dependabot/pip/develop/torch-2.12.1
dependabot/pip/develop/praw-8.0.1
dependabot/pip/develop/yt-dlp-2026.6.9
dependabot/pip/develop/elevenlabs-2.53.0
dependabot/pip/develop/transformers-5.12.1
develop
master
blocked-words
New-WebDriver
1.0.0
1.1
2.0
2.1
2.2
2.2.1
2.2.2
2.3
2.4
2.4.1
2.4.2
3.0
3.0.1
3.1
3.2
3.2.1
3.3.0
3.4.0
${ noResults }
2 Commits (71cbbacd60e428bd9ac9bf9332a387987e5f457a)
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
076b65f04c |
feat: pro caption system with WhisperX word-level alignment
Core changes: - utils/caption_renderer.py: new single-responsibility rendering engine - Three display modes: aligned, single, multi - 8-direction stroke technique for clean text outlines - Transparent PNG overlays (no more solid box) - utils/whisper_aligner.py: WhisperX forced alignment module - Word-level timestamps from any TTS audio - Graceful fallback to single mode if unavailable - utils/imagenarator.py: refactored as thin orchestrator - Delegates to caption_renderer - Saves timing_map.json for final_video sync - utils/sentiment_map.py: added STYLE_MAP with display_mode per sentiment - utils/sentiment.py: stores sentiment in settings for downstream use - TTS/engine_wrapper.py: runs WhisperX after each TTS save - video_creation/final_video.py: reads timing_map, handles absolute + fraction timing - video_creation/screenshot_downloader.py: clean imagemaker call Assets: - fonts/: added Montserrat, Nunito, Oswald, Raleway, Lato, Anton font families Dependencies: - requirements.txt: updated with all current dependencies |
3 weeks ago |
|
|
af0940045c |
feat: sentiment-aware video pipeline with DeepSeek, metadata generation, and per-video folder structure
SENTIMENT DETECTION (utils/sentiment.py)
- Integrate DeepSeek API using OpenAI-compatible SDK to classify each Reddit post
- Detect sentiment from post title + body (first 500 chars) into 8 labels:
sad, happy, angry, mysterious, funny, dramatic, wholesome, scary
- Override in-memory config per post (background_video, background_audio, voice)
- Falls back to 'dramatic' label if DeepSeek API fails or is unavailable
- Can be enabled/disabled via config.toml [deepseek] enabled = true/false
SENTIMENT MAPS (utils/sentiment_map.py)
- BACKGROUND_MAP: maps each sentiment to optimal background video + audio pair
- OPENAI_VOICE_MAP: maps each sentiment to best-fit OpenAI TTS voice
- ELEVENLABS_VOICE_MAP: maps each sentiment to best-fit ElevenLabs voice
(fully mapped to real voices: Adam, George, Harry, Callum, Jessica, Brian, Laura, Matilda)
- All overrides are in-memory only — config.toml is never modified
METADATA GENERATION (utils/sentiment.py)
- Single DeepSeek API call generates both sentiment + social media metadata
- Generates per-platform content:
* YouTube: title (max 70 chars) + full description
* TikTok: caption (max 150 chars) with hashtags
* Instagram: caption with hashtags
* Facebook: caption
* Hashtags: list of relevant tags
- Falls back to basic title-based metadata if DeepSeek fails
- Saves metadata.json inside each video's output folder
RESULTS FOLDER RESTRUCTURE (video_creation/final_video.py)
- Changed output structure from results/{subreddit}/{filename}.mp4
- New structure: results/{actual_subreddit}/{thread_id}_{sentiment}/video.mp4
- Each video now has its own isolated folder containing:
* video.mp4
* metadata.json
* thumbnail.png (if thumbnail generation is enabled)
* OnlyTTS/video.mp4 (if enable_extra_audio is enabled)
SUBREDDIT TRACKING (reddit/subreddit.py)
- Added thread_subreddit field to reddit_object using submission.subreddit.display_name
- Posts from r/AmItheAsshole now save to results/AmItheAsshole/
- Posts from r/tifu now save to results/tifu/
- Posts from r/confession now save to results/confession/
- Previously all posts were grouped under the combined subreddit string
PIPELINE INTEGRATION (main.py)
- Added apply_sentiment_config() call between post fetching and video generation
- Sentiment detection runs before TTS and background selection
- Controlled by settings.config['deepseek']['enabled'] flag
CONFIG CHANGES (config.toml + utils/.config.template.toml)
- Added [deepseek] section with api_key and enabled fields
- elevenlabs_voice_name changed from optional=false to optional=true
- Prevents prompt appearing when ElevenLabs is not the selected TTS provider
|
4 weeks ago |