RedditVideoMakerBot

Commit Graph

Author	SHA1	Message	Date
Abdessamad Haddouche	076b65f04c	feat: pro caption system with WhisperX word-level alignment Core changes: - utils/caption_renderer.py: new single-responsibility rendering engine - Three display modes: aligned, single, multi - 8-direction stroke technique for clean text outlines - Transparent PNG overlays (no more solid box) - utils/whisper_aligner.py: WhisperX forced alignment module - Word-level timestamps from any TTS audio - Graceful fallback to single mode if unavailable - utils/imagenarator.py: refactored as thin orchestrator - Delegates to caption_renderer - Saves timing_map.json for final_video sync - utils/sentiment_map.py: added STYLE_MAP with display_mode per sentiment - utils/sentiment.py: stores sentiment in settings for downstream use - TTS/engine_wrapper.py: runs WhisperX after each TTS save - video_creation/final_video.py: reads timing_map, handles absolute + fraction timing - video_creation/screenshot_downloader.py: clean imagemaker call Assets: - fonts/: added Montserrat, Nunito, Oswald, Raleway, Lato, Anton font families Dependencies: - requirements.txt: updated with all current dependencies	3 weeks ago

Author

SHA1

Message

Date

Abdessamad Haddouche

076b65f04c

feat: pro caption system with WhisperX word-level alignment

Core changes:
- utils/caption_renderer.py: new single-responsibility rendering engine
  - Three display modes: aligned, single, multi
  - 8-direction stroke technique for clean text outlines
  - Transparent PNG overlays (no more solid box)
- utils/whisper_aligner.py: WhisperX forced alignment module
  - Word-level timestamps from any TTS audio
  - Graceful fallback to single mode if unavailable
- utils/imagenarator.py: refactored as thin orchestrator
  - Delegates to caption_renderer
  - Saves timing_map.json for final_video sync
- utils/sentiment_map.py: added STYLE_MAP with display_mode per sentiment
- utils/sentiment.py: stores sentiment in settings for downstream use
- TTS/engine_wrapper.py: runs WhisperX after each TTS save
- video_creation/final_video.py: reads timing_map, handles absolute + fraction timing
- video_creation/screenshot_downloader.py: clean imagemaker call

Assets:
- fonts/: added Montserrat, Nunito, Oswald, Raleway, Lato, Anton font families

Dependencies:
- requirements.txt: updated with all current dependencies

1 Commits (71cbbacd60e428bd9ac9bf9332a387987e5f457a)