17 KiB
CLAUDE.md — VideoMakerBot Development Guide
Project Overview
VideoMakerBot — Automated short-form video creator from social media content.
Status: Production-ready, actively maintained (v3.4.0)
Language: Python 3.10 (locked by Dockerfile; host venv may use 3.14 for tooling only)
Runtime: Docker only — all CLI, GUI, and test invocations go through docker compose. Do not invoke python on the host.
Platforms: Reddit (PRAW API), Threads (Graph API + Web Scraping)
Core Mission
Transforms social media threads (post + comments/replies) into complete short-form videos with:
- AI-generated speech (7+ TTS providers)
- UI screenshots (Playwright, headless Chromium pre-installed in image)
- Background video/audio overlays
- FFmpeg composition & output (Linux ffmpeg with full filter set, including
drawtext) - Optional YouTube upload
- Modern web UI (Tailwind CSS + DaisyUI + Lucide + vanilla ES6) on
localhost:4000
Architecture at a Glance
main.py (CLI)
↓ [platform factory]
├─→ reddit/subreddit.py [PRAW API]
└─→ platforms/threads/
├─→ fetcher.py [Graph API — your own posts]
├─→ scraper.py [Web scraping — trending For You feed]
└─→ auth.py [Shared Playwright login + cookies]
↓ [standard data dict]
├─→ TTS/engine_wrapper.py [7+ providers, auto-fallback]
├─→ screenshot_downloader.py (Reddit)
│ or platforms/threads/screenshot.py (Threads)
├─→ video_creation/background.py [local or yt-dlp]
├─→ video_creation/youtube_uploader.py [optional auto-upload]
└─→ video_creation/final_video.py [FFmpeg with libx264]
↓
results/{category}/{video.mp4}
Data Contract: The "content_object" Dict
All fetchers return this shape:
{
"thread_id": str, # Used for temp folder: assets/temp/{id}/
"thread_category": str, # "reddit", "threads" → output folder
"thread_title": str, # TTS + output filename (clean, no metadata)
"thread_url": str, # Playwright navigates here for screenshot
"is_nsfw": bool,
"comments": [
{
"comment_body": str, # TTS per reply (clean body text)
"comment_url": str, # Playwright navigates here
"comment_id": str, # Unique identifier (URL-based for scraper)
}
],
"thread_post": str | list, # Story mode (no comments)
}
File Organization
VideoMakerBot/
├── platforms/
│ ├── __init__.py # Factory: get_content_object(), get_screenshot_fn()
│ └── threads/
│ ├── auth.py # Shared Playwright login + cookie management
│ ├── fetcher.py # Graph API → content_object (your own posts)
│ ├── scraper.py # Web scraping → content_object (trending feed)
│ └── screenshot.py # Playwright Threads screenshotter (div-based)
│
├── reddit/
│ └── subreddit.py # PRAW API → content_object
│
├── video_creation/
│ ├── final_video.py # FFmpeg composition (libx264, no drawtext on macOS)
│ ├── background.py # Video/audio downloader (local files or yt-dlp)
│ ├── screenshot_downloader.py # Playwright Reddit UI capturer
│ ├── voices.py # TTS orchestrator
│ └── youtube_uploader.py # YouTube OAuth2 upload (post-render hook)
│
├── TTS/
│ ├── engine_wrapper.py # Provider abstraction + TikTok→pyttsx3 fallback
│ ├── TikTok.py # TikTok TTS (hardened error handling)
│ └── ... # 7+ provider implementations
│
├── utils/
│ ├── settings.py # Config loading + interactive validation
│ ├── videos.py # check_done() + check_done_by_id()
│ ├── console.py # Rich terminal output
│ ├── .config.template.toml # Config schema
│ ├── background_videos.json # Background video manifest
│ ├── background_audios.json # Background audio manifest
│ └── ...
│
├── GUI/ # Flask templates (Tailwind + DaisyUI + Lucide)
│ ├── layout.html # Base layout (no jQuery, no Bootstrap)
│ ├── index.html # Video Library (3 buttons: source / download / copy link)
│ ├── backgrounds.html # Background Manager (videos catalog)
│ ├── settings.html # Config editor (validated against template)
│ └── create.html # Render progress page
│
├── tests/
│ └── test_gui_utils.py # pytest regression for add/delete background
│
├── main.py # CLI entry (platform-routed via factory)
├── GUI.py # Flask web UI; `/video/<id>` serves files with sanitized headers
├── Dockerfile # python:3.10-slim-bookworm + ffmpeg + playwright + pytest
├── docker-compose.yml # Services: gui, cli, test
├── docker-entrypoint.sh # Runs `utils.docker_bootstrap` then exec's the command
├── requirements.txt
└── CLAUDE.md
Configuration
Threads (full config)
[settings]
platform = "threads"
[threads]
discovery_method = "scrape" # "api" (Graph API, own posts) or "scrape" (trending feed)
[threads.creds]
username = "your_insta" # For Playwright login (always needed)
password = "your_password"
access_token = "" # Only for discovery_method="api"
user_id = "" # Only for discovery_method="api"
[threads.thread]
post_id = "" # Specific post ID; blank = auto-pick from feed
max_reply_length = 500
min_reply_length = 1
min_replies = 5 # Minimum replies for post eligibility
min_engagement = 0 # Minimum likes+reposts for viral filter (0=disabled, 10000=viral)
blocked_words = ""
[settings.tts]
voice_choice = "googletranslate" # Best for macOS: no API key, fast, free
# voice_choice = "tiktok" # Needs tiktok_sessionid; auto-falls back to pyttsx3
# voice_choice = "OpenAI" # Needs openai_api_key
[settings.background]
background_video = "minecraft"
background_audio = "lofi"
background_audio_volume = 0.15
Reddit (reference)
[settings]
platform = "reddit"
[reddit.creds]
client_id = "..."
client_secret = "..."
username = "..."
password = "..."
2fa = false
2fa_secret = "" # TOTP base32 secret for auto-2FA
[reddit.thread]
subreddit = "AskReddit"
min_comments = 20
YouTube upload
[youtube]
enabled = false # Set true to auto-upload after render
privacy = "public" # or "private", "unlisted"
client_secret_path = "" # Path to youtube_client_secret.json
Platform-Specific Knowledge
Threads — Web Scraping (discovery_method = "scrape")
DOM Structure:
- Threads.net uses div-based card layout — NO
<article>elements anywhere - Feed posts:
a[href*="/post/"]links inside<div>cards (class containsx1a2a7pz) - Post pages: same structure; main post link appears first, replies follow
- Screenshots: Use
a[href*="/post/"]→ ancestor div card, NOTpage.locator("article")
Card Text Format (used by _parse_card_text()):
Line 0: username
Line 1: timestamp (e.g., "14h", "1d")
Line 2..N: post body text
Last 1-4: engagement metrics (likes, replies, reposts, quotes)
Engagement Parsing:
- Numbers can be plain ("266") or abbreviated ("1K", "2.5M")
likes= first trailing number,replies= second,reposts= thirdmin_engagementfilters bylikes + repoststotal- Posts are sorted by engagement descending before selection
Login Flow:
- Threads uses Instagram auth (
threads.net/login) - Selectors:
input[autocomplete="username"],input[autocomplete="current-password"] - Button:
get_by_role("button", name="Log in", exact=True).first - Cookies cached at
video_creation/data/cookie-threads.json - Login logic is shared via
platforms/threads/auth.py
API Limitation:
- Graph API v1.0 only accesses YOUR OWN posts — no trending/discovery
- Scraping bypasses this entirely — no API token needed
Threads — Graph API (discovery_method = "api")
- Auth: Bearer token, 60-day expiry
- Only accesses authenticated user's own threads + replies
- Use when you have your own content with replies
- API: PRAW (Python Reddit API Wrapper)
- Post discovery:
subreddit.hot(limit=25)→get_subreddit_undone()→ fallback totop(day/hour/month/week/year/all) - Screenshot: Playwright on new.reddit.com
- 2FA: Auto-TOTP via
pyotpwhen2fa_secretis configured in config.toml
Development Guidelines
✅ DO:
- Run everything through Docker —
docker compose up gui,docker compose run --rm cli,docker compose run --rm test - Use platform factory — never import platform modules directly
- Return standard content_object from all fetchers
- Use clean body text for TTS — parse out username/timestamp metadata
- Default to
googletranslateTTS for headless containers — no API key, fast, free - Use
libx264encoder —h264_nvencis NVIDIA-only and not available in the slim image - Test both Threads discovery methods:
apiandscrape - Bind-mount preserves state — edits to
config.toml,results/,assets/temp/,video_creation/data/, and theutils/background_*.jsoncatalogs persist across container runs - GUI must bind to
0.0.0.0in Docker (already enforced viaGUI_HOST=0.0.0.0env) - Use
/video/<id>to serve renders — the route looks up the file by id invideos.json, sanitizes theContent-Dispositionfilename, and avoids 404s caused by literal newlines in titles
❌ DON'T:
- Don't run
python GUI.pyorpython main.pyon the host — Docker is the only supported path - Don't use
<article>selectors on Threads.net — the DOM is div-based - Don't hardcode
h264_nvenc— uselibx264for cross-platform compatibility - Don't import platform modules directly in main.py/utils
- Don't assume config keys exist without
.get()fallback - Don't reintroduce jQuery, Bootstrap, or ClipboardJS — the UI is vanilla ES6 + Tailwind + DaisyUI + Lucide
- Don't write to
utils/backgrounds.json— it is a legacy empty file. Useutils/background_videos.jsonandutils/background_audios.json
Web UI (Flask, served by gui service)
- Stack: Tailwind CSS, DaisyUI, Lucide Icons, vanilla ES6 (no jQuery, no Bootstrap, no ClipboardJS)
- Routes:
/— Video Library; cards show source-post link, download, and copy-link buttons/video/<id>— serves the rendered mp4 by id (lookup viavideos.json); guards path-traversal and sanitizes the filename forContent-Disposition/backgrounds— Background Manager UI/backgrounds.json— servesutils/background_videos.json(the videos catalog)/background/add,/background/delete— POST endpoints; mutate bothutils/background_videos.jsonand thesettings.background.background_video.optionsarray inutils/.config.template.toml/settings— config editor; loads fromconfig.toml, validates againstutils/.config.template.toml, persists viautils/gui_utils.modify_settings(preserves comments/formatting viatomlkit)
- HTML escaping: the
h()helper inindex.htmlescapes& " < >for any user-controlled string embedded in attributes — use it for any new dynamic data on the Library page
Key Files to Know
| File | Purpose |
|---|---|
main.py |
CLI entry; pipeline orchestration via factory |
platforms/__init__.py |
Factory dispatch (platform + discovery_method) |
platforms/threads/scraper.py |
NEW — Web scraping fetcher with engagement parsing |
platforms/threads/auth.py |
NEW — Shared Playwright login + cookie management |
platforms/threads/fetcher.py |
Graph API client (own posts only) |
platforms/threads/screenshot.py |
Div-based Threads screenshotter |
video_creation/final_video.py |
FFmpeg composition (libx264, platform-aware output) |
video_creation/background.py |
Background downloader (local files + yt-dlp) |
video_creation/youtube_uploader.py |
NEW — OAuth2 YouTube upload |
TTS/engine_wrapper.py |
TTS provider abstraction + TikTok fallback |
TTS/TikTok.py |
Hardened TikTok TTS with graceful error handling |
reddit/subreddit.py |
PRAW Reddit fetcher with auto-2FA |
utils/settings.py |
Config loading + interactive validation |
utils/videos.py |
Video dedup tracking |
utils/.config.template.toml |
Config schema (also drives Settings page validation) |
utils/background_videos.json |
Background video manifest (served at /backgrounds.json) |
utils/background_audios.json |
Background audio manifest |
utils/gui_utils.py |
add_background, delete_background, modify_settings, get_checks |
GUI.py |
Flask app: /, /video/<id>, /backgrounds, /settings, /create |
Dockerfile |
python:3.10-slim-bookworm + ffmpeg + Playwright Chromium + pytest |
docker-compose.yml |
Three services: gui (port 4000), cli, test |
tests/test_gui_utils.py |
Pytest regression for Background Manager round-trip |
Debugging Tips
FFmpeg "Unknown encoder 'h264_nvenc'"
→ Use libx264. Find-and-replace h264_nvenc → libx264 in video_creation/final_video.py. The slim image does not ship with NVIDIA encoders.
yt-dlp "Requested format is not available"
→ Bump the pinned version in requirements.txt and rebuild (docker compose build). Also prefer best[height<=1080] over bestvideo in video_creation/background.py — many videos lack video-only streams.
Threads screenshots fail ("Main post article not found")
→ Threads.net uses div cards, not <article>. Ensure screenshot code uses a[href*="/post/"] → ancestor div approach.
Config validator EOFError in non-interactive mode
→ check_toml() prompts for ALL platform sections regardless of platform setting. Either fill all required fields, edit through /settings, or pre-populate config.toml before docker compose run cli.
Playwright timeout on Threads login
→ Cookies corrupted. Delete video_creation/data/cookie-threads.json for fresh login (the file is bind-mounted, so deleting on host clears the container too). Also confirm selectors: button uses exact=True due to multiple "Log in" buttons.
No viral posts found
→ Lower min_engagement in config. Most Threads feed posts have <100 likes — 10000 filters almost everything.
Background Manager grid is empty
→ /backgrounds.json must serve utils/background_videos.json (split catalog), not the legacy utils/backgrounds.json (empty {}). Verify in GUI.py:backgrounds_json.
/video/<id> returns 404
→ The route looks up the entry in video_creation/data/videos.json by id and resolves the file under results/<thread_category>/<filename>.mp4. Confirm both the JSON entry and the file exist; the file may have been pruned.
JS "Unexpected end of input" on Library page
→ Any user-controlled string interpolated into an HTML attribute must go through the h() helper in index.html. Avoid inline onclick= with ${JSON.stringify(...)}.
Stale image after editing requirements.txt or Dockerfile
→ docker compose build to rebuild. Code changes alone do NOT need a rebuild because the repo root is bind-mounted to /app.
Useful Commands (Docker-only)
# Build (or rebuild after Dockerfile / requirements.txt changes)
docker compose build
# Run the GUI (foreground)
docker compose up gui
# → http://localhost:4000
# Run the GUI in the background
docker compose up -d gui
docker compose logs -f gui
docker compose down
# Run the CLI pipeline (one-off, removed on exit)
docker compose run --rm cli
docker compose run --rm cli python main.py <post_id>
# Run the test suite
docker compose run --rm test
# Open a shell in a fresh container for ad-hoc commands
docker compose run --rm --entrypoint /bin/bash gui
# inside: python -m py_compile main.py platforms/threads/scraper.py
# Tail a running GUI container
docker compose exec gui ls /app/results/threads/
Anything that needs
pip install,playwright install, orapt-getbelongs inDockerfilefollowed bydocker compose build— never run those on the host.