You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
393 lines
13 KiB
393 lines
13 KiB
# AGENT.md — Guidance for Agents & AI Working on VideoMakerBot
|
|
|
|
This document guides **agents, bots, and AI assistants** on how to work effectively with the VideoMakerBot codebase.
|
|
|
|
---
|
|
|
|
## Quick Start for Agents
|
|
|
|
### Core Principle
|
|
**VideoMakerBot uses a platform-agnostic factory pattern.** Always respect the abstraction:
|
|
- Don't import platform-specific modules (reddit/, threads/) directly
|
|
- Always use `platforms/__init__.py` factory functions
|
|
- Keep platform-specific logic in `platforms/{platform}/`
|
|
|
|
### The "Do This" Checklist
|
|
1. ✅ Read existing CLAUDE.md for architecture context
|
|
2. ✅ Use factory: `from platforms import get_content_object, get_screenshot_fn`
|
|
3. ✅ Return standard `content_object` dict from all fetchers
|
|
4. ✅ Test both Reddit and Threads modes before declaring completion
|
|
5. ✅ Use config fallback chains for cross-platform keys
|
|
6. ✅ Document platform-specific logic in docstrings
|
|
|
|
### The "Don't Do This" List
|
|
1. ❌ Import `reddit.subreddit` directly in main.py or generic modules
|
|
2. ❌ Hardcode subreddit/platform names in core video pipeline
|
|
3. ❌ Add platform-specific selectors outside `platforms/{platform}/`
|
|
4. ❌ Assume config keys exist without `.get()` and fallbacks
|
|
5. ❌ Modify screenshot_downloader.py for non-Reddit platforms
|
|
|
|
---
|
|
|
|
## Understanding the Codebase Structure
|
|
|
|
### Entry Point
|
|
**`main.py`** — Single CLI entry point using platform factory
|
|
- Calls `get_content_object(POST_ID)` from factory
|
|
- Calls `get_screenshot_fn()` from factory
|
|
- Everything else is platform-agnostic
|
|
|
|
### Platform Layer (`platforms/`)
|
|
- **`__init__.py`** — Factory dispatch functions (add new platforms here)
|
|
- **`threads/fetcher.py`** — Threads Graph API client (returns standard dict)
|
|
- **`threads/screenshot.py`** — Threads.net Playwright screenshotter
|
|
|
|
### Legacy Platform (`reddit/`)
|
|
- **`subreddit.py`** — PRAW API client (returns standard dict)
|
|
- No changes needed; called via factory
|
|
|
|
### Video Pipeline (`video_creation/`)
|
|
- **`final_video.py`** — FFmpeg composition (platform-aware output folder only)
|
|
- **`screenshot_downloader.py`** — Reddit Playwright screenshotter (not called for Threads)
|
|
- **`voices.py`** — TTS orchestration (platform-agnostic)
|
|
- **`background.py`** — Video/audio download (platform-agnostic)
|
|
|
|
### TTS Layer (`TTS/`)
|
|
- **`engine_wrapper.py`** — Provider abstraction (handles `post_lang` fallback)
|
|
- **`*.py`** — Individual provider implementations (elevenlabs, aws_polly, etc.)
|
|
|
|
### Config & Utils (`utils/`)
|
|
- **`settings.py`** — TOML config loading & validation
|
|
- **`videos.py`** — Dedup tracking (`check_done()` + `check_done_by_id()`)
|
|
- **`.config.template.toml`** — Config schema with `[settings]`, `[reddit.*]`, `[threads.*]`, `[ai]`
|
|
|
|
---
|
|
|
|
## How to Approach Common Tasks
|
|
|
|
### Adding a New Social Platform (e.g., X/Twitter)
|
|
|
|
**Steps:**
|
|
1. Create `platforms/twitter/fetcher.py`:
|
|
```python
|
|
def get_twitter_content(POST_ID=None) -> dict:
|
|
"""Fetch post + replies, return standard content_object."""
|
|
# Implement API fetching logic here
|
|
return {
|
|
"thread_id": ...,
|
|
"thread_category": "twitter", # NEW: generic field for output folder
|
|
"thread_title": ...,
|
|
"thread_url": ...,
|
|
"comments": [...]
|
|
}
|
|
```
|
|
|
|
2. Create `platforms/twitter/screenshot.py`:
|
|
```python
|
|
def get_screenshots_of_twitter_posts(content_object: dict, screenshot_num: int):
|
|
"""Use Playwright to screenshot X/Twitter posts."""
|
|
# Implement Playwright logic here
|
|
```
|
|
|
|
3. Update `platforms/__init__.py`:
|
|
```python
|
|
elif platform == "twitter":
|
|
from platforms.twitter.fetcher import get_twitter_content
|
|
return get_twitter_content(POST_ID)
|
|
```
|
|
|
|
4. Add config section to `utils/.config.template.toml`:
|
|
```toml
|
|
[twitter.creds]
|
|
api_key = { ... }
|
|
api_secret = { ... }
|
|
|
|
[twitter.thread]
|
|
post_id = { ... }
|
|
```
|
|
|
|
5. Update `main.py` helper:
|
|
```python
|
|
elif platform == "twitter":
|
|
return config.get("twitter", {}).get("thread", {}).get("post_id", "")
|
|
```
|
|
|
|
6. **Zero changes needed to:** TTS, backgrounds, video composition, utils.
|
|
|
|
**Verification:**
|
|
```bash
|
|
# Test Reddit (regression check)
|
|
sed -i 's/platform = "twitter"/platform = "reddit"/' config.toml
|
|
python3 main.py
|
|
# Verify results/{subreddit}/ output
|
|
|
|
# Test Twitter
|
|
sed -i 's/platform = "reddit"/platform = "twitter"/' config.toml
|
|
python3 main.py --post-id <twitter-id>
|
|
# Verify results/twitter/ output
|
|
```
|
|
|
|
---
|
|
|
|
### Modifying the Video Pipeline
|
|
|
|
**Scenario:** You need to change FFmpeg composition or add a new processing step.
|
|
|
|
**Approach:**
|
|
1. Check which data the modified code consumes (`content_object` dict)
|
|
2. Verify it works with both Reddit and Threads content structures
|
|
3. If platform-specific: move logic to `platforms/{platform}/`
|
|
4. If generic: keep in `video_creation/`
|
|
5. Test both modes before merging
|
|
|
|
**Example:** Adding video filters
|
|
```python
|
|
# In final_video.py (generic, works for all platforms)
|
|
def apply_filter(video_clip, filter_type):
|
|
# No platform-specific logic here
|
|
return video_clip.filter(...)
|
|
|
|
# Test:
|
|
# - Reddit mode produces filtered video
|
|
# - Threads mode produces filtered video
|
|
```
|
|
|
|
---
|
|
|
|
### Fixing a Bug in Config Handling
|
|
|
|
**Scenario:** `post_lang` is not being applied correctly.
|
|
|
|
**Debug Path:**
|
|
1. Check `utils/settings.py` — how is config loaded?
|
|
2. Check `TTS/engine_wrapper.py:182` — uses fallback chain:
|
|
```python
|
|
lang = (settings.config["settings"].get("post_lang") or
|
|
settings.config.get("reddit", {}).get("thread", {}).get("post_lang", ""))
|
|
```
|
|
3. Check `video_creation/final_video.py:78` — same fallback logic
|
|
4. If still broken: verify `utils/.config.template.toml` has the key defined
|
|
5. Test both platforms with `post_lang = "es"` in config
|
|
|
|
---
|
|
|
|
### Adding Support for a New TTS Provider
|
|
|
|
**Scenario:** User wants Whisper TTS support.
|
|
|
|
**Steps:**
|
|
1. Create `TTS/whisper_tts.py`:
|
|
```python
|
|
class WhisperTTS:
|
|
def make_voice(self, text):
|
|
# Call Whisper API
|
|
return audio_bytes
|
|
```
|
|
|
|
2. Update `TTS/engine_wrapper.py:make_voice()`:
|
|
```python
|
|
elif voice_choice == "whisper":
|
|
from TTS.whisper_tts import WhisperTTS
|
|
return WhisperTTS().make_voice(text)
|
|
```
|
|
|
|
3. Add config to `utils/.config.template.toml`:
|
|
```toml
|
|
[settings.tts]
|
|
whisper_api_key = { optional = true, ... }
|
|
```
|
|
|
|
4. Test:
|
|
```bash
|
|
# In config.toml:
|
|
voice_choice = "whisper"
|
|
# Run: python3 main.py
|
|
```
|
|
|
|
---
|
|
|
|
## Common Pitfalls & How to Avoid Them
|
|
|
|
### Pitfall 1: Platform-Specific Code in Generic Modules
|
|
**Problem:**
|
|
```python
|
|
# BAD: In video_creation/final_video.py
|
|
subreddit = settings.config["reddit"]["thread"]["subreddit"]
|
|
```
|
|
**Will break** when platform = "threads" (no reddit.thread.subreddit).
|
|
|
|
**Solution:**
|
|
```python
|
|
# GOOD:
|
|
platform = settings.config["settings"].get("platform", "reddit")
|
|
if platform == "reddit":
|
|
category = settings.config["reddit"]["thread"]["subreddit"]
|
|
else:
|
|
category = reddit_obj.get("thread_category", platform)
|
|
```
|
|
|
|
### Pitfall 2: Hardcoding Selectors in Platform-Agnostic Code
|
|
**Problem:**
|
|
```python
|
|
# BAD: In video_creation/voices.py
|
|
element = page.locator("#t1_{comment_id}") # Reddit-only selector!
|
|
```
|
|
**Will fail** when running Threads mode (different DOM).
|
|
|
|
**Solution:**
|
|
- Keep all Playwright logic in `platforms/{platform}/screenshot.py`
|
|
- Never hardcode selectors in generic modules
|
|
|
|
### Pitfall 3: Forgetting to Test Both Modes
|
|
**Problem:** You change `final_video.py`, test with Reddit, declare done.
|
|
Threads mode breaks because you didn't test it.
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Test both before committing:
|
|
sed -i 's/platform = "threads"/platform = "reddit"/' config.toml
|
|
python3 main.py
|
|
# Check results/{subreddit}/
|
|
|
|
sed -i 's/platform = "reddit"/platform = "threads"/' config.toml
|
|
python3 main.py --post-id <id>
|
|
# Check results/threads/
|
|
```
|
|
|
|
### Pitfall 4: Assuming Config Keys Exist
|
|
**Problem:**
|
|
```python
|
|
# BAD:
|
|
lang = settings.config["reddit"]["thread"]["post_lang"]
|
|
```
|
|
**Will crash** if key doesn't exist.
|
|
|
|
**Solution:**
|
|
```python
|
|
# GOOD:
|
|
lang = (settings.config["settings"].get("post_lang") or
|
|
settings.config.get("reddit", {}).get("thread", {}).get("post_lang", ""))
|
|
```
|
|
|
|
---
|
|
|
|
## Code Review Checklist for Agents
|
|
|
|
Before marking work complete, verify:
|
|
|
|
- [ ] **No platform imports in main.py** — Uses factory only
|
|
- [ ] **Standard content_object dict** — All fetchers return same shape
|
|
- [ ] **Platform-specific logic isolated** — Only in `platforms/{platform}/`
|
|
- [ ] **Config fallback chains** — No hardcoded section names in generic code
|
|
- [ ] **Both modes tested** — Reddit AND Threads produce correct output
|
|
- [ ] **Docstrings updated** — New functions document platform assumptions
|
|
- [ ] **Error messages clear** — Include platform name + actionable guidance
|
|
- [ ] **Video dedup works** — No duplicate videos created
|
|
|
|
---
|
|
|
|
## Understanding Data Flow
|
|
|
|
### Happy Path: Fetch → TTS → Screenshot → Compose → Output
|
|
|
|
```
|
|
1. main.py:main()
|
|
└─→ platforms/__init__.py:get_content_object()
|
|
└─→ platforms/threads/fetcher.py:get_threads_content()
|
|
└─→ Returns: {thread_id, thread_title, comments, ...}
|
|
|
|
2. video_creation/voices.py:save_text_to_mp3()
|
|
└─→ TTS/engine_wrapper.py:process_text()
|
|
└─→ TTS/engine_wrapper.py:make_voice()
|
|
└─→ TTS/{provider}.py: {elevenlabs,tiktok,etc}
|
|
└─→ Returns: audio_length, comment_count
|
|
|
|
3. platforms/__init__.py:get_screenshot_fn()
|
|
└─→ platforms/threads/screenshot.py:get_screenshots_of_threads_posts()
|
|
└─→ Uses Playwright on threads.net
|
|
└─→ Saves: assets/temp/{thread_id}/png/{title,comment_0,etc}.png
|
|
|
|
4. video_creation/background.py
|
|
└─→ download_background_video() & download_background_audio()
|
|
└─→ Uses yt-dlp to fetch YouTube videos/audio
|
|
└─→ Saves to: assets/temp/{thread_id}/{video,audio}
|
|
|
|
5. video_creation/final_video.py:make_final_video()
|
|
└─→ Uses FFmpeg to compose everything
|
|
└─→ Reads: audio files, screenshot PNGs, background video
|
|
└─→ Writes: results/{thread_category}/{filename}.mp4
|
|
|
|
6. utils/videos.py:save_data()
|
|
└─→ Records video in videos.json for dedup
|
|
```
|
|
|
|
### Config Flow
|
|
|
|
```
|
|
config.toml (user settings)
|
|
↓
|
|
utils/settings.py:check_toml()
|
|
└─→ Validates against .config.template.toml schema
|
|
└─→ Returns: settings.config (dict)
|
|
|
|
Used by:
|
|
├─ main.py (platform selection)
|
|
├─ platforms/reddit/ (subreddit, etc.)
|
|
├─ platforms/threads/ (Graph API token, etc.)
|
|
├─ TTS/engine_wrapper.py (post_lang fallback)
|
|
├─ video_creation/ (theme, resolution, etc.)
|
|
└─ utils/videos.py (dedup behavior)
|
|
```
|
|
|
|
---
|
|
|
|
## Deployment Notes
|
|
|
|
### Python Version
|
|
- **Minimum:** 3.10
|
|
- **Tested:** 3.10, 3.11, 3.12
|
|
- **Reason:** F-strings, type hints, modern async patterns
|
|
|
|
### Critical Dependencies
|
|
- **reddit platform:** praw 7.8.1 (requires Reddit OAuth app)
|
|
- **threads platform:** requests (for Graph API calls)
|
|
- **screenshots:** playwright 1.49.1 (requires browser installation: `playwright install`)
|
|
- **video:** moviepy 2.2.1, ffmpeg-python 0.2.0 (requires FFmpeg system binary)
|
|
- **tts:** varies per provider (elevenlabs, aws_polly, openai, etc.)
|
|
|
|
### Versions That Caused Issues
|
|
- **yt-dlp==2026.3.17** — Doesn't exist (use 2025.10.14 or latest stable)
|
|
- **playwright without browser install** — Will crash on first screenshot
|
|
|
|
---
|
|
|
|
## When to Escalate
|
|
|
|
### Escalate to User if:
|
|
- User needs new platform support (only they know requirements)
|
|
- Config changes affect backward compatibility
|
|
- Performance optimization needed (only user knows acceptable limits)
|
|
- Security concern (token handling, credential storage, etc.)
|
|
|
|
### Safe to Implement as Agent:
|
|
- Bug fixes within existing architecture
|
|
- Adding new TTS providers
|
|
- Extending config options for existing platforms
|
|
- Performance optimizations (caching, parallelization)
|
|
- New filter/processing features that work platform-agnostically
|
|
- Documentation & refactoring
|
|
|
|
---
|
|
|
|
## Final Guidance
|
|
|
|
**Golden Rule:** The factory pattern is your friend. When in doubt, check if your change breaks the abstraction. If it does, rethink it.
|
|
|
|
**Test Obsessively:** Always run both Reddit and Threads modes. The codebase is designed for multi-platform support, and it's easy to break one platform while fixing another.
|
|
|
|
**Document Platform Assumptions:** If your code works differently for Reddit vs Threads, say so explicitly in docstrings and comments.
|
|
|
|
**Ask Yourself:** "Would this work for X/Twitter?" If no, it probably belongs in `platforms/threads/`, not in generic code.
|
|
|
|
Good luck, and happy contributing! 🎥
|