RedditVideoMakerBot/AGENT.md

# AGENT.md — Guidance for Agents & AI Working on VideoMakerBot

This document guides **agents, bots, and AI assistants** on how to work effectively with the VideoMakerBot codebase.

---

## Quick Start for Agents

### Core Principle
**VideoMakerBot uses a platform-agnostic factory pattern.** Always respect the abstraction:
- Don't import platform-specific modules (reddit/, threads/) directly
- Always use `platforms/__init__.py` factory functions
- Keep platform-specific logic in `platforms/{platform}/`

### The "Do This" Checklist
1. ✅ Read existing CLAUDE.md for architecture context
2. ✅ Use factory: `from platforms import get_content_object, get_screenshot_fn`
3. ✅ Return standard `content_object` dict from all fetchers
4. ✅ Test both Reddit and Threads modes before declaring completion
5. ✅ Use config fallback chains for cross-platform keys
6. ✅ Document platform-specific logic in docstrings

### The "Don't Do This" List
1. ❌ Import `reddit.subreddit` directly in main.py or generic modules
2. ❌ Hardcode subreddit/platform names in core video pipeline
3. ❌ Add platform-specific selectors outside `platforms/{platform}/`
4. ❌ Assume config keys exist without `.get()` and fallbacks
5. ❌ Modify screenshot_downloader.py for non-Reddit platforms

---

## Understanding the Codebase Structure

### Entry Point
**`main.py`** — Single CLI entry point using platform factory
- Calls `get_content_object(POST_ID)` from factory
- Calls `get_screenshot_fn()` from factory
- Everything else is platform-agnostic

### Platform Layer (`platforms/`)
- **`__init__.py`** — Factory dispatch functions (add new platforms here)
- **`threads/fetcher.py`** — Threads Graph API client (returns standard dict)
- **`threads/screenshot.py`** — Threads.net Playwright screenshotter

### Legacy Platform (`reddit/`)
- **`subreddit.py`** — PRAW API client (returns standard dict)
- No changes needed; called via factory

### Video Pipeline (`video_creation/`)
- **`final_video.py`** — FFmpeg composition (platform-aware output folder only)
- **`screenshot_downloader.py`** — Reddit Playwright screenshotter (not called for Threads)
- **`voices.py`** — TTS orchestration (platform-agnostic)
- **`background.py`** — Video/audio download (platform-agnostic)

### TTS Layer (`TTS/`)
- **`engine_wrapper.py`** — Provider abstraction (handles `post_lang` fallback)
- **`*.py`** — Individual provider implementations (elevenlabs, aws_polly, etc.)

### Config & Utils (`utils/`)
- **`settings.py`** — TOML config loading & validation
- **`videos.py`** — Dedup tracking (`check_done()` + `check_done_by_id()`)
- **`.config.template.toml`** — Config schema with `[settings]`, `[reddit.*]`, `[threads.*]`, `[ai]`

---

## How to Approach Common Tasks

### Adding a New Social Platform (e.g., X/Twitter)

**Steps:**
1. Create `platforms/twitter/fetcher.py`:
   ```python
   def get_twitter_content(POST_ID=None) -> dict:
       """Fetch post + replies, return standard content_object."""
       # Implement API fetching logic here
       return {
           "thread_id": ...,
           "thread_category": "twitter",  # NEW: generic field for output folder
           "thread_title": ...,
           "thread_url": ...,
           "comments": [...]
       }
   ```

2. Create `platforms/twitter/screenshot.py`:
   ```python
   def get_screenshots_of_twitter_posts(content_object: dict, screenshot_num: int):
       """Use Playwright to screenshot X/Twitter posts."""
       # Implement Playwright logic here
   ```

3. Update `platforms/__init__.py`:
   ```python
   elif platform == "twitter":
       from platforms.twitter.fetcher import get_twitter_content
       return get_twitter_content(POST_ID)
   ```

4. Add config section to `utils/.config.template.toml`:
   ```toml
   [twitter.creds]
   api_key = { ... }
   api_secret = { ... }

   [twitter.thread]
   post_id = { ... }
   ```

5. Update `main.py` helper:
   ```python
   elif platform == "twitter":
       return config.get("twitter", {}).get("thread", {}).get("post_id", "")
   ```

6. **Zero changes needed to:** TTS, backgrounds, video composition, utils.

**Verification:**
```bash
# Test Reddit (regression check)
sed -i 's/platform = "twitter"/platform = "reddit"/' config.toml
python3 main.py
# Verify results/{subreddit}/ output

# Test Twitter
sed -i 's/platform = "reddit"/platform = "twitter"/' config.toml
python3 main.py --post-id <twitter-id>
# Verify results/twitter/ output
```

---

### Modifying the Video Pipeline

**Scenario:** You need to change FFmpeg composition or add a new processing step.

**Approach:**
1. Check which data the modified code consumes (`content_object` dict)
2. Verify it works with both Reddit and Threads content structures
3. If platform-specific: move logic to `platforms/{platform}/`
4. If generic: keep in `video_creation/`
5. Test both modes before merging

**Example:** Adding video filters
```python
# In final_video.py (generic, works for all platforms)
def apply_filter(video_clip, filter_type):
    # No platform-specific logic here
    return video_clip.filter(...)

# Test:
# - Reddit mode produces filtered video
# - Threads mode produces filtered video
```

---

### Fixing a Bug in Config Handling

**Scenario:** `post_lang` is not being applied correctly.

**Debug Path:**
1. Check `utils/settings.py` — how is config loaded?
2. Check `TTS/engine_wrapper.py:182` — uses fallback chain:
   ```python
   lang = (settings.config["settings"].get("post_lang") or
           settings.config.get("reddit", {}).get("thread", {}).get("post_lang", ""))
   ```
3. Check `video_creation/final_video.py:78` — same fallback logic
4. If still broken: verify `utils/.config.template.toml` has the key defined
5. Test both platforms with `post_lang = "es"` in config

---

### Adding Support for a New TTS Provider

**Scenario:** User wants Whisper TTS support.

**Steps:**
1. Create `TTS/whisper_tts.py`:
   ```python
   class WhisperTTS:
       def make_voice(self, text):
           # Call Whisper API
           return audio_bytes
   ```

2. Update `TTS/engine_wrapper.py:make_voice()`:
   ```python
   elif voice_choice == "whisper":
       from TTS.whisper_tts import WhisperTTS
       return WhisperTTS().make_voice(text)
   ```

3. Add config to `utils/.config.template.toml`:
   ```toml
   [settings.tts]
   whisper_api_key = { optional = true, ... }
   ```

4. Test:
   ```bash
   # In config.toml:
   voice_choice = "whisper"
   # Run: python3 main.py
   ```

---

## Common Pitfalls & How to Avoid Them

### Pitfall 1: Platform-Specific Code in Generic Modules
**Problem:**
```python
# BAD: In video_creation/final_video.py
subreddit = settings.config["reddit"]["thread"]["subreddit"]
```
**Will break** when platform = "threads" (no reddit.thread.subreddit).

**Solution:**
```python
# GOOD:
platform = settings.config["settings"].get("platform", "reddit")
if platform == "reddit":
    category = settings.config["reddit"]["thread"]["subreddit"]
else:
    category = reddit_obj.get("thread_category", platform)
```

### Pitfall 2: Hardcoding Selectors in Platform-Agnostic Code
**Problem:**
```python
# BAD: In video_creation/voices.py
element = page.locator("#t1_{comment_id}")  # Reddit-only selector!
```
**Will fail** when running Threads mode (different DOM).

**Solution:**
- Keep all Playwright logic in `platforms/{platform}/screenshot.py`
- Never hardcode selectors in generic modules

### Pitfall 3: Forgetting to Test Both Modes
**Problem:** You change `final_video.py`, test with Reddit, declare done.
Threads mode breaks because you didn't test it.

**Solution:**
```bash
# Test both before committing:
sed -i 's/platform = "threads"/platform = "reddit"/' config.toml
python3 main.py
# Check results/{subreddit}/

sed -i 's/platform = "reddit"/platform = "threads"/' config.toml
python3 main.py --post-id <id>
# Check results/threads/
```

### Pitfall 4: Assuming Config Keys Exist
**Problem:**
```python
# BAD:
lang = settings.config["reddit"]["thread"]["post_lang"]
```
**Will crash** if key doesn't exist.

**Solution:**
```python
# GOOD:
lang = (settings.config["settings"].get("post_lang") or
        settings.config.get("reddit", {}).get("thread", {}).get("post_lang", ""))
```

---

## Code Review Checklist for Agents

Before marking work complete, verify:

- [ ] **No platform imports in main.py** — Uses factory only
- [ ] **Standard content_object dict** — All fetchers return same shape
- [ ] **Platform-specific logic isolated** — Only in `platforms/{platform}/`
- [ ] **Config fallback chains** — No hardcoded section names in generic code
- [ ] **Both modes tested** — Reddit AND Threads produce correct output
- [ ] **Docstrings updated** — New functions document platform assumptions
- [ ] **Error messages clear** — Include platform name + actionable guidance
- [ ] **Video dedup works** — No duplicate videos created

---

## Understanding Data Flow

### Happy Path: Fetch → TTS → Screenshot → Compose → Output

```
1. main.py:main()
   └─→ platforms/__init__.py:get_content_object()
       └─→ platforms/threads/fetcher.py:get_threads_content()
           └─→ Returns: {thread_id, thread_title, comments, ...}

2. video_creation/voices.py:save_text_to_mp3()
   └─→ TTS/engine_wrapper.py:process_text()
       └─→ TTS/engine_wrapper.py:make_voice()
           └─→ TTS/{provider}.py: {elevenlabs,tiktok,etc}
               └─→ Returns: audio_length, comment_count

3. platforms/__init__.py:get_screenshot_fn()
   └─→ platforms/threads/screenshot.py:get_screenshots_of_threads_posts()
       └─→ Uses Playwright on threads.net
           └─→ Saves: assets/temp/{thread_id}/png/{title,comment_0,etc}.png

4. video_creation/background.py
   └─→ download_background_video() & download_background_audio()
       └─→ Uses yt-dlp to fetch YouTube videos/audio
           └─→ Saves to: assets/temp/{thread_id}/{video,audio}

5. video_creation/final_video.py:make_final_video()
   └─→ Uses FFmpeg to compose everything
       └─→ Reads: audio files, screenshot PNGs, background video
           └─→ Writes: results/{thread_category}/{filename}.mp4

6. utils/videos.py:save_data()
   └─→ Records video in videos.json for dedup
```

### Config Flow

```
config.toml (user settings)
    ↓
utils/settings.py:check_toml()
    └─→ Validates against .config.template.toml schema
        └─→ Returns: settings.config (dict)

            Used by:
            ├─ main.py (platform selection)
            ├─ platforms/reddit/ (subreddit, etc.)
            ├─ platforms/threads/ (Graph API token, etc.)
            ├─ TTS/engine_wrapper.py (post_lang fallback)
            ├─ video_creation/ (theme, resolution, etc.)
            └─ utils/videos.py (dedup behavior)
```

---

## Deployment Notes

### Python Version
- **Minimum:** 3.10
- **Tested:** 3.10, 3.11, 3.12
- **Reason:** F-strings, type hints, modern async patterns

### Critical Dependencies
- **reddit platform:** praw 7.8.1 (requires Reddit OAuth app)
- **threads platform:** requests (for Graph API calls)
- **screenshots:** playwright 1.49.1 (requires browser installation: `playwright install`)
- **video:** moviepy 2.2.1, ffmpeg-python 0.2.0 (requires FFmpeg system binary)
- **tts:** varies per provider (elevenlabs, aws_polly, openai, etc.)

### Versions That Caused Issues
- **yt-dlp==2026.3.17** — Doesn't exist (use 2025.10.14 or latest stable)
- **playwright without browser install** — Will crash on first screenshot

---

## When to Escalate

### Escalate to User if:
- User needs new platform support (only they know requirements)
- Config changes affect backward compatibility
- Performance optimization needed (only user knows acceptable limits)
- Security concern (token handling, credential storage, etc.)

### Safe to Implement as Agent:
- Bug fixes within existing architecture
- Adding new TTS providers
- Extending config options for existing platforms
- Performance optimizations (caching, parallelization)
- New filter/processing features that work platform-agnostically
- Documentation & refactoring

---

## Final Guidance

**Golden Rule:** The factory pattern is your friend. When in doubt, check if your change breaks the abstraction. If it does, rethink it.

**Test Obsessively:** Always run both Reddit and Threads modes. The codebase is designed for multi-platform support, and it's easy to break one platform while fixing another.

**Document Platform Assumptions:** If your code works differently for Reddit vs Threads, say so explicitly in docstrings and comments.

**Ask Yourself:** "Would this work for X/Twitter?" If no, it probably belongs in `platforms/threads/`, not in generic code.

Good luck, and happy contributing! 🎥