From e8d95dfa3c09ba79f1ed01b1fcc0483050f4edce Mon Sep 17 00:00:00 2001
From: Hong Phuc <hongphuc.dthp@gmail.com>
Date: Fri, 24 Apr 2026 01:20:42 +0700
Subject: [PATCH] Keep container workflow instructions aligned with the code

Document the Docker Compose workflow, persistent runtime paths, and container-specific GUI binding so future work on this branch follows the implemented setup rather than the old direct-Python assumptions.

Constraint: The repo now supports both host and container execution paths, and the agent guidance needs to reflect the new operational defaults.
Rejected: Leave AGENTS.md untouched | it would continue pointing contributors at stale runtime behavior.
Confidence: high
Scope-risk: narrow
Directive: Treat the Docker Compose commands as the default local workflow for GUI/CLI work on this branch.
Tested: Reviewed AGENTS.md against the implemented Docker files and runtime bootstrap.
Not-tested: No code-path changes; documentation-only update.
---
 AGENTS.md | 413 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 413 insertions(+)
 create mode 100644 AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..6e1634c
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,413 @@
+# AGENTS.md — VideoMakerBot Development Guide
+
+## Project Overview
+
+**VideoMakerBot** — Automated short-form video creator from social media content.
+
+**Status:** Production-ready, actively maintained (v3.4.0)
+**Language:** Python 3.10+
+**Platforms:** Reddit (original), Threads (NEW), X/Twitter (planned)
+
+### Core Mission
+Transforms social media threads (post + comments/replies) into complete short-form videos with:
+- AI-generated speech (7+ TTS providers)
+- UI screenshots (Playwright)
+- Background video/audio overlays
+- FFmpeg composition & output
+
+---
+
+## Architecture at a Glance
+
+```
+main.py (CLI)
+    ↓ [platform factory]
+    ├─→ reddit/subreddit.py [PRAW API]
+    └─→ platforms/threads/fetcher.py [Graph API]
+        ↓ [standard data dict]
+        ├─→ TTS/engine_wrapper.py [7+ providers]
+        ├─→ screenshot_downloader.py (Reddit)
+        │   or platforms/threads/screenshot.py (Threads)
+        ├─→ video_creation/background.py
+        └─→ video_creation/final_video.py [FFmpeg]
+            ↓
+            results/{category}/{video.mp4}
+```
+
+### Key Design: Platform Abstraction via Factory Pattern
+
+**Why:** Single codebase supports multiple platforms without tight coupling.
+
+**How:** `platforms/__init__.py` exports:
+- `get_content_object(POST_ID=None)` — routes to right fetcher
+- `get_screenshot_fn()` — routes to right screenshotter
+
+**Result:** Adding X/Twitter requires only: new module + config section + two `elif` branches.
+
+---
+
+## Data Contract: The "content_object" Dict
+
+All fetchers return this shape (defined in `platforms/__init__.py`):
+
+```python
+{
+    # Unique identifiers
+    "thread_id":       str,           # Used for temp folder: assets/temp/{id}/
+    "thread_category": str,           # "reddit", "threads", etc. → output folder
+
+    # Content
+    "thread_title":    str,           # TTS as title + output filename
+    "thread_url":      str,           # Playwright navigates here for screenshot
+    "is_nsfw":         bool,          # Content filter flag
+
+    # Replies/Comments (mutually exclusive with thread_post)
+    "comments": [
+        {
+            "comment_body": str,      # TTS per reply
+            "comment_url":  str,      # Playwright navigates here
+            "comment_id":   str,      # CSS selector ID or unique identifier
+        }
+    ],
+
+    # OR Story mode:
+    "thread_post":     str | list,    # Long-form text (no comments)
+}
+```
+
+**Why:** Loose coupling—TTS, backgrounds, and video composition don't need platform-specific logic.
+
+---
+
+## File Organization
+
+```
+VideoMakerBot/
+├── platforms/                      # Multi-platform abstraction
+│   ├── __init__.py                # Factory: get_content_object(), get_screenshot_fn()
+│   └── threads/                   # Threads (Meta) implementation
+│       ├── fetcher.py             # Graph API → content_object
+│       └── screenshot.py          # Playwright Threads screenshotter
+│
+├── reddit/                        # Reddit implementation (kept as-is)
+│   └── subreddit.py              # PRAW API → content_object + thread_category
+│
+├── video_creation/
+│   ├── final_video.py            # FFmpeg composition (platform-aware folder naming)
+│   ├── screenshot_downloader.py  # Playwright Reddit UI capturer
+│   ├── voices.py                 # TTS orchestrator (platform-agnostic)
+│   ├── background.py             # Video/audio downloader (platform-agnostic)
+│   └── data/
+│       ├── videos.json           # Dedup tracker
+│       ├── cookie-dark-mode.json # Reddit theme cookie
+│       └── cookie-threads.json   # Threads session cookie (auto-created)
+│
+├── TTS/                          # Text-to-Speech
+│   ├── engine_wrapper.py         # Provider abstraction + post_lang fallback
+│   ├── elevenlabs.py, aws_polly.py, etc. # 7+ provider implementations
+│
+├── utils/
+│   ├── settings.py               # Config loading + validation
+│   ├── videos.py                 # check_done() + check_done_by_id()
+│   ├── console.py                # Rich terminal output
+│   ├── .config.template.toml     # Config schema (platform sections)
+│   └── ... (id, voice, cleanup, etc.)
+│
+├── main.py                       # CLI entry (platform-routed via factory)
+├── GUI.py                        # Flask web UI (localhost:4000 in host mode, 0.0.0.0 in Docker)
+├── requirements.txt              # Dependencies
+└── AGENTS.md / AGENT.md          # This file + agent guidelines
+```
+
+---
+
+## Configuration
+
+**File:** `utils/.config.template.toml` (schema) → `config.toml` (user config)
+
+### Platform Selection
+```toml
+[settings]
+platform = "reddit"     # or "threads"
+post_lang = "es-cr"     # Optional: translation language (all platforms)
+```
+
+### Reddit Config
+```toml
+[reddit.creds]
+client_id = "..."       # OAuth app
+client_secret = "..."
+username = "..."
+password = "..."
+2fa = true/false
+
+[reddit.thread]
+subreddit = "AskReddit"
+post_id = ""            # Leave blank for auto-pick
+max_comment_length = 500
+min_comment_length = 1
+min_comments = 20
+blocked_words = "..."
+```
+
+### Threads Config (NEW)
+```toml
+[threads.creds]
+access_token = "EAABsbCS..."  # Meta Graph API token (60-day expiry)
+user_id = "12345678901234567"
+username = "your_insta"       # For Playwright login
+password = "your_password"
+
+[threads.thread]
+post_id = ""            # Leave blank for auto-pick
+max_reply_length = 500
+min_reply_length = 1
+min_replies = 5
+blocked_words = "..."
+```
+
+### Generic Settings
+```toml
+[settings]
+theme = "dark"
+resolution_w = 1080
+resolution_h = 1920
+storymode = false
+times_to_run = 1
+
+[settings.tts]
+voice_choice = "tiktok"     # or "elevenlabs", "awspolly", "googletranslate", etc.
+random_voice = true
+silence_duration = 0.3
+
+[settings.background]
+background_video = "minecraft"
+background_audio = "lofi"
+background_audio_volume = 0.15
+```
+
+---
+
+## Development Guidelines
+
+### ✅ DO:
+
+1. **Use platform factory in main.py**
+   ```python
+   from platforms import get_content_object, get_screenshot_fn
+   reddit_object = get_content_object(POST_ID)
+   screenshot_fn = get_screenshot_fn()
+   screenshot_fn(reddit_object, number_of_comments)
+   ```
+
+2. **Return standard content dict** from all fetchers
+   ```python
+   return {
+       "thread_id": ...,
+       "thread_category": ...,  # NEW: replaces hardcoded subreddit
+       "comments": [...]
+   }
+   ```
+
+3. **Use config fallback chains** for cross-platform keys
+   ```python
+   lang = (settings.config["settings"].get("post_lang") or
+           settings.config.get("reddit", {}).get("thread", {}).get("post_lang", ""))
+   ```
+
+4. **Read thread_category from dict** instead of config
+   ```python
+   # WRONG:
+   subreddit = settings.config["reddit"]["thread"]["subreddit"]
+
+   # RIGHT:
+   platform = settings.config["settings"].get("platform", "reddit")
+   if platform == "reddit":
+       subreddit = settings.config["reddit"]["thread"]["subreddit"]
+   else:
+       subreddit = reddit_obj.get("thread_category", platform)
+   ```
+
+5. **Test both platforms** after core pipeline changes
+   ```bash
+   # Test Reddit (must not regress)
+   sed -i 's/platform = "threads"/platform = "reddit"/' config.toml
+   python3 main.py
+
+   # Test Threads
+   sed -i 's/platform = "reddit"/platform = "threads"/' config.toml
+   python3 main.py --post-id <threads-id>
+   ```
+
+### ❌ DON'T:
+
+1. **Don't import platform modules directly** in main.py/utils
+   ```python
+   # WRONG: from reddit.subreddit import get_subreddit_threads
+   # RIGHT: from platforms import get_content_object
+   ```
+
+2. **Don't hardcode platform names** in generic modules
+   ```python
+   # WRONG in final_video.py:
+   subreddit = settings.config["reddit"]["thread"]["subreddit"]
+
+   # RIGHT:
+   subreddit = reddit_obj.get("thread_category", "unknown")
+   ```
+
+3. **Don't add platform-specific UI selectors** outside `platforms/{platform}/screenshot.py`
+   - Reddit selectors stay in `video_creation/screenshot_downloader.py`
+   - Threads selectors stay in `platforms/threads/screenshot.py`
+
+4. **Don't assume config keys exist** without fallback
+   ```python
+   # WRONG: lang = settings.config["reddit"]["thread"]["post_lang"]
+   # RIGHT: lang = settings.config.get("settings", {}).get("post_lang", "")
+   ```
+
+---
+
+## Platform-Specific Knowledge
+
+### Reddit
+- **API:** PRAW (Python Reddit API Wrapper)
+- **Auth:** OAuth app (client_id, secret) + username/password
+- **Screenshot:** Playwright on reddit.com/new.reddit.com
+  - Login form: `input[name="username"]`, `input[name="password"]`
+  - Post selector: `[data-test-id="post-content"]`
+  - Comment selector: `#t1_{comment_id}`
+- **NSFW:** `submission.over_18`
+- **Output folder:** `results/{subreddit}/`
+
+### Threads
+- **API:** Meta Graph API (v18.0+)
+- **Auth:** User access token (60-day lifetime) via https://developers.facebook.com/
+- **Screenshot:** Playwright on threads.net
+  - Login form: `input[autocomplete="username"]`, `input[autocomplete="current-password"]`
+  - Post selector: `article` (universal, more stable than Reddit)
+  - Cookies saved to: `video_creation/data/cookie-threads.json`
+- **NSFW:** API doesn't provide; always False
+- **Output folder:** `results/threads/`
+
+### Future: X/Twitter
+Create: `platforms/twitter/fetcher.py` + `platforms/twitter/screenshot.py` + config section
+Update: `platforms/__init__.py` with `elif platform == "twitter"` branches
+
+---
+
+## Extending the Project
+
+### Adding a New TTS Provider
+1. Create `TTS/my_provider.py` with a class implementing the TTS interface
+2. Add config keys to `[settings.tts]` in `.config.template.toml`
+3. Update `TTS/engine_wrapper.py` to call your provider
+4. Test with `settings.config["settings"]["tts"]["voice_choice"] = "my_provider"`
+
+### Adding a New Platform (e.g., X/Twitter)
+1. **Create fetcher:** `platforms/twitter/fetcher.py`
+   - Implement `get_twitter_content(POST_ID=None)` returning standard dict
+2. **Create screenshotter:** `platforms/twitter/screenshot.py`
+   - Implement `get_screenshots_of_twitter_posts(content_object, screenshot_num)`
+3. **Update config:** Add `[twitter.creds]` and `[twitter.thread]` sections
+4. **Update factory:** Add `elif platform == "twitter"` in `platforms/__init__.py`
+5. **Update CLI helper:** Add case to `_get_platform_post_id()` in `main.py`
+6. **Test:** Verify Reddit mode still works, test Twitter mode end-to-end
+
+**Zero changes needed to:** TTS, backgrounds, video composition, or utils.
+
+---
+
+## Debugging Tips
+
+### "No matching distribution found for yt-dlp==2026.3.17"
+→ yt-dlp uses date versioning (YYYY.M.DD, no leading zeros). Use `2025.10.14` (latest stable).
+
+### "Threads API: Invalid or expired access_token"
+→ Meta tokens expire every 60 days. Refresh at https://developers.facebook.com/tools/explorer/
+
+### Playwright timeout on Threads screenshot
+→ Login cookies corrupted or expired. Delete `video_creation/data/cookie-threads.json` to force fresh login next run.
+
+### "No eligible Threads posts found"
+→ Configure `[threads.thread].min_replies = 5` (or lower). Ensure your Threads account has public posts with replies.
+
+### Video dedup not working
+→ Check `video_creation/data/videos.json` is writable. Ensure `check_done_by_id()` is called before fetching content.
+
+---
+
+## Testing Checklist
+
+- [ ] Reddit mode: `platform = "reddit"` produces video to `results/{subreddit}/`
+- [ ] Threads mode: `platform = "threads"` produces video to `results/threads/`
+- [ ] Video dedup: Running same post_id twice skips second run
+- [ ] Translation: `post_lang = "es"` translates filenames
+- [ ] TTS providers: Test with different voice_choice values
+- [ ] Background selection: Custom background video/audio works
+- [ ] Story mode: storymode=true only uses thread_post, not comments
+- [ ] Error handling: Invalid credentials show clear messages
+
+---
+
+## Key Files to Know
+
+| File | Purpose |
+|------|---------|
+| `main.py` | CLI entry; orchestrates pipeline via factory |
+| `platforms/__init__.py` | Factory dispatch for multi-platform support |
+| `platforms/threads/fetcher.py` | Threads Graph API client |
+| `platforms/threads/screenshot.py` | Threads.net Playwright screenshotter |
+| `video_creation/final_video.py` | FFmpeg composition; platform-aware output naming |
+| `TTS/engine_wrapper.py` | TTS provider abstraction; post_lang fallback |
+| `utils/settings.py` | Config loading & validation |
+| `utils/videos.py` | Video dedup tracking |
+| `utils/.config.template.toml` | Config schema |
+| `requirements.txt` | Dependencies |
+
+---
+
+## Useful Commands
+
+```bash
+# Install dependencies
+pip install -r requirements.txt
+
+# Run CLI
+python3 main.py
+
+# Run with specific post
+python3 main.py <post_id>
+
+# Run Flask GUI
+python3 GUI.py
+
+# Check syntax
+python3 -m py_compile main.py platforms/threads/fetcher.py
+
+# Format code
+black main.py platforms/ utils/
+
+# Lint
+pylint main.py
+```
+
+## Docker Workflow
+
+- Use `docker compose build` to build the shared image for both CLI and GUI.
+- Use `docker compose up gui` to run the Flask app on port `4000`.
+- Use `docker compose run --rm cli` to run the video generator in a container.
+- The repo root is bind-mounted in Compose, so `config.toml`, `results/`, `assets/temp/`, `video_creation/data/videos.json`, and `utils/backgrounds.json` should persist across runs.
+- The GUI must bind to `0.0.0.0` in Docker; do not switch it back to `localhost` for container use.
+
+---
+
+## When You Get Stuck
+
+1. **"What does this module do?"** → Check imports in `main.py` or docstrings
+2. **"How do I add support for platform X?"** → See "Adding a New Platform" section above
+3. **"Why is my config not being read?"** → Check `utils/settings.py:check_toml()` and `.config.template.toml` schema
+4. **"Why isn't my TTS provider being called?"** → Check `TTS/engine_wrapper.py:make_voice()` and config `voice_choice`
+5. **"How do I debug the Playwright screenshot?"** → Uncomment `page.pause()` in screenshot downloader, run headful browser
+
+Good luck! 🚀