- Added to handle text-to-speech for screenshots, generating MP3 files and updating post objects with audio paths and durations. - Introduced to assemble videos from screenshots and TTS audio, including background video and audio management. - Created as the entry point for the manual pipeline, supporting commands to initialize posts, render videos, and list post statuses. - Updated background audio and video configurations in JSON files, removing outdated entries and adding new options. - Adjusted file permissions for several utility scripts to ensure proper execution.pull/2558/head
parent
569f25098a
commit
2301f9c3b4
@ -0,0 +1,235 @@
|
||||
# 🧠 Brainstorming: Manual Screenshot → Video Pipeline
|
||||
|
||||
> **Bối cảnh**: Không thể sử dụng Reddit API. Cần workflow mới cho phép user tự chụp screenshot từ **Reddit, Threads (Meta), X (Twitter)** rồi hệ thống tự động tạo video.
|
||||
>
|
||||
> **Trạng thái**: ✅ **ĐÃ IMPLEMENT** — Phase 1 hoàn tất.
|
||||
|
||||
---
|
||||
|
||||
## 1. Phân Tích Vấn Đề Cốt Lõi
|
||||
|
||||
### Flow hiện tại đang phụ thuộc Reddit API ở đâu?
|
||||
|
||||
| Bước | Phụ thuộc API? | Chi tiết |
|
||||
|------|:---:|----------|
|
||||
| Lấy thread + comments | ✅ **YES** | `reddit/subreddit.py` — PRAW login, fetch post, filter comments |
|
||||
| **Text cho TTS** | ✅ **YES** | `TTS/engine_wrapper.py` — lấy text từ `reddit_object["comments"]` |
|
||||
| **Screenshot** | ✅ **YES** | `screenshot_downloader.py` — Playwright login Reddit, navigate, capture |
|
||||
| Background video/audio | ❌ NO | `background.py` — chỉ dùng YouTube, không liên quan Reddit |
|
||||
| Final video assembly | ❌ NO | `final_video.py` — chỉ dùng FFmpeg, nhưng cần `reddit_obj` dict |
|
||||
| Video tracking | ❌ NO | `videos.json` — chỉ lưu metadata |
|
||||
|
||||
**Kết luận**: Cần thay thế hoàn toàn **3 bước đầu** (fetch → TTS text → screenshot) bằng flow thủ công.
|
||||
|
||||
---
|
||||
|
||||
## 2. Phương Án Đã Chọn: **.mp3 ưu tiên, .txt fallback**
|
||||
|
||||
User cung cấp **file audio (.mp3) trực tiếp** + screenshots. TTS chỉ là fallback nếu chỉ có file `.txt`.
|
||||
|
||||
```
|
||||
User chụp screenshot + cung cấp audio .mp3 → Video
|
||||
(hoặc .txt fallback → TTS → Video)
|
||||
```
|
||||
|
||||
### Ưu tiên audio:
|
||||
|
||||
```
|
||||
Có .mp3? ──YES──▶ Dùng .mp3 trực tiếp (bỏ qua TTS)
|
||||
│
|
||||
NO
|
||||
│
|
||||
Có .txt? ──YES──▶ TTS sinh .mp3 từ text (fallback)
|
||||
│
|
||||
NO
|
||||
│
|
||||
▼
|
||||
⚠ SKIP (screenshot không có audio)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Cấu Trúc Đã Implement
|
||||
|
||||
### 3.1 Thư mục
|
||||
|
||||
```
|
||||
RedditVideoMakerBot/
|
||||
├── main.py # Flow cũ (giữ nguyên, không sửa)
|
||||
├── manual_main.py # 🆕 Entry point cho flow mới
|
||||
│
|
||||
├── manual/ # 🆕 Module flow mới (tách biệt hoàn toàn)
|
||||
│ ├── __init__.py # Module docstring
|
||||
│ ├── scanner.py # Quét folder, validate (.png + .mp3 + .txt)
|
||||
│ ├── tts_processor.py # Audio processor (.mp3 ưu tiên, TTS fallback)
|
||||
│ └── video_builder.py # FFmpeg pipeline (libx264 CPU)
|
||||
│
|
||||
├── manual_posts/ # 🆕 Thư mục input
|
||||
│ └── post_001/
|
||||
│ ├── meta.json # (optional) metadata
|
||||
│ ├── 0_title.png # Screenshot bài đăng
|
||||
│ ├── 0_title.mp3 # Audio (pre-recorded)
|
||||
│ ├── 1_comment.png # Screenshot comment
|
||||
│ └── 1_comment.mp3 # Audio comment
|
||||
│
|
||||
├── manual_results/ # 🆕 Thư mục output
|
||||
│ └── post_001.mp4
|
||||
│
|
||||
├── reddit/ # Flow cũ (giữ nguyên)
|
||||
├── TTS/ # Shared — dùng chung TTS engines (fallback)
|
||||
├── video_creation/ # Flow cũ (giữ nguyên)
|
||||
└── utils/ # Shared — dùng chung utilities
|
||||
```
|
||||
|
||||
### 3.2 Quy Tắc Đặt Tên File
|
||||
|
||||
```
|
||||
<số_thứ_tự>_<loại>.<ext>
|
||||
```
|
||||
|
||||
| Pattern | Ý nghĩa | Bắt buộc? |
|
||||
|---------|----------|-----------|
|
||||
| `0_title.png` | Screenshot bài đăng chính | ✅ Bắt buộc |
|
||||
| `0_title.mp3` | Audio pre-recorded | ✅ (hoặc .txt) |
|
||||
| `0_title.txt` | Text TTS fallback | Fallback |
|
||||
| `1_comment.png` | Screenshot comment 1 | Optional |
|
||||
| `1_comment.mp3` | Audio comment 1 | ✅ (hoặc .txt) |
|
||||
| `meta.json` | Metadata | Optional |
|
||||
|
||||
### 3.3 `post_object` — Data Structure
|
||||
|
||||
```python
|
||||
post_object = {
|
||||
"post_id": "post_001",
|
||||
"platform": "reddit", # reddit | threads | x | other
|
||||
"title": "What's the most...",
|
||||
"author": "u/example_user",
|
||||
"url": "https://...",
|
||||
"post_dir": "manual_posts/post_001",
|
||||
|
||||
"screenshots": [
|
||||
{
|
||||
"index": 0,
|
||||
"type": "title",
|
||||
"image_path": "manual_posts/post_001/0_title.png",
|
||||
"text": "", # Từ .txt (nếu có)
|
||||
"audio_path": "manual_posts/post_001/0_title.mp3", # Từ .mp3
|
||||
"audio_duration": 3.5, # Đo sau khi process
|
||||
},
|
||||
{
|
||||
"index": 1,
|
||||
"type": "comment",
|
||||
"image_path": "manual_posts/post_001/1_comment.png",
|
||||
"text": "",
|
||||
"audio_path": "manual_posts/post_001/1_comment.mp3",
|
||||
"audio_duration": 5.2,
|
||||
},
|
||||
],
|
||||
|
||||
"total_duration": 8.7,
|
||||
"output_path": "manual_results/post_001.mp4",
|
||||
}
|
||||
```
|
||||
|
||||
### 3.4 Flow Xử Lý
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A["manual_main.py"] --> B{"Command?"}
|
||||
|
||||
B -->|"render"| G["Quét manual_posts/"]
|
||||
G --> H["Validate: có ảnh + audio/text?"]
|
||||
H --> I["Build post_object từ files"]
|
||||
I --> J{"Có .mp3?"}
|
||||
J -->|"YES"| K["Dùng .mp3 trực tiếp"]
|
||||
J -->|"NO, có .txt"| L["TTS: text → .mp3"]
|
||||
K --> M["Random pick background video + audio"]
|
||||
L --> M
|
||||
M --> N["FFmpeg: ghép ảnh + audio + background"]
|
||||
N --> O["Output → manual_results/"]
|
||||
|
||||
B -->|"render --all"| P["Loop qua tất cả folders"]
|
||||
P --> G
|
||||
|
||||
B -->|"init"| Q["Tạo folder + meta.json"]
|
||||
B -->|"list"| R["Liệt kê posts + trạng thái"]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. So Sánh Flow Cũ vs Flow Mới
|
||||
|
||||
| Aspect | Flow Cũ (`main.py`) | Flow Mới (`manual_main.py`) |
|
||||
|--------|---------------------|---------------------------|
|
||||
| **Data source** | Reddit API (PRAW) | Manual screenshots + audio files |
|
||||
| **Screenshot** | Playwright auto-capture | User tự chụp |
|
||||
| **Audio source** | TTS từ comment text | **User cung cấp .mp3** (hoặc .txt → TTS) |
|
||||
| **Platform** | Chỉ Reddit | Reddit + Threads + X + any |
|
||||
| **TTS engines** | Required | Optional (chỉ là fallback cho .txt) |
|
||||
| **Background** | Hardcoded YouTube list | **Random từ local folder** (YouTube fallback) |
|
||||
| **Encoder** | `h264_nvenc` (GPU) | `libx264` (CPU) |
|
||||
| **Config** | `config.toml` (template-based) | `config.toml` `[manual]` section + built-in defaults |
|
||||
| **Output** | `results/<subreddit>/` | `manual_results/` |
|
||||
| **Tracking** | `videos.json` | `videos.json` (shared) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Config
|
||||
|
||||
```toml
|
||||
[manual]
|
||||
input_dir = "manual_posts"
|
||||
output_dir = "manual_results"
|
||||
encoder = "libx264"
|
||||
resolution_w = 1080
|
||||
resolution_h = 1920
|
||||
opacity = 0.9
|
||||
background_video = "random" # "random" hoặc tên cụ thể (e.g. "minecraft")
|
||||
background_audio = "random" # "random" hoặc tên cụ thể (e.g. "lofi")
|
||||
background_video_dir = "assets/backgrounds/video" # Thư mục chứa video nền local
|
||||
background_audio_dir = "assets/backgrounds/audio" # Thư mục chứa nhạc nền local
|
||||
background_audio_volume = 0.15
|
||||
max_video_length = 120
|
||||
```
|
||||
|
||||
**Lưu ý**: Config `[manual]` là optional. Nếu không có, dùng built-in defaults.
|
||||
|
||||
### Background: Random từ local folder
|
||||
|
||||
Bỏ file video/audio nền vào thư mục → hệ thống random chọn mỗi lần render:
|
||||
```
|
||||
assets/backgrounds/video/ ← Bỏ file .mp4/.mkv/.webm/.avi/.mov vào đây
|
||||
assets/backgrounds/audio/ ← Bỏ file .mp3/.wav/.ogg/.m4a/.flac vào đây
|
||||
```
|
||||
- **Có file local** → random chọn 1
|
||||
- **Không có file local** → fallback tải từ YouTube (danh sách cũ)
|
||||
|
||||
TTS fallback dùng settings từ `[settings.tts]` (mặc định: GoogleTranslate, không cần API key).
|
||||
|
||||
---
|
||||
|
||||
## 6. Decisions Log
|
||||
|
||||
| Câu hỏi | Quyết định |
|
||||
|----------|------------|
|
||||
| Audio source | **.mp3 ưu tiên**, .txt fallback sang TTS |
|
||||
| Background | **Random từ local folder**, YouTube fallback |
|
||||
| Encoder | `libx264` (CPU) — không có GPU NVIDIA |
|
||||
| Config | Section `[manual]` trong `config.toml` |
|
||||
| Thumbnail | Bỏ qua |
|
||||
| Video tracking | Chung file `videos.json` |
|
||||
| OCR (Phase 2) | EN + VI, dùng EasyOCR |
|
||||
|
||||
---
|
||||
|
||||
## 7. Phases
|
||||
|
||||
| Phase | Trạng thái | Mô tả |
|
||||
|-------|:---:|--------|
|
||||
| **Phase 1: Core** | ✅ Done | .png + .mp3 → Video (+ .txt TTS fallback) |
|
||||
| Phase 2: OCR | ⏳ Planned | Auto-read text từ screenshots (EN + VI) |
|
||||
| Phase 3: GUI | ⏳ Planned | Flask web interface cho manual flow |
|
||||
|
||||
---
|
||||
|
||||
> 📝 **Tóm tắt**: Module `manual/` tách biệt hoàn toàn. Input chính: screenshots (.png) + audio (.mp3). TTS chỉ là fallback khi dùng .txt. Reuse background functions từ code cũ. Output video vào `manual_results/`. Platform-agnostic.
|
||||
@ -0,0 +1,139 @@
|
||||
# 📖 Hướng Dẫn Sử Dụng Manual Pipeline
|
||||
|
||||
> **Tóm tắt**: Tạo video từ screenshots chụp tay (Reddit, Threads, X) mà không cần API.
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Quick Start (3 bước)
|
||||
|
||||
### Bước 1: Tạo folder cho post mới
|
||||
```bash
|
||||
cd /home/minhvu/projects/RedditVideoMakerBot
|
||||
python manual_main.py init my_first_post --platform reddit
|
||||
```
|
||||
|
||||
Kết quả:
|
||||
```
|
||||
manual_posts/my_first_post/
|
||||
├── meta.json ← (optional) metadata
|
||||
├── 0_title.txt ← Chỉnh text cho TTS ở đây
|
||||
└── 1_comment.txt ← Chỉnh text comment ở đây
|
||||
```
|
||||
|
||||
### Bước 2: Thêm screenshots + text
|
||||
|
||||
1. **Chụp screenshot** bài đăng → lưu thành `0_title.png`
|
||||
2. **Chụp screenshot** comments → lưu thành `1_comment.png`, `2_comment.png`, ...
|
||||
3. **Sửa file `.txt`** tương ứng — nhập nội dung text mà bot sẽ đọc thành giọng nói
|
||||
|
||||
```
|
||||
manual_posts/my_first_post/
|
||||
├── meta.json
|
||||
├── 0_title.png ← Screenshot bài đăng
|
||||
├── 0_title.txt ← "What's the most underrated life hack?"
|
||||
├── 1_comment.png ← Screenshot comment 1
|
||||
├── 1_comment.txt ← "I always put my phone on airplane mode..."
|
||||
├── 2_comment.png ← Screenshot comment 2
|
||||
└── 2_comment.txt ← "Using a binder clip as a phone stand..."
|
||||
```
|
||||
|
||||
> [!IMPORTANT]
|
||||
> Mỗi file `.png` **bắt buộc** phải có file `.txt` cùng số thứ tự. Số `0` luôn là title/bài đăng chính.
|
||||
|
||||
### Bước 3: Render video
|
||||
```bash
|
||||
python manual_main.py render my_first_post
|
||||
```
|
||||
|
||||
Video sẽ được lưu tại: `manual_results/my_first_post.mp4`
|
||||
|
||||
---
|
||||
|
||||
## 📋 Tất Cả Commands
|
||||
|
||||
| Command | Mô tả |
|
||||
|---------|--------|
|
||||
| `python manual_main.py init <post_id>` | Tạo folder mới với template files |
|
||||
| `python manual_main.py init <post_id> --platform threads` | Tạo folder cho Threads post |
|
||||
| `python manual_main.py render <post_id>` | Render 1 post thành video |
|
||||
| `python manual_main.py render --all` | Render tất cả posts chưa render |
|
||||
| `python manual_main.py render <post_id> --force` | Re-render (dù đã render trước đó) |
|
||||
| `python manual_main.py list` | Liệt kê tất cả posts + trạng thái |
|
||||
|
||||
---
|
||||
|
||||
## 📁 Quy Tắc Đặt Tên File
|
||||
|
||||
```
|
||||
<số_thứ_tự>_<loại>.<ext>
|
||||
```
|
||||
|
||||
| File | Ý nghĩa |
|
||||
|------|----------|
|
||||
| `0_title.png` | Screenshot bài đăng chính (bắt buộc) |
|
||||
| `0_title.txt` | Text TTS cho bài đăng (bắt buộc) |
|
||||
| `1_comment.png` | Screenshot comment 1 |
|
||||
| `1_comment.txt` | Text TTS cho comment 1 |
|
||||
| `N_comment.png/txt` | Comment thứ N |
|
||||
| `meta.json` | Metadata (optional) |
|
||||
|
||||
> [!TIP]
|
||||
> File `.txt` hỗ trợ dòng comment bắt đầu bằng `#` — những dòng này sẽ bị bỏ qua khi TTS.
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Cấu Hình
|
||||
|
||||
Thêm section `[manual]` vào `config.toml` (hoặc để trống — bot sẽ dùng defaults):
|
||||
|
||||
```toml
|
||||
[manual]
|
||||
input_dir = "manual_posts" # Thư mục input
|
||||
output_dir = "manual_results" # Thư mục output
|
||||
encoder = "libx264" # CPU encoder (hoặc h264_nvenc nếu có GPU)
|
||||
resolution_w = 1080 # Width video
|
||||
resolution_h = 1920 # Height video (1080x1920 = portrait)
|
||||
opacity = 0.9 # Độ trong suốt screenshot overlay
|
||||
background_video = "minecraft" # Video nền
|
||||
background_audio = "lofi" # Audio nền
|
||||
background_audio_volume = 0.15 # Âm lượng audio nền (0 = tắt)
|
||||
max_video_length = 120 # Max thời lượng video (giây)
|
||||
```
|
||||
|
||||
TTS engine được lấy từ section `[settings.tts]` trong `config.toml`. Mặc định dùng **GoogleTranslate** (không cần API key).
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Kiến Trúc Module
|
||||
|
||||
```
|
||||
manual/
|
||||
├── __init__.py # Module docstring
|
||||
├── scanner.py # Quét folders, validate, build post_object
|
||||
├── tts_processor.py # TTS: text → MP3 (reuse TTS/ engines)
|
||||
└── video_builder.py # FFmpeg: screenshots + audio → video
|
||||
|
||||
manual_main.py # CLI entry point (init, render, list)
|
||||
```
|
||||
|
||||
**Hoàn toàn tách biệt** với flow cũ (`main.py`). Không sửa bất kỳ file nào của flow cũ.
|
||||
|
||||
### Files đã tạo/sửa
|
||||
|
||||
| File | Action | Mô tả |
|
||||
|------|--------|--------|
|
||||
| [manual/__init__.py](file:///home/minhvu/projects/RedditVideoMakerBot/manual/__init__.py) | 🆕 Created | Module init |
|
||||
| [manual/scanner.py](file:///home/minhvu/projects/RedditVideoMakerBot/manual/scanner.py) | 🆕 Created | Folder scanner & validator |
|
||||
| [manual/tts_processor.py](file:///home/minhvu/projects/RedditVideoMakerBot/manual/tts_processor.py) | 🆕 Created | TTS processor |
|
||||
| [manual/video_builder.py](file:///home/minhvu/projects/RedditVideoMakerBot/manual/video_builder.py) | 🆕 Created | Video assembler |
|
||||
| [manual_main.py](file:///home/minhvu/projects/RedditVideoMakerBot/manual_main.py) | 🆕 Created | CLI entry point |
|
||||
| [.gitignore](file:///home/minhvu/projects/RedditVideoMakerBot/.gitignore) | ✏️ Updated | Thêm `manual_posts/`, `manual_results/` |
|
||||
|
||||
---
|
||||
|
||||
## ⚠️ Lưu Ý
|
||||
|
||||
1. **FFmpeg** phải được cài sẵn trên hệ thống
|
||||
2. **Background video** sẽ tự động tải từ YouTube lần đầu (cần internet)
|
||||
3. Config.toml có thể trống — bot dùng built-in defaults (GoogleTranslate TTS)
|
||||
4. Encoder mặc định là `libx264` (CPU) — phù hợp máy không có GPU NVIDIA
|
||||
@ -0,0 +1,483 @@
|
||||
# 📦 RedditVideoMakerBot — Project Init Documentation
|
||||
|
||||
> **Version**: 3.4.0
|
||||
> **Author gốc**: Lewis Menelaws & [TMRRW](https://tmrrwinc.ca)
|
||||
> **License**: GPL + Roboto Fonts (Apache 2.0)
|
||||
> **Python**: 3.10 / 3.11 / 3.12
|
||||
|
||||
---
|
||||
|
||||
## 1. Tổng Quan
|
||||
|
||||
RedditVideoMakerBot là một công cụ tự động hóa việc tạo video ngắn (TikTok/YouTube Shorts/Instagram Reels) từ các bài đăng trên Reddit. Bot sẽ:
|
||||
|
||||
1. **Lấy bài đăng** từ subreddit (qua Reddit API / PRAW)
|
||||
2. **Chuyển text thành giọng nói** (TTS — 7 engine khác nhau)
|
||||
3. **Chụp screenshot** bài đăng/comments bằng Playwright
|
||||
4. **Tải & cắt video/audio nền** từ YouTube
|
||||
5. **Ghép tất cả** thành video hoàn chỉnh bằng FFmpeg
|
||||
|
||||
Kết quả cuối cùng: file `.mp4` trong thư mục `results/<subreddit>/`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Cấu Trúc Thư Mục
|
||||
|
||||
```
|
||||
RedditVideoMakerBot/
|
||||
├── main.py # 🚀 Entry point chính
|
||||
├── GUI.py # 🖥️ Web GUI (Flask, port 4000)
|
||||
├── config.toml # ⚙️ File cấu hình (user-generated)
|
||||
├── ptt.py # 🔊 Helper script để liệt kê system voices
|
||||
├── requirements.txt # 📦 Python dependencies
|
||||
├── Dockerfile # 🐳 Docker support (python:3.10-slim)
|
||||
├── build.sh / run.sh / run.bat # 📜 Scripts chạy nhanh
|
||||
├── install.sh # 📜 Auto-installer (Linux/macOS)
|
||||
│
|
||||
├── reddit/ # 📡 Module lấy dữ liệu từ Reddit
|
||||
│ └── subreddit.py # Đăng nhập Reddit, lấy threads & comments
|
||||
│
|
||||
├── TTS/ # 🗣️ Module Text-to-Speech (7 engines)
|
||||
│ ├── engine_wrapper.py # TTSEngine — wrapper chung cho tất cả TTS
|
||||
│ ├── TikTok.py # TikTok TTS API
|
||||
│ ├── aws_polly.py # AWS Polly (boto3)
|
||||
│ ├── elevenlabs.py # ElevenLabs API
|
||||
│ ├── openai_tts.py # OpenAI TTS API
|
||||
│ ├── GTTS.py # Google Translate TTS (gTTS)
|
||||
│ ├── pyttsx.py # pyttsx3 (offline, system voices)
|
||||
│ └── streamlabs_polly.py # Streamlabs Polly
|
||||
│
|
||||
├── video_creation/ # 🎬 Module tạo video
|
||||
│ ├── voices.py # Orchestrator — chọn TTS provider & chạy
|
||||
│ ├── screenshot_downloader.py # Chụp screenshot Reddit bằng Playwright
|
||||
│ ├── background.py # Tải & cắt background video/audio (yt-dlp)
|
||||
│ ├── final_video.py # Ghép tất cả thành video (FFmpeg pipeline)
|
||||
│ └── data/ # Cookie files + videos.json (tracking)
|
||||
│ ├── cookie-dark-mode.json
|
||||
│ ├── cookie-light-mode.json
|
||||
│ └── videos.json
|
||||
│
|
||||
├── utils/ # 🛠️ Utilities
|
||||
│ ├── settings.py # Đọc/validate config.toml theo template
|
||||
│ ├── .config.template.toml # Template cấu hình (định nghĩa tất cả fields)
|
||||
│ ├── console.py # Rich console helpers (print_step, handle_input...)
|
||||
│ ├── ai_methods.py # AI similarity sorting (sentence-transformers)
|
||||
│ ├── subreddit.py # Logic chọn post chưa làm + bộ lọc
|
||||
│ ├── voice.py # sanitize_text(), rate limit, sleep_until()
|
||||
│ ├── videos.py # check_done(), save_data() — tracking
|
||||
│ ├── cleanup.py # Xóa temp files
|
||||
│ ├── ffmpeg_install.py # Tự động cài FFmpeg nếu chưa có
|
||||
│ ├── imagenarator.py # Render ảnh cho storymode method 1
|
||||
│ ├── thumbnail.py # Tạo thumbnail cho video
|
||||
│ ├── fonts.py # Font size helpers
|
||||
│ ├── id.py # extract_id() — sanitize reddit thread ID
|
||||
│ ├── posttextparser.py # Phân tách post text thành các đoạn
|
||||
│ ├── playwright.py # Helper clear cookies
|
||||
│ ├── version.py # Check version mới trên GitHub
|
||||
│ ├── gui_utils.py # Utils cho Flask GUI
|
||||
│ ├── background_videos.json # Danh sách background videos (YouTube URLs)
|
||||
│ └── background_audios.json # Danh sách background audios (YouTube URLs)
|
||||
│
|
||||
├── GUI/ # 🌐 Flask Templates (HTML)
|
||||
│ ├── layout.html # Base template
|
||||
│ ├── index.html # Trang chủ — danh sách videos đã tạo
|
||||
│ ├── settings.html # Trang cấu hình
|
||||
│ ├── backgrounds.html # Quản lý backgrounds
|
||||
│ └── voices/ # Voice sample files
|
||||
│
|
||||
├── fonts/ # 🔤 Roboto font files
|
||||
│ ├── Roboto-Regular.ttf
|
||||
│ ├── Roboto-Bold.ttf
|
||||
│ ├── Roboto-Medium.ttf
|
||||
│ ├── Roboto-Black.ttf
|
||||
│ └── LICENSE.txt
|
||||
│
|
||||
├── assets/ # 🎨 Static assets
|
||||
│ ├── title_template.png # Template ảnh cho fancy thumbnail
|
||||
│ └── backgrounds/ # Downloaded background files (video/audio)
|
||||
│
|
||||
├── results/ # 📁 Output videos (auto-created)
|
||||
│ └── <subreddit>/
|
||||
│ ├── <video>.mp4
|
||||
│ ├── OnlyTTS/ # Video không có background audio
|
||||
│ └── thumbnails/ # Generated thumbnails
|
||||
│
|
||||
└── threads/ # 📂 (Unused/placeholder)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Pipeline Xử Lý (Luồng Chính)
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A["main.py — Entry Point"] --> B["1. get_subreddit_threads()"]
|
||||
B --> C["2. save_text_to_mp3()"]
|
||||
C --> D["3. get_screenshots_of_reddit_posts()"]
|
||||
D --> E["4. download/chop backgrounds"]
|
||||
E --> F["5. make_final_video()"]
|
||||
F --> G["results/subreddit/video.mp4"]
|
||||
|
||||
B -.-> B1["reddit/subreddit.py"]
|
||||
B1 -.-> B2["PRAW — Reddit API"]
|
||||
B1 -.-> B3["utils/subreddit.py — filter logic"]
|
||||
B1 -.-> B4["utils/ai_methods.py — similarity sort"]
|
||||
|
||||
C -.-> C1["video_creation/voices.py"]
|
||||
C1 -.-> C2["TTS/engine_wrapper.py"]
|
||||
C2 -.-> C3["7 TTS Engines"]
|
||||
|
||||
D -.-> D1["video_creation/screenshot_downloader.py"]
|
||||
D1 -.-> D2["Playwright — Headless Chrome"]
|
||||
|
||||
E -.-> E1["video_creation/background.py"]
|
||||
E1 -.-> E2["yt-dlp — YouTube download"]
|
||||
|
||||
F -.-> F1["video_creation/final_video.py"]
|
||||
F1 -.-> F2["FFmpeg — Video assembly"]
|
||||
```
|
||||
|
||||
### Bước 1: Lấy Reddit Thread (`reddit/subreddit.py`)
|
||||
|
||||
- Đăng nhập Reddit qua **PRAW** (client_id, client_secret, username, password)
|
||||
- Hỗ trợ **2FA** (nhập code thủ công)
|
||||
- Chọn post theo các cách:
|
||||
- **Post ID cụ thể** (từ config, hỗ trợ nhiều ID phân cách bằng `+`)
|
||||
- **AI Similarity** — dùng `sentence-transformers/all-MiniLM-L6-v2` so sánh tương đồng với keywords
|
||||
- **Random** từ `subreddit.hot(limit=25)`
|
||||
- **Bộ lọc** (trong `utils/subreddit.py`):
|
||||
- Skip posts đã làm (kiểm tra `videos.json`)
|
||||
- Skip NSFW (nếu `allow_nsfw = false`)
|
||||
- Skip pinned posts
|
||||
- Skip posts chứa **blocked words**
|
||||
- Skip posts ít hơn `min_comments`
|
||||
- Storymode: kiểm tra `selftext` length
|
||||
- Thu thập comments (filter theo `min/max_comment_length`, skip deleted/removed/stickied)
|
||||
- **Output**: Dict chứa `thread_url`, `thread_title`, `thread_id`, `is_nsfw`, `comments[]` hoặc `thread_post`
|
||||
|
||||
### Bước 2: Text-to-Speech (`video_creation/voices.py` + `TTS/`)
|
||||
|
||||
**7 TTS Providers** với `max_chars` khác nhau:
|
||||
|
||||
| Provider | Class | Max Chars | API Key Required | Notes |
|
||||
|----------|-------|-----------|------------------|-------|
|
||||
| **TikTok** | `TikTok` | 200 | Session ID | Dùng TikTok unofficial API |
|
||||
| **Google Translate** | `GTTS` | 5,000 | Không | Dùng gTTS library |
|
||||
| **AWS Polly** | `AWSPolly` | 3,000 | AWS Profile | Neural engine, 15 voices |
|
||||
| **Streamlabs Polly** | `StreamlabsPolly` | 550 | Không | Free Polly wrapper |
|
||||
| **ElevenLabs** | `elevenlabs` | 2,500 | API Key | Multilingual v1 model |
|
||||
| **OpenAI** | `OpenAITTS` | 4,096 | API Key | tts-1, tts-1-hd, gpt-4o-mini-tts |
|
||||
| **pyttsx3** | `pyttsx` | 5,000 | Không | Offline, system voices |
|
||||
|
||||
**TTSEngine wrapper** (`TTS/engine_wrapper.py`):
|
||||
- Nhận reddit object → tạo MP3 cho title + mỗi comment
|
||||
- Tự động **split** text dài hơn `max_chars` thành nhiều phần, dùng FFmpeg concat
|
||||
- Thêm **silence** giữa các phần (`silence_duration`, mặc định 0.3s)
|
||||
- Sanitize text: xóa URLs, ký tự đặc biệt, thay `+` → "plus", `&` → "and"
|
||||
- Hỗ trợ **dịch** sang ngôn ngữ khác (qua `translators` library)
|
||||
- Tính tổng `length` audio → dùng cho video length
|
||||
- **Max video length**: mặc định 50 giây (hardcoded `DEFAULT_MAX_LENGTH`)
|
||||
- **Output**: MP3 files trong `assets/temp/<thread_id>/mp3/`
|
||||
|
||||
### Bước 3: Screenshot Reddit Posts (`video_creation/screenshot_downloader.py`)
|
||||
|
||||
- Dùng **Playwright** (Chromium headless)
|
||||
- **Login** vào Reddit (username/password)
|
||||
- Truy cập thread URL trên `new.reddit.com`
|
||||
- Hỗ trợ **Dark/Light/Transparent** theme (load cookies tương ứng)
|
||||
- Chụp screenshot:
|
||||
- **Title** → `assets/temp/<id>/png/title.png`
|
||||
- **Comments** → `assets/temp/<id>/png/comment_<i>.png`
|
||||
- **Story content** → `assets/temp/<id>/png/story_content.png`
|
||||
- Hỗ trợ **zoom** (scale browser)
|
||||
- Hỗ trợ **dịch** text trước khi chụp
|
||||
- Xử lý NSFW warning popup
|
||||
- **Storymode method 1**: thay vì screenshot, dùng `imagemaker()` render ảnh từ text bằng PIL
|
||||
|
||||
### Bước 4: Background Video/Audio (`video_creation/background.py`)
|
||||
|
||||
**Background Videos** (10 options):
|
||||
| Name | Source | Credit |
|
||||
|------|--------|--------|
|
||||
| minecraft | YouTube parkour | bbswitzer |
|
||||
| minecraft-2 | YouTube | Itslpsn |
|
||||
| gta | GTA stunt race | Achy Gaming |
|
||||
| motor-gta | Bike parkour GTA | Achy Gaming |
|
||||
| rocket-league | Rocket League | Orbital Gameplay |
|
||||
| csgo-surf | CSGO Surf | Aki |
|
||||
| cluster-truck | Cluster Truck | No Copyright Gameplay |
|
||||
| multiversus | MultiVersus | MKIceAndFire |
|
||||
| fall-guys | Fall Guys | Throneful |
|
||||
| steep | Steep | joel |
|
||||
|
||||
**Background Audios** (3 options): `lofi`, `lofi-2`, `chill-summer`
|
||||
|
||||
- Tải bằng **yt-dlp** (chỉ lần đầu, cache ở `assets/backgrounds/`)
|
||||
- **Cắt ngẫu nhiên** đoạn video/audio dài bằng video length
|
||||
- Output: `assets/temp/<id>/background.mp4` và `background.mp3`
|
||||
|
||||
### Bước 5: Final Video (`video_creation/final_video.py`)
|
||||
|
||||
- **Concat** tất cả audio clips → `assets/temp/<id>/audio.mp3`
|
||||
- **Merge** background audio (volume configurable, mặc định 0.15)
|
||||
- **Prepare background**: crop video nền theo tỉ lệ `W/H` (mặc định 1080x1920 — portrait)
|
||||
- **Tạo fancy thumbnail**: lấy `title_template.png`, stretch middle section, vẽ title text lên
|
||||
- **Overlay** screenshots lên background video theo thời gian audio clips
|
||||
- Mỗi screenshot hiện trong khoảng thời gian tương ứng với audio clip của nó
|
||||
- Hỗ trợ `opacity` (mặc định 0.9)
|
||||
- **Draw credit text** ở góc dưới phải
|
||||
- **Render** bằng FFmpeg:
|
||||
- Codec: `h264_nvenc` (NVIDIA GPU acceleration)
|
||||
- Video bitrate: 20Mbps
|
||||
- Audio bitrate: 192kbps
|
||||
- Threads: `multiprocessing.cpu_count()`
|
||||
- **Optional**: Render thêm bản "OnlyTTS" (không có background audio)
|
||||
- **Save metadata** vào `videos.json`
|
||||
- **Cleanup** temp files
|
||||
- **Output**: `results/<subreddit>/<normalized_title>.mp4`
|
||||
|
||||
---
|
||||
|
||||
## 4. Cấu Hình (`config.toml`)
|
||||
|
||||
Cấu hình được validate tự động dựa trên template `utils/.config.template.toml`. Khi chạy lần đầu hoặc thiếu field, bot sẽ hỏi user nhập.
|
||||
|
||||
### `[reddit.creds]` — Thông tin đăng nhập Reddit
|
||||
| Key | Type | Required | Mô tả |
|
||||
|-----|------|----------|-------|
|
||||
| `client_id` | string | ✅ | Reddit App ID (12-30 chars) |
|
||||
| `client_secret` | string | ✅ | Reddit App Secret (20-40 chars) |
|
||||
| `username` | string | ✅ | Tên đăng nhập Reddit (3-20 chars) |
|
||||
| `password` | string | ✅ | Mật khẩu Reddit |
|
||||
| `2fa` | bool | ❌ | Bật 2FA? Default: `false` |
|
||||
|
||||
### `[reddit.thread]` — Cấu hình bài đăng
|
||||
| Key | Type | Default | Mô tả |
|
||||
|-----|------|---------|-------|
|
||||
| `subreddit` | string | — | Subreddit name (hỗ trợ `+` cho nhiều sub) |
|
||||
| `post_id` | string | `""` | Post ID cụ thể (hỗ trợ `+` cho nhiều ID) |
|
||||
| `random` | bool | `false` | Random thread? |
|
||||
| `max_comment_length` | int | `500` | Max ký tự/comment |
|
||||
| `min_comment_length` | int | `1` | Min ký tự/comment |
|
||||
| `post_lang` | string | `""` | Ngôn ngữ dịch (VD: `vi`, `es`, `ja`) |
|
||||
| `min_comments` | int | `20` | Min số comments của post |
|
||||
| `blocked_words` | string | `""` | Comma-separated blocked words |
|
||||
|
||||
### `[ai]` — AI Similarity
|
||||
| Key | Type | Default | Mô tả |
|
||||
|-----|------|---------|-------|
|
||||
| `ai_similarity_enabled` | bool | `false` | Bật sorting theo similarity |
|
||||
| `ai_similarity_keywords` | string | — | Keywords phân cách bằng dấu phẩy |
|
||||
|
||||
### `[settings]` — Cài đặt chung
|
||||
| Key | Type | Default | Mô tả |
|
||||
|-----|------|---------|-------|
|
||||
| `allow_nsfw` | bool | `false` | Cho phép NSFW? |
|
||||
| `theme` | string | `"dark"` | `dark` / `light` / `transparent` |
|
||||
| `times_to_run` | int | `1` | Số lần chạy liên tiếp |
|
||||
| `opacity` | float | `0.9` | Opacity overlayed comments (0-1) |
|
||||
| `storymode` | bool | `false` | Chỉ đọc title + post content |
|
||||
| `storymodemethod` | int | `1` | `0`: 1 ảnh cố định, `1`: ảnh fancy |
|
||||
| `storymode_max_length` | int | `1000` | Max ký tự cho storymode |
|
||||
| `resolution_w` | int | `1080` | Width video (pixels) |
|
||||
| `resolution_h` | int | `1920` | Height video (pixels) |
|
||||
| `zoom` | float | `1` | Browser zoom level (0.1-2.0) |
|
||||
| `channel_name` | string | `"Reddit Tales"` | Tên kênh hiển thị trên thumbnail |
|
||||
|
||||
### `[settings.background]` — Background
|
||||
| Key | Type | Default | Mô tả |
|
||||
|-----|------|---------|-------|
|
||||
| `background_video` | string | `"minecraft"` | Video nền |
|
||||
| `background_audio` | string | `"lofi"` | Audio nền |
|
||||
| `background_audio_volume` | float | `0.15` | Âm lượng audio nền (0=tắt) |
|
||||
| `enable_extra_audio` | bool | `false` | Render thêm bản không có bg audio |
|
||||
| `background_thumbnail` | bool | `false` | Tạo thumbnail? |
|
||||
| `background_thumbnail_font_*` | — | — | Font family/size/color cho thumbnail |
|
||||
|
||||
### `[settings.tts]` — Text-to-Speech
|
||||
| Key | Type | Default | Mô tả |
|
||||
|-----|------|---------|-------|
|
||||
| `voice_choice` | string | `"tiktok"` | TTS provider |
|
||||
| `random_voice` | bool | `true` | Random voice mỗi comment |
|
||||
| `silence_duration` | float | `0.3` | Khoảng lặng giữa các TTS (giây) |
|
||||
| `no_emojis` | bool | `false` | Xóa emojis? |
|
||||
| `tiktok_voice` | string | `"en_us_001"` | Voice cho TikTok TTS |
|
||||
| `tiktok_sessionid` | string | — | TikTok session ID |
|
||||
| `elevenlabs_voice_name` | string | `"Bella"` | Voice cho ElevenLabs |
|
||||
| `elevenlabs_api_key` | string | — | ElevenLabs API Key |
|
||||
| `aws_polly_voice` | string | `"Matthew"` | Voice cho AWS Polly |
|
||||
| `streamlabs_polly_voice` | string | `"Matthew"` | Voice cho Streamlabs |
|
||||
| `openai_api_url` | string | `"https://api.openai.com/v1/"` | OpenAI API endpoint |
|
||||
| `openai_api_key` | string | — | OpenAI API Key |
|
||||
| `openai_voice_name` | string | `"alloy"` | Voice cho OpenAI TTS |
|
||||
| `openai_model` | string | `"tts-1"` | Model OpenAI TTS |
|
||||
| `python_voice` | string | `"1"` | Index system voice |
|
||||
| `py_voice_num` | string | `"2"` | Số system voices |
|
||||
|
||||
---
|
||||
|
||||
## 5. Dependencies (`requirements.txt`)
|
||||
|
||||
| Package | Version | Vai trò |
|
||||
|---------|---------|---------|
|
||||
| `praw` | 7.8.1 | Reddit API wrapper |
|
||||
| `playwright` | 1.49.1 | Browser automation (screenshot) |
|
||||
| `moviepy` | 2.2.1 | Video/audio clip processing |
|
||||
| `ffmpeg-python` | 0.2.0 | FFmpeg pipeline builder |
|
||||
| `yt-dlp` | 2025.10.22 | YouTube video/audio downloader |
|
||||
| `gTTS` | 2.5.4 | Google Translate TTS |
|
||||
| `pyttsx3` | 2.98 | Offline system TTS |
|
||||
| `elevenlabs` | 1.57.0 | ElevenLabs TTS SDK |
|
||||
| `boto3` / `botocore` | 1.36.8 | AWS Polly TTS |
|
||||
| `requests` | 2.32.3 | HTTP requests (TikTok/Streamlabs API) |
|
||||
| `rich` | 13.9.4 | Terminal formatting (progress bars, panels) |
|
||||
| `toml` / `tomlkit` | 0.10.2 / 0.13.2 | Config file parsing |
|
||||
| `translators` | 5.9.9 | Multi-language translation |
|
||||
| `Pillow` (PIL) | — | Image processing (thumbnails, storymode) |
|
||||
| `clean-text` | 0.6.0 | Text cleaning (emoji removal) |
|
||||
| `unidecode` | 1.4.0 | Unicode → ASCII |
|
||||
| `spacy` | 3.8.7 | NLP (text processing) |
|
||||
| `torch` | 2.7.0 | PyTorch (AI similarity) |
|
||||
| `transformers` | 4.52.4 | HuggingFace transformers (sentence-transformers) |
|
||||
| `Flask` | 3.1.1 | Web GUI |
|
||||
|
||||
---
|
||||
|
||||
## 6. Hai Chế Độ Hoạt Động
|
||||
|
||||
### Mode 1: Comment Mode (mặc định)
|
||||
- Lấy **top comments** từ Reddit thread
|
||||
- Chuyển mỗi comment thành MP3 riêng
|
||||
- Chụp screenshot mỗi comment
|
||||
- Video hiển thị comments lần lượt
|
||||
|
||||
### Mode 2: Story Mode (`storymode = true`)
|
||||
- Chỉ đọc **title + selftext** của post
|
||||
- Hai method:
|
||||
- **Method 0**: Screenshot toàn bộ post content → 1 ảnh cố định
|
||||
- **Method 1**: Parse text thành từng đoạn → render từng ảnh riêng bằng PIL → hiệu ứng fancy
|
||||
|
||||
---
|
||||
|
||||
## 7. GUI Web (`GUI.py`)
|
||||
|
||||
- Framework: **Flask** (port 4000)
|
||||
- Routes:
|
||||
- `/` — Danh sách videos đã tạo (từ `videos.json`)
|
||||
- `/settings` — Form chỉnh sửa `config.toml`
|
||||
- `/backgrounds` — Quản lý background videos
|
||||
- `/background/add` — Thêm background mới
|
||||
- `/background/delete` — Xóa background
|
||||
- `/results/<path>` — Serve video files
|
||||
- `/voices/<path>` — Serve voice samples
|
||||
- Tự động mở browser khi chạy
|
||||
|
||||
---
|
||||
|
||||
## 8. Lưu Ý Kỹ Thuật Quan Trọng
|
||||
|
||||
### ⚠️ FFmpeg Encoder
|
||||
- Code sử dụng **`h264_nvenc`** (NVIDIA GPU encoder) — yêu cầu có GPU NVIDIA
|
||||
- Nếu không có GPU, cần sửa thành `libx264`
|
||||
|
||||
### ⚠️ Cleanup Bug
|
||||
- `utils/cleanup.py` sử dụng path `../assets/temp/{reddit_id}/` (relative path có `..`) — có thể gây lỗi tùy working directory
|
||||
|
||||
### ⚠️ Security Concerns
|
||||
- `utils/settings.py` sử dụng `eval()` 2 lần (dòng 33, 81) — đánh dấu `fixme` nhưng chưa sửa
|
||||
- `utils/console.py` cũng dùng `eval()` (dòng 105)
|
||||
|
||||
### ⚠️ Hardcoded Values
|
||||
- `DEFAULT_MAX_LENGTH = 50` (seconds) trong `TTS/engine_wrapper.py`
|
||||
- NSFW button selector hardcoded với post ID cụ thể (`#t3_12hmbug`) trong screenshot_downloader
|
||||
- `title_template.png` username position hardcoded tại `(205, 825)`
|
||||
|
||||
### ⚠️ Video Tracking
|
||||
- Videos đã tạo được lưu trong `video_creation/data/videos.json`
|
||||
- Mỗi entry: `{subreddit, id, time, background_credit, reddit_title, filename}`
|
||||
- Bot sẽ skip posts đã có trong list (trừ khi force bằng `post_id` config)
|
||||
|
||||
### ⚠️ AI Similarity Feature
|
||||
- Dùng `sentence-transformers/all-MiniLM-L6-v2` model
|
||||
- Tải model lần đầu chạy (~80MB)
|
||||
- Cosine similarity giữa thread titles+content với user keywords
|
||||
- Bật bằng `ai_similarity_enabled = true`
|
||||
|
||||
---
|
||||
|
||||
## 9. Cách Chạy
|
||||
|
||||
```bash
|
||||
# 1. Clone & setup
|
||||
git clone https://github.com/elebumm/RedditVideoMakerBot.git
|
||||
cd RedditVideoMakerBot
|
||||
python -m venv ./venv
|
||||
source ./venv/bin/activate # Linux/macOS
|
||||
# .\venv\Scripts\activate # Windows
|
||||
|
||||
# 2. Install dependencies
|
||||
pip install -r requirements.txt
|
||||
python -m playwright install
|
||||
python -m playwright install-deps
|
||||
|
||||
# 3. Chạy bot (CLI)
|
||||
python main.py
|
||||
|
||||
# 4. Hoặc chạy GUI
|
||||
python GUI.py
|
||||
```
|
||||
|
||||
### Docker:
|
||||
```bash
|
||||
docker build -t reddit-video-bot .
|
||||
docker run reddit-video-bot
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Sơ Đồ Module Dependencies
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
main["main.py"] --> reddit["reddit/subreddit.py"]
|
||||
main --> voices["video_creation/voices.py"]
|
||||
main --> screenshots["video_creation/screenshot_downloader.py"]
|
||||
main --> background["video_creation/background.py"]
|
||||
main --> final["video_creation/final_video.py"]
|
||||
|
||||
reddit --> praw["praw"]
|
||||
reddit --> ai["utils/ai_methods.py"]
|
||||
reddit --> sub_utils["utils/subreddit.py"]
|
||||
|
||||
voices --> engine["TTS/engine_wrapper.py"]
|
||||
engine --> tiktok["TTS/TikTok.py"]
|
||||
engine --> gtts["TTS/GTTS.py"]
|
||||
engine --> aws["TTS/aws_polly.py"]
|
||||
engine --> eleven["TTS/elevenlabs.py"]
|
||||
engine --> openai["TTS/openai_tts.py"]
|
||||
engine --> pyttsx["TTS/pyttsx.py"]
|
||||
engine --> streamlabs["TTS/streamlabs_polly.py"]
|
||||
|
||||
screenshots --> playwright["playwright"]
|
||||
screenshots --> imagenarator["utils/imagenarator.py"]
|
||||
|
||||
background --> ytdlp["yt-dlp"]
|
||||
background --> moviepy["moviepy"]
|
||||
|
||||
final --> ffmpeg["ffmpeg-python"]
|
||||
final --> pil["PIL/Pillow"]
|
||||
|
||||
ai --> torch["torch + transformers"]
|
||||
|
||||
subgraph "Shared Utils"
|
||||
settings["utils/settings.py"]
|
||||
console["utils/console.py"]
|
||||
voice_util["utils/voice.py"]
|
||||
video_util["utils/videos.py"]
|
||||
cleanup["utils/cleanup.py"]
|
||||
end
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
> 📝 **Document generated**: 2026-04-20 | Dựa trên phân tích toàn bộ source code của project.
|
||||
@ -0,0 +1,15 @@
|
||||
"""
|
||||
Manual Screenshot → Video Pipeline
|
||||
|
||||
This module provides an alternative workflow that creates videos
|
||||
from manually captured screenshots and text files, without requiring
|
||||
any social media API access.
|
||||
|
||||
Supported platforms: Reddit, Threads (Meta), X (Twitter), or any other.
|
||||
|
||||
Usage:
|
||||
python manual_main.py init <post_id> # Create folder structure
|
||||
python manual_main.py render <post_id> # Render one post
|
||||
python manual_main.py render --all # Render all unrendered posts
|
||||
python manual_main.py list # List all posts with status
|
||||
"""
|
||||
@ -0,0 +1,318 @@
|
||||
"""
|
||||
Scanner module for the manual pipeline.
|
||||
|
||||
Scans manual_posts/ directories for screenshots (.png), audio files (.mp3),
|
||||
and optional text files (.txt). Builds a unified post_object for processing.
|
||||
|
||||
Folder convention:
|
||||
manual_posts/
|
||||
└── my_post_001/
|
||||
├── meta.json (optional - metadata)
|
||||
├── 0_title.png (required - screenshot of post title)
|
||||
├── 0_title.mp3 (preferred - pre-recorded audio)
|
||||
├── 0_title.txt (fallback - text for TTS if no .mp3)
|
||||
├── 1_comment.png (optional - comment screenshots)
|
||||
├── 1_comment.mp3 (preferred - pre-recorded audio)
|
||||
├── 1_comment.txt (fallback - text for TTS if no .mp3)
|
||||
└── ...
|
||||
|
||||
Priority: .mp3 > .txt (if both exist, .mp3 is used and TTS is skipped).
|
||||
"""
|
||||
|
||||
import json
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Optional, Tuple
|
||||
|
||||
from utils.console import print_step, print_substep
|
||||
|
||||
|
||||
class PostScanner:
|
||||
"""Scans manual_posts/ directory, validates structure, builds post_object."""
|
||||
|
||||
# Regex pattern: <number>_<type>.<ext> where ext is png/jpg/jpeg/mp3/txt
|
||||
FILE_PATTERN = re.compile(r"^(\d+)_(title|comment)\.(png|jpg|jpeg|mp3|txt)$", re.IGNORECASE)
|
||||
|
||||
def __init__(self, input_dir: str = "manual_posts"):
|
||||
self.input_dir = Path(input_dir)
|
||||
|
||||
def scan_all(self) -> List[dict]:
|
||||
"""Scan all post folders in the input directory.
|
||||
|
||||
Returns:
|
||||
List of post_object dicts, sorted by folder name
|
||||
"""
|
||||
if not self.input_dir.exists():
|
||||
print_substep(f"Input directory '{self.input_dir}' does not exist.", style="red")
|
||||
return []
|
||||
|
||||
posts = []
|
||||
for post_dir in sorted(self.input_dir.iterdir()):
|
||||
if post_dir.is_dir() and not post_dir.name.startswith("."):
|
||||
post_obj = self.scan_one(post_dir.name)
|
||||
if post_obj is not None:
|
||||
posts.append(post_obj)
|
||||
|
||||
return posts
|
||||
|
||||
def scan_one(self, post_id: str) -> Optional[dict]:
|
||||
"""Scan a single post folder and build post_object.
|
||||
|
||||
Args:
|
||||
post_id: Name of the folder inside manual_posts/
|
||||
|
||||
Returns:
|
||||
post_object dict or None if invalid
|
||||
"""
|
||||
post_dir = self.input_dir / post_id
|
||||
|
||||
if not post_dir.exists():
|
||||
print_substep(f"Post directory '{post_dir}' does not exist.", style="red")
|
||||
return None
|
||||
|
||||
is_valid, errors = self.validate(post_dir)
|
||||
if not is_valid:
|
||||
print_substep(f"Validation failed for '{post_id}':", style="red")
|
||||
for err in errors:
|
||||
print_substep(f" ✗ {err}", style="red")
|
||||
return None
|
||||
|
||||
return self._build_post_object(post_dir)
|
||||
|
||||
def validate(self, post_dir: Path) -> Tuple[bool, List[str]]:
|
||||
"""Validate a post folder structure.
|
||||
|
||||
Checks:
|
||||
- At least 1 image file exists
|
||||
- Title image (0_title.png) exists
|
||||
- Each image has a corresponding .mp3 or .txt file
|
||||
- Files follow naming convention
|
||||
|
||||
Returns:
|
||||
(is_valid, list_of_errors)
|
||||
"""
|
||||
errors = []
|
||||
|
||||
# Gather all matching files
|
||||
images, audios, texts = self._categorize_files(post_dir)
|
||||
|
||||
# Check: at least 1 image
|
||||
if not images:
|
||||
errors.append("No image files found. Need at least 0_title.png")
|
||||
return False, errors
|
||||
|
||||
# Check: title image exists (index 0)
|
||||
if 0 not in images:
|
||||
errors.append("Missing title image: 0_title.png (must start with '0_')")
|
||||
|
||||
# Check: each image has a corresponding .mp3 or .txt file
|
||||
for idx in sorted(images.keys()):
|
||||
if idx not in audios and idx not in texts:
|
||||
errors.append(
|
||||
f"Missing audio/text for image #{idx}: "
|
||||
f"provide '{idx}_title.mp3' (or .txt as fallback)"
|
||||
)
|
||||
|
||||
# Check: text files (used as TTS fallback) are not empty
|
||||
for idx, txt_path in texts.items():
|
||||
if idx not in audios: # Only check .txt if no .mp3 exists
|
||||
content = txt_path.read_text(encoding="utf-8").strip()
|
||||
if not content:
|
||||
errors.append(f"Text file is empty (and no .mp3 provided): {txt_path.name}")
|
||||
|
||||
return len(errors) == 0, errors
|
||||
|
||||
def list_status(self) -> List[dict]:
|
||||
"""List all posts with their status.
|
||||
|
||||
Returns:
|
||||
List of dicts with keys: post_id, num_images, num_audios, num_texts, status
|
||||
"""
|
||||
if not self.input_dir.exists():
|
||||
return []
|
||||
|
||||
results = []
|
||||
for post_dir in sorted(self.input_dir.iterdir()):
|
||||
if not post_dir.is_dir() or post_dir.name.startswith("."):
|
||||
continue
|
||||
|
||||
images, audios, texts = self._categorize_files(post_dir)
|
||||
is_valid, errors = self.validate(post_dir)
|
||||
|
||||
# Determine status
|
||||
if not images:
|
||||
status = "empty"
|
||||
elif not is_valid:
|
||||
status = "incomplete"
|
||||
else:
|
||||
status = "ready"
|
||||
|
||||
results.append(
|
||||
{
|
||||
"post_id": post_dir.name,
|
||||
"num_images": len(images),
|
||||
"num_audios": len(audios),
|
||||
"num_texts": len(texts),
|
||||
"status": status,
|
||||
"errors": errors,
|
||||
}
|
||||
)
|
||||
|
||||
return results
|
||||
|
||||
def _categorize_files(self, post_dir: Path) -> Tuple[Dict[int, Path], Dict[int, Path], Dict[int, Path]]:
|
||||
"""Categorize files in a post directory into images, audios, and texts.
|
||||
|
||||
Returns:
|
||||
(images_dict, audios_dict, texts_dict) where key is the index number
|
||||
"""
|
||||
images = {} # {0: Path("0_title.png"), ...}
|
||||
audios = {} # {0: Path("0_title.mp3"), ...}
|
||||
texts = {} # {0: Path("0_title.txt"), ...}
|
||||
|
||||
for f in post_dir.iterdir():
|
||||
match = self.FILE_PATTERN.match(f.name)
|
||||
if match:
|
||||
idx = int(match.group(1))
|
||||
ext = match.group(3).lower()
|
||||
if ext in ("png", "jpg", "jpeg"):
|
||||
images[idx] = f
|
||||
elif ext == "mp3":
|
||||
audios[idx] = f
|
||||
elif ext == "txt":
|
||||
texts[idx] = f
|
||||
|
||||
return images, audios, texts
|
||||
|
||||
def _build_post_object(self, post_dir: Path) -> dict:
|
||||
"""Build the unified post_object from a validated post directory.
|
||||
|
||||
Returns:
|
||||
dict with structure:
|
||||
{
|
||||
"post_id": str,
|
||||
"platform": str,
|
||||
"title": str,
|
||||
"author": str,
|
||||
"url": str,
|
||||
"post_dir": str,
|
||||
"screenshots": [
|
||||
{
|
||||
"index": int,
|
||||
"type": "title" | "comment",
|
||||
"image_path": str,
|
||||
"text": str,
|
||||
"audio_path": None,
|
||||
"audio_duration": None,
|
||||
},
|
||||
...
|
||||
],
|
||||
"total_duration": 0,
|
||||
"output_path": None,
|
||||
}
|
||||
"""
|
||||
post_id = post_dir.name
|
||||
|
||||
# Read optional meta.json
|
||||
meta = self._read_meta(post_dir)
|
||||
|
||||
# Categorize files
|
||||
images, audios, texts = self._categorize_files(post_dir)
|
||||
|
||||
# Build screenshots list (sorted by index)
|
||||
screenshots = []
|
||||
for idx in sorted(images.keys()):
|
||||
img_path = images[idx]
|
||||
# Determine type from filename
|
||||
match = self.FILE_PATTERN.match(img_path.name)
|
||||
entry_type = match.group(2).lower() if match else "comment"
|
||||
|
||||
# Audio: prefer .mp3, fallback to .txt for TTS
|
||||
audio_path = str(audios[idx]) if idx in audios else None
|
||||
text_content = ""
|
||||
if idx in texts:
|
||||
text_content = texts[idx].read_text(encoding="utf-8").strip()
|
||||
|
||||
screenshots.append(
|
||||
{
|
||||
"index": idx,
|
||||
"type": entry_type,
|
||||
"image_path": str(img_path),
|
||||
"text": text_content,
|
||||
"audio_path": audio_path, # Pre-filled if .mp3 exists
|
||||
"audio_duration": None,
|
||||
}
|
||||
)
|
||||
|
||||
# Use title text, meta title, or folder name
|
||||
title = ""
|
||||
if screenshots and screenshots[0]["text"]:
|
||||
title = screenshots[0]["text"][:100]
|
||||
elif meta.get("title"):
|
||||
title = meta["title"]
|
||||
else:
|
||||
title = post_id
|
||||
|
||||
return {
|
||||
"post_id": post_id,
|
||||
"platform": meta.get("platform", "other"),
|
||||
"title": title,
|
||||
"author": meta.get("author", ""),
|
||||
"url": meta.get("url", ""),
|
||||
"post_dir": str(post_dir),
|
||||
"screenshots": screenshots,
|
||||
"total_duration": 0,
|
||||
"output_path": None,
|
||||
}
|
||||
|
||||
def _read_meta(self, post_dir: Path) -> dict:
|
||||
"""Read meta.json if it exists, return empty dict otherwise."""
|
||||
meta_path = post_dir / "meta.json"
|
||||
if meta_path.exists():
|
||||
try:
|
||||
with open(meta_path, "r", encoding="utf-8") as f:
|
||||
return json.load(f)
|
||||
except (json.JSONDecodeError, IOError) as e:
|
||||
print_substep(f"Warning: Could not read meta.json: {e}", style="yellow")
|
||||
return {}
|
||||
|
||||
|
||||
def create_post_folder(input_dir: str, post_id: str, platform: str = "reddit") -> Path:
|
||||
"""Create a new post folder with template files.
|
||||
|
||||
Args:
|
||||
input_dir: Base directory for manual posts
|
||||
post_id: Name for the new post folder
|
||||
platform: Source platform (reddit, threads, x, other)
|
||||
|
||||
Returns:
|
||||
Path to the created folder
|
||||
"""
|
||||
post_dir = Path(input_dir) / post_id
|
||||
post_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Create meta.json template
|
||||
meta = {
|
||||
"platform": platform,
|
||||
"post_id": post_id,
|
||||
"title": "",
|
||||
"author": "",
|
||||
"url": "",
|
||||
"created_at": "",
|
||||
"tags": [],
|
||||
"notes": "",
|
||||
}
|
||||
meta_path = post_dir / "meta.json"
|
||||
if not meta_path.exists():
|
||||
with open(meta_path, "w", encoding="utf-8") as f:
|
||||
json.dump(meta, f, indent=4, ensure_ascii=False)
|
||||
|
||||
print_step(f"Created post folder: {post_dir}")
|
||||
print_substep("Next steps:", style="bold cyan")
|
||||
print_substep(" 1. Add screenshots: 0_title.png, 1_comment.png, ...")
|
||||
print_substep(" 2. Add audio files: 0_title.mp3, 1_comment.mp3, ...")
|
||||
print_substep(" (Or use .txt files instead — TTS will generate audio)")
|
||||
print_substep(" 3. (Optional) Edit meta.json with post details")
|
||||
print_substep(f" 4. Run: python manual_main.py render {post_id}")
|
||||
|
||||
return post_dir
|
||||
@ -0,0 +1,277 @@
|
||||
"""
|
||||
TTS Processor for the manual pipeline.
|
||||
|
||||
Takes a post_object (built by scanner.py), generates MP3 audio files
|
||||
for each screenshot's text using the existing TTS engines, and updates
|
||||
the post_object with audio paths and durations.
|
||||
|
||||
Reuses TTS engines from TTS/ module — no code duplication.
|
||||
"""
|
||||
|
||||
import re
|
||||
from pathlib import Path
|
||||
from typing import Tuple
|
||||
|
||||
from moviepy import AudioFileClip
|
||||
|
||||
from utils import settings
|
||||
from utils.console import print_step, print_substep
|
||||
from utils.voice import sanitize_text
|
||||
|
||||
|
||||
class ManualTTSProcessor:
|
||||
"""Processes text-to-speech for manual pipeline posts."""
|
||||
|
||||
def __init__(self, post_object: dict, max_length: int = 120):
|
||||
"""
|
||||
Args:
|
||||
post_object: Post data from scanner.py
|
||||
max_length: Maximum total audio length in seconds (default: 120s = 2 min)
|
||||
"""
|
||||
self.post = post_object
|
||||
self.post_id = post_object["post_id"]
|
||||
self.max_length = max_length
|
||||
self.mp3_dir = Path(f"assets/temp/{self.post_id}/mp3")
|
||||
self.tts_module = None
|
||||
|
||||
def process(self) -> dict:
|
||||
"""Process audio for all screenshots.
|
||||
|
||||
For each screenshot:
|
||||
- If .mp3 already provided (audio_path set by scanner) → skip TTS, just measure duration
|
||||
- If only .txt provided → run TTS to generate .mp3
|
||||
- If neither → skip
|
||||
|
||||
Returns:
|
||||
Updated post_object with audio_path and audio_duration filled in
|
||||
"""
|
||||
self.mp3_dir.mkdir(parents=True, exist_ok=True)
|
||||
print_step("🔊 Processing audio files...")
|
||||
|
||||
total_duration = 0
|
||||
processed_count = 0
|
||||
tts_needed = False
|
||||
|
||||
for screenshot in self.post["screenshots"]:
|
||||
idx = screenshot["index"]
|
||||
|
||||
# Case 1: .mp3 already provided — just measure duration
|
||||
if screenshot.get("audio_path"):
|
||||
try:
|
||||
clip = AudioFileClip(screenshot["audio_path"])
|
||||
duration = clip.duration
|
||||
clip.close()
|
||||
except Exception as e:
|
||||
print_substep(f" ✗ Failed to read audio #{idx}: {e}", style="red")
|
||||
duration = 0
|
||||
|
||||
screenshot["audio_duration"] = duration
|
||||
total_duration += duration
|
||||
processed_count += 1
|
||||
print_substep(
|
||||
f" ✓ #{idx} → {duration:.1f}s (pre-recorded .mp3)",
|
||||
style="green",
|
||||
)
|
||||
continue
|
||||
|
||||
# Case 2: Only .txt provided — need TTS
|
||||
text = screenshot.get("text", "").strip()
|
||||
if not text:
|
||||
print_substep(
|
||||
f" ⚠ Screenshot #{idx} has no audio or text, skipping.",
|
||||
style="yellow",
|
||||
)
|
||||
continue
|
||||
|
||||
# Initialize TTS engine only when needed (lazy)
|
||||
if not tts_needed:
|
||||
print_substep(" 📝 Some entries need TTS generation...")
|
||||
self.tts_module = self._get_tts_engine()
|
||||
tts_needed = True
|
||||
|
||||
mp3_path = str(self.mp3_dir / f"{idx}.mp3")
|
||||
|
||||
# Sanitize and process text
|
||||
clean_text = self._process_text(text)
|
||||
if not clean_text or clean_text.isspace():
|
||||
print_substep(
|
||||
f" ⚠ Screenshot #{idx} text is empty after sanitization, skipping.",
|
||||
style="yellow",
|
||||
)
|
||||
continue
|
||||
|
||||
# Handle long text by splitting
|
||||
if len(clean_text) > self.tts_module.max_chars:
|
||||
self._generate_split_audio(clean_text, idx, mp3_path)
|
||||
else:
|
||||
self._generate_audio(clean_text, mp3_path)
|
||||
|
||||
# Measure duration
|
||||
try:
|
||||
clip = AudioFileClip(mp3_path)
|
||||
duration = clip.duration
|
||||
clip.close()
|
||||
except Exception as e:
|
||||
print_substep(f" ✗ Failed to read audio #{idx}: {e}", style="red")
|
||||
duration = 0
|
||||
|
||||
# Update screenshot entry
|
||||
screenshot["audio_path"] = mp3_path
|
||||
screenshot["audio_duration"] = duration
|
||||
total_duration += duration
|
||||
processed_count += 1
|
||||
|
||||
print_substep(
|
||||
f" ✓ #{idx} → {duration:.1f}s (TTS generated, {len(clean_text)} chars)",
|
||||
style="green",
|
||||
)
|
||||
|
||||
# Check max length
|
||||
if total_duration > self.max_length and processed_count > 1:
|
||||
print_substep(
|
||||
f" ⚠ Total duration ({total_duration:.1f}s) exceeds max ({self.max_length}s). "
|
||||
f"Stopping at {processed_count} clips.",
|
||||
style="yellow",
|
||||
)
|
||||
break
|
||||
|
||||
self.post["total_duration"] = total_duration
|
||||
print_substep(
|
||||
f"✅ {processed_count} audio clips ready, total: {total_duration:.1f}s",
|
||||
style="bold green",
|
||||
)
|
||||
|
||||
return self.post
|
||||
|
||||
def _get_tts_engine(self):
|
||||
"""Initialize the TTS engine based on config.
|
||||
|
||||
Reuses the TTS engines from video_creation/voices.py
|
||||
"""
|
||||
from TTS.GTTS import GTTS
|
||||
from TTS.TikTok import TikTok
|
||||
from TTS.aws_polly import AWSPolly
|
||||
from TTS.elevenlabs import elevenlabs
|
||||
from TTS.openai_tts import OpenAITTS
|
||||
from TTS.pyttsx import pyttsx
|
||||
from TTS.streamlabs_polly import StreamlabsPolly
|
||||
|
||||
providers = {
|
||||
"googletranslate": GTTS,
|
||||
"awspolly": AWSPolly,
|
||||
"streamlabspolly": StreamlabsPolly,
|
||||
"tiktok": TikTok,
|
||||
"pyttsx": pyttsx,
|
||||
"elevenlabs": elevenlabs,
|
||||
"openai": OpenAITTS,
|
||||
}
|
||||
|
||||
voice_choice = settings.config["settings"]["tts"]["voice_choice"]
|
||||
engine_class = providers.get(str(voice_choice).lower())
|
||||
|
||||
if engine_class is None:
|
||||
print_substep(
|
||||
f"Unknown TTS provider: {voice_choice}. Falling back to GoogleTranslate.",
|
||||
style="yellow",
|
||||
)
|
||||
engine_class = GTTS
|
||||
|
||||
print_substep(f"Using TTS engine: {engine_class.__name__}")
|
||||
return engine_class()
|
||||
|
||||
def _generate_audio(self, text: str, filepath: str):
|
||||
"""Generate a single audio file from text."""
|
||||
try:
|
||||
random_voice = settings.config["settings"]["tts"].get("random_voice", False)
|
||||
|
||||
if str(settings.config["settings"]["tts"]["voice_choice"]).lower() == "googletranslate":
|
||||
# GTTS doesn't support random_voice parameter
|
||||
self.tts_module.run(text, filepath=filepath)
|
||||
else:
|
||||
self.tts_module.run(text, filepath=filepath, random_voice=random_voice)
|
||||
except Exception as e:
|
||||
print_substep(f" ✗ TTS generation failed: {e}", style="red")
|
||||
raise
|
||||
|
||||
def _generate_split_audio(self, text: str, idx: int, final_path: str):
|
||||
"""Split long text and concat into one audio file.
|
||||
|
||||
For texts longer than the TTS engine's max_chars limit.
|
||||
"""
|
||||
import os
|
||||
|
||||
# Split text into chunks at sentence boundaries
|
||||
max_chars = self.tts_module.max_chars
|
||||
chunks = [
|
||||
x.group().strip()
|
||||
for x in re.finditer(
|
||||
r" *(((.|\\n){0," + str(max_chars) + r"})(\.|.$))", text
|
||||
)
|
||||
]
|
||||
|
||||
if not chunks:
|
||||
chunks = [text[:max_chars]]
|
||||
|
||||
part_files = []
|
||||
for part_idx, chunk in enumerate(chunks):
|
||||
if not chunk or chunk.isspace():
|
||||
continue
|
||||
part_path = str(self.mp3_dir / f"{idx}-{part_idx}.part.mp3")
|
||||
self._generate_audio(chunk, part_path)
|
||||
part_files.append(part_path)
|
||||
|
||||
if not part_files:
|
||||
return
|
||||
|
||||
# Concat using ffmpeg
|
||||
list_path = str(self.mp3_dir / f"{idx}_list.txt")
|
||||
with open(list_path, "w") as f:
|
||||
for part in part_files:
|
||||
f.write(f"file '{Path(part).name}'\n")
|
||||
|
||||
os.system(
|
||||
f"ffmpeg -f concat -y -hide_banner -loglevel panic -safe 0 "
|
||||
f"-i {list_path} -c copy {final_path}"
|
||||
)
|
||||
|
||||
# Cleanup part files
|
||||
for part in part_files:
|
||||
try:
|
||||
os.unlink(part)
|
||||
except OSError:
|
||||
pass
|
||||
try:
|
||||
os.unlink(list_path)
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
def _process_text(self, text: str) -> str:
|
||||
"""Clean and sanitize text for TTS.
|
||||
|
||||
- Removes lines starting with # (comments in txt files)
|
||||
- Sanitizes using existing sanitize_text()
|
||||
"""
|
||||
# Remove comment lines (lines starting with #)
|
||||
lines = text.split("\n")
|
||||
lines = [line for line in lines if not line.strip().startswith("#")]
|
||||
text = " ".join(lines).strip()
|
||||
|
||||
# Remove URLs
|
||||
regex_urls = r"((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:@\-_=#]+\.([a-zA-Z]){2,6}([a-zA-Z0-9\.\&\/\?\:@\-_=#])*"
|
||||
text = re.sub(regex_urls, " ", text)
|
||||
|
||||
# Replace newlines with periods for natural speech
|
||||
text = text.replace("\n", ". ")
|
||||
|
||||
# Add period at end if missing
|
||||
if text and text[-1] not in ".!?":
|
||||
text += "."
|
||||
|
||||
# Clean repeated dots
|
||||
text = re.sub(r"\.{2,}", ".", text)
|
||||
text = re.sub(r"\.\s*\.", ".", text)
|
||||
|
||||
# Use existing sanitize_text for final cleanup
|
||||
text = sanitize_text(text)
|
||||
|
||||
return text
|
||||
@ -0,0 +1,479 @@
|
||||
"""
|
||||
Video Builder for the manual pipeline.
|
||||
|
||||
Takes a post_object (with TTS audio already generated), downloads/chops
|
||||
background video and audio, overlays screenshots onto the background
|
||||
with correct timing, and renders the final video.
|
||||
|
||||
Reuses background download functions from video_creation/background.py.
|
||||
Uses libx264 encoder (CPU-based) by default.
|
||||
"""
|
||||
|
||||
import math
|
||||
import multiprocessing
|
||||
import os
|
||||
import re
|
||||
import tempfile
|
||||
import threading
|
||||
import time
|
||||
from pathlib import Path
|
||||
from typing import Dict, Tuple
|
||||
|
||||
import ffmpeg
|
||||
from moviepy import AudioFileClip, VideoFileClip
|
||||
from rich.console import Console
|
||||
|
||||
from utils import settings
|
||||
from utils.console import print_step, print_substep
|
||||
|
||||
console = Console()
|
||||
|
||||
|
||||
class ProgressFfmpeg(threading.Thread):
|
||||
"""Thread to monitor FFmpeg progress during rendering."""
|
||||
|
||||
def __init__(self, vid_duration_seconds, progress_update_callback):
|
||||
threading.Thread.__init__(self, name="ProgressFfmpeg")
|
||||
self.stop_event = threading.Event()
|
||||
self.output_file = tempfile.NamedTemporaryFile(mode="w+", delete=False)
|
||||
self.vid_duration_seconds = vid_duration_seconds
|
||||
self.progress_update_callback = progress_update_callback
|
||||
|
||||
def run(self):
|
||||
while not self.stop_event.is_set():
|
||||
latest_progress = self.get_latest_ms_progress()
|
||||
if latest_progress is not None:
|
||||
completed_percent = latest_progress / self.vid_duration_seconds
|
||||
self.progress_update_callback(completed_percent)
|
||||
time.sleep(1)
|
||||
|
||||
def get_latest_ms_progress(self):
|
||||
lines = self.output_file.readlines()
|
||||
if lines:
|
||||
for line in lines:
|
||||
if "out_time_ms" in line:
|
||||
out_time_ms_str = line.split("=")[1].strip()
|
||||
if out_time_ms_str.isnumeric():
|
||||
return float(out_time_ms_str) / 1000000.0
|
||||
return None
|
||||
|
||||
def stop(self):
|
||||
self.stop_event.set()
|
||||
|
||||
def __enter__(self):
|
||||
self.start()
|
||||
return self
|
||||
|
||||
def __exit__(self, *args, **kwargs):
|
||||
self.stop()
|
||||
|
||||
|
||||
class ManualVideoBuilder:
|
||||
"""Builds the final video from screenshots + TTS audio + background."""
|
||||
|
||||
def __init__(self, post_object: dict, manual_config: dict):
|
||||
"""
|
||||
Args:
|
||||
post_object: Post data with audio already generated (from tts_processor)
|
||||
manual_config: Manual-specific config dict
|
||||
"""
|
||||
self.post = post_object
|
||||
self.post_id = post_object["post_id"]
|
||||
self.config = manual_config
|
||||
self.temp_dir = Path(f"assets/temp/{self.post_id}")
|
||||
|
||||
# Video settings
|
||||
self.W = int(self.config.get("resolution_w", settings.config["settings"].get("resolution_w", 1080)))
|
||||
self.H = int(self.config.get("resolution_h", settings.config["settings"].get("resolution_h", 1920)))
|
||||
self.opacity = float(self.config.get("opacity", settings.config["settings"].get("opacity", 0.9)))
|
||||
self.encoder = self.config.get("encoder", "libx264")
|
||||
|
||||
# Background settings
|
||||
self.bg_video_name = self.config.get(
|
||||
"background_video",
|
||||
settings.config["settings"]["background"].get("background_video", "random"),
|
||||
)
|
||||
self.bg_audio_name = self.config.get(
|
||||
"background_audio",
|
||||
settings.config["settings"]["background"].get("background_audio", "random"),
|
||||
)
|
||||
self.bg_audio_volume = float(
|
||||
self.config.get(
|
||||
"background_audio_volume",
|
||||
settings.config["settings"]["background"].get("background_audio_volume", 0.15),
|
||||
)
|
||||
)
|
||||
|
||||
# Local background directories (user drops files here)
|
||||
self.bg_video_dir = Path(self.config.get("background_video_dir", "assets/backgrounds/video"))
|
||||
self.bg_audio_dir = Path(self.config.get("background_audio_dir", "assets/backgrounds/audio"))
|
||||
|
||||
# Output settings
|
||||
self.output_dir = Path(self.config.get("output_dir", "manual_results"))
|
||||
|
||||
def build(self) -> str:
|
||||
"""Build the final video.
|
||||
|
||||
Pipeline:
|
||||
1. Filter screenshots that have audio
|
||||
2. Download background video & audio (cached)
|
||||
3. Chop background to match video length
|
||||
4. Prepare background (crop to aspect ratio)
|
||||
5. Concat all audio clips → final audio track
|
||||
6. Mix with background audio
|
||||
7. Overlay screenshots onto background with timing
|
||||
8. Render final video
|
||||
|
||||
Returns:
|
||||
Path to the output video file
|
||||
"""
|
||||
# Filter screenshots with audio
|
||||
clips = [s for s in self.post["screenshots"] if s.get("audio_path") and s.get("audio_duration")]
|
||||
if not clips:
|
||||
print_substep("No audio clips found. Cannot create video.", style="red")
|
||||
return ""
|
||||
|
||||
total_duration = sum(s["audio_duration"] for s in clips)
|
||||
video_length = math.ceil(total_duration)
|
||||
|
||||
console.log(f"[bold green] Video will be: {video_length} seconds long ({len(clips)} clips)")
|
||||
|
||||
# Ensure temp directory exists
|
||||
self.temp_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Step 1: Download backgrounds
|
||||
print_step("📥 Downloading backgrounds (if needed)...")
|
||||
bg_config = self._get_background_config()
|
||||
self._download_backgrounds(bg_config)
|
||||
|
||||
# Step 2: Chop backgrounds to video length
|
||||
print_step("✂️ Chopping backgrounds to video length...")
|
||||
self._chop_backgrounds(bg_config, video_length)
|
||||
|
||||
# Step 3: Prepare background (crop to aspect ratio)
|
||||
print_step("🎬 Preparing background...")
|
||||
bg_path = self._prepare_background()
|
||||
background_clip = ffmpeg.input(bg_path)
|
||||
|
||||
# Step 4: Concat audio clips
|
||||
print_step("🔊 Building audio track...")
|
||||
audio_inputs = [ffmpeg.input(s["audio_path"]) for s in clips]
|
||||
audio_concat = ffmpeg.concat(*audio_inputs, a=1, v=0)
|
||||
audio_path = str(self.temp_dir / "audio.mp3")
|
||||
ffmpeg.output(
|
||||
audio_concat, audio_path, **{"b:a": "192k"}
|
||||
).overwrite_output().run(quiet=True)
|
||||
|
||||
# Step 5: Merge with background audio
|
||||
audio = ffmpeg.input(audio_path)
|
||||
final_audio = self._merge_background_audio(audio)
|
||||
|
||||
# Step 6: Overlay screenshots
|
||||
print_step("🖼️ Overlaying screenshots...")
|
||||
screenshot_width = int((self.W * 45) // 100)
|
||||
current_time = 0
|
||||
|
||||
for s in clips:
|
||||
img_input = ffmpeg.input(s["image_path"])["v"].filter("scale", screenshot_width, -1)
|
||||
img_overlay = img_input.filter("colorchannelmixer", aa=self.opacity)
|
||||
|
||||
background_clip = background_clip.overlay(
|
||||
img_overlay,
|
||||
enable=f"between(t,{current_time},{current_time + s['audio_duration']})",
|
||||
x="(main_w-overlay_w)/2",
|
||||
y="(main_h-overlay_h)/2",
|
||||
)
|
||||
current_time += s["audio_duration"]
|
||||
|
||||
# Scale to final resolution
|
||||
background_clip = background_clip.filter("scale", self.W, self.H)
|
||||
|
||||
# Step 7: Render
|
||||
print_step("🎥 Rendering the video...")
|
||||
self.output_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Normalize filename
|
||||
filename = self._normalize_filename(self.post.get("title", self.post_id))
|
||||
output_path = str(self.output_dir / f"{filename}.mp4")
|
||||
# Prevent path too long
|
||||
if len(output_path) > 251:
|
||||
output_path = output_path[:247] + ".mp4"
|
||||
|
||||
from tqdm import tqdm
|
||||
|
||||
pbar = tqdm(total=100, desc="Progress: ", bar_format="{l_bar}{bar}", unit=" %")
|
||||
|
||||
def on_update(progress):
|
||||
status = round(progress * 100, 2)
|
||||
old_percentage = pbar.n
|
||||
pbar.update(status - old_percentage)
|
||||
|
||||
with ProgressFfmpeg(video_length, on_update) as progress:
|
||||
try:
|
||||
ffmpeg.output(
|
||||
background_clip,
|
||||
final_audio,
|
||||
output_path,
|
||||
f="mp4",
|
||||
**{
|
||||
"c:v": self.encoder,
|
||||
"b:v": "20M",
|
||||
"b:a": "192k",
|
||||
"threads": multiprocessing.cpu_count(),
|
||||
},
|
||||
).overwrite_output().global_args(
|
||||
"-progress", progress.output_file.name
|
||||
).run(
|
||||
quiet=True,
|
||||
overwrite_output=True,
|
||||
capture_stdout=False,
|
||||
capture_stderr=False,
|
||||
)
|
||||
except ffmpeg.Error as e:
|
||||
print_substep(f"FFmpeg error: {e.stderr.decode('utf8') if e.stderr else str(e)}", style="red")
|
||||
pbar.close()
|
||||
return ""
|
||||
|
||||
old_percentage = pbar.n
|
||||
pbar.update(100 - old_percentage)
|
||||
pbar.close()
|
||||
|
||||
# Save to tracking (shared videos.json)
|
||||
self._save_tracking(bg_config, output_path)
|
||||
|
||||
# Cleanup temp files
|
||||
print_step("🗑️ Removing temporary files...")
|
||||
self._cleanup()
|
||||
|
||||
self.post["output_path"] = output_path
|
||||
print_step(f"✅ Done! Video saved to: {output_path}")
|
||||
|
||||
return output_path
|
||||
|
||||
def _scan_local_files(self, directory: Path, extensions: tuple) -> list:
|
||||
"""Scan a directory for files matching given extensions.
|
||||
|
||||
Returns:
|
||||
List of Path objects, sorted by name
|
||||
"""
|
||||
if not directory.exists():
|
||||
return []
|
||||
files = []
|
||||
for f in directory.iterdir():
|
||||
if f.is_file() and f.suffix.lower() in extensions:
|
||||
files.append(f)
|
||||
return sorted(files)
|
||||
|
||||
def _get_background_config(self) -> dict:
|
||||
"""Get background video & audio — local random or YouTube fallback.
|
||||
|
||||
Priority:
|
||||
1. Scan local directories for video/audio files
|
||||
2. If config is 'random' or local files exist → pick random from local
|
||||
3. If config is a specific name AND no local files → use YouTube download
|
||||
|
||||
Returns:
|
||||
dict with 'video_path', 'audio_path', 'video_credit', 'audio_credit'
|
||||
"""
|
||||
import random
|
||||
|
||||
result = {
|
||||
"video_path": None,
|
||||
"audio_path": None,
|
||||
"video_credit": "unknown",
|
||||
"audio_credit": "unknown",
|
||||
"_youtube_video": None, # YouTube config tuple (for download if needed)
|
||||
"_youtube_audio": None,
|
||||
}
|
||||
|
||||
# --- Video background ---
|
||||
video_exts = (".mp4", ".mkv", ".webm", ".avi", ".mov")
|
||||
local_videos = self._scan_local_files(self.bg_video_dir, video_exts)
|
||||
|
||||
if local_videos:
|
||||
# Pick random from local files
|
||||
chosen = random.choice(local_videos)
|
||||
result["video_path"] = str(chosen)
|
||||
result["video_credit"] = chosen.stem
|
||||
print_substep(f"🎬 Background video: {chosen.name} (random from {len(local_videos)} files)")
|
||||
else:
|
||||
# Fallback: YouTube download via background_options
|
||||
try:
|
||||
from video_creation.background import background_options
|
||||
video_name = self.bg_video_name
|
||||
if video_name == "random" or video_name not in background_options["video"]:
|
||||
video_name = random.choice(list(background_options["video"].keys()))
|
||||
result["_youtube_video"] = background_options["video"][video_name]
|
||||
print_substep(f"🎬 Background video: {video_name} (YouTube)")
|
||||
except Exception as e:
|
||||
print_substep(f"⚠ Could not load YouTube backgrounds: {e}", style="yellow")
|
||||
|
||||
# --- Audio background ---
|
||||
if self.bg_audio_volume > 0:
|
||||
audio_exts = (".mp3", ".wav", ".ogg", ".m4a", ".flac", ".aac")
|
||||
local_audios = self._scan_local_files(self.bg_audio_dir, audio_exts)
|
||||
|
||||
if local_audios:
|
||||
chosen = random.choice(local_audios)
|
||||
result["audio_path"] = str(chosen)
|
||||
result["audio_credit"] = chosen.stem
|
||||
print_substep(f"🎵 Background audio: {chosen.name} (random from {len(local_audios)} files)")
|
||||
else:
|
||||
try:
|
||||
from video_creation.background import background_options
|
||||
audio_name = self.bg_audio_name
|
||||
if audio_name == "random" or audio_name not in background_options["audio"]:
|
||||
audio_name = random.choice(list(background_options["audio"].keys()))
|
||||
result["_youtube_audio"] = background_options["audio"][audio_name]
|
||||
print_substep(f"🎵 Background audio: {audio_name} (YouTube)")
|
||||
except Exception as e:
|
||||
print_substep(f"⚠ Could not load YouTube audio backgrounds: {e}", style="yellow")
|
||||
|
||||
return result
|
||||
|
||||
def _download_backgrounds(self, bg_config: dict):
|
||||
"""Download YouTube backgrounds only if no local files were found."""
|
||||
if bg_config.get("_youtube_video"):
|
||||
from video_creation.background import download_background_video
|
||||
download_background_video(bg_config["_youtube_video"])
|
||||
# Set video_path to the downloaded file
|
||||
yt_cfg = bg_config["_youtube_video"]
|
||||
bg_config["video_path"] = f"assets/backgrounds/video/{yt_cfg[2]}-{yt_cfg[1]}"
|
||||
bg_config["video_credit"] = yt_cfg[2]
|
||||
|
||||
if bg_config.get("_youtube_audio"):
|
||||
from video_creation.background import download_background_audio
|
||||
download_background_audio(bg_config["_youtube_audio"])
|
||||
yt_cfg = bg_config["_youtube_audio"]
|
||||
bg_config["audio_path"] = f"assets/backgrounds/audio/{yt_cfg[2]}-{yt_cfg[1]}"
|
||||
bg_config["audio_credit"] = yt_cfg[2]
|
||||
|
||||
def _chop_backgrounds(self, bg_config: dict, video_length: int):
|
||||
"""Chop background video and audio to match the video length."""
|
||||
from video_creation.background import get_start_and_end_times
|
||||
|
||||
# Chop background audio
|
||||
if self.bg_audio_volume > 0 and bg_config.get("audio_path"):
|
||||
audio_file = bg_config["audio_path"]
|
||||
if Path(audio_file).exists():
|
||||
background_audio = AudioFileClip(audio_file)
|
||||
start_a, end_a = get_start_and_end_times(video_length, background_audio.duration)
|
||||
chopped = background_audio.subclipped(start_a, end_a)
|
||||
chopped.write_audiofile(str(self.temp_dir / "background.mp3"))
|
||||
background_audio.close()
|
||||
chopped.close()
|
||||
|
||||
# Chop background video
|
||||
video_file = bg_config.get("video_path")
|
||||
if video_file and Path(video_file).exists():
|
||||
with VideoFileClip(video_file) as video:
|
||||
start_v, end_v = get_start_and_end_times(video_length, video.duration)
|
||||
chopped = video.subclipped(start_v, end_v)
|
||||
chopped.write_videofile(str(self.temp_dir / "background.mp4"))
|
||||
else:
|
||||
print_substep("⚠ No background video file found!", style="red")
|
||||
raise FileNotFoundError(f"Background video not found: {video_file}")
|
||||
|
||||
def _prepare_background(self) -> str:
|
||||
"""Crop background video to correct aspect ratio (W:H).
|
||||
|
||||
Returns:
|
||||
Path to the cropped background video
|
||||
"""
|
||||
output_path = str(self.temp_dir / "background_noaudio.mp4")
|
||||
try:
|
||||
(
|
||||
ffmpeg.input(str(self.temp_dir / "background.mp4"))
|
||||
.filter("crop", f"ih*({self.W}/{self.H})", "ih")
|
||||
.output(
|
||||
output_path,
|
||||
an=None,
|
||||
**{
|
||||
"c:v": self.encoder,
|
||||
"b:v": "20M",
|
||||
"threads": multiprocessing.cpu_count(),
|
||||
},
|
||||
)
|
||||
.overwrite_output()
|
||||
.run(quiet=True)
|
||||
)
|
||||
except ffmpeg.Error as e:
|
||||
print_substep(f"Background prepare error: {e}", style="red")
|
||||
raise
|
||||
return output_path
|
||||
|
||||
def _merge_background_audio(self, tts_audio):
|
||||
"""Merge TTS audio with background audio.
|
||||
|
||||
Args:
|
||||
tts_audio: FFmpeg audio input of the TTS track
|
||||
|
||||
Returns:
|
||||
Merged audio stream or original if background audio disabled
|
||||
"""
|
||||
if self.bg_audio_volume == 0:
|
||||
return tts_audio
|
||||
|
||||
bg_audio_path = self.temp_dir / "background.mp3"
|
||||
if not bg_audio_path.exists():
|
||||
return tts_audio
|
||||
|
||||
bg_audio = ffmpeg.input(str(bg_audio_path)).filter("volume", self.bg_audio_volume)
|
||||
merged = ffmpeg.filter([tts_audio, bg_audio], "amix", duration="longest")
|
||||
return merged
|
||||
|
||||
def _normalize_filename(self, name: str) -> str:
|
||||
"""Normalize a string to be safe for filenames."""
|
||||
# Remove problematic characters
|
||||
name = re.sub(r'[?\\"%*:|<>]', "", name)
|
||||
name = re.sub(r"[/]", " ", name)
|
||||
name = name.strip()
|
||||
if not name:
|
||||
name = self.post_id
|
||||
# Limit length
|
||||
return name[:100]
|
||||
|
||||
def _save_tracking(self, bg_config: dict, output_path: str):
|
||||
"""Save rendered video info to shared videos.json.
|
||||
|
||||
Handles missing file gracefully (creates it if needed).
|
||||
Does NOT import from utils.videos to avoid praw dependency.
|
||||
"""
|
||||
import json
|
||||
import time as t
|
||||
|
||||
videos_path = Path("./video_creation/data/videos.json")
|
||||
videos_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Load existing data or start fresh
|
||||
done_vids = []
|
||||
if videos_path.exists():
|
||||
try:
|
||||
with open(videos_path, "r", encoding="utf-8") as f:
|
||||
done_vids = json.load(f)
|
||||
except (json.JSONDecodeError, IOError):
|
||||
done_vids = []
|
||||
|
||||
# Skip if already recorded
|
||||
if self.post_id in [v.get("id") for v in done_vids]:
|
||||
return
|
||||
|
||||
payload = {
|
||||
"subreddit": self.post.get("platform", "manual"),
|
||||
"id": self.post_id,
|
||||
"time": str(int(t.time())),
|
||||
"background_credit": bg_config.get("video_credit", "unknown"),
|
||||
"reddit_title": self.post.get("title", ""),
|
||||
"filename": Path(output_path).name,
|
||||
}
|
||||
done_vids.append(payload)
|
||||
|
||||
with open(videos_path, "w", encoding="utf-8") as f:
|
||||
json.dump(done_vids, f, ensure_ascii=False, indent=4)
|
||||
|
||||
def _cleanup(self):
|
||||
"""Remove temporary files for this post."""
|
||||
temp_path = f"assets/temp/{self.post_id}/"
|
||||
if Path(temp_path).exists():
|
||||
import shutil
|
||||
shutil.rmtree(temp_path)
|
||||
@ -0,0 +1,438 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Manual Screenshot → Video Pipeline — Entry Point
|
||||
|
||||
Create videos from manually captured screenshots and text files,
|
||||
without requiring any social media API access.
|
||||
|
||||
Supports screenshots from: Reddit, Threads (Meta), X (Twitter), or any platform.
|
||||
|
||||
Usage:
|
||||
python manual_main.py init <post_id> [--platform reddit|threads|x|other]
|
||||
python manual_main.py render <post_id>
|
||||
python manual_main.py render --all
|
||||
python manual_main.py list
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import sys
|
||||
from os.path import exists
|
||||
from pathlib import Path
|
||||
|
||||
import toml
|
||||
|
||||
from utils import settings
|
||||
from utils.console import print_markdown, print_step, print_substep
|
||||
from utils.ffmpeg_install import ffmpeg_install
|
||||
from manual.scanner import PostScanner
|
||||
from manual.tts_processor import ManualTTSProcessor
|
||||
from manual.video_builder import ManualVideoBuilder
|
||||
|
||||
__VERSION__ = "1.0.0"
|
||||
|
||||
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
# Configuration
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
|
||||
# Default config for manual pipeline (used when [manual] section not in config.toml)
|
||||
MANUAL_DEFAULTS = {
|
||||
"input_dir": "manual_posts",
|
||||
"output_dir": "manual_results",
|
||||
"encoder": "libx264",
|
||||
"resolution_w": 1080,
|
||||
"resolution_h": 1920,
|
||||
"opacity": 0.9,
|
||||
"background_video": "random",
|
||||
"background_audio": "random",
|
||||
"background_video_dir": "assets/backgrounds/video",
|
||||
"background_audio_dir": "assets/backgrounds/audio",
|
||||
"background_audio_volume": 0.1,
|
||||
"max_video_length": 120,
|
||||
}
|
||||
|
||||
# Full default settings.config that TTS engines and shared modules expect.
|
||||
# This ensures the manual flow works even if config.toml is empty or missing sections.
|
||||
_BASE_SETTINGS_DEFAULTS = {
|
||||
"reddit": {
|
||||
"creds": {
|
||||
"client_id": "",
|
||||
"client_secret": "",
|
||||
"username": "",
|
||||
"password": "",
|
||||
"2fa": False,
|
||||
},
|
||||
"thread": {
|
||||
"subreddit": "",
|
||||
"post_id": "",
|
||||
"max_comment_length": 500,
|
||||
"min_comment_length": 1,
|
||||
"post_lang": "vi",
|
||||
"min_comments": 20,
|
||||
"blocked_words": "",
|
||||
},
|
||||
},
|
||||
"ai": {
|
||||
"ai_similarity_enabled": False,
|
||||
"ai_similarity_keywords": "",
|
||||
},
|
||||
"settings": {
|
||||
"allow_nsfw": False,
|
||||
"theme": "dark",
|
||||
"times_to_run": 1,
|
||||
"opacity": 0.9,
|
||||
"storymode": False,
|
||||
"storymodemethod": 1,
|
||||
"storymode_max_length": 1000,
|
||||
"resolution_w": 1080,
|
||||
"resolution_h": 1920,
|
||||
"zoom": 1,
|
||||
"channel_name": "Reddit Tales",
|
||||
"background": {
|
||||
"background_video": "minecraft",
|
||||
"background_audio": "lofi",
|
||||
"background_audio_volume": 0.1,
|
||||
"enable_extra_audio": False,
|
||||
"background_thumbnail": False,
|
||||
"background_thumbnail_font_family": "arial",
|
||||
"background_thumbnail_font_size": 96,
|
||||
"background_thumbnail_font_color": "255,255,255",
|
||||
},
|
||||
"tts": {
|
||||
"voice_choice": "googletranslate",
|
||||
"random_voice": False,
|
||||
"elevenlabs_voice_name": "Bella",
|
||||
"elevenlabs_api_key": "",
|
||||
"aws_polly_voice": "Matthew",
|
||||
"streamlabs_polly_voice": "Matthew",
|
||||
"tiktok_voice": "en_us_001",
|
||||
"tiktok_sessionid": "",
|
||||
"python_voice": "1",
|
||||
"py_voice_num": "2",
|
||||
"silence_duration": 0.3,
|
||||
"no_emojis": False,
|
||||
"openai_api_url": "https://api.openai.com/v1/",
|
||||
"openai_api_key": "",
|
||||
"openai_voice_name": "alloy",
|
||||
"openai_model": "tts-1",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _deep_merge(base: dict, override: dict) -> dict:
|
||||
"""Deep merge two dicts. Values in 'override' take priority."""
|
||||
result = base.copy()
|
||||
for key, value in override.items():
|
||||
if key in result and isinstance(result[key], dict) and isinstance(value, dict):
|
||||
result[key] = _deep_merge(result[key], value)
|
||||
else:
|
||||
result[key] = value
|
||||
return result
|
||||
|
||||
|
||||
def load_config() -> dict:
|
||||
"""Load config and set up settings.config for TTS engines and backgrounds.
|
||||
|
||||
Strategy:
|
||||
1. Start with full default config (so TTS engines always have what they need)
|
||||
2. If config.toml exists and has content, deep-merge on top of defaults
|
||||
3. Extract [manual] section for manual-specific settings
|
||||
4. Set settings.config globally so shared modules (TTS, background, etc.) work
|
||||
|
||||
Returns:
|
||||
dict: Manual-specific config merged with defaults
|
||||
"""
|
||||
# Start with complete defaults
|
||||
config = _deep_merge({}, _BASE_SETTINGS_DEFAULTS)
|
||||
|
||||
# Try to load config.toml and merge on top
|
||||
config_path = Path("config.toml")
|
||||
if config_path.exists():
|
||||
try:
|
||||
file_config = toml.load(str(config_path))
|
||||
if file_config: # Not empty
|
||||
config = _deep_merge(config, file_config)
|
||||
print_substep("Loaded config from config.toml", style="dim")
|
||||
except Exception as e:
|
||||
print_substep(f"Warning: Could not parse config.toml: {e}", style="yellow")
|
||||
else:
|
||||
print_substep(
|
||||
"config.toml not found — using built-in defaults. "
|
||||
"TTS will use GoogleTranslate (no API key needed).",
|
||||
style="yellow",
|
||||
)
|
||||
|
||||
# Set global settings.config so TTS engines and shared modules work
|
||||
settings.config = config
|
||||
|
||||
# Build manual-specific config: defaults + [manual] section from config.toml
|
||||
manual_config = {**MANUAL_DEFAULTS}
|
||||
if "manual" in config:
|
||||
manual_config.update(config["manual"])
|
||||
|
||||
return manual_config
|
||||
|
||||
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
# Commands
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def cmd_init(args, manual_config):
|
||||
"""Create a new post folder with template files."""
|
||||
from manual.scanner import create_post_folder
|
||||
|
||||
post_id = args.post_id
|
||||
platform = getattr(args, "platform", "reddit")
|
||||
|
||||
input_dir = manual_config["input_dir"]
|
||||
post_dir = create_post_folder(input_dir, post_id, platform)
|
||||
|
||||
print_markdown(f"### Post folder created: `{post_dir}`")
|
||||
|
||||
|
||||
def cmd_render(args, manual_config):
|
||||
"""Render one or all posts into videos."""
|
||||
|
||||
scanner = PostScanner(input_dir=manual_config["input_dir"])
|
||||
|
||||
if args.all:
|
||||
# Render all ready posts
|
||||
posts = scanner.scan_all()
|
||||
if not posts:
|
||||
print_substep("No valid posts found in the input directory.", style="red")
|
||||
return
|
||||
|
||||
# Filter out already rendered
|
||||
posts_to_render = []
|
||||
for post in posts:
|
||||
if _is_already_done(post["post_id"]):
|
||||
print_substep(f" ⏭ {post['post_id']} — already rendered, skipping", style="blue")
|
||||
else:
|
||||
posts_to_render.append(post)
|
||||
|
||||
if not posts_to_render:
|
||||
print_substep("All posts have already been rendered!", style="green")
|
||||
return
|
||||
|
||||
print_step(f"📋 Rendering {len(posts_to_render)} posts...")
|
||||
for i, post in enumerate(posts_to_render):
|
||||
print_markdown(
|
||||
f"### [{i+1}/{len(posts_to_render)}] Rendering: {post['post_id']}"
|
||||
)
|
||||
_render_single(post, manual_config)
|
||||
else:
|
||||
# Render single post
|
||||
if not args.post_id:
|
||||
print_substep("Please specify a post_id or use --all", style="red")
|
||||
return
|
||||
|
||||
post = scanner.scan_one(args.post_id)
|
||||
if post is None:
|
||||
return # Error already printed by scanner
|
||||
|
||||
if _is_already_done(post["post_id"]) and not args.force:
|
||||
print_substep(
|
||||
f"Post '{post['post_id']}' already rendered. Use --force to re-render.",
|
||||
style="yellow",
|
||||
)
|
||||
return
|
||||
|
||||
_render_single(post, manual_config)
|
||||
|
||||
|
||||
def _render_single(post_object: dict, manual_config: dict):
|
||||
"""Render a single post into a video.
|
||||
|
||||
Pipeline:
|
||||
1. TTS: Convert text → MP3 audio files
|
||||
2. Video: Assemble screenshots + audio + background → MP4
|
||||
"""
|
||||
post_id = post_object["post_id"]
|
||||
print_step(f"🚀 Starting render for: {post_id}")
|
||||
|
||||
# Step 1: TTS
|
||||
max_length = manual_config.get("max_video_length", 120)
|
||||
tts = ManualTTSProcessor(post_object, max_length=max_length)
|
||||
post_object = tts.process()
|
||||
|
||||
# Check if we have audio
|
||||
clips_with_audio = [s for s in post_object["screenshots"] if s.get("audio_path")]
|
||||
if not clips_with_audio:
|
||||
print_substep("No audio generated. Check text files.", style="red")
|
||||
return
|
||||
|
||||
# Step 2: Video build
|
||||
builder = ManualVideoBuilder(post_object, manual_config)
|
||||
output_path = builder.build()
|
||||
|
||||
if output_path:
|
||||
print_markdown(f"### ✅ Video saved: `{output_path}`")
|
||||
else:
|
||||
print_substep("Video rendering failed.", style="red")
|
||||
|
||||
|
||||
def cmd_list(args, manual_config):
|
||||
"""List all posts and their status."""
|
||||
from manual.scanner import PostScanner
|
||||
|
||||
scanner = PostScanner(input_dir=manual_config["input_dir"])
|
||||
statuses = scanner.list_status()
|
||||
|
||||
if not statuses:
|
||||
print_substep(
|
||||
f"No posts found in '{manual_config['input_dir']}/'. "
|
||||
f"Run 'python manual_main.py init <post_id>' to create one.",
|
||||
style="yellow",
|
||||
)
|
||||
return
|
||||
|
||||
# Status emoji map
|
||||
status_icons = {
|
||||
"ready": "✅",
|
||||
"incomplete": "⚠️",
|
||||
"empty": "❌",
|
||||
}
|
||||
|
||||
print_step("📋 Manual Posts Status")
|
||||
print()
|
||||
|
||||
for s in statuses:
|
||||
icon = status_icons.get(s["status"], "❓")
|
||||
rendered = "🎬" if _is_already_done(s["post_id"]) else " "
|
||||
print_substep(
|
||||
f" {icon} {rendered} {s['post_id']:30s} "
|
||||
f"| {s['num_images']} 🖼️ {s.get('num_audios', 0)} 🎵 {s['num_texts']} 📝 "
|
||||
f"| {s['status']}",
|
||||
style="bold" if s["status"] == "ready" else "",
|
||||
)
|
||||
if s["errors"]:
|
||||
for err in s["errors"]:
|
||||
print_substep(f" ↳ {err}", style="red")
|
||||
|
||||
print()
|
||||
ready_count = sum(1 for s in statuses if s["status"] == "ready")
|
||||
rendered_count = sum(1 for s in statuses if _is_already_done(s["post_id"]))
|
||||
print_substep(
|
||||
f" Total: {len(statuses)} posts | "
|
||||
f"{ready_count} ready | "
|
||||
f"{rendered_count} rendered",
|
||||
style="bold cyan",
|
||||
)
|
||||
|
||||
|
||||
def _is_already_done(post_id: str) -> bool:
|
||||
"""Check if a post has already been rendered (shared videos.json)."""
|
||||
videos_path = "./video_creation/data/videos.json"
|
||||
if not exists(videos_path):
|
||||
return False
|
||||
try:
|
||||
with open(videos_path, "r", encoding="utf-8") as f:
|
||||
done_videos = json.load(f)
|
||||
return any(v.get("id") == post_id for v in done_videos)
|
||||
except (json.JSONDecodeError, IOError):
|
||||
return False
|
||||
|
||||
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
# CLI
|
||||
# ────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
parser = argparse.ArgumentParser(
|
||||
prog="manual_main.py",
|
||||
description="Manual Screenshot → Video Pipeline. "
|
||||
"Create videos from screenshots captured from Reddit, Threads, X, or any platform.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--version", action="version", version=f"%(prog)s {__VERSION__}"
|
||||
)
|
||||
subparsers = parser.add_subparsers(dest="command", help="Available commands")
|
||||
|
||||
# init command
|
||||
init_parser = subparsers.add_parser("init", help="Create a new post folder with template files")
|
||||
init_parser.add_argument("post_id", type=str, help="Name/ID for the post folder")
|
||||
init_parser.add_argument(
|
||||
"--platform",
|
||||
type=str,
|
||||
default="reddit",
|
||||
choices=["reddit", "threads", "x", "other"],
|
||||
help="Source platform (default: reddit)",
|
||||
)
|
||||
|
||||
# render command
|
||||
render_parser = subparsers.add_parser("render", help="Render post(s) into video(s)")
|
||||
render_parser.add_argument(
|
||||
"post_id", type=str, nargs="?", default=None, help="Post ID to render"
|
||||
)
|
||||
render_parser.add_argument(
|
||||
"--all", action="store_true", help="Render all unrendered posts"
|
||||
)
|
||||
render_parser.add_argument(
|
||||
"--force", action="store_true", help="Re-render even if already done"
|
||||
)
|
||||
|
||||
# list command
|
||||
subparsers.add_parser("list", help="List all posts and their status")
|
||||
|
||||
return parser
|
||||
|
||||
|
||||
def main():
|
||||
print(
|
||||
"""
|
||||
╔══════════════════════════════════════════════════════════╗
|
||||
║ Manual Screenshot → Video Pipeline v1.0.0 ║
|
||||
║ Supports: Reddit • Threads • X • Any Platform ║
|
||||
╚══════════════════════════════════════════════════════════╝
|
||||
"""
|
||||
)
|
||||
|
||||
parser = build_parser()
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.command:
|
||||
parser.print_help()
|
||||
sys.exit(1)
|
||||
|
||||
# Check Python version
|
||||
if sys.version_info.major != 3 or sys.version_info.minor not in [10, 11, 12]:
|
||||
print("This program requires Python 3.10, 3.11, or 3.12.")
|
||||
sys.exit(1)
|
||||
|
||||
# Check FFmpeg
|
||||
ffmpeg_install()
|
||||
|
||||
# Load config
|
||||
manual_config = load_config()
|
||||
|
||||
# Create input directory if it doesn't exist
|
||||
input_dir = Path(manual_config["input_dir"])
|
||||
input_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Dispatch command
|
||||
commands = {
|
||||
"init": cmd_init,
|
||||
"render": cmd_render,
|
||||
"list": cmd_list,
|
||||
}
|
||||
|
||||
cmd_func = commands.get(args.command)
|
||||
if cmd_func:
|
||||
try:
|
||||
cmd_func(args, manual_config)
|
||||
except KeyboardInterrupt:
|
||||
print("\nInterrupted by user.")
|
||||
sys.exit(0)
|
||||
except Exception as e:
|
||||
print_substep(f"Error: {e}", style="red")
|
||||
raise
|
||||
else:
|
||||
parser.print_help()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@ -1,18 +1,18 @@
|
||||
{
|
||||
"__comment": "Supported Backgrounds Audio. Can add/remove background audio here...",
|
||||
"lofi": [
|
||||
"https://www.youtube.com/watch?v=LTphVIore3A",
|
||||
"https://www.youtube.com/watch?v=Q7HjxOAU5Kc",
|
||||
"lofi.mp3",
|
||||
"Super Lofi World"
|
||||
"Breaking Copyright"
|
||||
],
|
||||
"lofi-2":[
|
||||
"https://www.youtube.com/watch?v=BEXL80LS0-I",
|
||||
"https://www.youtube.com/watch?v=cTMOQiY0axo",
|
||||
"lofi-2.mp3",
|
||||
"stompsPlaylist"
|
||||
"Breaking Copyright"
|
||||
],
|
||||
"chill-summer":[
|
||||
"https://www.youtube.com/watch?v=EZE8JagnBI8",
|
||||
"chill-summer.mp3",
|
||||
"Mellow Vibes Radio"
|
||||
"lofi-3":[
|
||||
"https://www.youtube.com/watch?v=4sFVeqvJu-0",
|
||||
"lofi-3.mp3",
|
||||
"Chill - Copyright Free Music"
|
||||
]
|
||||
}
|
||||
|
||||
Loading…
Reference in new issue