Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

indiser/ViralContent-Factory

Repository files navigation

🎬 ViralContent Factory

Autonomous AI-Powered Viral Content Generation Pipeline

Python MoviePy Edge TTS License

Fully automated Reddit story scraping β†’ AI voice synthesis β†’ viral short-form video generation

Features β€’ Architecture β€’ Installation β€’ Usage β€’ Tech Stack


πŸš€ Overview

ViralContent Factory is an end-to-end automated content generation system that transforms Reddit stories into professionally edited, viral-ready short-form videos for TikTok, YouTube Shorts, and Instagram Reels. The pipeline handles everything from content discovery to final video rendering with zero manual intervention.

πŸ’‘ What Makes This Special?

  • πŸ€– Fully Autonomous: Set it and forget it. The system runs via scheduled tasks (3 videos per batch)
  • 🧠 AI-Powered Intelligence: Multi-provider LLM router with automatic failover across 5+ AI services
  • 🎯 Production-Ready: Includes failover systems, persistent database, and email alerting
  • ⚑ Optimized Performance: Word-level subtitle sync, smart caching, and resource management
  • πŸ“Š Scalable Architecture: Modular phase-based design for easy extension and maintenance
  • πŸ”„ Smart LLM Routing: Automatic failover between Groq, Cerebras, Gemini, HuggingFace, and OpenRouter

✨ Features

πŸ” Phase 1: Intelligent Content Acquisition

  • Multi-Source Scraping: Waterfall system across 30+ high-engagement subreddits (AITA, TIFU, TrueOffMyChest, confessions, pettyrevenge, etc.)
  • Smart Filtering:
    • Language detection (English-only)
    • Optimal word count (120-380 words for 60-180 second videos)
    • Duplicate prevention via persistent JSON database
    • Automatic removal of deleted/removed posts
  • AI Enhancement:
    • Multi-provider LLM router with automatic quota management
    • Gender detection for voice matching (fast models)
    • Viral hook generation with creative reasoning (strong models)
    • Hook A/B testing (AI-generated vs original title ranking)
    • Dynamic SEO tag generation (5 keywords per video)
    • Slang/acronym normalization (AITA β†’ "Am I the jerk", 19F β†’ "a 19 year old woman", etc.)
  • Failover System: Falls back to local cold storage if all live sources fail
  • Upload Automation: YouTube and Instagram automation modules (setup required)

πŸŽ™οΈ Phase 2: Professional Audio Synthesis

  • Edge TTS Integration: Microsoft's neural voices for natural-sounding narration
  • Dynamic Voice Selection: Gender-matched voices (3 female variants: Jenny/Michelle/Aria, 1 male: Christopher)
  • Word-Level Timing: Precise timestamp extraction for perfect subtitle synchronization
  • Sync Offset System: Configurable timing adjustment (-0.3s default) for perfect alignment
  • Fallback Mechanisms: Sentence-level heuristics if word boundaries fail
  • JSON Export: Word-by-word timing data saved for video compositor

πŸŽ₯ Phase 3: Viral Video Composition

  • 9:16 Vertical Format: Optimized for mobile-first platforms
  • Dynamic Background Selection: Random gameplay footage (Minecraft, GTA 5)
  • Animated Subtitles:
    • Impact font with stroke for maximum readability
    • 3-word chunks with pop-in animations
    • Mathematically synced to word-level audio timestamps
    • Configurable sync offset for perfect timing
  • Smart Cropping: Automatic center-crop from 16:9 to 9:16
  • Random Start Points: Prevents repetitive background footage
  • Test Mode: 10-second preview rendering for quick testing

πŸ€– LLM Router System

  • Multi-Provider Architecture: Supports 5 AI providers with automatic failover
  • Intelligent Task Routing:
    • Fast models (OpenRouter, HuggingFace, Gemini) for classification and tagging
    • Strong models (Groq, Cerebras) for creative writing and reasoning
  • Quota Management: Automatically detects rate limits (429, 400 errors) and switches providers
  • Error Recovery: Retry logic with provider fallback chain
  • Cost Optimization: Routes cheap tasks to free tiers, expensive tasks to premium models

πŸ”§ Production Features

  • Automated Cleanup: Removes temporary audio/JSON files after each run
  • Batch Management: Collects 7+ videos before triggering upload alert
  • Email Notifications: Gmail SMTP alerts when batch threshold reached
  • Sanitized Filenames: OS-safe naming with Reddit ID-based uniqueness
  • Error Handling: Comprehensive try-catch blocks with detailed logging
  • Video Path Utilities: Batch processing helpers for upload automation
  • Persistent Database: JSON-based story tracking with "used" flag system
  • Sleep Prevention: Windows execution state management to prevent system sleep

πŸ—οΈ System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MAIN PIPELINE ORCHESTRATOR β”‚
β”‚ (main_pipeline.py) β”‚
β”‚ Prevents system sleep during run β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ β”‚
 β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Phase 1 │──────│ Phase 2 β”‚
β”‚ Scraper β”‚ β”‚ Audio β”‚
β”‚ +AI LLM β”‚ β”‚ +Timing β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
 β”‚ β”‚
 β”‚ β–Ό
 β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ β”‚ Phase 3 β”‚
 └───────────│ Video β”‚
 β”‚Compositorβ”‚
 β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
 β”‚
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Cleanup & β”‚
 β”‚ Notification β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
 β”‚
 β–Ό
 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 β”‚ Upload β”‚
 β”‚ Automation β”‚
 β”‚ (Manual/API) β”‚
 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

AutoContent/
β”œβ”€β”€ πŸ“œ main_pipeline.py # Orchestrator - coordinates all phases, prevents sleep
β”œβ”€β”€ πŸ” phase1.py # Content acquisition & AI processing (30+ subreddits)
β”œβ”€β”€ πŸŽ™οΈ phase2.py # Audio synthesis & word-level timestamp extraction
β”œβ”€β”€ πŸŽ₯ phase3.py # Video composition & subtitle rendering
β”œβ”€β”€ πŸ€– llm_router.py # Multi-provider LLM failover system (5 providers)
β”œβ”€β”€ πŸ“₯ yt_downloader.py # Background footage downloader (yt-dlp wrapper)
β”œβ”€β”€ πŸ“§ reminder.py # Batch management & email alerts (7-video threshold)
β”œβ”€β”€ πŸ“€ yt_automation.py # YouTube upload automation (OAuth setup required)
β”œβ”€β”€ πŸ“± ig_login.py # Run this script and log into instagram only once (One-Time Run)
β”œβ”€β”€ πŸ“± insta_automation.py # Instagram upload automation (Graph API setup required)
β”œβ”€β”€ πŸ”§ get_videopaths.py # Video path utility for batch processing
β”œβ”€β”€ βš™οΈ run_factory.bat # Windows Task Scheduler entry point (3 videos per run)
β”œβ”€β”€ πŸ“¦ requirements.txt # Python dependencies
β”œβ”€β”€ πŸ—„οΈ scripts.json # Persistent story database with "used" tracking
β”œβ”€β”€ πŸ“ hidden_depedencies.txt # System dependency checklist
β”œβ”€β”€ πŸ“„ TrendingDescription.txt # Sample trending content reference
β”œβ”€β”€ 🎬 downloads/ # Background video assets (2 videos included)
β”œβ”€β”€ πŸ“€ reels/ # Final rendered videos (staging area)
└── πŸ“¦ ready_to_upload/ # Batched videos ready for upload (7 videos)

πŸ› οΈ Tech Stack

Category Technology Purpose
Language Python 3.11+ Core runtime
AI/LLM Multi-Provider Router Groq, Cerebras, Gemini, HuggingFace, OpenRouter
Voice Synthesis Edge-TTS Neural text-to-speech (streaming)
Video Processing MoviePy 1.0.3 Compositing & rendering
Image Processing ImageMagick Text rendering backend for subtitles
Web Scraping Requests Reddit JSON API interaction
NLP langdetect Language filtering
Video Download yt-dlp Background footage acquisition
Email smtplib Gmail SMTP notifications
Environment python-dotenv Secure credential management

πŸ“¦ Installation

Prerequisites

# Required System Dependencies
- Python 3.11 or higher
- FFmpeg (for audio/video processing)
- ImageMagick (for subtitle rendering)
- Deno or Node.js (for yt-dlp YouTube signature extraction)

Step 1: Clone the Repository

git clone https://github.com/indiser/ViralContent-Factory.git
cd viralcontent-factory

Step 2: Install Python Dependencies

pip install -r requirements.txt

Dependencies installed:

  • requests
  • python-dotenv
  • langdetect
  • edge-tts
  • moviepy==1.0.3
  • yt-dlp
  • groq
  • openai
  • google-genai
  • huggingface_hub

Step 3: Install System Dependencies

Windows (via winget):

winget install Gyan.FFmpeg
winget install ImageMagick.ImageMagick
winget install DenoLand.Deno

macOS (via Homebrew):

brew install ffmpeg imagemagick deno

Linux (Ubuntu/Debian):

sudo apt update
sudo apt install ffmpeg imagemagick
curl -fsSL https://deno.land/install.sh | sh

Step 4: Configure Environment Variables

Create a .env file in the project root:

# LLM API Keys (at least one required, more = better failover)
GROQ_API_KEY=your_groq_api_key_here
CEREBRAS_API_KEY=your_cerebras_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
HUGGINGFACE_API_KEY=your_huggingface_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here
# Gmail SMTP (for batch notifications)
EMAIL_USER=your_email@gmail.com
EMAIL_APP_PASS=your_gmail_app_password

Note: For Gmail, you need to generate an App Password (not your regular password)

LLM Keys: You only need ONE API key to start, but having multiple provides better reliability through automatic failover

Step 5: Download Background Videos

python yt_downloader.py "https://youtube.com/watch?v=MINECRAFT_VIDEO_ID"
python yt_downloader.py "https://youtube.com/watch?v=GTA5_VIDEO_ID"

Or manually place 9:16 or 16:9 gameplay videos in the downloads/ folder.

Current background videos:

  • Insanely Crazy GTA 5 Mega Ramp Gameplay (4K 60fps)
  • Minecraft Parkour Gameplay No Copyright (4K)

Step 6: Configure ImageMagick Path (Windows Only)

Edit phase3.py line 5 to match your ImageMagick installation:

os.environ["IMAGEMAGICK_BINARY"] = r"C:\Program Files\ImageMagick-7.1.2-Q16-HDRI\magick.exe"

Step 7: Configure Batch Script Path (Windows Only)

Edit run_factory.bat lines 5 and 17 to match your project location and Python installation:

cd /d "C:\Users\YOUR_USERNAME\Desktop\AutoContent"
"C:\Path\To\Your\python.exe" main_pipeline.py

🎯 Usage

Manual Execution (Single Video)

python main_pipeline.py

Automated Batch Execution (Windows)

  1. Open Task Scheduler
  2. Create a new task:
    • Trigger: Daily at 3:00 AM (or your preferred time)
    • Action: Run run_factory.bat
  3. The system will automatically:
    • Generate 3 videos per run (configurable in batch script)
    • Collect videos until 7+ are ready
    • Send email alert when batch threshold is reached

Batch script configuration:

  • Edit run_factory.bat line 9 to change video count: FOR /L %%A IN (1,1,3) (change 3 to desired count)

Batch Management

python reminder.py

This checks if 7+ videos are ready and moves them to ready_to_upload/ folder.

Get Video Paths for Upload

python get_videopaths.py

Returns absolute paths of all videos in ready_to_upload/ for batch upload scripts.


πŸ“Š Workflow Example

1. [03:00 AM] Task Scheduler triggers run_factory.bat
2. [03:00:05] Phase 1 scrapes random subreddit from 30+ sources
3. [03:00:12] LLM Router tries OpenRouter β†’ generates viral hook
4. [03:00:15] Gender detected: Female β†’ Voice: en-US-AriaNeural (random from 3 variants)
5. [03:00:18] Hook ranking: AI vs Original β†’ Winner selected
6. [03:00:22] SEO tags generated: ["reddit", "storytime", "drama", ...]
7. [03:00:45] Phase 2 generates audio + word-level timestamps
8. [03:01:30] Phase 3 renders vertical video with animated subtitles
9. [03:02:00] Cleanup removes temporary audio/JSON files
10. [03:02:05] Loop repeats 2 more times (3 videos total per run)
11. [03:06:15] Reminder script checks inventory (9/7 videos)
12. [03:06:20] Email sent: "🟒 FACTORY ALERT: Weekly Batch Ready"
13. [03:06:25] 9 videos moved to ready_to_upload/ folder
14. [Manual] Run upload automation scripts or manual upload

🎨 Customization

Add More Subreddits

Edit phase1.py lines 30-65:

SUBREDDITS = [
 "AmItheAsshole",
 "AITAH",
 "YourNewSubreddit", # Add here
]

Current subreddits (30+): AmItheAsshole, AITAH, TrueOffMyChest, confessions, confession, tifu, pettyrevenge, entitledparents, MaliciousCompliance, EntitledPeople, relationships, relationship_advice, Vent, stories, moraldilemmas, self, PointlessStories, TwoHotTakes, dating, offmychest, UnsentLetters, SeriousConversation, Adulting, lonely, BreakUps, TalesFromTheFrontDesk, legaladvice, RBI, UnresolvedMysteries, Glitch_in_the_Matrix, raisedbynarcissists, dadjokes, Jokes

Change Voice Models

Edit phase2.py lines 6-10:

WOMAN_VOICE_LIST = [
 "en-US-JennyNeural",
 "en-US-MichelleNeural",
 "en-US-AriaNeural",
 "en-GB-SoniaNeural", # Add British accent
]

Male voice is set on line 19: "en-US-ChristopherNeural"

Adjust Video Length

Edit phase1.py line 175:

if 120 < len(words) < 380: # Change word count range (current: ~60-180 seconds)

Modify Subtitle Style

Edit phase3.py lines 33-44:

txt_clip = TextClip(
 chunk_text,
 font="Impact", # Change font
 fontsize=85, # Adjust size
 color="white", # Change color
 stroke_color="black", # Outline color
 stroke_width=5, # Outline thickness
 method="caption",
 size=(video_width * 0.9, None)
)

Adjust Subtitle Chunk Size

Edit phase3.py line 20:

chunk_size = 3 # Words per subtitle (current: 3 words)

Adjust Audio Sync Timing

If subtitles appear too early or late, edit phase2.py line 15:

SYNC_OFFSET = -0.3 # Negative = earlier, Positive = later

Configure LLM Provider Priority

Edit llm_router.py lines 125-127:

CHEAP_PROVIDERS = [openrouter_chat, hf_chat, gemini_chat]
STRONG_PROVIDERS = [groq_chat, cerebras_chat]

Enable Test Mode (10-second preview)

Edit main_pipeline.py line 18:

TEST_MODE = True # Renders only first 10 seconds

πŸ› Troubleshooting

Issue: "ImageMagick not found"

Solution: Update the path in phase3.py line 5 to match your installation

Issue: "No viable stories found"

Solution: The subreddit may have no posts matching criteria. The system will automatically try the next subreddit in the randomized list

Issue: "FFmpeg not found"

Solution: Ensure FFmpeg is in your system PATH. Run ffmpeg -version to verify. The yt_downloader.py script includes dependency checks

Issue: "Email sending failed"

Solution:

  1. Enable 2FA on Gmail
  2. Generate an App Password
  3. Use the App Password in .env, not your regular password

Issue: "All LLM providers failed"

Solution:

  1. Check that at least one API key is valid in .env
  2. Verify API quotas haven't been exceeded
  3. Check internet connection
  4. The router automatically tries all 5 providers before failing

Issue: "Word boundaries missing"

Solution: The system automatically falls back to sentence-level timing. This is expected behavior for some voices

Issue: "yt-dlp download fails"

Solution: Install Deno or Node.js for YouTube signature extraction. The script checks dependencies automatically


πŸ“ˆ Performance Metrics

  • Average Runtime: 2-3 minutes per video (single-threaded)
  • Batch Runtime: ~6-9 minutes for 3 videos (run_factory.bat default)
  • Video Quality: 1080x1920 @ 30fps (9:16 vertical)
  • Audio Quality: Edge TTS neural voices (streaming)
  • Storage: ~15-25MB per final video
  • Success Rate: 95%+ (with multi-subreddit + LLM failover)
  • LLM Failover: <2 seconds between provider switches
  • Subtitle Sync: Β±0.3s accuracy with configurable offset
  • Content Sources: 30+ subreddits with randomized selection

πŸ”’ Security & Privacy

  • βœ… No user data collection
  • βœ… API keys stored in .env (gitignored)
  • βœ… Reddit scraping complies with API terms
  • βœ… All content is public domain (Reddit posts)
  • βœ… No personal information in generated videos
  • βœ… Multi-provider LLM routing prevents vendor lock-in

🚧 Roadmap

  • Multi-provider LLM router with automatic failover (5 providers)
  • Batch video management system (7-video threshold)
  • Word-level subtitle synchronization with timing offset
  • Hook A/B testing (AI vs Original title ranking)
  • Dynamic SEO tag generation
  • Gender-based voice selection
  • Automated cleanup system
  • Email notification system
  • YouTube upload automation (OAuth setup required)
  • Instagram Reels upload automation
  • TikTok upload automation (no official API - Selenium needed)
  • Thumbnail generation with text overlay
  • Analytics dashboard (views, engagement tracking)
  • GPU-accelerated rendering (NVENC support)
  • Cloud deployment (AWS Lambda + S3)
  • Web UI for manual overrides
  • Multi-language support (Spanish, French, etc.)

🀝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

  • Reddit API - Content source
  • Microsoft Edge TTS - Neural voice synthesis
  • Groq, Cerebras, Gemini, HuggingFace, OpenRouter - LLM infrastructure
  • MoviePy - Video processing framework
  • yt-dlp - Video download utility

πŸ“ž Contact


⭐ If this project helped you, please consider giving it a star!

Made with ❀️ and Python

About

Autonomous AI pipeline that transforms Reddit stories into viral TikTok/YouTube Shorts. Features LLM-powered content curation, neural voice synthesis, and automated video composition. Built with Python, MoviePy, and Edge-TTS.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /