Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

AI-powered CLI tool that generates engaging podcast-style conversations with realistic text-to-speech capabilities

License

Notifications You must be signed in to change notification settings

EudaLabs/deepcast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

24 Commits

Repository files navigation

deepcast

License: MIT Python Version

deepcast is an AI-powered CLI tool that generates engaging podcast-style conversations with realistic text-to-speech capabilities. Perfect for creating educational content, practice conversations, or exploring topics in a dialogue format.

✨ Features

  • πŸ€– AI-Powered Conversations: Uses Deepseek-V3 model for generating natural, educational dialogues
  • 🎧 Interactive Format: Generates engaging podcast-style conversations between two speakers
  • πŸ“š Educational Content: Creates deep, insightful discussions on any given topic
  • πŸ—£οΈ Text-to-Speech: Integrates PlayHT for converting conversations into realistic audio
  • πŸš€ Background Music: Add ambient music with adjustable volume
  • 😊 Voice Emotions: Control speaker emotions (happy, serious, excited, etc.)
  • πŸ“„ Rich File Support: Generate from TXT, PDF, DOCX, EPUB, Markdown, HTML files
  • 🌐 Web Content: Generate from web articles, YouTube transcripts, and URLs
  • πŸ”„ Content Combination: Combine multiple sources into one podcast
  • 🌍 Multiple Languages: Support for English, Spanish, French, German, Italian, and Portuguese
  • 🎭 Podcast Styles: Different conversation styles (interview, debate, storytelling, etc.)
  • πŸ“Š Complexity Levels: Adjust content for beginner, intermediate, or expert audiences
  • πŸš€ Easy to Use: Simple CLI interface with rich terminal output

πŸ› οΈ Installation

  1. Clone the repository:
git clone https://github.com/byigitt/deepcast.git
cd deepcast
  1. Install dependencies using uv:
uv venv
uv pip install -e .
  1. Create a .env file from the example:
cp .env.example .env
  1. Add your API keys to the .env file:

πŸš€ Usage

View Available Options

List available podcast styles:

deepcast styles

List available background music:

deepcast music

List available voice emotions:

deepcast emotions

Generate from a Topic

Create a podcast about any topic with custom settings:

# Basic usage
deepcast generate "Quantum Computing"
# With custom style
deepcast generate "Quantum Computing" --style debate
# With background music
deepcast generate "Quantum Computing" --music ambient --volume 0.2
# With voice emotions
deepcast generate "Quantum Computing" \
 --speaker1-emotion professional \
 --speaker2-emotion friendly
# Full customization
deepcast generate "Quantum Computing" \
 --style educational \
 --complexity expert \
 --language french \
 --exchanges 7 \
 --music soft_piano \
 --volume 0.15 \
 --speaker1-emotion serious \
 --speaker2-emotion excited \
 --save-audio \
 --format mp3

Generate from Files

Create a podcast from various file types:

# From a single file with music
deepcast generate "Research Paper" \
 --file paper.pdf \
 --music ambient
# From multiple files with emotions
deepcast generate "Research Summary" \
 --file paper1.pdf \
 --file paper2.pdf \
 --speaker1-emotion professional \
 --speaker2-emotion friendly
# From different file types with full audio
deepcast generate "Documentation" \
 --file intro.md \
 --file chapter1.docx \
 --file appendix.pdf \
 --music soft_piano \
 --volume 0.2 \
 --save-audio

Generate from Web Content

Create a podcast from web content:

# From a web article with music
deepcast generate "News Article" \
 --url "https://example.com/article" \
 --music cinematic
# From a YouTube video with emotions
deepcast generate "Video Summary" \
 --youtube "https://youtube.com/watch?v=..." \
 --speaker1-emotion excited \
 --speaker2-emotion professional
# Combine web and file content with full audio
deepcast generate "Research Review" \
 --file paper.pdf \
 --url "https://example.com/article" \
 --youtube "https://youtube.com/watch?v=..." \
 --music jazz \
 --volume 0.1 \
 --save-audio

Output Options

Save the transcript to a file:

deepcast generate "Artificial Intelligence" --output transcript.txt

Only get the audio URL:

deepcast generate "Space Exploration" --audio-only

Save audio locally:

deepcast generate "Nature Documentary" \
 --music nature \
 --save-audio \
 --format mp3

Full Example

Combine all features:

deepcast generate "Advanced Physics" \
 --file research.pdf \
 --file notes.md \
 --url "https://example.com/article" \
 --youtube "https://youtube.com/watch?v=..." \
 --style educational \
 --complexity expert \
 --language french \
 --exchanges 7 \
 --music cinematic \
 --volume 0.15 \
 --speaker1-emotion professional \
 --speaker2-emotion excited \
 --save-audio \
 --format mp3 \
 --output transcript.txt

πŸ—οΈ Project Structure

src/
β”œβ”€β”€ models/ # Data models (Podcast, Config, Audio)
β”œβ”€β”€ services/ # Core services (LLM, Audio, File, Content)
β”œβ”€β”€ utils/ # Utility functions (Config)
└── cli.py # CLI interface

πŸ”§ Configuration

The following environment variables can be configured in .env:

  • OPENROUTER_API_KEY: Your OpenRouter API key for accessing the Deepseek model
  • FAL_KEY: Your FAL.ai API key for text-to-speech conversion
  • LOG_LEVEL: Optional logging level (default: INFO)

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • OpenRouter for providing access to the Deepseek model
  • FAL.ai for the text-to-speech capabilities
  • PlayHT for voice synthesis
  • Pixabay for background music
  • All our contributors and users

About

AI-powered CLI tool that generates engaging podcast-style conversations with realistic text-to-speech capabilities

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Languages

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /