This project is a local Retrieval Augmented Generation (RAG) chatbot application built with Streamlit, LangChain, and Ollama. It allows you to upload multiple PDF documents and chat with an AI model that can answer questions based on the content of those documents. The application features both voice input (speech-to-text) and output (text-to-speech) capabilities, all running completely offline.
- Complete Offline Operation: Runs entirely on your local machine using Ollama and Vosk
- Multi-PDF Support: Upload and query information from multiple PDF files simultaneously
- Contextual Answers: AI responds based only on the content of uploaded documents
- Offline Voice Input: Speak your questions using Vosk for speech recognition (no internet required)
- Offline Voice Output: AI responses can be spoken aloud using pyttsx3
- Chat History Persistence: Saves and loads your chat history locally
- Flexible Configuration: Customize models, prompts, and behavior through the code
- Easy Management: Clear buttons for uploaded PDFs and chat history
Before you begin, ensure you have the following:
- Python 3.8+: python.org/downloads
- Ollama: ollama.com/download (must be running)
- Required Models: Run these commands after installing Ollama:
ollama pull mistral:7b ollama pull nomic-embed-text:latest
- Vosk Model: Download the small English model from alphacephei.com/vosk/models and place it in a vosk-models directory
- Microphone and Speakers: Required for voice features
- Clone or download the repository
- Create and activate a virtual environment:
python -m venv .venv python3.10 -m venv .venv source .venv/bin/activate # Linux/Mac .venv\Scripts\activate # Windows
- install all the dependency pip install -r requirements.txt
Ensure Ollama is running in the background
Start the application:
streamlit run main.py
The app will open in your default browser at http://localhost:8501- Upload PDFs via the sidebar
- Wait for processing to complete (you'll see a confirmation message)
- Ask questions about the document content:
- Type in the chat box, OR
- Click the microphone button to speak your question
- Enable/disable voice input and output via the sidebar checkboxes
- For voice input, click the microphone button and speak clearly
- Voice output will automatically read the AI's responses when enabled
- Clear Uploaded PDFs: Removes all documents from the knowledge base
- Clear Chat History: Starts a new conversation
- Save Current Chat: Manually persists the conversation to disk
Modify these constants in app.py to customize behavior:
python DEFAULT_LLM_MODEL = "" # Change the language model DEFAULT_EMBEDDING_MODEL = "nomic-embed-text:latest" # Change embedding model DEFAULT_SYSTEM_PROMPT = "..." # Customize AI personality VOSK_MODEL_PATH = "vosk-models/..." # Path to Vosk model
π Troubleshooting
Ollama Connection Problems: Ensure Ollama is running (ollama serve) Verify models are downloaded (ollama list)
Check microphone permissions Ensure Vosk model is in the correct location Reduce background noise
Try different PDF files Check for password protection or corruption
On Linux, install libespeak1:
sudo apt-get install libespeak1 Check your audio output settings
For additional help, check the terminal output when running the app as it often contains detailed error messages.