Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Pratham1603/Text_2_SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

3 Commits

Repository files navigation


Text_2_SQL: RAG-based Natural Language to SQL

A robust Retrieval-Augmented Generation (RAG) system designed to bridge the gap between natural language and relational databases. This project allows users to query SQL databases using plain English, leveraging LangChain for orchestration, Qdrant for schema context retrieval, and Ollama for private, local LLM inference.

πŸš€ Features

  • Local-First: Runs entirely on your machine using Ollamaβ€”no API keys required for inference.
  • RAG-Enhanced: Uses a vector database (Qdrant) to store database metadata, ensuring the LLM understands your specific schema before generating queries.
  • Natural Language Interface: Translate complex business questions into accurate SQL joins and aggregations.
  • Secure: Since it runs locally, your sensitive database schema and data never leave your infrastructure.

πŸ› οΈ Tech Stack

  • Framework: LangChain
  • Vector Database: Qdrant
  • LLM Engine: Ollama (e.g., Llama 3, Mistral, or SQLCoder)
  • Database Support: SQLite, PostgreSQL, or MySQL (via SQLAlchemy)
  • Language: Python 3.10+

πŸ“‹ Prerequisites

  1. Ollama: Download and install Ollama.
  2. Model: Pull a coding or general-purpose model:
ollama pull llama3
  1. Qdrant: Run Qdrant locally via Docker:
docker run -p 6333:6333 qdrant/qdrant

βš™οΈ Installation & Setup

  1. Clone the Repository:
git clone https://github.com/Pratham1603/Text_2_SQL.git
cd Text_2_SQL
  1. Create a Virtual Environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
  1. Install Dependencies:
pip install -r requirements.txt
  1. Configuration: Create a .env file in the root directory and add your database credentials:
DB_URL=sqlite:///./example.db
QDRANT_HOST=localhost
QDRANT_PORT=6333
OLLAMA_MODEL=llama3

πŸ“– How It Works

  1. Schema Indexing: The system parses your SQL schema (tables, columns, types) and generates descriptive embeddings. These are stored in Qdrant.
  2. Contextual Retrieval: When a user asks a question, the system retrieves the most relevant table schemas from the vector store.
  3. Prompt Engineering: LangChain constructs a prompt containing the user's question and the retrieved schema context.
  4. SQL Generation: The Ollama LLM generates a syntactically correct SQL query.
  5. Execution: The system executes the query against the database and returns the result in human-readable format.

πŸ–₯️ Usage

Run the main application:

python main.py

Example Query:

"Show me the top 5 customers who spent the most money in 2023."


🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“„ License

Distributed under the MIT License. See LICENSE for more information.


About

RAG-based Text-to-SQL system using LangChain, Qdrant, and local LLMs (Ollama) to query relational databases in natural language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /