Name	Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
app.py	app.py
ingestion.py	ingestion.py
my_database.db	my_database.db

Text_2_SQL: RAG-based Natural Language to SQL

A robust Retrieval-Augmented Generation (RAG) system designed to bridge the gap between natural language and relational databases. This project allows users to query SQL databases using plain English, leveraging LangChain for orchestration, Qdrant for schema context retrieval, and Ollama for private, local LLM inference.

🚀 Features

Local-First: Runs entirely on your machine using Ollama—no API keys required for inference.
RAG-Enhanced: Uses a vector database (Qdrant) to store database metadata, ensuring the LLM understands your specific schema before generating queries.
Natural Language Interface: Translate complex business questions into accurate SQL joins and aggregations.
Secure: Since it runs locally, your sensitive database schema and data never leave your infrastructure.

🛠️ Tech Stack

Framework: LangChain
Vector Database: Qdrant
LLM Engine: Ollama (e.g., Llama 3, Mistral, or SQLCoder)
Database Support: SQLite, PostgreSQL, or MySQL (via SQLAlchemy)
Language: Python 3.10+

📋 Prerequisites

Ollama: Download and install Ollama.
Model: Pull a coding or general-purpose model:

ollama pull llama3

Qdrant: Run Qdrant locally via Docker:

docker run -p 6333:6333 qdrant/qdrant

⚙️ Installation & Setup

Clone the Repository:

git clone https://github.com/Pratham1603/Text_2_SQL.git
cd Text_2_SQL

Create a Virtual Environment:

python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

Install Dependencies:

pip install -r requirements.txt

Configuration: Create a .env file in the root directory and add your database credentials:

DB_URL=sqlite:///./example.db
QDRANT_HOST=localhost
QDRANT_PORT=6333
OLLAMA_MODEL=llama3

📖 How It Works

Schema Indexing: The system parses your SQL schema (tables, columns, types) and generates descriptive embeddings. These are stored in Qdrant.
Contextual Retrieval: When a user asks a question, the system retrieves the most relevant table schemas from the vector store.
Prompt Engineering: LangChain constructs a prompt containing the user's question and the retrieved schema context.
SQL Generation: The Ollama LLM generates a syntactically correct SQL query.
Execution: The system executes the query against the database and returns the result in human-readable format.

🖥️ Usage

Run the main application:

python main.py

Example Query:

"Show me the top 5 customers who spent the most money in 2023."

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE for more information.

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pratham1603/Text_2_SQL

Folders and files

Latest commit

History

Repository files navigation

Text_2_SQL: RAG-based Natural Language to SQL

🚀 Features

🛠️ Tech Stack

📋 Prerequisites

⚙️ Installation & Setup

📖 How It Works

🖥️ Usage

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text_2_SQL: RAG-based Natural Language to SQL

🚀 Features

🛠️ Tech Stack

📋 Prerequisites

⚙️ Installation & Setup

📖 How It Works

🖥️ Usage

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages