Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Sdinzsh/RAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

5 Commits

Repository files navigation

🌲 Vectorless RAG β€” Document Intelligence

No embeddings. No vector DB. Pure LLM reasoning.


🧠 How It Works

Traditional Vector RAG
 PDF β†’ chunks β†’ embeddings β†’ cosine similarity β†’ hope it's relevant β†’ answer
Vectorless RAG (this project)
 PDF β†’ document TREE β†’ LLM reasons over tree β†’ picks exact sections β†’ answer

The 3-step pipeline

Step 1 β€” PARSE
 PyMuPDF extracts text page by page.
 Font-size analysis detects headings β†’ builds a hierarchical Document Tree.
Step 2 β€” TREE SEARCH (the key insight)
 LLM receives the compact tree (like a Table of Contents with section IDs).
 LLM reasons: "Which sections most likely contain the answer?"
 Returns a JSON array of section_ids.
 β†’ No similarity math. No embeddings. Just reasoning.
Step 3 β€” ANSWER
 Full text of the chosen sections is retrieved (with page numbers).
 LLM synthesises a cited, markdown-formatted answer.
 Response is streamed token-by-token to the UI.

πŸš€ Quick Start

Prerequisites

  • Python 3.10+
  • Ollama installed and running
  • At least one model pulled:
ollama pull llama3.2 # recommended (fast, good reasoning)
# or
ollama pull mistral
ollama pull phi3
ollama pull gemma2

1 β€” Clone / download

git clone https://github.com/Sdinzsh/RAG.git
cd RAG

2 β€” Install dependencies

pip install -r requirements.txt

3 β€” Start Ollama

ollama serve # if not already running as a service

4 β€” Run the app

python app.py

Open http://localhost:5000 in your browser.


πŸ–₯️ Web UI Features

Feature Detail
PDF Upload Drag-and-drop or click to browse (up to 50 MB)
Document Tree Sidebar shows all sections/pages with page numbers
Section Highlight Sections used for each answer are highlighted in the tree
Streaming Chat Answers stream token-by-token like ChatGPT
Source Badges Each answer shows which sections/pages were used
Multi-turn Chat Conversation history maintained per session
Model Selector Switch between any Ollama model at the top
Clear History Reset conversation context without re-uploading

πŸ“ Project Structure

RAG/
β”œβ”€β”€ app.py ← Flask backend + REST API
β”œβ”€β”€ rag_engine.py ← Core RAG logic
β”‚ β”œβ”€β”€ build_document_tree() ← PDF β†’ Section tree
β”‚ β”œβ”€β”€ find_relevant_sections() ← LLM tree search
β”‚ └── answer_query_stream() ← Full RAG pipeline
β”œβ”€β”€ templates/
β”‚ └── index.html ← Web UI (dark industrial theme)
β”œβ”€β”€ uploads/ ← Uploaded PDFs (auto-created)
└── requirements.txt

βš™οΈ Configuration

Edit the top of rag_engine.py:

OLLAMA_BASE = "http://localhost:11434" # Ollama server URL
DEFAULT_MODEL = "llama3.2" # Default model
TOP_K_SECTIONS = 4 # Max sections per query
MAX_SECTION_CHARS = 3000 # Chars per section in context

πŸ”Œ API Endpoints

Method Endpoint Description
GET / Web UI
GET /api/models List available Ollama models
POST /api/upload Upload PDF + build document tree
POST /api/chat Chat query β†’ SSE stream
GET /api/session/<id> Session info
POST /api/session/<id>/clear Clear chat history

πŸ’‘ Tips

  • Best models for reasoning: llama3.2, mistral, gemma2, phi3
  • Large PDFs: The tree search is O(sections) not O(pages) β€” works well on big docs
  • Structured PDFs (reports, textbooks): heading detection works best
  • Scanned PDFs: Won't work β€” needs OCR pre-processing

About

A vector-less RAG works fully in local NO internet needed, with webUI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /