Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

This project is a local Insights Assistant. You ingest PDF/CSV files into a vector database (Chroma) via a FastAPI server. A client (CLI or Streamlit) searches those embeddings and then calls an LLM summarizer (Ollama or OpenAI) to produce a concise, cited answer grounded in the retrieved snippets.

Notifications You must be signed in to change notification settings

hakant66/FastAPI-insights-assistant

Repository files navigation

Insights Assistant — User Guide (RAG + MCP Host/Server + LLM Summarizer)

Date: 2025年09月03日

Summary

This project is a local Insights Assistant. You ingest PDF/CSV files into a vector database (Chroma) via a FastAPI server. A client (CLI or Streamlit) searches those embeddings and then calls an LLM summarizer (Ollama or OpenAI) to produce a concise, cited answer grounded in retrieved snippets.

  • Server (mcp_server.py): exposes /tools/ingest_pdf, /tools/ingest_csv, /tools/search_docs. Uses LangChain loaders → splitter → embeddings → Chroma (persistent in ./db).
  • Client library (mcp_host.py): tiny HTTP wrapper with retries/backoff and friendly exceptions.
  • Clients:
    • CLI (client_app.py): ingest-* and ask.
    • Streamlit (streamlit_app.py): tabs for Ingest and Ask + preview panel.
  • Summarizer (summarizer.py): calls Ollama (local) or OpenAI to turn top-k snippets into a clean, cited answer. (Required in this setup.)

ASCII Flow

ASCII = r""" +------+ +--------------------+ +-------------------------+ |User |--uses---> | Streamlit UI | | CLI Client | +------+ | (streamlit_app.py) | | (client_app.py) | +----------+---------+ +-----------+-------------+ \ / \ via MCP Host (HTTP client) / v v +--------+-------------------------------+ | MCP Host (mcp_host.py) | | Retries • Backoff • Friendly errors | +-------------------+--------------------+ | | POST /tools/* v +------------+-------------+ | FastAPI Server | | (mcp_server.py) | | ingest_pdf / ingest_csv | | search_docs | +------------+-------------+ | +-----------------------------+------------------------------+ | LangChain RAG pipeline (server-side) | | Loaders -> Splitter -> Embeddings -> Chroma (./db) | +-----------+------------+-------------+--------------------+ | | | | docs | chunks | vectors v v v [files] [chunks] [persisted index]

REQUIRED (client-side): +---------------------------------------------------------------+ | summarizer.py → Ollama/OpenAI (REST) | | Produces the ONLY answer shown to users (cited, grounded). | +---------------------------------------------------------------+ """.strip("\n")

Dependencies & Ports

  • Python: 3.11+ (3.12 works; on 3.13 remove or place any from __future__ lines at the very top)
  • Pip: fastapi, uvicorn[standard], langchain, langchain-community, chromadb, pypdf, sentence-transformers, python-dotenv, requests, streamlit, pandas
  • External: Ollama (ollama serve; ollama pull llama3.1:8b)
  • Ports: Server 8799 • Streamlit 8501 • Ollama 11434

Environment Variables (.env)

DB_DIR=./db EMBED_MODEL=sentence-transformers/all-MiniLM-L6-v2

SUMMARIZER_PROVIDER=ollama OLLAMA_URL=http://127.0.0.1:11434

OLLAMA_MODEL=llama3.1:8b

If using OpenAI instead: OPENAI_API_KEY=sk-... OPENAI_MODEL=gpt-4o-mini

Load in Python near the top:

from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
## Run Order
1) Start Ollama
 ollama serve
 ollama pull llama3.1:8b
2) Start the FastAPI server (in your .venv)
 .\.venv\Scripts\Activate.ps1
 python -m uvicorn mcp_server:app --host 127.0.0.1 --port 8799 --reload
3) Start a client
 - Streamlit UI:
 .\.venv\Scripts\Activate.ps1
 python -m streamlit run streamlit_app.py # http://localhost:8501
 - CLI:
 .\.venv\Scripts\Activate.ps1
 python client_app.py ingest-pdf .\data\doc.pdf
 python client_app.py ask "Paste a phrase from your PDF"
## Sanity Checks
- where python; python -V
- curl http://127.0.0.1:8799/
- curl http://127.0.0.1:11434/api/tags
- python client_app.py ingest-pdf .\data\doc.pdf
- python client_app.py ask "your phrase"
## Common Issues (and fixes)
Cannot reach serverstart uvicorn; test /; port 8799; firewall/port conflict (netstat -ano | findstr :8799)
Wrong Python (global vs .venv) → activate .venv or run .\.venv\Scripts\python -m ...
from __future__ SyntaxErrorremove or move to file top (3.11+ doesnt need it)
Few/zero chunksshort/scanned PDF; pre-OCR or OCR loader; reduce chunk size to 500/100
LLM summary missing (required) → ollama serve, ollama pull llama3.1:8b, set OLLAMA_* envs
File not found (400) → pass a correct absolute/relative path
Proxy interferes with localhostNO_PROXY=127.0.0.1,localhost
---

About

This project is a local Insights Assistant. You ingest PDF/CSV files into a vector database (Chroma) via a FastAPI server. A client (CLI or Streamlit) searches those embeddings and then calls an LLM summarizer (Ollama or OpenAI) to produce a concise, cited answer grounded in the retrieved snippets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

AltStyle によって変換されたページ (->オリジナル) /