Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

diorwave/Patentpath

Repository files navigation

PatentPath — Agentic RAG for USPTO Briefings (Starter)

This is a small, runnable demo that:

  • Ingests a sample patent dataset (CSV)
  • Builds a hybrid retrieval index (BM25 + vectors via Chroma + Sentence Transformers)
  • Generates a cited brief with inline [doc_id] references
  • Shows a Streamlit UI to test queries

Quick Start

# 1) Create and activate a virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# 2) Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# 3) Build the index (uses sample CSV in data/raw)
python -m src.ingest
# 4) Run the demo UI
streamlit run app/streamlit_app.py

Try these sample queries

  • LLM watermarking methods
  • drone swarming computer vision
  • synthetic data generation patents
  • transformer optimization energy efficiency

Notes

  • If nltk complains about missing data, the code has a fallback sentence splitter (no internet required).
  • If chromadb install is problematic on your system, try updating pip and setuptools: pip install --upgrade pip setuptools wheel.

Security: This starter uses public, synthetic sample data. Do not ingest client or restricted data.

About

Agentic RAG demo for public patent text (Streamlit + hybrid retrieval + verification)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

AltStyle によって変換されたページ (->オリジナル) /