Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

document-indexing

Here are 18 public repositories matching this topic...

Context Search Engine is an AI-powered semantic document search platform built for learning, experimentation, and real-world prototyping. It demonstrates the full lifecycle of modern vector-based search — from document ingestion to chunking, embedding, indexing, and contextual query matching.

  • Updated Dec 24, 2025
  • Python

A highly efficient, isomorphic, full-featured, multilingual text search engine library, providing full-text search, fuzzy matching, phonetic scoring, document indexing and more, with micro JSON state hydration/dehydration in-browser and server-side.

  • Updated Jul 21, 2023
  • TypeScript

This repository highlights my learning journey in building Retrieval-Augmented Generation (RAG) pipelines using DeepSeek on Lightning AI, covering document ingestion, retrieval, and integration with generative AI. It showcases fine-tuning, evaluation, and optimization for accurate open-domain QA and knowledge management.

  • Updated Jan 24, 2025
  • Jupyter Notebook

Developed an AI-powered document intelligence platform for educators with Google Drive integration, enabling seamless processing of diverse document formats. Leveraged Qdrant vectorization and AzureOpenAI gpt-4o-mini to create a robust question answering system with optimized search capabilities, transparent citations, and direct source navigation.

  • Updated Apr 17, 2025
  • Python

Atlas - Enterprise document indexing plugin for OpenClaw. Vectorless RAG using PageIndex with async indexing, incremental updates, and smart caching. Scales from 10 to 5000+ documents. Perfect for financial reports, legal docs, technical manuals, and research papers.

  • Updated Feb 9, 2026
  • TypeScript

A real-time Personal Document Intelligence system that utilizes Java filesystem monitoring and Python RAG orchestration with Google Gemini to automatically index and semantically query local documents

  • Updated Feb 5, 2026
  • Python

Improve this page

Add a description, image, and links to the document-indexing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the document-indexing topic, visit your repo's landing page and select "manage topics."

Learn more

AltStyle によって変換されたページ (->オリジナル) /