Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A learning project for building local knowledge bases from PDFs using LangChain, supporting multiple LLMs (DeepSeek, OpenAI). Features include PDF processing, knowledge graph construction, and natural language Q&A interface.一个基于 LangChain 的学习项目,用于构建 PDF 本地知识库,支持多种大语言模型(DeepSeek、OpenAI)。功能包括 PDF 处理、知识图谱构建和自然语言问答界面。

License

Notifications You must be signed in to change notification settings

tomy145/DataGraphX_Learn

Repository files navigation

DataGraphX (Learning Edition)

English | 中文

⚠️ Note: This is a learning edition. For commercial use, please contact us for customized solutions!

⚠️ 注意: 这是学习版本。商业用途请联系我们定制解决方案!

🌟 DataGraphX

An intelligent document analysis system that combines LangChain, Neo4j graph database, and large language models to create a knowledge graph-based RAG (Retrieval-Augmented Generation) application.

🖼️ Project Demo

Q&A System Interface

Q&A System

Knowledge Graph Visualization

Knowledge Graph

🚀 Features

  • 📊 Automatic Knowledge Graph Construction

    • PDF document processing and analysis
    • Intelligent text segmentation
    • Relationship extraction
    • Interactive graph visualization
  • 🤖 Natural Language Q&A

    • Context-aware responses
    • Knowledge graph-based retrieval
    • Multi-LLM support (DeepSeek, OpenAI)
    • Real-time graph exploration

📦 Project Structure

DataGraphX/
├── app.py # Main application file
├── api_utils.py # API utilities
├── config.py # Configuration settings
├── data_persistence_utils.py # Data persistence helpers
├── knowledge_graph_utils.py # Knowledge graph functions
├── requirements.txt # Project dependencies
├── cache/ # Cache directory
├── logo.png # Project logo
├── kg.jpg # Knowledge graph demo
└── qa.jpg # Q&A interface demo

🔧 Installation

  1. Clone repository:
git clone https://github.com/adoresever/DataGraphX_Learn.git
cd DataGraphX_Learn
  1. Create and activate conda environment:
conda create -n datagraphx python=3.10
conda activate datagraphx
  1. Install dependencies:
pip install -r requirements.txt
  1. Start application:
streamlit run app.py

🛠️ Requirements

  • Python 3.10+
  • Neo4j Database Server
  • DeepSeek/OpenAI API access
  • CUDA-compatible GPU (recommended)

🌟 DataGraphX 学习版

一个智能文档分析系统,结合了 LangChain、Neo4j 图数据库和大型语言模型,创建了一个基于知识图谱的 RAG(检索增强生成)应用。

🖼️ 项目展示

知识图谱可视化

知识图谱

问答系统界面

问答系统

🚀 功能特点

  • 📊 自动知识图谱构建

    • PDF文档处理与分析
    • 智能文本分段
    • 关系抽取
    • 交互式图谱可视化
  • 🤖 自然语言问答

    • 上下文感知响应
    • 基于知识图谱的检索
    • 多LLM支持(DeepSeek、OpenAI)
    • 实时图谱探索

📦 项目结构

DataGraphX/
├── app.py # 主应用程序文件
├── api_utils.py # API工具
├── config.py # 配置设置
├── data_persistence_utils.py # 数据持久化助手
├── knowledge_graph_utils.py # 知识图谱功能
├── requirements.txt # 项目依赖
├── cache/ # 缓存目录
├── logo.png # 项目标志
├── kg.jpg # 知识图谱演示
└── qa.jpg # 问答界面演示

🔧 安装步骤

  1. 克隆仓库:
git clone https://github.com/adoresever/DataGraphX_Learn.git
cd DataGraphX_Learn
  1. 创建并激活conda环境:
conda create -n datagraphx python=3.10
conda activate datagraphx
  1. 安装依赖:
pip install -r requirements.txt
  1. 启动应用:
streamlit run app.py

🛠️ 环境要求

  • Python 3.10+
  • Neo4j 数据库服务器
  • DeepSeek/OpenAI API 访问权限
  • CUDA兼容GPU(推荐)

👥 作者

王宇 (Yu Wang) - wywelljob@gmail.com

📝 致谢

2025新年快乐!

📄 许可证

CC BY-NC-SA 4.0 - 详见 LICENSE 文件


🔒 商业定制

如需商业版本或定制开发,请联系:wywelljob@gmail.com

About

A learning project for building local knowledge bases from PDFs using LangChain, supporting multiple LLMs (DeepSeek, OpenAI). Features include PDF processing, knowledge graph construction, and natural language Q&A interface.一个基于 LangChain 的学习项目,用于构建 PDF 本地知识库,支持多种大语言模型(DeepSeek、OpenAI)。功能包括 PDF 处理、知识图谱构建和自然语言问答界面。

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%

AltStyle によって変換されたページ (->オリジナル) /