Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

hanjiale/Temporal-GraphRAG

Repository files navigation

Temporal-GraphRAG (TG-RAG)

arXiv Hugging Face Dataset

Official implementation of "RAG Meets Temporal Graphs: Time-Sensitive Modeling and Retrieval for Evolving Knowledge".

Overview

Temporal-GraphRAG (TG-RAG) addresses the temporal blindness in conventional RAG systems by modeling knowledge as a bi-level temporal graph. This enables precise time-aware retrieval and efficient incremental updates as corpora evolve.

Key Advantages:

  • πŸ• Explicit temporal fact representation
  • πŸ“Š Multi-granularity temporal summaries
  • πŸ”„ Efficient incremental updates
  • 🎯 Dynamic time-aware retrieval

Installation

git clone https://github.com/hanjiale/Temporal-GraphRAG.git
cd Temporal-GraphRAG
# Create virtual environment
python3.12 -m venv venv
source venv/bin/activate 
# Install dependencies
pip install -r requirements.txt

Quick Start

1. Set up API keys (required for LLM and embedding providers):

# Create .env file or set environment variables
export OPENAI_API_KEY="your-openai-key-here" # For OpenAI provider
export GOOGLE_API_KEY="your-google-key-here" # For Gemini provider (or use GEMINI_API_KEY)

2. Build and query:

# Build a graph from documents
python build_graph.py --output_dir ./graph_output --corpus_path ./my_documents/
# Query the graph
python query_graph.py --question "Your question here" --working_dir ./graph_output --mode global

Configurations

Entity Types

Customize which entity types are extracted by editing tgrag/configs/prompts.yaml:

defaults:
 entity_types:
 - "financial concept"
 - "business segment"
 - "event"
 - "company"
 - "person" 
 - "product"
 - "location"

The system will only extract entities matching these configured types.

LLM and Embedding Providers

Configure in tgrag/configs/config.yaml:

building:
 provider: "gemini" # Options: openai, azure, bedrock, gemini, ollama
 model: "gemini-2.5-flash-lite"
 embedding_provider: "openai"

Supported Providers:

  • OpenAI - Requires OPENAI_API_KEY
  • Azure OpenAI - Requires Azure credentials (set via Azure SDK)
  • Amazon Bedrock - Requires AWS credentials and aioboto3
  • Google Gemini - Requires GOOGLE_API_KEY or GEMINI_API_KEY
  • Ollama - Requires local Ollama server (default: http://localhost:11434)

Set API keys via environment variables or .env file:

export OPENAI_API_KEY="your-key-here"
export GOOGLE_API_KEY="your-key-here" # or GEMINI_API_KEY

Usage Examples

Building the Graph

The build_graph.py script automatically detects input type:

ECT-QA corpus (JSONL.gz):

python build_graph.py --output_dir ./graph_output --corpus_path ./ect-qa/corpus/base.jsonl.gz --num_docs 10

Single text file:

python build_graph.py --output_dir ./graph_output --corpus_path ./my_document.txt

Directory of text files (recursive):

python build_graph.py --output_dir ./graph_output --corpus_path ./my_documents/

Supported text formats: .txt, .md, .rst, .text, .log, and files without extensions.

Query Modes
# Local mode - for specific facts
python query_graph.py --question "What was Company X's revenue in Q3 2023?" --mode local
# Global mode - for trends and summarization
python query_graph.py --question "How did tech companies navigate 2023 challenges?" --mode global
# Naive mode - simple RAG
python query_graph.py --question "What is artificial intelligence?" --mode naive
Python API Examples
from tgrag import create_temporal_graphrag_from_config
# Build the graph
graph_rag = create_temporal_graphrag_from_config(
 config_path="tgrag/configs/config.yaml",
 config_type="building"
)
# Insert documents
graph_rag.insert([{"title": "Doc 1", "doc": "content..."}])
# Query the graph
graph_rag = create_temporal_graphrag_from_config(
 config_path="tgrag/configs/config.yaml",
 config_type="querying"
)
answer = graph_rag.query("Your question here", mode="global")

ECT-QA Dataset

High-quality benchmark for time-sensitive question answering:

  • Corpus: 480 earnings call transcripts (24 companies, 2020-2024)
  • Questions: 1,005 specific + 100 abstract temporal queries

The dataset is also available on Hugging Face: austinmyc/ECT-QA

You can load it using:

from datasets import load_dataset
# Load questions dataset
questions = load_dataset("austinmyc/ECT-QA", "questions")
# Load corpus dataset
corpus = load_dataset("austinmyc/ECT-QA", "corpus")

Repository Structure

Temporal-GraphRAG/
β”œβ”€β”€ tgrag/ 
β”‚ β”œβ”€β”€ configs/ 
β”‚ β”‚ β”œβ”€β”€ config.yaml # Main configuration
β”‚ β”‚ └── prompts.yaml # prompts for indexing and querying
β”‚ └── src/ 
β”‚ β”œβ”€β”€ temporal_graphrag.py 
β”‚ └── ... 
β”œβ”€β”€ ect-qa/ # ECT-QA dataset 
β”‚ β”œβ”€β”€ corpus/ 
β”‚ β”‚ β”œβ”€β”€ base.jsonl.gz # 2020 - 2023
β”‚ β”‚ └── new.jsonl.gz # 2024
β”‚ └── questions/ 
β”‚ β”œβ”€β”€ local_base.jsonl 
β”‚ β”œβ”€β”€ local_new.jsonl 
β”‚ β”œβ”€β”€ global_base.jsonl 
β”‚ └── global_new.jsonl 
β”œβ”€β”€ graph_storage/
β”‚ └── ... # Output graphs 
β”œβ”€β”€ build_graph.py # Script to build knowledge graph
β”œβ”€β”€ query_graph.py # Script to query the graph
β”œβ”€β”€ requirements.txt 
β”œβ”€β”€ README.md 
β”œβ”€β”€ LICENSE 
└── .gitignore 

Citation

@article{han2025rag,
 title={RAG Meets Temporal Graphs: Time-Sensitive Modeling and Retrieval for Evolving Knowledge},
 author={Han, Jiale and Cheung, Austin and Wei, Yubai and Yu, Zheng and Wang, Xusheng and Zhu, Bing and Yang, Yi},
 journal={arXiv preprint arXiv:2510.13590},
 year={2025}
}

Acknowledgments

Paper available at: arXiv:2510.13590


About

Official code for ''RAG Meets Temporal Graphs: Time-Sensitive Modeling and Retrieval for Evolving Knowledge''.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

Languages

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /