Name	Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples	examples
notebooks	notebooks
research_assistant	research_assistant
tests	tests
.gitignore	.gitignore
ACHIEVEMENT_SUMMARY.md	ACHIEVEMENT_SUMMARY.md
CLAUDE.md	CLAUDE.md
ENHANCED_CLI_README.md	ENHANCED_CLI_README.md
LICENSE	LICENSE
PROJECT_STATUS.md	PROJECT_STATUS.md
PROJECT_STATUS_ENHANCED.md	PROJECT_STATUS_ENHANCED.md
README.md	README.md
SECURITY_DECISIONS.md	SECURITY_DECISIONS.md
install.sh	install.sh
pytest.ini	pytest.ini
requirements-dev.txt	requirements-dev.txt
requirements.txt	requirements.txt
setup.py	setup.py

Research Assistant Agent

An AI-powered research assistant for collecting and analyzing academic papers from ArXiv and Semantic Scholar. Built with async Python, FAISS vector search, and LLM integration for intelligent paper analysis.

Features

🔍 Multi-source paper collection from ArXiv and Semantic Scholar APIs
⚡ Async/await architecture for efficient concurrent API calls
🚦 Intelligent rate limiting with adaptive backoff strategies
🧠 LLM-powered analysis for extracting insights from papers
📊 Vector similarity search using FAISS for finding related papers
🖥️ Rich CLI interface with colorful tables and progress tracking

Installation

# Clone the repository
git clone https://github.com/davidburton/ResearchAssistantAgent.git
cd ResearchAssistantAgent
# Create and activate virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e .

Quick Start

Command Line Interface

Search for papers across both ArXiv and Semantic Scholar:

# Basic search
research-assistant search "transformer neural networks"
# Search only ArXiv
research-assistant search "quantum computing" --source arxiv --limit 20
# Search by author
research-assistant advanced-search --author "Yoshua Bengio" --limit 10
# Search by category (ArXiv)
research-assistant advanced-search --category cs.AI --limit 15
# Store results in vector database (requires OpenAI API key for embeddings)
research-assistant search "large language models" --store

Python API

import asyncio
from research_assistant import ArxivCollector, SemanticScholarCollector
async def search_papers():
 # Search ArXiv
 async with ArxivCollector() as arxiv:
 papers = await arxiv.search("cat:cs.LG transformer", max_results=5)
 for paper in papers:
 print(f"{paper.title} - {paper.arxiv_id}")
 
 # Search Semantic Scholar 
 async with SemanticScholarCollector() as s2:
 papers = await s2.search("deep learning", limit=5)
 for paper in papers:
 print(f"{paper.title} - Citations: {paper.citation_count}")
asyncio.run(search_papers())

Architecture

The project follows a modular architecture:

src/research_assistant/
├── collectors/ # API clients for paper sources
│ ├── arxiv_collector.py
│ └── semantic_scholar_collector.py
├── analyzers/ # LLM-based paper analysis
│ └── paper_analyzer.py
├── vector_store/ # FAISS similarity search
│ └── faiss_store.py
└── utils/ # Rate limiting and helpers
 └── rate_limiter.py

Configuration

Set environment variables for API keys:

export OPENAI_API_KEY="your-api-key" # For paper analysis and embeddings
export SEMANTIC_SCHOLAR_API_KEY="your-key" # Optional, for higher rate limits

Development

# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Format code
black src/ tests/
# Type checking
mypy src/

API Rate Limits

The tool respects API rate limits:

ArXiv: Max 3 requests/second (configurable)
Semantic Scholar: 100 requests per 5 minutes (anonymous)

Advanced Usage

Using the Rate Limiter

from research_assistant import RateLimiter, AdaptiveRateLimiter
# Fixed rate limiting
limiter = RateLimiter(max_calls=10, time_window=60) # 10 calls per minute
# Adaptive rate limiting (adjusts based on server responses)
adaptive = AdaptiveRateLimiter(
 initial_rate=10.0,
 min_rate=1.0,
 max_rate=50.0,
 backoff_factor=0.5
)
# Use with async context manager
async with limiter:
 # Your API call here
 pass

Paper Analysis with LLMs

from research_assistant import PaperAnalyzer, AnalysisType
analyzer = PaperAnalyzer(api_key="your-openai-key")
# Analyze a paper
analysis = await analyzer.analyze_paper(
 paper_text="Paper abstract or full text...",
 paper_id="arxiv.2301.00001",
 paper_title="Attention Is All You Need",
 analysis_type=AnalysisType.METHODOLOGY
)
print(analysis.methodology)
print(analysis.key_contributions)

Vector Store Operations

from research_assistant import FAISSVectorStore, Document
# Initialize vector store
store = FAISSVectorStore(dimension=1536, index_type="flat")
# Add documents
doc = Document(
 id="paper_001",
 text="Paper content...",
 metadata={"title": "Paper Title", "authors": ["Author 1"]},
 embedding=[0.1, 0.2, ...] # 1536-dimensional vector
)
store.add_documents([doc])
# Search similar documents
results = store.search(query_embedding, k=10)
# Save and load
store.save("./my_index")
loaded_store = FAISSVectorStore.load("./my_index")

Testing

Run the test suite:

# Run all tests
pytest
# Run with coverage
pytest --cov=research_assistant tests/
# Run specific test file
pytest tests/unit/utils/test_rate_limiter.py

Project Status

This is an actively developed research tool. Current focus areas:

✅ Core API collectors (ArXiv, Semantic Scholar)
✅ Rate limiting and async architecture
✅ FAISS vector store integration
✅ CLI interface
🚧 Full paper content extraction
🚧 Advanced LLM analysis pipelines
📋 Web UI dashboard
📋 Citation graph analysis

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details.

Acknowledgments

Built with:

aiohttp for async HTTP
FAISS for vector search
Click for CLI
Rich for beautiful terminal output

Folders and files

Latest commit

History

Repository files navigation

Research Assistant Agent

Features

Installation

Quick Start

Command Line Interface

Python API

Architecture

Configuration

Development

API Rate Limits

Advanced Usage

Using the Rate Limiter

Paper Analysis with LLMs

Vector Store Operations

Testing

Project Status

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages