Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

rhowardstone/Claude-Code-Scientist

Repository files navigation

Claude Code Scientist

Fully open-source autonomous scientific research capabilities for Claude Code.

NOTE: NO official ties with Anthropic or Claude! Completely independent, open-source project.

What This Is

Claude Code Scientist transforms Claude Code into a semi-autonomous, self-improving research system. It provides:

  • Research Director logic via CLAUDE.md
  • Specialized subagents for literature review, synthesis, peer review, experiments
  • Skills for orchestrating multi-step research workflows
  • Provenance tracking ensuring every claim has a source

Prerequisites

  • Python 3.9+ with pip
  • Claude Code CLI installed and authenticated
  • ~2GB disk space for models and caches
  • Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)

Installation

git clone https://github.com/rhowardstone/Claude-Code-Scientist.git
cd Claude-Code-Scientist
# Install Python dependencies
pip install -r requirements.txt
# Download spaCy model
python -m spacy download en_core_web_sm

Optional: Install to Other Projects

# Install to another project's .claude/ directory
./install.sh /path/to/your/project
# Or install globally to ~/.claude/
./install.sh --global

Quick Start

./session.sh new "Your research goal here"

or even more simply:

./session.sh new

That's it! This creates a session and launches Claude Code, which automatically:

  1. Decompose the goal into Research Questions
  2. Search literature across multiple databases (OpenAlex, PubMed, Semantic Scholar)
  3. Extract evidence with full provenance (DOI + quote + page)
  4. Synthesize findings into a LaTeX paper
  5. Run peer review with three specialized reviewers
  6. Iterate until unanimous acceptance

Session Management

Each research project gets its own isolated session:

./session.sh new "goal" # Create new session
./session.sh list # List all sessions
./session.sh resume <id> # Resume a session
./session.sh current # Check active session

Sessions store all artifacts in workspace/sessions/session_<id>/.

What You Get

A completed session produces:

workspace/sessions/session_abc123/
├── synthesis/
│ ├── paper.tex # LaTeX paper with citations
│ ├── paper.pdf # Compiled PDF
│ └── references.bib # Bibliography with DOIs
├── literature/
│ └── preread_papers.json # Discovered papers with abstracts
├── peer_review/
│ ├── methodology_review.json
│ ├── statistics_review.json
│ └── impact_review.json
├── experiments/ # If experiments were run
│ ├── results.json
│ └── figures/
└── world_model.json # Research state

Self-Improvement (Optional)

After completing research sessions, you can run CORTEX to analyze what went well and what could be improved:

./session.sh cortex # Launch cortex session

Then run /cortex to start the self-improvement cycle. CORTEX traces the narrative flow of prior sessions, diagnoses issues, and generates fixes. It's how this system improves itself.

Architecture

Claude Code Scientist
 │
 ├── CLAUDE.md (Research Director prompt)
 │
 ├── .claude/
 │ ├── agents/ # Specialized subagent configs (7)
 │ │ ├── lit-scout.md
 │ │ ├── synthesizer.md
 │ │ ├── reviewer-*.md # 3 reviewers
 │ │ ├── experimentalist.md
 │ │ └── tool-acquirer.md
 │ │
 │ ├── skills/ # Orchestration workflows (24)
 │ │ ├── literature-search/
 │ │ ├── peer-review/
 │ │ ├── goal-decomposition/
 │ │ ├── synthesizer/
 │ │ └── ...
 │ │
 │ ├── hooks/ # Validation automation (8)
 │ │ ├── validate-claims.py
 │ │ ├── validate-doi.py
 │ │ └── verify-provenance.py
 │ │
 │ └── rules/ # Conventions (3)
 │ ├── provenance-tracking.md
 │ ├── world-model.md
 │ └── workflow.md
 │
 ├── craig/ # Python utilities (137 files)
 │ ├── world_model.py # Research state management
 │ ├── doi_fetcher.py # DOI validation
 │ ├── latex_compiler.py # Paper compilation
 │ ├── literature/ # Database clients
 │ │ ├── openalex_client.py
 │ │ ├── pubmed_client.py
 │ │ └── semantic_scholar_client.py
 │ ├── pipeline/ # Phase implementations
 │ └── experiment_harness_templates/ # Experiment scaffolding
 │
 └── mcp-servers/literature/ # Literature search MCP

Capabilities

Literature Acquisition

  • Triple-search strategy (keywords, semantic, "googling the question")
  • Multi-database support (OpenAlex, PubMed, Semantic Scholar)
  • Citation graph expansion (forward + reverse)
  • Relevance filtering with provenance

Evidence Extraction

  • Rigorous claim extraction with DOI + quote + page
  • Confidence scoring with justification
  • Conflict detection across papers
  • Gap identification for experimental follow-up

Synthesis

  • Academic paper generation (LaTeX)
  • Proper citations with DOIs
  • Narrative flow (not database dump)
  • Separation of direct vs analogical evidence

Peer Review

  • Three reviewers: methodology, statistics, impact
  • Actionable feedback with specific locations
  • Revision cycles until unanimous acceptance
  • External validation (Codex) when available

Experimentation

  • Phased design → implementation → validation → execution
  • Real data only (no mock/simulated)
  • Timing validation before full runs
  • Incremental saves and checkpointing

Key Principles

  1. Provenance is everything - Every claim needs DOI + quote + page
  2. No simulation trap - Run actual tools, not simulations
  3. Writing code ≠ running code - Always execute to verify
  4. Honesty over completion - Missing evidence > false evidence
  5. Unanimous peer review - Not majority vote

Hardware Requirements

Minimum (Literature-Only Research)

  • RAM: 8GB (32GB+ recommended)
  • Storage: 10GB for caches and session data
  • CPU: Any modern multi-core
  • GPU: Not required (CPU embeddings work fine)

Recommended (Full Pipeline with Experiments)

  • RAM: 32GB+ (genomics/scRNA-seq data can be large)
  • Storage: 50GB+ for paper PDFs and knowledge graphs
  • GPU: NVIDIA with CUDA for faster embeddings (optional)

What Gets Installed

  • Python packages: ~500MB
  • Embedding models: ~400MB (downloaded on first use)
  • spaCy models: ~50MB
  • FAISS: ~10MB

API Costs

  • Claude Code subscription: Required - Pro or Max (run /login inside Claude Code)
  • Literature search: Free (OpenAlex, PubMed, Semantic Scholar are free APIs)
  • PDF access: Free (uses open access sources only)

Customization

Adding MCP Servers

Create .mcp.json to add literature databases:

{
 "mcpServers": {
 "openalex": {
 "type": "stdio",
 "command": "python",
 "args": ["mcp-servers/literature/server.py"]
 }
 }
}

Adding Skills

Create .claude/skills/my-skill/SKILL.md:

---
name: my-skill
description: What it does and when to use it
user-invocable: true
---
# Skill instructions here

Adding Agents

Create .claude/agents/my-agent.md:

---
name: my-agent
description: Specialized agent description
model: sonnet
---
# Agent instructions here

Workspace

Research artifacts are stored in workspace/:

workspace/
├── world_model.json # Research state
├── literature/ # Search results, papers
├── synthesis/ # Paper drafts
├── peer_review/ # Review feedback
└── experiments/ # Experimental artifacts

Python Utilities

The craig/ directory contains 137 Python files providing:

Core Utilities

  • world_model.py - Research state management (papers, claims, RQs)
  • doi_fetcher.py - DOI validation and metadata retrieval
  • latex_compiler.py - LaTeX paper compilation
  • conflict_detector.py - Detect contradictions across sources
  • data_provenance.py - Track evidence chains

Literature Clients

  • openalex_client.py - 200M+ open access works
  • pubmed_client.py - Biomedical literature
  • semantic_scholar_client.py - CS/AI papers with embeddings
  • citation_expander.py - Forward/reverse citation traversal

Experiment Harness

The craig/experiment_harness_templates/ provides scaffolding for experiments:

  • run.sh - Master experiment runner
  • steps/ - Modular experiment phases
  • lib/ - Utilities for checkpointing, scaling, validation

LaTeX (Optional)

For compiling the generated papers to PDF:

apt install texlive-latex-base texlive-latex-extra # Debian/Ubuntu
# or: brew install --cask mactex # macOS

GPU Acceleration (Optional)

For faster embeddings during knowledge graph ingestion:

# Replace faiss-cpu with faiss-gpu
pip uninstall faiss-cpu
pip install faiss-gpu
# Use CUDA for embeddings
python -m craig.literature.knowledge_graph.ingest --device cuda --batch-size 128

Troubleshooting

Duplicate Hooks Running

If you see hooks running twice (e.g., PostToolUse:Write hook succeeded appearing 6 times instead of 3), you have hooks configured in both:

  • Global: ~/.claude/settings.json
  • Local: .claude/settings.json

Claude Code merges both, so they stack. Solutions:

  1. Remove global hooks if you only use this project
  2. Remove local hooks if you prefer global configuration
  3. Accept duplicates - they're harmless, just verbose

API Key Warnings

Messages like NCBI_API_KEY not set are informational. The pipeline works without API keys but may hit rate limits. To add keys:

cp .env.example .env
# Edit .env with your keys

Get keys at:

"All source phases exhausted" Errors

This means the paper couldn't be downloaded from any open access source. It's normal for paywalled papers. The pipeline continues with abstract-only data.

Contributing

We welcome contributions! The most valuable way to contribute is:

  1. Run Cortex between your research sessions
  2. Submit improvements as pull requests
# After a research session, run:
./session.sh cortex
# Then in Claude:
/cortex

Cortex analyzes past sessions, diagnoses issues, and generates fixes. Submit the improvements back!

See CONTRIBUTING.md for full guidelines.

License

MIT

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /