DS-STAR (Data Science - Structured Thought and Action) is a Python-based agentic framework for automating data science tasks. It leverages a multi-agent system powered by Google's Gemini models to analyze data, devise a plan, write and execute code, and iteratively refine the solution to answer a user's query.
This project is an implementation of the paper from Google Research: DS-STAR: A State-of-the-Art Versatile Data Science Agent. Paper
- Agentic Workflow: Implements a pipeline of specialized AI agents (Analyzer, Planner, Coder, Verifier, Router, Debugger, Finalyzer) that collaborate to solve data science problems.
- Reproducibility: Every step of the pipeline is saved, including prompts, generated code, execution results, and metadata. This allows for complete auditability and reproducibility of results.
- Interactive & Resume-able: Runs can be paused and resumed. The interactive mode allows for step-by-step execution.
- Code Editing & Debugging: Allows users to manually edit the generated code during a run and features an auto-debug agent to fix execution errors.
- Configuration-driven: Project settings, model parameters, and run configurations are managed through a
config.yamlfile.
The DS-STAR pipeline is composed of several phases and agents:
- Analysis: The
Analyzeragent inspects the initial data files and generates summaries. - Iterative Planning & Execution:
- The
Plannercreates an initial plan to address the user's query. - The
Codergenerates Python code to execute the current step of the plan. - The code is executed, and the result is captured.
- An automatic
Debuggeragent attempts to fix any code that fails. - The
Verifierchecks if the result sufficiently answers the query. - The
Routerdecides what to do next: either finalize the plan or add a new step for refinement. - This loop continues until the plan is deemed sufficient or the maximum number of refinement rounds is reached.
- The
- Finalization: The
Finalyzeragent takes the final code and results and formats them into a clean, specified output format (e.g., JSON).
All artifacts for each run are stored in the runs/ directory, organized by run_id.
/
├─── dsstar.py # Main script containing the agent logic and CLI
├─── config.yaml # Main configuration file
├─── prompt.yaml # Prompts for the different AI agents
├─── pyproject.toml # Project metadata and dependencies (uv format)
├─── uv.lock # Locked dependency versions for reproducibility
├─── .python-version # Python version specification for uv
├─── data/ # Directory for your data files
└─── runs/ # Directory where all experiment runs and artifacts are stored
- Python 3.11+
- An API key for Google's Gemini models.
- uv package manager (recommended)
-
Clone the repository:
git clone <repository-url> cd DS-Star
-
Install uv (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh -
Install dependencies with uv:
uv sync
-
Set your API Key: The application requires a Gemini API key. You can set it as an environment variable:
export GEMINI_API_KEY='your-api-key'
Alternatively, you can add it to the
config.yamlfile. -
Customize
config.yaml: Create aconfig.yamlfile in the root of the project and customize the settings. See the "Configuration" section below for details.# config.yaml model_name: 'gemini-1.5-flash' max_refinement_rounds: 5 interactive: false # api_key: 'your-api-key' # Alternatively, place it here # Optional: Configure specific models for different agents agent_models: PLANNER: 'gpt-4' CODER: 'gemini-1.5-pro' VERIFIER: 'gemini-1.5-flash'
Place your data files (e.g., .xlsx, .csv) in the data/ directory.
To start a new analysis, you need to provide the data files and a query.
Using uv:
uv run python dsstar.py --data-files file1.xlsx file2.xlsx --query "What is the total sales for each department?"If a run was interrupted, you can resume it using its run_id.
uv run python dsstar.py --resume <run_id>
You can manually edit the last generated piece of code and re-run it. This is useful for manual debugging or tweaking the agent's logic.
uv run python dsstar.py --edit-last --resume <run_id>
This will open the last code file in your default text editor (nano, vim, etc.). After you save and close the editor, the script will re-execute the modified code.
To review each step before proceeding, use the interactive flag.
uv run python dsstar.py --interactive --data-files ... --query "..."This project uses uv for fast and reliable dependency management. Here are some useful commands:
- Install dependencies:
uv sync - Add a new dependency:
uv add package-name - Remove a dependency:
uv remove package-name - Update dependencies:
uv sync --upgrade - Run a command in the virtual environment:
uv run python script.py - Show installed packages:
uv pip list
- Speed: uv is 10-100x faster than pip
- Reliability: Consistent dependency resolution with lock files
- No virtual environment activation needed: Use
uv runto execute commands directly - Better dependency resolution: Automatically resolves complex dependency conflicts
The following options are available in config.yaml and can be overridden by CLI arguments:
run_id(string): The ID of a run to resume.max_refinement_rounds(int): The maximum number of times the agent will try to refine its plan.api_key(string): Your Gemini API key.model_name(string): The Gemini model to use (e.g.,gemini-1.5-flash).interactive(bool): If true, waits for user input before executing each step.auto_debug(bool): If true, theDebuggeragent will automatically try to fix failing code.execution_timeout(int): Timeout in seconds for code execution.execution_timeout(int): Timeout in seconds for code execution.preserve_artifacts(bool): If true, all step artifacts are saved to therunsdirectory.agent_models(dict): A dictionary mapping agent names (e.g.,PLANNER,CODER) to specific model names. If not specified,model_nameis used.
Contributions are welcome! Please feel free to submit a pull request or open an issue for any bugs or feature requests.