CodeQL Advanced Update llms.txt
A Python tool for compressing and organizing code files into a single, LLM-friendly text file. This tool is designed to help prepare codebases for analysis by Large Language Models by removing unnecessary content while preserving important semantic information.
- Preserves docstrings and important comments
- Removes redundant whitespace and formatting
- Maintains code structure and readability
- Handles multiple programming languages
- Python (with AST-based compression)
- JavaScript
- Java
- C/C++
- Shell scripts
- HTML/CSS
- Configuration files (JSON, YAML, TOML, INI)
- Markdown
- XML-style semantic markers
- File metadata and type information
- Organized imports section
- Clear file boundaries
- Consistent formatting
- GitHub Actions integration
- Automatic updates on code changes
- CI/CD friendly
This project uses uv for dependency management, but can also be installed directly with pip.
# Using pip pip install git+https://github.com/ngmisl/llmstxt.git # Using uv (recommended for development) curl -LsSf https://astral.sh/uv/install.sh | sh uv pip install . # For development uv pip install -e ".[dev]"
# Generate llms.txt from current directory python -m llmstxt # Or import and use in your code from llmstxt import generate_llms_txt generate_llms_txt()
The script will:
- Scan the current directory recursively
- Process files according to .gitignore rules
- Generate
llms.txtwith compressed content
pip install --user .Now you can use the llmstxt command from your terminal.
There are two ways to use this tool with GitHub Actions:
- For Your Own Repository
Create .github/workflows/update-llms.yml with:
name: Update llms.txt on: push: branches: [main, master] pull_request: branches: [main, master] workflow_dispatch: # Allow manual triggering permissions: contents: write jobs: update-llms: runs-on: ubuntu-latest steps: - name: Checkout repository uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v4 with: python-version: "3.12" cache: "pip" - name: Install llmstxt tool run: | python -m venv .venv . .venv/bin/activate python -m pip install --upgrade pip pip install git+https://github.com/ngmisl/llmstxt.git - name: Generate llms.txt run: | . .venv/bin/activate rm -f llms.txt python -c "from llmstxt import generate_llms_txt; generate_llms_txt()" - name: Configure Git run: | git config --local user.email "github-actions[bot]@users.noreply.github.com" git config --local user.name "github-actions[bot]" - name: Commit and push changes run: | git add llms.txt if git diff --staged --quiet; then echo "No changes to commit" else git commit -m "chore: update llms.txt" git push fi
The workflow will:
- Run on push to main/master
- Run on pull requests
- Can be triggered manually
- Generate and commit
llms.txtautomatically
- For Remote Repositories You can trigger the action for any repository using the GitHub API:
curl -X POST \ -H "Authorization: token $GITHUB_TOKEN" \ -H "Accept: application/vnd.github.v3+json" \ https://api.github.com/repos/ngmisl/llmstxt/dispatches \ -d '{"event_type": "update-llms", "client_payload": {"repository": "https://github.com/user/repo.git"}}'
The generated llms.txt file follows this structure:
# Project: llmstxt ## Project Structure This file contains the compressed and processed contents of the project. ### File Types - .py - .js - .java ... <file>src/main.py</file> <metadata> path: src/main.py type: py size: 1234 bytes </metadata> <imports> import ast from typing import Optional </imports> <code lang='python'> def example(): """Docstring preserved.""" return True </code> <file>src/utils.js</file> <metadata> path: src/utils.js type: js size: 567 bytes </metadata> <code lang='javascript'> function helper() { return true; } </code>
The tool can be configured through function parameters:
generate_llms_txt( output_file="llms.txt", # Output filename max_file_size=100 * 1024, # Max file size (100KB) allowed_extensions=( # Supported file types ".py", ".js", ".java", ".c", ".cpp", ".h", ".hpp", ".sh", ".txt", ".md", ".json", ".xml", ".yaml", ".yml", ".toml", ".ini" ) )
Requirements:
- Python 3.8+
- uv for dependency management (recommended)
# Clone the repository git clone https://github.com/ngmisl/llmstxt.git cd llmstxt # Install development dependencies uv pip install -e ".[dev]" # Run type checking mypy llmstxt # Run linting and formatting ruff check llmstxt ruff format llmstxt
MIT License - See LICENSE file for details