Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/ llmstxt Public

A Python tool for compressing and organizing code files into a single, LLM-friendly text file. This tool is designed to help prepare codebases for analysis by Large Language Models by removing unnecessary content while preserving important semantic information.

License

Notifications You must be signed in to change notification settings

ngmisl/llmstxt

Repository files navigation

CodeQL Advanced Update llms.txt

llmstxt

A Python tool for compressing and organizing code files into a single, LLM-friendly text file. This tool is designed to help prepare codebases for analysis by Large Language Models by removing unnecessary content while preserving important semantic information.

Features

Smart Code Compression

  • Preserves docstrings and important comments
  • Removes redundant whitespace and formatting
  • Maintains code structure and readability
  • Handles multiple programming languages

Language Support

  • Python (with AST-based compression)
  • JavaScript
  • Java
  • C/C++
  • Shell scripts
  • HTML/CSS
  • Configuration files (JSON, YAML, TOML, INI)
  • Markdown

LLM-Friendly Output

  • XML-style semantic markers
  • File metadata and type information
  • Organized imports section
  • Clear file boundaries
  • Consistent formatting

Automation

  • GitHub Actions integration
  • Automatic updates on code changes
  • CI/CD friendly

Installation

This project uses uv for dependency management, but can also be installed directly with pip.

# Using pip
pip install git+https://github.com/ngmisl/llmstxt.git
# Using uv (recommended for development)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv pip install .
# For development
uv pip install -e ".[dev]"

Usage

Local Usage

# Generate llms.txt from current directory
python -m llmstxt
# Or import and use in your code
from llmstxt import generate_llms_txt
generate_llms_txt()

The script will:

  1. Scan the current directory recursively
  2. Process files according to .gitignore rules
  3. Generate llms.txt with compressed content

Install Locally

pip install --user .

Now you can use the llmstxt command from your terminal.

GitHub Actions Integration

There are two ways to use this tool with GitHub Actions:

  1. For Your Own Repository

Create .github/workflows/update-llms.yml with:

name: Update llms.txt
on:
 push:
 branches: [main, master]
 pull_request:
 branches: [main, master]
 workflow_dispatch: # Allow manual triggering
permissions:
 contents: write
jobs:
 update-llms:
 runs-on: ubuntu-latest
 steps:
 - name: Checkout repository
 uses: actions/checkout@v4
 - name: Set up Python
 uses: actions/setup-python@v4
 with:
 python-version: "3.12"
 cache: "pip"
 - name: Install llmstxt tool
 run: |
 python -m venv .venv
 . .venv/bin/activate
 python -m pip install --upgrade pip
 pip install git+https://github.com/ngmisl/llmstxt.git

 - name: Generate llms.txt
 run: |
 . .venv/bin/activate
 rm -f llms.txt
 python -c "from llmstxt import generate_llms_txt; generate_llms_txt()"

 - name: Configure Git
 run: |
 git config --local user.email "github-actions[bot]@users.noreply.github.com"
 git config --local user.name "github-actions[bot]"

 - name: Commit and push changes
 run: |
 git add llms.txt
 if git diff --staged --quiet; then
 echo "No changes to commit"
 else
 git commit -m "chore: update llms.txt"
 git push
 fi

The workflow will:

  • Run on push to main/master
  • Run on pull requests
  • Can be triggered manually
  • Generate and commit llms.txt automatically
  1. For Remote Repositories You can trigger the action for any repository using the GitHub API:
curl -X POST \
 -H "Authorization: token $GITHUB_TOKEN" \
 -H "Accept: application/vnd.github.v3+json" \
 https://api.github.com/repos/ngmisl/llmstxt/dispatches \
 -d '{"event_type": "update-llms", "client_payload": {"repository": "https://github.com/user/repo.git"}}'

Output Format

The generated llms.txt file follows this structure:

# Project: llmstxt
## Project Structure
This file contains the compressed and processed contents of the project.
### File Types
- .py
- .js
- .java
...
<file>src/main.py</file>
<metadata>
path: src/main.py
type: py
size: 1234 bytes
</metadata>
<imports>
import ast
from typing import Optional
</imports>
<code lang='python'>
def example():
 """Docstring preserved."""
 return True
</code>
<file>src/utils.js</file>
<metadata>
path: src/utils.js
type: js
size: 567 bytes
</metadata>
<code lang='javascript'>
function helper() {
 return true;
}
</code>

Configuration

The tool can be configured through function parameters:

generate_llms_txt(
 output_file="llms.txt", # Output filename
 max_file_size=100 * 1024, # Max file size (100KB)
 allowed_extensions=( # Supported file types
 ".py", ".js", ".java",
 ".c", ".cpp", ".h", ".hpp",
 ".sh", ".txt", ".md",
 ".json", ".xml", ".yaml",
 ".yml", ".toml", ".ini"
 )
)

Development

Requirements:

  • Python 3.8+
  • uv for dependency management (recommended)
# Clone the repository
git clone https://github.com/ngmisl/llmstxt.git
cd llmstxt
# Install development dependencies
uv pip install -e ".[dev]"
# Run type checking
mypy llmstxt
# Run linting and formatting
ruff check llmstxt
ruff format llmstxt

License

MIT License - See LICENSE file for details

About

A Python tool for compressing and organizing code files into a single, LLM-friendly text file. This tool is designed to help prepare codebases for analysis by Large Language Models by removing unnecessary content while preserving important semantic information.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

Languages

AltStyle によって変換されたページ (->オリジナル) /