High-performance FASTQ compression for the sequencing era
CI Status Code Quality Latest Release License C++23 Documentation
English • 简体中文 • Rust Implementation
fq-compressor is a high-performance FASTQ compression tool that leverages Assembly-based Compression (ABC) and Statistical Context Mixing (SCM) to achieve near-entropy compression ratios while maintaining O(1) random access to compressed data.
Key highlights:
- 🧪 Evidence-first benchmarking with
./scripts/benchmark.shfor tracked evidence and./scripts/benchmark_v2.shfor local comparison runs - 📊 Generated peer standing for compression ratio, compression speed, and decompression speed
- 🎯 Random access without full decompression
- 🚀 Intel oneTBB parallel pipeline
- 📦 Transparent support for .gz, .bz2, .xz inputs
Linux (x86_64, static binary):
wget https://github.com/LessUp/fq-compressor/releases/download/v0.2.0/fq-compressor-v0.2.0-linux-x86_64-musl.tar.gz tar -xzf fq-compressor-v0.2.0-linux-x86_64-musl.tar.gz sudo mv fq-compressor-v0.2.0-linux-x86_64-musl/fqc /usr/local/bin/
macOS (Homebrew):
# Coming soonOther platforms: See Installation Guide
git clone https://github.com/LessUp/fq-compressor.git cd fq-compressor # Install dependencies via Conan conan install . --build=missing -of=build/gcc-release \ -s build_type=Release -s compiler.cppstd=23 # Build cmake --preset gcc-release cmake --build --preset gcc-release -j$(nproc) # Binary: build/gcc-release/src/fqc
Requirements: GCC 14+ or Clang 18+, CMake 3.28+, Conan 2.x
# Compress FASTQ to FQC format fqc compress -i reads.fastq -o reads.fqc # Verify archive integrity fqc verify reads.fqc # Full decompression fqc decompress -i reads.fqc -o restored.fastq
# Random access - extract reads 1000-2000 fqc decompress -i reads.fqc --range 1000:2000 -o subset.fastq # Multi-threaded compression (8 threads) fqc compress -i reads.fastq -o reads.fqc -t 8 -v # Paired-end data fqc compress -i reads_1.fastq -2 reads_2.fastq \ -o paired.fqc --paired # Archive inspection fqc info reads.fqc
- Measured compression density should be read from generated benchmark reports, with O(1) random access remaining part of the system contract
- Latest tracked benchmark evidence is generated by
./scripts/benchmark.sh, backed by the canonicalbenchmark_v2/runner and report stack - Peer standing should be read from generated reports instead of hard-coded README constants
- Archive inspection and verification via
fqc infoandfqc verify - Transparent input handling for
.gz,.bz2, and.xzFASTQ inputs
For deeper benchmark data, algorithm notes, and file-format details, use the maintained docs rather than this repository entry page.
| Surface | Role |
|---|---|
| 📖 GitHub Pages | Public landing page and EN/ZH entry paths |
| 🚀 English docs | Whitepaper, academy, architecture, evidence |
| 简体中文文档 | 白皮书、学院、架构说明、证据链 |
| 📦 Releases | Prebuilt binaries |
| 🤝 Contributing Guide | Closeout-oriented development workflow |
fq-compressor is in closeout mode. Simple development workflow:
./scripts/build.sh clang-debug ./scripts/lint.sh format-check ./scripts/test.sh clang-debug
Contributors should use the single acceptance runner:
./scripts/acceptance.sh
Release-check command surface (kept in sync with the acceptance runner):
./scripts/lint.sh format-check ./scripts/test.sh clang-debug bash tests/e2e/cli_smoke_test.sh bash tests/e2e/benchmark_v2_smoke_test.sh bash tests/e2e/devcontainer_validate_test.sh bash tests/e2e/devcontainer_host_sync_test.sh bash tests/e2e/devcontainer_sshd_lib_test.sh bash tests/e2e/devcontainer_adapter_contract_test.sh (cd docs && npm ci && npm run build) bash scripts/devcontainer-validate.sh
Generate reproducible tracked benchmark evidence with:
./scripts/benchmark.sh \ --dataset err091571-local-supported \ --build \ --tools fqc,gzip,xz,bzip2,spring \ --threads 1 \ --runs 1
Use ./scripts/benchmark_v2.sh for local comparison runs and smoke-scale exploratory workloads.
See AGENTS.md for full project rules and architecture.
Focused contributions are welcome, especially for:
- documentation cleanup and ownership tightening
- evidence-driven bug fixes with regression coverage
- workflow and tooling simplification
- archive-readiness polish
See the Contributing Guide for the repository workflow.
- Project Code: MIT License — see LICENSE
- vendor/spring-core/: Spring's original research license (not MIT)
- Spring (Chandak et al., 2019) — ABC algorithm inspiration
- fqzcomp5 (Bonfield) — Quality compression reference
- Intel oneTBB — Parallel computing framework
- Contributors — Everyone who has helped improve this project