Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

API Reference

shijiashuai edited this page Mar 9, 2026 · 1 revision

API 参考

FastQTools 提供清晰的 C++ 公共 API,支持作为库集成到其他项目中。


入口头文件

#include <fqtools/fq.h>

fq.h 聚合了所有公共接口:

模块 头文件 命名空间 说明
I/O <fqtools/io/...> fq::io FastqReader, FastqWriter, FastqRecord, FastqBatch
处理 <fqtools/processing/...> fq::processing 流水线、Predicate、Mutator
统计 <fqtools/statistics/...> fq::statistic 统计计算接口
核心 <fqtools/core/core.h> fq::core 序列工具函数
配置 <fqtools/config/config.h> fq::config 配置管理
错误 <fqtools/error/error.h> fq::error 异常处理框架
日志 <fqtools/logging.h> fq::logging 日志初始化与级别控制
通用 <fqtools/common/common.h> fq::common Timer、IDGenerator 等

模块关系

fq.h(聚合入口)
 ├── io/ → FastqReader / FastqWriter / FastqRecord / FastqBatch
 ├── processing/ → ProcessingPipelineInterface / Predicate / Mutator
 ├── statistics/ → StatisticCalculatorInterface
 ├── core/ → SequenceUtils
 ├── config/ → Configuration
 ├── error/ → FastQException 异常体系
 ├── logging/ → init / setLevel
 └── common/ → Timer / IDGenerator

IO 模块 — fq::io

FastqRecord

FASTQ 记录的零拷贝视图,使用 std::string_view 指向 FastqBatch 的连续内存。

字段:

字段 类型 说明
id std::string_view 记录标识符
sequence std::string_view DNA 序列
quality std::string_view 质量分数字符串
separator std::string_view 分隔符行(通常为 +)

方法:

auto averageQuality(int qualityEncoding = 33) const -> double;
auto length() const -> size_t;
auto gcContent() const -> double;
auto nRatio() const -> double;

FastqBatch

批量存储多条 FASTQ 记录的容器,维护连续内存缓冲区。

auto records() const -> const std::vector<FastqRecord>&;
auto size() const -> size_t;
auto empty() const -> bool;
void clear();
void reserve(size_t count);

内存模型:

FastqBatch
├── buffer_ 连续内存块(存储原始文本)
└── records_ FastqRecord 数组(string_view 指向 buffer_)

FastqReader

fq::io::FastqReader reader("input.fastq.gz");
fq::io::FastqBatch batch;
while (reader.nextBatch(batch, 10000)) {
 for (const auto& record : batch.records()) {
 // 处理每条记录
 }
}

性能参数: readChunkBytes, zlibBufferBytes, maxBufferBytes

FastqWriter

fq::io::FastqWriter writer("output.fastq.gz");
for (const auto& record : batch.records()) {
 writer.write(record);
}

性能参数: zlibBufferBytes, outputBufferBytes

FastqBatchPool

基于对象池模式,减少 TBB pipeline 中的频繁分配:

auto pool = fq::io::createFastqBatchPool(initialSize, maxSize);
auto batch = pool->acquire(); // 从池获取
// shared_ptr 析构时自动归还

处理流水线 — fq::processing

ProcessingPipelineInterface

通过工厂模式创建:

auto pipeline = fq::processing::createProcessingPipeline();

接口方法:

class ProcessingPipelineInterface {
public:
 virtual void setInputPath(const std::string& path) = 0;
 virtual void setOutputPath(const std::string& path) = 0;
 virtual void setProcessingConfig(const ProcessingConfig& config) = 0;
 virtual void addReadPredicate(std::unique_ptr<ReadPredicateInterface> predicate) = 0;
 virtual void addReadMutator(std::unique_ptr<ReadMutatorInterface> mutator) = 0;
 virtual auto run() -> ProcessingStats = 0;
};

ProcessingConfig

参数 类型 说明
batchSize size_t 每批 reads 数量
threadCount size_t 并行线程数
readChunkBytes size_t 读取块大小
zlibBufferBytes size_t zlib 缓冲区
writerBufferBytes size_t 写入缓冲区
batchCapacityBytes size_t 批次内存限制
memoryLimitBytes size_t 总内存限制
maxInFlightBatches size_t 并发批次数

ProcessingStats

字段 类型 说明
totalReads uint64_t 输入读段总数
passedReads uint64_t 通过过滤的读段数
filteredReads uint64_t 被过滤的读段数
errorReads uint64_t 错误读段数
inputBytes uint64_t 输入字节数
outputBytes uint64_t 输出字节数
elapsedMs uint64_t 总耗时(毫秒)
throughputMbps double 吞吐量(MB/s)

ReadPredicateInterface — 过滤谓词

class ReadPredicateInterface {
public:
 virtual auto evaluate(const fq::io::FastqRecord& read) const -> bool = 0;
};

内置实现:

说明
MinQualityPredicate 最小平均质量过滤
MinLengthPredicate 最小读长过滤
MaxLengthPredicate 最大读长过滤
MaxNRatioPredicate 最大 N 碱基比例过滤

ReadMutatorInterface — 读段修饰器

class ReadMutatorInterface {
public:
 virtual void process(fq::io::FastqRecord& read) = 0;
};

内置实现:

说明
QualityTrimmer 质量修剪(Both / FivePrime / ThreePrime)
LengthTrimmer 长度修剪(FixedLength / MaxLength / FromStart / FromEnd)
AdapterTrimmer 接头修剪

统计分析 — fq::statistic

StatisticCalculatorInterface

fq::statistic::StatisticOptions options;
options.inputFastqPath = "input.fastq.gz";
options.outputStatPath = "output.stat.txt";
options.threadCount = 4;
auto calculator = fq::statistic::createStatisticCalculator(options);
calculator->run();

StatisticOptions

字段 类型 说明
inputFastqPath std::string 输入 FASTQ 文件路径
outputStatPath std::string 输出统计文件路径
threadCount size_t 线程数
batchSize size_t 批处理大小

FqStatisticResult

字段 类型 说明
readCount uint64_t 读段总数
totalBases uint64_t 碱基总数
maxReadLength uint32_t 最大读长
posQualityDist vector<vector<uint64_t>> 位置质量分布
posBaseDist vector<vector<uint64_t>> 位置碱基分布

支持 operator+= 合并多个批次的统计结果。


核心工具 — fq::core

SequenceUtils

DNA/RNA 序列处理工具类,使用 C++23 Concepts 约束模板参数:

namespace fq::core {
class SequenceUtils {
public:
 template <std::ranges::range R>
 static auto gcContent(const R& sequence) -> double;
 template <std::ranges::range R>
 static auto nRatio(const R& sequence) -> double;
 static auto reverseComplement(std::string_view sequence) -> std::string;
 static auto isValidBase(char base) -> bool;
};
}

配置管理 — fq::config

namespace fq::config {
class Configuration {
public:
 void loadFromFile(const std::string& configFile);
 void loadFromArgs(int argc, const char* argv[]);
 void loadFromEnv();
 template <typename T> auto get(const std::string& key) const -> T;
 template <typename T> auto getOr(const std::string& key, const T& def) const -> T;
 template <typename T> void set(const std::string& key, const T& value);
 auto hasKey(const std::string& key) const -> bool;
 void validate() const;
};
}

配置优先级:默认值 → 配置文件 → 环境变量 → 命令行参数


错误处理 — fq::error

异常体系

FastQException
├── IOError — 文件 I/O 错误
├── FormatError — FASTQ 格式错误
├── ConfigurationError — 配置错误
└── ValidationError — 验证错误

ErrorCategory / ErrorSeverity

enum class ErrorCategory { IO, Format, Validation, Processing, Resource, Configuration };
enum class ErrorSeverity { Info, Warning, Error, Critical };

便捷宏

FQ_THROW_CONFIG_ERROR("Required key 'input' is missing");
FQ_THROW_IO_ERROR("Failed to open file: " + path);

日志系统 — fq::logging

fq::logging::LogOptions options;
options.level = "info"; // trace/debug/info/warn/error
options.colored = true;
fq::logging::init(options);
fq::logging::info("Processing {} reads", readCount);
fq::logging::warn("Quality below threshold: {}", quality);
fq::logging::setLevel("debug");

CMake 集成

find_package(FastQTools REQUIRED)
target_link_libraries(my_app PRIVATE FastQTools::FastQTools)

完整示例

#include <fqtools/fq.h>
#include <iostream>
int main() {
 // 创建处理流水线
 auto pipeline = fq::processing::createProcessingPipeline();
 pipeline->setInputPath("input.fastq");
 pipeline->setOutputPath("output.fastq");
 // 配置
 fq::processing::ProcessingConfig config;
 config.batchSize = 10000;
 config.threadCount = 4;
 pipeline->setProcessingConfig(config);
 // 添加过滤条件
 pipeline->addReadPredicate(
 std::make_unique<fq::processing::MinQualityPredicate>(20.0, 33));
 pipeline->addReadPredicate(
 std::make_unique<fq::processing::MinLengthPredicate>(50));
 // 添加修剪器
 pipeline->addReadMutator(
 std::make_unique<fq::processing::QualityTrimmer>(
 20.0, 50, fq::processing::QualityTrimmer::TrimMode::Both, 33));
 // 执行
 auto stats = pipeline->run();
 std::cout << stats.toString() << std::endl;
 return 0;
}

相关页面

FastQTools v3.1.0

🚀 快速上手

🏗️ 架构与设计

🔧 构建与部署

🧪 质量工程

📖 规范与参考

🔗 外部链接

Clone this wiki locally

AltStyle によって変換されたページ (->オリジナル) /