1
0
Fork
You've already forked biogrep2
0
Command line tools for improved ad-hoc text file data processing
  • Python 100%
2024年11月30日 16:24:04 +01:00
example_data repo restructure 2023年11月22日 15:39:43 +01:00
imgs/logo repo restructure 2023年11月22日 15:39:43 +01:00
releases adding release v0.2.2.1 2024年11月30日 16:24:04 +01:00
src adding release v0.2.2.1 2024年11月30日 16:24:04 +01:00
.gitignore Initial commit 2023年11月22日 14:19:59 +00:00
biogrep_example_usage.md adding tablemerge; doc update 2024年11月30日 15:36:34 +01:00
biogrep_reference.md adding tablemerge; doc update 2024年11月30日 15:36:34 +01:00
Changelog.md adding release v0.2.2.1 2024年11月30日 16:24:04 +01:00
example_cmd_lines.txt adding tablemerge; doc update 2024年11月30日 15:36:34 +01:00
LICENSE repo restructure 2023年11月22日 15:39:43 +01:00
LICENSE_GPLv3 repo restructure 2023年11月22日 15:39:43 +01:00
LICENSE_MIT repo restructure 2023年11月22日 15:39:43 +01:00
README.md adding release v0.2.2.1 2024年11月30日 16:24:04 +01:00

biogrep

current version dependency python one license used one license used

A cmd line tool collection useful for text file processing, particularly on bioinformatics text files (e.g. BED or GFF format).

These tools are (so far) most helpful for ad-hoc prototyping and data exploration. For permanent code, rather bedtools, pybedtools, gffutils etc are recommended.

Requires python, is tested with python 3.7, python 3.9.

Overview

The command tablegrep.py provides generic and versatile processing of text files, which are assumed to contain rows of comma-separated values. Operations of these are akin to simple database operations. Use this for basic operations on files in BEDGRAPH, BED or GFF3 format.

The commands extract_gff3_attributes.py and gff_schema_dump.py provide GFF specific functions.

Full command reference is found here.

Command line examples are found here.

Implementation has been done in a way that limits (RAM) memory usage, acknowledging that big files can be involved. Code is reasonably optimized speed-wise, but further potential for optimizations surely exists. Contributions (e.g. via code diffs/pull requests) are always welcome.

Installation

Download the latest wheel and use standard pip install biogrep-0.x.y.z-py3-none-any.whl. This will make the scripts available for direct cmd line usage.

Changelog

License

Files are individually licensed, see header in each respective file. For all files holds: the origin must not be misrepresented; logos cannot be used in derived work (say, in forks); distributions of modified versions of the software must be marked as such.