| example_data | repo restructure | |
| imgs/logo | repo restructure | |
| releases | adding release v0.2.2.1 | |
| src | adding release v0.2.2.1 | |
| .gitignore | Initial commit | |
| biogrep_example_usage.md | adding tablemerge; doc update | |
| biogrep_reference.md | adding tablemerge; doc update | |
| Changelog.md | adding release v0.2.2.1 | |
| example_cmd_lines.txt | adding tablemerge; doc update | |
| LICENSE | repo restructure | |
| LICENSE_GPLv3 | repo restructure | |
| LICENSE_MIT | repo restructure | |
| README.md | adding release v0.2.2.1 | |
current version dependency python one license used one license used
A cmd line tool collection useful for text file processing, particularly on bioinformatics text files (e.g. BED or GFF format).
These tools are (so far) most helpful for ad-hoc prototyping and data exploration. For permanent code, rather bedtools, pybedtools, gffutils etc are recommended.
Requires python, is tested with python 3.7, python 3.9.
Overview
The command tablegrep.py provides generic and versatile processing of text files, which
are assumed to contain rows of comma-separated values. Operations of these are akin to simple
database operations. Use this for basic operations on files in BEDGRAPH, BED or GFF3 format.
The commands extract_gff3_attributes.py and gff_schema_dump.py provide GFF specific
functions.
Full command reference is found here.
Command line examples are found here.
Implementation has been done in a way that limits (RAM) memory usage, acknowledging that big files can be involved. Code is reasonably optimized speed-wise, but further potential for optimizations surely exists. Contributions (e.g. via code diffs/pull requests) are always welcome.
Installation
Download the latest wheel and use standard
pip install biogrep-0.x.y.z-py3-none-any.whl. This will make the scripts available for direct cmd line
usage.
Changelog
License
Files are individually licensed, see header in each respective file. For all files holds: the origin must not be misrepresented; logos cannot be used in derived work (say, in forks); distributions of modified versions of the software must be marked as such.