A universal compressor for genomic files
- Use cases
- Using genozip
- Using genounzip
- Using genocat
- Using genols
- Publications & Citing
- Source code
- Developer options
Sign up to receive low-frequency updates related to Genozip.
Genozip is a universal compressor for genomic files - it is optimized to compress FASTQ, SAM/BAM/CRAM, VCF/BCF, FASTA, GVF, PHYLIP, Chain and 23andMe files, but it can also compress any other file (including non-genomic files).
Typically, a 2X-5X improvement over the existing compression is achieved when compressing already-compressed files like .fastq.gz .bam vcf.gz, and much higher ratios in some other cases.
Yes, Genozip can compress already-compressed files (.gz .bz2 .xz .bam .cram).
The compression is lossless - the decompressed file is 100% identical to the original file (some exceptions apply).
Genozip consists of four command line tools:
genozip compresses files
genounzip decompresses files
genols shows metadata of compressed files and directories
- genocat is the workhorse for using genozip in analytical pipelines:
Display the contents of a compressed file - possibly piping it into a downstream tool
Subset a compressed file - show a specific part of its contents
Translate a compressed file to another format (eg BAM to FASTQ or Multi-FASTA to Phylip)
- From Conda (Linux & Mac):
conda config --add channels conda-forge
conda install genozip
Using a Windows installer:
- From Github (tested on Linux, Mac and Windows):
git clone https://github.com/divonlan/genozip
makerequires: gcc or clang, make
- From Docker Hub here
Publications & Citing¶
Lan, D., et al. (2021) Genozip: a universal extensible genomic data compressor Bioinformatics, btab102, https://doi.org/10.1093/bioinformatics/btab102
Lan, D., et al. (2020) genozip: a fast and efficient compression tool for VCF files Bioinformatics, 36, 4091–4092
Bug reports and feature requests: email@example.com
Commercial license inquiries: firstname.lastname@example.org
Requests for support for compression of additional public or proprietary file formats: email@example.com
THIS SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.