genounzip

Uncompress files compressed with genozip.

Usage: genounzip [options]… [files]…

One or more file names must be given


Examples

genounzip file1.vcf.genozip file2.sam.genozip

genounzip file.vcf.genozip --output file.vcf.gz

genounzip bound.vcf.genozip --unbind

Options

-e, --reference filename.  Load a reference file prior to decompressing. Required only for files compressed with --reference

-c, --stdout  Send output to standard output instead of a file.

-f, --force  Force overwrite of the output file.

-z, --bgzf level.  Compress the output to the BGZF format (.gz extension) using libdeflate at the compression level specified by the argument. Argument specifies the compression level from 0 (no compression) to 12 (best yet slowest compression). If you are not sure what value to choose - 6 is a popular option. Note: by default (absent this option) genozip will attempt to re-create the same BGZF compression as in the original file. Whether genozip succeeds in re-creating the exact same BGZF compression ratio depends on the compression library used by the application that generated the original file.

-^, --replace  Replace the source file with the result file rather than leaving it unchanged.

-o, --output output-filename.  Output to this filename.

-p, --password password.  Provide password to access file(s) that were compressed with --password.

-x, --index  Create an index file alongside the decompressed file. The index file is created as described:

Data type

Tool used

SAM/BAM

samtools index

FASTQ

samtools faidx

FASTA

samtools faidx

VCF

bcftools index

Other types

--index not supported


-u, --unbind[=prefix]  Split a bound file back to its original components. If the --unbind=prefix form is used a prefix is added to each file component. A prefix may include a directory.

-m, --md5  Show the digest of the decompressed file - MD5 if the file was compressed with --md5 or --test and Adler32 if not. Note: for compressed files (e.g. myfile.vcf.gz) the digest calculated is that of the original uncompressed file.

-t, --test  Decompress in memory (i.e. without writing the decompressed file to disk) and use the digest (MD5 or Adler32) to verify that the resulting decompressed file is identical to the original file.

-q, --quiet  Don't show the progress indicator or warnings.

-Q, --noisy  The --quiet option is turned on by default when outputting to the terminal. --noisy stops the suppression of warnings.

-@, --threads number.  Specify the maximum number of threads. By default genozip uses all the threads it needs to maximize usage of all available cores.

-w, --show-stats   Show the internal structure of a genozip file and the associated compression statistics.

-W, --SHOW-STATS   Show more detailed statistics.

Translation options (convertion from one format to another)

--bam  (SAM and BAM only) Output as BAM. Note: this option is implicit if --output specifies a filename ending with .bam

--sam  (SAM and BAM only) Output as SAM. This option is the default in genocat on SAM and BAM data.

--no-PG  (SAM and BAM only) When converting a file from SAM to BAM or vice versa Genozip normally adds a @PG line in the header. With this option it doesn't.

--fastq  (SAM and BAM only) Output as FASTQ. The alignments are outputted as FASTQ reads in the order they appear in the SAM/BAM file. Alignments with FLAG 16 (reverse complimented) have their SEQ reverse complimented and their QUAL reversed. Alignments with FLAG 4 (unmapped) or 256 (secondary) are dropped. Alignments with FLAG 64 (or 128) (the first (or last) segment in the template) have a '1' (or '2') added after the read name. Usually (if the original order of the SAM/BAM file has not been tampered with) this would result in a valid interleaved FASTQ file. Note: this option is implicit if --output specifies a filename ending with .fq[.gz] or .fastq[.gz]

--bcf  (VCF only) Output as BCF. Note: bcftools needs to be installed for this option to work.

--phylip  (FASTA only) Output a Multi-FASTA in Phylip format. All sequences must be the same length.

--fasta  (Phylip only) Output as Multi-FASTA.

--vcf  (23andMe only) Output as VCF. --vcf must be used in combination with --reference to specify the reference file as listed in the header of the 23andMe file (usually this is GRCh37). Note: INDEL genotypes ('DD' 'DI' 'II') as well as uncalled sites ('--') are discarded.

Other actions - uses other than uncompressing a file

-h, --help[=topic]  Show this help page. Optional topic can be:

topic

genozip

list of genozip options

genounzip

list of genounzip options

genocat

list of genocat options

genols

list of genols options

dev

list of developer options

input

list of possible arguments of –input


-L, --license, --licence  Show the license terms and conditions for this product.

-V, --version  Display Genozip's version number.