Difference between revisions of "Petagene"

From Define Wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 5: Line 5:
 
# Unpack corpus.tar.gz to some path (e.g. /var/genecodeq/)
 
# Unpack corpus.tar.gz to some path (e.g. /var/genecodeq/)
  
== petagene Use ==
+
== Using GeneCodeq for FASTQ files ==
 +
 
  
Using GeneCodeq for FASTQ files:
 
 
   
 
   
 
One can compress FASTQ input files (e.g. in.fastq.gz) in the following way
 
One can compress FASTQ input files (e.g. in.fastq.gz) in the following way
Line 14: Line 14:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
You'll notice from above, that this takes a compressed fastq file (i.e. in.fastq.gz) and produces another compressed fastq file (i.e. out.fastq.gz).  
+
You'll notice from above, that this takes a compressed fastq file (i.e. in.fastq.gz) and produces another compressed fastq file (i.e. out.fastq.gz).
+
 
Using GeneCodeq for BAM files:
+
==Using GeneCodeq for BAM files==
 
   
 
   
 
One needs to install samtools in order to use GeneCodeq with BAM files. The systems that your client uses should most likely have this already installed. If they don't, it can be downloaded from http://www.htslib.org/download/
 
One needs to install samtools in order to use GeneCodeq with BAM files. The systems that your client uses should most likely have this already installed. If they don't, it can be downloaded from http://www.htslib.org/download/
 
   
 
   
 
Once samtools is installed, you can use GeneCodeq to compress BAM files (e.g. in.bam) in the following way:
 
Once samtools is installed, you can use GeneCodeq to compress BAM files (e.g. in.bam) in the following way:
> samtools view -h in.bam | genecodeq.1.0.1.centos6 -r /var/genecodeq/hs37d5.seq -v /var/genecodeq/1kGenomeProjectT16.vcf -t SAM | samtools view -b - -o out.bam
+
<syntaxhighlight>
 +
samtools view -h in.bam | genecodeq.1.0.1.centos6 -r /var/genecodeq/hs37d5.seq -v /var/genecodeq/1kGenomeProjectT16.vcf -t SAM | samtools view -b - -o out.bam
 +
</syntaxhighlight>
 
   
 
   
In both cases (FASTQ or BAM), you should see a considerable difference in file sizes (e.g. in.fastq.gz vs out.fastq.gz and in.bam vs out.bam). As I said before, please share any performance numbers you gather with us.
+
In both cases (FASTQ or BAM), you should see a considerable difference in file sizes (e.g. in.fastq.gz vs out.fastq.gz and in.bam vs out.bam).

Latest revision as of 14:34, 23 November 2015

Petagene install

  1. Download the binary
  2. Download corpus.tar.gz from the following page: http://www.petagene.com/eval/genecodeq/ (please note that this is a 21GiB file)
  3. Unpack corpus.tar.gz to some path (e.g. /var/genecodeq/)

Using GeneCodeq for FASTQ files

One can compress FASTQ input files (e.g. in.fastq.gz) in the following way

genecodeq.1.0.1.centos6 -r /var/genecodeq/hs37d5.seq -v /var/genecodeq/1kGenomeProjectT16.vcf -i in.fastq.gz -o out.fastq.gz

You'll notice from above, that this takes a compressed fastq file (i.e. in.fastq.gz) and produces another compressed fastq file (i.e. out.fastq.gz).

Using GeneCodeq for BAM files

One needs to install samtools in order to use GeneCodeq with BAM files. The systems that your client uses should most likely have this already installed. If they don't, it can be downloaded from http://www.htslib.org/download/

Once samtools is installed, you can use GeneCodeq to compress BAM files (e.g. in.bam) in the following way:

samtools view -h in.bam | genecodeq.1.0.1.centos6 -r /var/genecodeq/hs37d5.seq -v /var/genecodeq/1kGenomeProjectT16.vcf -t SAM | samtools view -b - -o out.bam

In both cases (FASTQ or BAM), you should see a considerable difference in file sizes (e.g. in.fastq.gz vs out.fastq.gz and in.bam vs out.bam).