Reading from unix pipes
From Biopython
Revision as of 12:01, 5 June 2009 by Giles.weaver (Talk | contribs)
Problem
There are many circumstances when reading data from a Unix pipe is preferable to reading data from a file. One example is reading sequences from a compressed file, which is often preferable to uncompressing the file and then reading from it.
Solution
This example script reads a solexa/illumina fastq from stdin, converts the data to sanger fastq and writes it to stdout.
import sys from Bio import SeqIO recs = SeqIO.parse(sys.stdin, "fastq-solexa") SeqIO.write(recs, sys.stdout, "fastq")
The following bash command can be used to extract the compressed sequence and pipe it to the script (solexa2sanger_fq.py).
gunzip -c some_solexa.fastq.gz | python solexa2sanger_fq.py
This will write the sequence in sanger fastq format to stdout - in this case the screen.