| Trees | Indices | Help |
|
|---|
|
|
Bio.GFF.easy: some functions to ease the use of Biopython (DEPRECATED) This is part of the "old" Bio.GFF module by Michael Hoffman, which offered access to a MySQL database holding GFF data loaded by BioPerl. This code has now been deprecated, and will probably be removed in order to free the Bio.GFF namespace for a new GFF parser in Biopython (including GFF3 support). Some of the more useful ideas of Bio.GFF.easy may be reworked for Bio.GenBank, using the standard SeqFeature objects used elsewhere in Biopython.
|
|||
|
FeatureDict JH: accessing feature.qualifiers as a list is stupid. |
|||
|
Location this is really best interfaced through LocationFromString fuzzy: < or > join: {0 = no join, 1 = join, 2 = order} |
|||
|
LocationJoin >>> join = LocationJoin([LocationFromCoords(339, 564, 1), LocationFromString("complement(100..339)")])... |
|||
|
LocationFromCoords >>> print LocationFromCoords(339, 564)... |
|||
|
LocationFromString >>> # here are some tests from http://www.ncbi.nlm.nih.gov/collab/FT/index.html#location >>> print LocationFromString("467") 467 >>> print LocationFromString("340..565") 340..565 >>> print LocationFromString("<345..500") <345..500 >>> print LocationFromString("<1..888") <1..888 >>> # (102.110) and 123^124 syntax unimplemented >>> print LocationFromString("join(12..78,134..202)") join(12..78,134..202) >>> print LocationFromString("complement(join(2691..4571,4918..5163))") complement(join(2691..4571,4918..5163)) >>> print LocationFromString("join(complement(4918..5163),complement(2691..4571))") join(complement(4918..5163),complement(2691..4571)) >>> print LocationFromString("order(complement(4918..5163),complement(2691..4571))") order(complement(4918..5163),complement(2691..4571)) >>> print LocationFromString("NC_001802x.fna:73..78") NC_001802x.fna:73..78 >>> print LocationFromString("J00194:100..202") J00194:100..202 |
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
re_complement = re.compile(r'^complement\(
|
|||
re_seqname = re.compile(r'^
|
|||
re_join = re.compile(r'^
|
|||
re_dotdot = re.compile(r'^
|
|||
re_fuzzy = re.compile(r'^
|
|||
|
|||
>>> record = fasta_single(string='''
... >gi|9629360|ref|NP_057850.1| Gag [Human immunodeficiency virus type 1]
... MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQT
... GSEELRSLYNTVATLYCVHQRIEIKDTKEALDKIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQG
... QMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAA
... EWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPT
... SILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTAC
... QGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGHTARNCRAPRKKGCWKCGKEG
... HQMKDCTERQANFLGKIWPSYKGRPGNFLQSRPEPTAPPEESFRSGVETTTPPQKQEPIDKELYPLTSLR
... SLFGNDPSSQ
... ''')
>>> record.id
'gi|9629360|ref|NP_057850.1|'
>>> record.description
'gi|9629360|ref|NP_057850.1| Gag [Human immunodeficiency virus type 1]'
>>> record.seq[0:5]
Seq('MGARA', SingleLetterAlphabet())
|
>>> records = fasta_readrecords('GFF/multi.fna')
>>> records[0].id
'test1'
>>> records[2].seq
Seq('AAACACAC', SingleLetterAlphabet())
|
>>> record = genbank_single("GFF/NC_001422.gbk")
>>> record.taxonomy
['Viruses', 'ssDNA viruses', 'Microviridae', 'Microvirus']
>>> cds = record.features[-4]
>>> cds.key
'CDS'
>>> location = LocationFromString(cds.location)
>>> print location
2931..3917
>>> subseq = record_subseq(record, location)
>>> subseq[0:20]
Seq('ATGTTTGGTGCTATTGCTGG', Alphabet())
|
>>> from Bio.SeqRecord import SeqRecord
>>> record = SeqRecord(Seq("gagttttatcgcttccatga"),
... "ref|NC_001422",
... "Coliphage phiX174, complete genome",
... "bases 1-11")
>>> record_subseq(record, LocationFromString("1..4")) # one-based
Seq('GAGT', Alphabet())
>>> record_subseq(record, LocationFromString("complement(1..4)")) # one-based
Seq('ACTC', Alphabet())
>>> record_subseq(record, LocationFromString("join(complement(1..4),1..4)")) # what an idea!
Seq('ACTCGAGT', Alphabet())
>>> loc = LocationFromString("complement(join(complement(5..7),1..4))")
>>> print loc
complement(join(complement(5..7),1..4))
>>> record_subseq(record, loc)
Seq('ACTCTTT', Alphabet())
>>> print loc
complement(join(complement(5..7),1..4))
>>> loc.reverse()
>>> record_subseq(record, loc)
Seq('AAAGAGT', Alphabet())
>>> record_subseq(record, loc, upper=1)
Seq('AAAGAGT', Alphabet())
|
returns the sequence of a record can be Bio.SeqRecord.SeqRecord or Bio.GenBank.Record.Record |
>>> from Bio.SeqRecord import SeqRecord
>>> record = SeqRecord(Seq("gagttttatcgcttccatga"),
... "ref|NC_001422",
... "Coliphage phiX174, complete genome",
... "bases 1-11")
>>> record_coords(record, 0, 4) # zero-based
Seq('GAGT', Alphabet())
>>> record_coords(record, 0, 4, "-") # zero-based
Seq('ACTC', Alphabet())
>>> record_coords(record, 0, 4, "-", upper=1) # zero-based
Seq('ACTC', Alphabet())
|
Run the Bio.GFF.easy module's doctests (PRIVATE). This will try and locate the unit tests directory, and run the doctests from there in order that the relative paths used in the examples work. |
|
|||
re_seqname
|
| Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0.1 on Thu Aug 18 17:53:32 2011 | http://epydoc.sourceforge.net |