Package Bio :: Package AlignIO :: Module StockholmIO :: Class StockholmIterator
[hide private]
[frames] | no frames]

Class StockholmIterator

source code

                  object --+    
                           |    
Interfaces.AlignmentIterator --+
                               |
                              StockholmIterator

Loads a Stockholm file from PFAM into MultipleSeqAlignment objects.

The file may contain multiple concatenated alignments, which are loaded and returned incrementally.

This parser will detect if the Stockholm file follows the PFAM conventions for sequence specific meta-data (lines starting #=GS and #=GR) and populates the SeqRecord fields accordingly.

Any annotation which does not follow the PFAM conventions is currently ignored.

If an accession is provided for an entry in the meta data, IT WILL NOT be used as the record.id (it will be recorded in the record's annotations). This is because some files have (sub) sequences from different parts of the same accession (differentiated by different start-end positions).

Wrap-around alignments are not supported - each sequences must be on a single line. However, interlaced sequences should work.

For more information on the file format, please see: http://www.bioperl.org/wiki/Stockholm_multiple_alignment_format http://www.cgb.ki.se/cgb/groups/sonnhammer/Stockholm.html

For consistency with BioPerl and EMBOSS we call this the "stockholm" format.

Instance Methods [hide private]
 
__next__(self)
Return the next alignment in the file.
source code
 
_identifier_split(self, identifier)
Returns (name, start, end) string tuple from an identier.
source code
 
_get_meta_data(self, identifier, meta_dict)
Takes an itentifier and returns dict of all meta-data matching it.
source code
 
_populate_meta_data(self, identifier, record)
Adds meta-date to a SecRecord's annotations dictionary.
source code

Inherited from Interfaces.AlignmentIterator: __init__, __iter__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Class Variables [hide private]
  pfam_gr_mapping = {'AS': 'active_site', 'IN': 'intron', 'LI': ...
  pfam_gs_mapping = {'LO': 'look', 'OC': 'organism_classificatio...
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__next__(self)

source code 
Return the next alignment in the file.

This method should be replaced by any derived class to do something
useful.

Overrides: Interfaces.AlignmentIterator.__next__
(inherited documentation)

_get_meta_data(self, identifier, meta_dict)

source code 

Takes an itentifier and returns dict of all meta-data matching it.

For example, given "Q9PN73_CAMJE/149-220" will return all matches to this or "Q9PN73_CAMJE" which the identifier without its /start-end suffix.

In the example below, the suffix is required to match the AC, but must be removed to match the OS and OC meta-data:

   # STOCKHOLM 1.0
   #=GS Q9PN73_CAMJE/149-220  AC Q9PN73
   ...
   Q9PN73_CAMJE/149-220               NKA...
   ...
   #=GS Q9PN73_CAMJE OS Campylobacter jejuni
   #=GS Q9PN73_CAMJE OC Bacteria

This function will return an empty dictionary if no data is found.

_populate_meta_data(self, identifier, record)

source code 

Adds meta-date to a SecRecord's annotations dictionary.

This function applies the PFAM conventions.


Class Variable Details [hide private]

pfam_gr_mapping

Value:
{'AS': 'active_site',
 'IN': 'intron',
 'LI': 'ligand_binding',
 'PP': 'posterior_probability',
 'SA': 'surface_accessibility',
 'SS': 'secondary_structure',
 'TM': 'transmembrane'}

pfam_gs_mapping

Value:
{'LO': 'look', 'OC': 'organism_classification', 'OS': 'organism'}