Package Bio :: Module SeqFeature :: Class CompoundLocation
[hide private]
[frames] | no frames]

Class CompoundLocation

source code

object --+
         |
        CompoundLocation

For handling joins etc where a feature location has several parts.
Instance Methods [hide private]
 
__init__(self, parts, operator='join')
Create a compound location with several parts.
source code
 
__str__(self)
Returns a representation of the location (with python counting).
source code
 
__repr__(self)
String representation of the location for debugging.
source code
 
_get_strand(self) source code
 
_set_strand(self, value) source code
 
__add__(self, other)
Combine locations, or shift the location by an integer offset.
source code
 
__radd__(self, other)
Combine locations.
source code
 
__contains__(self, value)
Check if an integer position is within the location.
source code
 
__nonzero__(self)
Returns True regardless of the length of the feature.
source code
 
__len__(self) source code
 
__iter__(self) source code
 
_shift(self, offset)
Returns a copy of the location shifted by the offset (PRIVATE).
source code
 
_flip(self, length)
Returns a copy of the location after the parent is reversed (PRIVATE).
source code
 
extract(self, parent_sequence)
Extract feature sequence from the supplied parent sequence.
source code

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Properties [hide private]
  strand
Overall strand of the compound location.
  start
Start location (integer like, possibly a fuzzy position, read only).
  end
End location (integer like, possibly a fuzzy position, read only).
  nofuzzy_start
Start position (integer, approximated if fuzzy, read only) (OBSOLETE).
  nofuzzy_end
End position (integer, approximated if fuzzy, read only) (OBSOLETE).
  ref
CompoundLocation's don't have a ref (dummy method for API compatibility).
  ref_db
CompoundLocation's don't have a ref_db (dummy method for API compatibility).

Inherited from object: __class__

Method Details [hide private]

__init__(self, parts, operator='join')
(Constructor)

source code 

Create a compound location with several parts.

>>> from Bio.SeqFeature import FeatureLocation, CompoundLocation
>>> f1 = FeatureLocation(10, 40, strand=+1)
>>> f2 = FeatureLocation(50, 59, strand=+1)
>>> f = CompoundLocation([f1, f2])
>>> len(f) == len(f1) + len(f2) == 39 == len(list(f))
True
>>> print(f.operator)
join
>>> 5 in f
False
>>> 15 in f
True
>>> f.strand
1

Notice that the strand of the compound location is computed automatically - in the case of mixed strands on the sub-locations the overall strand is set to None.

>>> f = CompoundLocation([FeatureLocation(3, 6, strand=+1),
...                       FeatureLocation(10, 13, strand=-1)])
>>> print(f.strand)
None
>>> len(f)
6
>>> list(f)
[3, 4, 5, 12, 11, 10]

The example above doing list(f) iterates over the coordinates within the feature. This allows you to use max and min on the location, to find the range covered:

>>> min(f)
3
>>> max(f)
12

More generally, you can use the compound location's start and end which give the full range covered, 0 <= start <= end <= full sequence length.

>>> f.start == min(f)
True
>>> f.end == max(f) + 1
True

This is consistent with the behaviour of the simple FeatureLocation for a single region, where again the 'start' and 'end' do not necessarily give the biological start and end, but rather the 'minimal' and 'maximal' coordinate boundaries.

Note that adding locations provides a more intuitive method of construction:

>>> f = FeatureLocation(3, 6, strand=+1) + FeatureLocation(10, 13, strand=-1)
>>> len(f)
6
>>> list(f)
[3, 4, 5, 12, 11, 10]
Overrides: object.__init__

__str__(self)
(Informal representation operator)

source code 
Returns a representation of the location (with python counting).
Overrides: object.__str__

__repr__(self)
(Representation operator)

source code 
String representation of the location for debugging.
Overrides: object.__repr__

__add__(self, other)
(Addition operator)

source code 

Combine locations, or shift the location by an integer offset.

>>> from Bio.SeqFeature import FeatureLocation, CompoundLocation
>>> f1 = FeatureLocation(15, 17) + FeatureLocation(20, 30)
>>> print(f1)
join{[15:17], [20:30]}

You can add another FeatureLocation:

>>> print(f1 + FeatureLocation(40, 50))
join{[15:17], [20:30], [40:50]}
>>> print(FeatureLocation(5, 10) + f1)
join{[5:10], [15:17], [20:30]}

You can also add another CompoundLocation:

>>> f2 = FeatureLocation(40, 50) + FeatureLocation(60, 70)
>>> print(f2)
join{[40:50], [60:70]}
>>> print(f1 + f2)
join{[15:17], [20:30], [40:50], [60:70]}

Also, as with the FeatureLocation, adding an integer shifts the location's co-ordinates by that offset:

>>> print(f1 + 100)
join{[115:117], [120:130]}
>>> print(200 + f1)
join{[215:217], [220:230]}
>>> print(f1 + (-5))
join{[10:12], [15:25]}

__nonzero__(self)
(Boolean test operator)

source code 

Returns True regardless of the length of the feature.

This behaviour is for backwards compatibility, since until the __len__ method was added, a FeatureLocation always evaluated as True.

Note that in comparison, Seq objects, strings, lists, etc, will all evaluate to False if they have length zero.

WARNING: The FeatureLocation may in future evaluate to False when its length is zero (in order to better match normal python behaviour)!

_flip(self, length)

source code 

Returns a copy of the location after the parent is reversed (PRIVATE).

Note that the order of the parts is NOT reversed too. Consider a CDS on the forward strand with exons small, medium and large (in length). Once we change the frame of reference to the reverse complement strand, the start codon is still part of the small exon, and the stop codon still part of the large exon - so the part order remains the same!

Here is an artificial example, were the features map to the two upper case regions and the lower case runs of n are not used:

>>> from Bio.Seq import Seq
>>> from Bio.SeqFeature import FeatureLocation
>>> dna = Seq("nnnnnAGCATCCTGCTGTACnnnnnnnnGAGAMTGCCATGCCCCTGGAGTGAnnnnn")
>>> small = FeatureLocation(5, 20, strand=1)
>>> large = FeatureLocation(28, 52, strand=1)
>>> location = small + large
>>> print(small)
[5:20](+)
>>> print(large)
[28:52](+)
>>> print(location)
join{[5:20](+), [28:52](+)}
>>> for part in location.parts:
...     print(len(part))
...
15
24

As you can see, this is a silly example where each "exon" is a word:

>>> print(small.extract(dna).translate())
SILLY
>>> print(large.extract(dna).translate())
EXAMPLE*
>>> print(location.extract(dna).translate())
SILLYEXAMPLE*
>>> for part in location.parts:
...     print(part.extract(dna).translate())
...
SILLY
EXAMPLE*

Now, let's look at this from the reverse strand frame of reference:

>>> flipped_dna = dna.reverse_complement()
>>> flipped_location = location._flip(len(dna))
>>> print(flipped_location.extract(flipped_dna).translate())
SILLYEXAMPLE*
>>> for part in flipped_location.parts:
...     print(part.extract(flipped_dna).translate())
...
SILLY
EXAMPLE*

The key point here is the first part of the CompoundFeature is still the small exon, while the second part is still the large exon:

>>> for part in flipped_location.parts:
...     print(len(part))
...
15
24
>>> print(flipped_location)
join{[37:52](-), [5:29](-)}

Notice the parts are not reversed. However, there was a bug here in older versions of Biopython which would have given join{[5:29](-), [37:52](-)} and the translation would have wrongly been "EXAMPLE*SILLY" instead.


Property Details [hide private]

strand

Overall strand of the compound location.

If all the parts have the same strand, that is returned. Otherwise for mixed strands, this returns None.

>>> from Bio.SeqFeature import FeatureLocation, CompoundLocation
>>> f1 = FeatureLocation(15, 17, strand=1)
>>> f2 = FeatureLocation(20, 30, strand=-1)
>>> f = f1 + f2
>>> f1.strand
1
>>> f2.strand
-1
>>> f.strand
>>> f.strand is None
True

If you set the strand of a CompoundLocation, this is applied to all the parts - use with caution:

>>> f.strand = 1
>>> f1.strand
1
>>> f2.strand
1
>>> f.strand
1
Get Method:
_get_strand(self)
Set Method:
_set_strand(self, value)

start

Start location (integer like, possibly a fuzzy position, read only).
Get Method:
unreachable.start(self) - Start location (integer like, possibly a fuzzy position, read only).

end

End location (integer like, possibly a fuzzy position, read only).
Get Method:
unreachable.end(self) - End location (integer like, possibly a fuzzy position, read only).

nofuzzy_start

Start position (integer, approximated if fuzzy, read only) (OBSOLETE).

This is an alias for int(feature.start), which should be used in preference -- unless you are trying to support old versions of Biopython.

Get Method:
unreachable.nofuzzy_start(self) - Start position (integer, approximated if fuzzy, read only) (OBSOLETE).

nofuzzy_end

End position (integer, approximated if fuzzy, read only) (OBSOLETE).

This is an alias for int(feature.end), which should be used in preference -- unless you are trying to support old versions of Biopython.

Get Method:
unreachable.nofuzzy_end(self) - End position (integer, approximated if fuzzy, read only) (OBSOLETE).

ref

CompoundLocation's don't have a ref (dummy method for API compatibility).
Get Method:
unreachable.ref(self) - CompoundLocation's don't have a ref (dummy method for API compatibility).

ref_db

CompoundLocation's don't have a ref_db (dummy method for API compatibility).
Get Method:
unreachable.ref_db(self) - CompoundLocation's don't have a ref_db (dummy method for API compatibility).