Package Bio :: Package Phylo :: Module Consensus :: Class _BitString
[hide private]
[frames] | no frames]

Class _BitString

source code

object --+        
         |        
basestring --+    
             |    
           str --+
                 |
                _BitString

Helper class for binary string data (PRIVATE).

Assistant class of binary string data used for storing and counting compatible clades in consensus tree searching. It includes some binary manipulation(&|^~) methods.

_BitString is a sub-class of str object that only accepts two characters('0' and '1'), with additional functions for binary-like manipulation(&|^~). It is used to count and store the clades in multiple trees in consensus tree searching. During counting, the clades will be considered the same if their terminals(in terms of name attribute) are the same.

For example, let's say two trees are provided as below to search their strict consensus tree:

tree1: (((A, B), C),(D, E))
tree2: ((A, (B, C)),(D, E))

For both trees, a _BitString object '11111' will represent their root clade. Each '1' stands for the terminal clade in the list [A, B, C, D, E](the order might not be the same, it's determined by the get_terminal method of the first tree provided). For the clade ((A, B), C) in tree1 and (A, (B, C)) in tree2, they both can be represented by '11100'. Similarly, '11000' represents clade (A, B) in tree1, '01100' represents clade (B, C) in tree2, and '00011' represents clade (D, E) in both trees.

So, with the _count_clades function in this module, finally we can get the clade counts and their _BitString representation as follows (the root and terminals are omitted):

clade   _BitString   count
ABC     '11100'     2
DE      '00011'     2
AB      '11000'     1
BC      '01100'     1

To get the _BitString representation of a clade, we can use the following code snippet:

# suppose we are provided with a tree list, the first thing to do is
# to get all the terminal names in the first tree
term_names = [term.name for term in trees[0].get_terminals()]
# for a specific clade in any of the tree, also get its terminal names
clade_term_names = [term.name for term in clade.get_terminals()]
# then create a boolean list
boolvals = [name in clade_term_names for name in term_names]
# create the string version and pass it to _BitString
bitstr = _BitString(''.join(map(str, map(int, boolvals))))
# or, equivalently:
bitstr = _BitString.from_bool(boolvals)

To convert back:

# get all the terminal clades of the first tree
terms = [term for term in trees[0].get_terminals()]
# get the index of terminal clades in bitstr
index_list = bitstr.index_one()
# get all terminal clades by index
clade_terms = [terms[i] for i in index_list]
# create a new calde and append all the terminal clades
new_clade = BaseTree.Clade()
new_clade.clades.extend(clade_terms)

Example

>>> from Bio.Phylo.Consensus import _BitString
>>> bitstr1 = _BitString('11111')
>>> bitstr2 = _BitString('11100')
>>> bitstr3 = _BitString('01101')
>>> bitstr1
_BitString('11111')
>>> bitstr2 & bitstr3
_BitString('01100')
>>> bitstr2 | bitstr3
_BitString('11101')
>>> bitstr2 ^ bitstr3
_BitString('10001')
>>> bitstr2.index_one()
[0, 1, 2]
>>> bitstr3.index_one()
[1, 2, 4]
>>> bitstr3.index_zero()
[0, 3]
>>> bitstr1.contains(bitstr2)
True
>>> bitstr2.contains(bitstr3)
False
>>> bitstr2.independent(bitstr3)
False
>>> bitstr2.independent(bitstr4)
True
>>> bitstr1.iscompatible(bitstr2)
True
>>> bitstr2.iscompatible(bitstr3)
False
>>> bitstr2.iscompatible(bitstr4)
True
Instance Methods [hide private]
 
__and__(self, other) source code
 
__or__(self, other) source code
 
__xor__(self, other) source code
 
__rand__(self, other) source code
 
__ror__(self, other) source code
 
__rxor__(self, other) source code
 
__repr__(self)
repr(x)
source code
 
index_one(self)
Return a list of positions where the element is '1'
source code
 
index_zero(self)
Return a list of positions where the element is '0'
source code
 
contains(self, other)
Check if current bitstr1 contains another one bitstr2.
source code
 
independent(self, other)
Check if current bitstr1 is independent of another one bitstr2.
source code
 
iscompatible(self, other)
Check if current bitstr1 is compatible with another bitstr2.
source code

Inherited from str: __add__, __contains__, __eq__, __format__, __ge__, __getattribute__, __getitem__, __getnewargs__, __getslice__, __gt__, __hash__, __le__, __len__, __lt__, __mod__, __mul__, __ne__, __rmod__, __rmul__, __sizeof__, __str__, capitalize, center, count, decode, encode, endswith, expandtabs, find, format, index, isalnum, isalpha, isdigit, islower, isspace, istitle, isupper, join, ljust, lower, lstrip, partition, replace, rfind, rindex, rjust, rpartition, rsplit, rstrip, split, splitlines, startswith, strip, swapcase, title, translate, upper, zfill

Inherited from str (private): _formatter_field_name_split, _formatter_parser

Inherited from object: __delattr__, __init__, __reduce__, __reduce_ex__, __setattr__, __subclasshook__

Class Methods [hide private]
 
from_bool(cls, bools) source code
Static Methods [hide private]
a new object with type S, a subtype of T

__new__(cls, strdata)
init from a binary string data
source code
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__new__(cls, strdata)
Static Method

source code 
init from a binary string data
Returns:
a new object with type S, a subtype of T

Overrides: object.__new__

__repr__(self)
(Representation operator)

source code 
repr(x)

Overrides: object.__repr__
(inherited documentation)

contains(self, other)

source code 

Check if current bitstr1 contains another one bitstr2.

That is to say, the bitstr2.index_one() is a subset of bitstr1.index_one().

Examples:
"011011" contains "011000", "011001", "000011"

Be careful, "011011" also contains "000000". Actually, all _BitString objects contain all-zero _BitString of the same length.

independent(self, other)

source code 

Check if current bitstr1 is independent of another one bitstr2.

That is to say the bitstr1.index_one() and bitstr2.index_one() have no intersection.

Be careful, all _BitString objects are independent of all-zero _BitString of the same length.

iscompatible(self, other)

source code 

Check if current bitstr1 is compatible with another bitstr2.

Two conditions are considered as compatible:

  1. bitstr1.contain(bitstr2) or vise versa;
  2. bitstr1.independent(bitstr2).