Package Bio :: Package SearchIO :: Package _model :: Module query :: Class QueryResult
[hide private]
[frames] | no frames]

Class QueryResult

source code

             object --+    
                      |    
_base._BaseSearchObject --+
                          |
                         QueryResult

Class representing search results from a single query.

QueryResult is the container object that stores all search hits from a
single search query. It is the top-level object returned by SearchIO's two
main functions, `read` and `parse`. Depending on the search results and
search output format, a QueryResult object will contain zero or more Hit
objects (see Hit).

You can take a quick look at a QueryResult's contents and attributes by
invoking `print` on it:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> print(qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      2  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...
            8      1  gi|262205451|ref|NR_030222.1|  Homo sapiens microRNA 51...
            9      2  gi|301171447|ref|NR_035871.1|  Pan troglodytes microRNA...
           10      1  gi|301171276|ref|NR_035852.1|  Pan troglodytes microRNA...
           11      1  gi|262205290|ref|NR_030188.1|  Homo sapiens microRNA 51...
...

If you just want to know how many hits a QueryResult has, you can invoke
`len` on it. Alternatively, you can simply type its name in the interpreter:

>>> len(qresult)
100
>>> qresult
QueryResult(id='33211', 100 hits)

QueryResult behaves like a hybrid of Python's built-in list and dictionary.
You can retrieve its items (Hit objects) using the integer index of the
item, just like regular Python lists:

>>> first_hit = qresult[0]
>>> first_hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

You can slice QueryResult objects as well. Slicing will return a new
QueryResult object containing only the sliced hits:

>>> sliced_qresult = qresult[:3]    # slice the first three hits
>>> len(qresult)
100
>>> len(sliced_qresult)
3
>>> print(sliced_qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...

Like Python dictionaries, you can also retrieve hits using the hit's ID.
This is useful for retrieving hits that you know should exist in a given
search:

>>> hit = qresult['gi|262205317|ref|NR_030195.1|']
>>> hit
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

You can also replace a Hit in QueryResult with another Hit using either the
integer index or hit key string. Note that the replacing object must be a
Hit that has the same `query_id` property as the QueryResult object.

If you're not sure whether a QueryResult contains a particular hit, you can
use the hit ID to check for membership first:

>>> 'gi|262205317|ref|NR_030195.1|' in qresult
True
>>> 'gi|262380031|ref|NR_023426.1|' in qresult
False

Or, if you just want to know the rank / position of a given hit, you can
use the hit ID as an argument for the `index` method. Note that the values
returned will be zero-based. So zero (0) means the hit is the first in the
QueryResult, three (3) means the hit is the fourth item, and so on. If the
hit does not exist in the QueryResult, a `ValueError` will be raised.

>>> qresult.index('gi|262205317|ref|NR_030195.1|')
0
>>> qresult.index('gi|262205330|ref|NR_030198.1|')
5
>>> qresult.index('gi|262380031|ref|NR_023426.1|')
Traceback (most recent call last):
...
ValueError: ...

To ease working with a large number of hits, QueryResult has several
`filter` and `map` methods, analogous to Python's built-in functions with
the same names. There are `filter` and `map` methods available for
operations over both Hit objects or HSP objects. As an example, here we are
using the `hit_map` method to rename all hit IDs within a QueryResult:

>>> def renamer(hit):
...     hit.id = hit.id.split('|')[3]
...     return hit
>>> mapped_qresult = qresult.hit_map(renamer)
>>> print(mapped_qresult)
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  NR_030195.1  Homo sapiens microRNA 520b (MIR520B), micr...
            1      1  NR_035856.1  Pan troglodytes microRNA mir-520b (MIR520B...
            2      1  NR_032573.1  Macaca mulatta microRNA mir-519a (MIR519A)...
...

The principle for other `map` and `filter` methods are similar: they accept
a function, applies it, and returns a new QueryResult object.

There are also other methods useful for working with list-like objects:
`append`, `pop`, and `sort`. More details and examples are available in
their respective documentations.

Finally, just like Python lists and dictionaries, QueryResult objects are
iterable. Iteration over QueryResults will yield Hit objects:

>>> for hit in qresult[:4]:     # iterate over the first four items
...     hit
...
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps)
Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps)
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)

If you need access to all the hits in a QueryResult object, you can get
them in a list using the `hits` property. Similarly, access to all hit IDs is
available through the `hit_keys` property.

>>> qresult.hits
[Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...]
>>> qresult.hit_keys
['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...]

Instance Methods [hide private]
 
__init__(self, hits=[], id=None, hit_key_function=<function <lambda> at 0x7fa98356f050>)
Initializes a QueryResult object.
source code
 
__iter__(self) source code
 
iterhits(self)
Returns an iterator over the Hit objects.
source code
 
iterhit_keys(self)
Returns an iterator over the ID of the Hit objects.
source code
 
iteritems(self)
Returns an iterator yielding tuples of Hit ID and Hit objects.
source code
 
__contains__(self, hit_key) source code
 
__len__(self) source code
 
__bool__(self) source code
 
__nonzero__(self) source code
 
__repr__(self)
repr(x)
source code
 
__str__(self)
str(x)
source code
 
__getitem__(self, hit_key) source code
 
__setitem__(self, hit_key, hit) source code
 
__delitem__(self, hit_key) source code
 
absorb(self, hit)
Adds a Hit object to the end of QueryResult.
source code
 
append(self, hit)
Adds a Hit object to the end of QueryResult.
source code
 
hit_filter(self, func=None)
Creates a new QueryResult object whose Hit objects pass the filter function.
source code
 
hit_map(self, func=None)
Creates a new QueryResult object, mapping the given function to its Hits.
source code
 
hsp_filter(self, func=None)
Creates a new QueryResult object whose HSP objects pass the filter function.
source code
 
hsp_map(self, func=None)
Creates a new QueryResult object, mapping the given function to its HSPs.
source code
 
pop(self, hit_key=-1, default=object())
Removes the specified hit key and return the Hit object.
source code
 
index(self, hit_key)
Returns the index of a given hit key, zero-based.
source code
 
sort(self, key=None, reverse=False, in_place=True)
Sorts the Hit objects.
source code

Inherited from _base._BaseSearchObject (private): _transfer_attrs

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

Class Variables [hide private]
  _NON_STICKY_ATTRS = ('_items')
  __marker = object()
Properties [hide private]
  hits
Hit objects contained in the QueryResult.
  hit_keys
Hit IDs of the Hit objects contained in the QueryResult.
  items
List of tuples of Hit IDs and Hit objects.
  id
QueryResult ID string
  description
QueryResult description
  hsps
HSP objects contained in the QueryResult.
  fragments
HSPFragment objects contained in the QueryResult.

Inherited from object: __class__

Method Details [hide private]

__init__(self, hits=[], id=None, hit_key_function=<function <lambda> at 0x7fa98356f050>)
(Constructor)

source code 
Initializes a QueryResult object.

Arguments:
id -- String of query sequence ID.
hits -- Iterator returning Hit objects.
hit_key_function -- Function to define hit keys, defaults to a function
                    that return Hit object IDs.

Overrides: object.__init__

__repr__(self)
(Representation operator)

source code 
repr(x)

Overrides: object.__repr__
(inherited documentation)

__str__(self)
(Informal representation operator)

source code 
str(x)

Overrides: object.__str__
(inherited documentation)

absorb(self, hit)

source code 
Adds a Hit object to the end of QueryResult. If the QueryResult
already has a Hit with the same ID, append the new Hit's HSPs into
the existing Hit.

Arguments:
hit -- Hit object to absorb.

This method is used for file formats that may output the same Hit in
separate places, such as BLAT or Exonerate. In both formats, Hit
with different strands are put in different places. However, SearchIO
considers them to be the same as a Hit object should be all database
entries with the same ID, regardless of strand orientation.

append(self, hit)

source code 
Adds a Hit object to the end of QueryResult.

Parameters
hit -- Hit object to append.

Any Hit object appended must have the same `query_id` property as the
QueryResult's `id` property. If the hit key already exists, a
`ValueError` will be raised.

hit_filter(self, func=None)

source code 
Creates a new QueryResult object whose Hit objects pass the filter
function.

Arguments:
func -- Callback function that accepts a Hit object as its parameter,
        does a boolean check, and returns True or False

Here is an example of using `hit_filter` to select Hits whose
description begins with the string 'Homo sapiens', case sensitive:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> def desc_filter(hit):
...     return hit.description.startswith('Homo sapiens')
...
>>> len(qresult)
100
>>> filtered = qresult.hit_filter(desc_filter)
>>> len(filtered)
39
>>> print(filtered[:4])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            2      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            3      1  gi|262205451|ref|NR_030222.1|  Homo sapiens microRNA 51...

Note that instance attributes (other than the hits) from the unfiltered
QueryResult are retained in the filtered object.

    >>> qresult.program == filtered.program
    True
    >>> qresult.target == filtered.target
    True

hit_map(self, func=None)

source code 
Creates a new QueryResult object, mapping the given function to its
Hits.

Arguments:
func -- Callback function that accepts a Hit object as its parameter and
        also returns a Hit object.

Here is an example of using `hit_map` with a function that discards all
HSPs in a Hit except for the first one:

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> print(qresult[:8])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      2  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      2  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...

>>> top_hsp = lambda hit: hit[:1]
>>> mapped_qresult = qresult.hit_map(top_hsp)
>>> print(mapped_qresult[:8])
Program: blastn (2.2.27+)
  Query: 33211 (61)
         mir_1
 Target: refseq_rna
   Hits: ----  -----  ----------------------------------------------------------
            #  # HSP  ID + description                                          
         ----  -----  ----------------------------------------------------------
            0      1  gi|262205317|ref|NR_030195.1|  Homo sapiens microRNA 52...
            1      1  gi|301171311|ref|NR_035856.1|  Pan troglodytes microRNA...
            2      1  gi|270133242|ref|NR_032573.1|  Macaca mulatta microRNA ...
            3      1  gi|301171322|ref|NR_035857.1|  Pan troglodytes microRNA...
            4      1  gi|301171267|ref|NR_035851.1|  Pan troglodytes microRNA...
            5      1  gi|262205330|ref|NR_030198.1|  Homo sapiens microRNA 52...
            6      1  gi|262205302|ref|NR_030191.1|  Homo sapiens microRNA 51...
            7      1  gi|301171259|ref|NR_035850.1|  Pan troglodytes microRNA...

hsp_filter(self, func=None)

source code 
Creates a new QueryResult object whose HSP objects pass the filter
function.

`hsp_filter` is the same as `hit_filter`, except that it filters
directly on each HSP object in every Hit. If the filtering removes
all HSP objects in a given Hit, the entire Hit will be discarded. This
will result in the QueryResult having less Hit after filtering.

hsp_map(self, func=None)

source code 
Creates a new QueryResult object, mapping the given function to its
HSPs.

`hsp_map` is the same as `hit_map`, except that it applies the given
function to all HSP objects in every Hit, instead of the Hit objects.

pop(self, hit_key=-1, default=object())

source code 
Removes the specified hit key and return the Hit object.

Arguments:
hit_key -- Integer index or string of hit key that points to a Hit
           object.
default -- Value that will be returned if the Hit object with the
           specified index or hit key is not found.

By default, `pop` will remove and return the last Hit object in the
QueryResult object. To remove specific Hit objects, you can use its
integer index or hit key.

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> len(qresult)
100
>>> for hit in qresult[:5]:
...     print(hit.id)
... 
gi|262205317|ref|NR_030195.1|
gi|301171311|ref|NR_035856.1|
gi|270133242|ref|NR_032573.1|
gi|301171322|ref|NR_035857.1|
gi|301171267|ref|NR_035851.1|

# remove the last hit
>>> qresult.pop()
Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps)

# remove the first hit
>>> qresult.pop(0)
Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps)

# remove hit with the given ID
>>> qresult.pop('gi|301171322|ref|NR_035857.1|')
Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps)

index(self, hit_key)

source code 
Returns the index of a given hit key, zero-based.

Arguments:
hit_key -- Hit ID string to look up.

This method is useful for finding out the integer index (usually
correlated with search rank) of a given hit key.

>>> from Bio import SearchIO
>>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml'))
>>> qresult.index('gi|301171259|ref|NR_035850.1|')
7

sort(self, key=None, reverse=False, in_place=True)

source code 
Sorts the Hit objects.

Arguments:
key -- Function used to sort the Hit objects.
reverse -- Boolean, whether to reverse the sorting or not.
in_place -- Boolean, whether to perform sorting in place (in the same
            object) or not (creating a new object).

`sort` defaults to sorting in-place, to mimick Python's `list.sort`
method. If you set the `in_place` argument to False, it will treat
return a new, sorted QueryResult object and keep the initial one
unsorted.


Property Details [hide private]

hits

Hit objects contained in the QueryResult.

Get Method:
unreachable.hits(self) - Hit objects contained in the QueryResult.

hit_keys

Hit IDs of the Hit objects contained in the QueryResult.

Get Method:
unreachable.hit_keys(self) - Hit IDs of the Hit objects contained in the QueryResult.

items

List of tuples of Hit IDs and Hit objects.

Get Method:
unreachable.items(self) - List of tuples of Hit IDs and Hit objects.

id

QueryResult ID string

Get Method:
unreachable.getter(self)
Set Method:
unreachable.setter(self, value)

description

QueryResult description

Get Method:
unreachable.getter(self)
Set Method:
unreachable.setter(self, value)

hsps

HSP objects contained in the QueryResult.

Get Method:
unreachable.hsps(self) - HSP objects contained in the QueryResult.

fragments

HSPFragment objects contained in the QueryResult.

Get Method:
unreachable.fragments(self) - HSPFragment objects contained in the QueryResult.