Package Bio :: Package SearchIO :: Package _model :: Module query
[hide private]
[frames] | no frames]

Source Code for Module Bio.SearchIO._model.query

  1  # Copyright 2012 by Wibowo Arindrarto.  All rights reserved. 
  2  # This code is part of the Biopython distribution and governed by its 
  3  # license.  Please see the LICENSE file that should have been included 
  4  # as part of this package. 
  5   
  6  """Bio.SearchIO object to model search results from a single query.""" 
  7   
  8  from __future__ import print_function 
  9  from Bio._py3k import basestring 
 10   
 11  from copy import deepcopy 
 12  from itertools import chain 
 13   
 14  from Bio._py3k import OrderedDict 
 15  from Bio._py3k import filter 
 16   
 17  from Bio._utils import trim_str 
 18  from Bio.SearchIO._utils import optionalcascade 
 19   
 20  from ._base import _BaseSearchObject 
 21  from .hit import Hit 
 22   
 23   
 24  __docformat__ = "restructuredtext en" 
25 26 27 -class QueryResult(_BaseSearchObject):
28 29 """Class representing search results from a single query. 30 31 QueryResult is the container object that stores all search hits from a 32 single search query. It is the top-level object returned by SearchIO's two 33 main functions, ``read`` and ``parse``. Depending on the search results and 34 search output format, a QueryResult object will contain zero or more Hit 35 objects (see Hit). 36 37 You can take a quick look at a QueryResult's contents and attributes by 38 invoking ``print`` on it:: 39 40 >>> from Bio import SearchIO 41 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 42 >>> print(qresult) 43 Program: blastn (2.2.27+) 44 Query: 33211 (61) 45 mir_1 46 Target: refseq_rna 47 Hits: ---- ----- ---------------------------------------------------------- 48 # # HSP ID + description 49 ---- ----- ---------------------------------------------------------- 50 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 51 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 52 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 53 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 54 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 55 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 56 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 57 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 58 8 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 59 9 2 gi|301171447|ref|NR_035871.1| Pan troglodytes microRNA... 60 10 1 gi|301171276|ref|NR_035852.1| Pan troglodytes microRNA... 61 11 1 gi|262205290|ref|NR_030188.1| Homo sapiens microRNA 51... 62 ... 63 64 If you just want to know how many hits a QueryResult has, you can invoke 65 ``len`` on it. Alternatively, you can simply type its name in the interpreter:: 66 67 >>> len(qresult) 68 100 69 >>> qresult 70 QueryResult(id='33211', 100 hits) 71 72 QueryResult behaves like a hybrid of Python's built-in list and dictionary. 73 You can retrieve its items (Hit objects) using the integer index of the 74 item, just like regular Python lists:: 75 76 >>> first_hit = qresult[0] 77 >>> first_hit 78 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 79 80 You can slice QueryResult objects as well. Slicing will return a new 81 QueryResult object containing only the sliced hits:: 82 83 >>> sliced_qresult = qresult[:3] # slice the first three hits 84 >>> len(qresult) 85 100 86 >>> len(sliced_qresult) 87 3 88 >>> print(sliced_qresult) 89 Program: blastn (2.2.27+) 90 Query: 33211 (61) 91 mir_1 92 Target: refseq_rna 93 Hits: ---- ----- ---------------------------------------------------------- 94 # # HSP ID + description 95 ---- ----- ---------------------------------------------------------- 96 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 97 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 98 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 99 100 Like Python dictionaries, you can also retrieve hits using the hit's ID. 101 This is useful for retrieving hits that you know should exist in a given 102 search:: 103 104 >>> hit = qresult['gi|262205317|ref|NR_030195.1|'] 105 >>> hit 106 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 107 108 You can also replace a Hit in QueryResult with another Hit using either the 109 integer index or hit key string. Note that the replacing object must be a 110 Hit that has the same ``query_id`` property as the QueryResult object. 111 112 If you're not sure whether a QueryResult contains a particular hit, you can 113 use the hit ID to check for membership first:: 114 115 >>> 'gi|262205317|ref|NR_030195.1|' in qresult 116 True 117 >>> 'gi|262380031|ref|NR_023426.1|' in qresult 118 False 119 120 Or, if you just want to know the rank / position of a given hit, you can 121 use the hit ID as an argument for the ``index`` method. Note that the values 122 returned will be zero-based. So zero (0) means the hit is the first in the 123 QueryResult, three (3) means the hit is the fourth item, and so on. If the 124 hit does not exist in the QueryResult, a ``ValueError`` will be raised. 125 126 >>> qresult.index('gi|262205317|ref|NR_030195.1|') 127 0 128 >>> qresult.index('gi|262205330|ref|NR_030198.1|') 129 5 130 >>> qresult.index('gi|262380031|ref|NR_023426.1|') 131 Traceback (most recent call last): 132 ... 133 ValueError: ... 134 135 To ease working with a large number of hits, QueryResult has several 136 ``filter`` and ``map`` methods, analogous to Python's built-in functions with 137 the same names. There are ``filter`` and ``map`` methods available for 138 operations over both Hit objects or HSP objects. As an example, here we are 139 using the ``hit_map`` method to rename all hit IDs within a QueryResult:: 140 141 >>> def renamer(hit): 142 ... hit.id = hit.id.split('|')[3] 143 ... return hit 144 >>> mapped_qresult = qresult.hit_map(renamer) 145 >>> print(mapped_qresult) 146 Program: blastn (2.2.27+) 147 Query: 33211 (61) 148 mir_1 149 Target: refseq_rna 150 Hits: ---- ----- ---------------------------------------------------------- 151 # # HSP ID + description 152 ---- ----- ---------------------------------------------------------- 153 0 1 NR_030195.1 Homo sapiens microRNA 520b (MIR520B), micr... 154 1 1 NR_035856.1 Pan troglodytes microRNA mir-520b (MIR520B... 155 2 1 NR_032573.1 Macaca mulatta microRNA mir-519a (MIR519A)... 156 ... 157 158 The principle for other ``map`` and ``filter`` methods are similar: they accept 159 a function, applies it, and returns a new QueryResult object. 160 161 There are also other methods useful for working with list-like objects: 162 ``append``, ``pop``, and ``sort``. More details and examples are available in 163 their respective documentations. 164 165 Finally, just like Python lists and dictionaries, QueryResult objects are 166 iterable. Iteration over QueryResults will yield Hit objects:: 167 168 >>> for hit in qresult[:4]: # iterate over the first four items 169 ... hit 170 ... 171 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 172 Hit(id='gi|301171311|ref|NR_035856.1|', query_id='33211', 1 hsps) 173 Hit(id='gi|270133242|ref|NR_032573.1|', query_id='33211', 1 hsps) 174 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 175 176 If you need access to all the hits in a QueryResult object, you can get 177 them in a list using the ``hits`` property. Similarly, access to all hit IDs is 178 available through the ``hit_keys`` property. 179 180 >>> qresult.hits 181 [Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps), ...] 182 >>> qresult.hit_keys 183 ['gi|262205317|ref|NR_030195.1|', 'gi|301171311|ref|NR_035856.1|', ...] 184 185 """ 186 187 # attributes we don't want to transfer when creating a new QueryResult class 188 # from this one 189 _NON_STICKY_ATTRS = ('_items', '__alt_hit_ids', ) 190
191 - def __init__(self, hits=[], id=None, 192 hit_key_function=lambda hit: hit.id):
193 """Initializes a QueryResult object. 194 195 :param id: query sequence ID 196 :type id: string 197 :param hits: iterator yielding Hit objects 198 :type hits: iterable 199 :param hit_key_function: function to define hit keys 200 :type hit_key_function: callable, accepts Hit objects, returns string 201 202 """ 203 # default values 204 self._id = id 205 self._hit_key_function = hit_key_function 206 self._items = OrderedDict() 207 self._description = None 208 self.__alt_hit_ids = {} 209 self.program = '<unknown program>' 210 self.target = '<unknown target>' 211 self.version = '<unknown version>' 212 213 # validate Hit objects and fill up self._items 214 for hit in hits: 215 # validation is handled by __setitem__ 216 self.append(hit)
217 218 # handle Python 2 OrderedDict behavior 219 if hasattr(OrderedDict, 'iteritems'): 220
221 - def __iter__(self):
222 return self.iterhits()
223 224 @property
225 - def hits(self):
226 """Hit objects contained in the QueryResult.""" 227 return self._items.values()
228 229 @property
230 - def hit_keys(self):
231 """Hit IDs of the Hit objects contained in the QueryResult.""" 232 return self._items.keys()
233 234 @property
235 - def items(self):
236 """List of tuples of Hit IDs and Hit objects.""" 237 return self._items.items()
238
239 - def iterhits(self):
240 """Returns an iterator over the Hit objects.""" 241 for hit in self._items.itervalues(): 242 yield hit
243
244 - def iterhit_keys(self):
245 """Returns an iterator over the ID of the Hit objects.""" 246 for hit_id in self._items: 247 yield hit_id
248
249 - def iteritems(self):
250 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 251 for item in self._items.iteritems(): 252 yield item
253 254 else: 255
256 - def __iter__(self):
257 return iter(self.hits)
258 259 @property
260 - def hits(self):
261 """Hit objects contained in the QueryResult.""" 262 return list(self._items.values())
263 264 @property
265 - def hit_keys(self):
266 """Hit IDs of the Hit objects contained in the QueryResult.""" 267 return list(self._items.keys())
268 269 @property
270 - def items(self):
271 """List of tuples of Hit IDs and Hit objects.""" 272 return list(self._items.items())
273
274 - def iterhits(self):
275 """Returns an iterator over the Hit objects.""" 276 for hit in self._items.values(): 277 yield hit
278
279 - def iterhit_keys(self):
280 """Returns an iterator over the ID of the Hit objects.""" 281 for hit_id in self._items: 282 yield hit_id
283
284 - def iteritems(self):
285 """Returns an iterator yielding tuples of Hit ID and Hit objects.""" 286 for item in self._items.items(): 287 yield item
288
289 - def __contains__(self, hit_key):
290 if isinstance(hit_key, Hit): 291 return self._hit_key_function(hit_key) in self._items 292 return hit_key in self._items or hit_key in self.__alt_hit_ids
293
294 - def __len__(self):
295 return len(self._items)
296 297 # Python 3:
298 - def __bool__(self):
299 return bool(self._items)
300 301 # Python 2: 302 __nonzero__= __bool__ 303
304 - def __repr__(self):
305 return "QueryResult(id=%r, %r hits)" % (self.id, len(self))
306
307 - def __str__(self):
308 lines = [] 309 310 # set program and version line 311 lines.append('Program: %s (%s)' % (self.program, self.version)) 312 313 # set query id line 314 qid_line = ' Query: %s' % self.id 315 if hasattr(self, 'seq_len'): 316 qid_line += ' (%i)' % self.seq_len 317 if self.description: 318 qid_line += trim_str('\n %s' % self.description, 80, '...') 319 lines.append(qid_line) 320 321 # set target line 322 lines.append(' Target: %s' % self.target) 323 324 # set hit lines 325 if not self.hits: 326 lines.append(' Hits: 0') 327 else: 328 lines.append(' Hits: %s %s %s' % ('-'*4, '-'*5, '-'*58)) 329 pattern = '%13s %5s %s' 330 lines.append(pattern % ('#', '# HSP', 'ID + description')) 331 lines.append(pattern % ('-'*4, '-'*5, '-'*58)) 332 for idx, hit in enumerate(self.hits): 333 if idx < 30: 334 hid_line = '%s %s' % (hit.id, hit.description) 335 if len(hid_line) > 58: 336 hid_line = hid_line[:55] + '...' 337 lines.append(pattern % (idx, str(len(hit)), hid_line)) 338 elif idx > len(self.hits) - 4: 339 hid_line = '%s %s' % (hit.id, hit.description) 340 if len(hid_line) > 58: 341 hid_line = hid_line[:55] + '...' 342 lines.append(pattern % (idx, str(len(hit)), hid_line)) 343 elif idx == 30: 344 lines.append('%14s' % '~~~') 345 346 return '\n'.join(lines)
347
348 - def __getitem__(self, hit_key):
349 # retrieval using slice objects returns another QueryResult object 350 if isinstance(hit_key, slice): 351 # should we return just a list of Hits instead of a full blown 352 # QueryResult object if it's a slice? 353 hits = list(self.hits)[hit_key] 354 obj = self.__class__(hits, self.id, self._hit_key_function) 355 self._transfer_attrs(obj) 356 return obj 357 358 # if key is an int, then retrieve the Hit at the int index 359 elif isinstance(hit_key, int): 360 length = len(self) 361 if 0 <= hit_key < length: 362 for idx, item in enumerate(self.iterhits()): 363 if idx == hit_key: 364 return item 365 elif -1 * length <= hit_key < 0: 366 for idx, item in enumerate(self.iterhits()): 367 if length + hit_key == idx: 368 return item 369 raise IndexError("list index out of range") 370 371 # if key is a string, then do a regular dictionary retrieval 372 # falling back on alternative hit IDs 373 try: 374 return self._items[hit_key] 375 except KeyError: 376 return self._items[self.__alt_hit_ids[hit_key]]
377
378 - def __setitem__(self, hit_key, hit):
379 # only accept string keys 380 if not isinstance(hit_key, basestring): 381 raise TypeError("QueryResult object keys must be a string.") 382 # hit must be a Hit object 383 if not isinstance(hit, Hit): 384 raise TypeError("QueryResult objects can only contain Hit objects.") 385 qid = self.id 386 hqid = hit.query_id 387 # and it must have the same query ID as this object's ID 388 # unless it's the query ID is None (default for empty objects), in which 389 # case we want to use the hit's query ID as the query ID 390 if qid is not None: 391 if hqid != qid: 392 raise ValueError("Expected Hit with query ID %r, found %r " 393 "instead." % (qid, hqid)) 394 else: 395 self.id = hqid 396 # same thing with descriptions 397 qdesc = self.description 398 hqdesc = hit.query_description 399 if qdesc is not None: 400 if hqdesc != qdesc: 401 raise ValueError("Expected Hit with query description %r, " 402 "found %r instead." % (qdesc, hqdesc)) 403 else: 404 self.description = hqdesc 405 406 # remove existing alt_id references, if hit_key already exists 407 if hit_key in self._items: 408 for alt_key in self._items[hit_key].id_all[1:]: 409 del self.__alt_hit_ids[alt_key] 410 411 # if hit_key is already present as an alternative ID 412 # delete it from the alternative ID dict 413 if hit_key in self.__alt_hit_ids: 414 del self.__alt_hit_ids[hit_key] 415 416 self._items[hit_key] = hit 417 for alt_id in hit.id_all[1:]: 418 self.__alt_hit_ids[alt_id] = hit_key
419
420 - def __delitem__(self, hit_key):
421 # if hit_key an integer or slice, get the corresponding key first 422 # and put it into a list 423 if isinstance(hit_key, int): 424 hit_keys = [list(self.hit_keys)[hit_key]] 425 # the same, if it's a slice 426 elif isinstance(hit_key, slice): 427 hit_keys = list(self.hit_keys)[hit_key] 428 # otherwise put it in a list 429 else: 430 hit_keys = [hit_key] 431 432 for key in hit_keys: 433 deleted = False 434 if key in self._items: 435 del self._items[key] 436 deleted = True 437 if key in self.__alt_hit_ids: 438 del self._items[self.__alt_hit_ids[key]] 439 del self.__alt_hit_ids[key] 440 deleted = True 441 if not deleted: 442 raise KeyError('%r'.format(key)) 443 return
444 445 # properties # 446 id = optionalcascade('_id', 'query_id', """QueryResult ID string""") 447 description = optionalcascade('_description', 'query_description', 448 """QueryResult description""") 449 450 @property
451 - def hsps(self):
452 """HSP objects contained in the QueryResult.""" 453 return [hsp for hsp in chain(*self.hits)]
454 455 @property
456 - def fragments(self):
457 """HSPFragment objects contained in the QueryResult.""" 458 return [frag for frag in chain(*self.hsps)]
459 460 # public methods #
461 - def absorb(self, hit):
462 """Adds a Hit object to the end of QueryResult. If the QueryResult 463 already has a Hit with the same ID, append the new Hit's HSPs into 464 the existing Hit. 465 466 :param hit: object to absorb 467 :type hit: Hit 468 469 This method is used for file formats that may output the same Hit in 470 separate places, such as BLAT or Exonerate. In both formats, Hit 471 with different strands are put in different places. However, SearchIO 472 considers them to be the same as a Hit object should be all database 473 entries with the same ID, regardless of strand orientation. 474 475 """ 476 try: 477 self.append(hit) 478 except ValueError: 479 assert hit.id in self 480 for hsp in hit: 481 self[hit.id].append(hsp)
482
483 - def append(self, hit):
484 """Adds a Hit object to the end of QueryResult. 485 486 :param hit: object to append 487 :type hit: Hit 488 489 Any Hit object appended must have the same ``query_id`` property as the 490 QueryResult's ``id`` property. If the hit key already exists, a 491 ``ValueError`` will be raised. 492 493 """ 494 # if a custom hit_key_function is supplied, use it to define th hit key 495 if self._hit_key_function is not None: 496 hit_key = self._hit_key_function(hit) 497 else: 498 hit_key = hit.id 499 500 if hit_key not in self and all([pid not in self for pid in hit.id_all[1:]]): 501 self[hit_key] = hit 502 else: 503 raise ValueError("The ID or alternative IDs of Hit %r exists in " 504 "this QueryResult." % hit_key)
505
506 - def hit_filter(self, func=None):
507 """Creates a new QueryResult object whose Hit objects pass the filter 508 function. 509 510 :param func: filter function 511 :type func: callable, accepts Hit, returns bool 512 513 Here is an example of using ``hit_filter`` to select Hits whose 514 description begins with the string 'Homo sapiens', case sensitive:: 515 516 >>> from Bio import SearchIO 517 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 518 >>> def desc_filter(hit): 519 ... return hit.description.startswith('Homo sapiens') 520 ... 521 >>> len(qresult) 522 100 523 >>> filtered = qresult.hit_filter(desc_filter) 524 >>> len(filtered) 525 39 526 >>> print(filtered[:4]) 527 Program: blastn (2.2.27+) 528 Query: 33211 (61) 529 mir_1 530 Target: refseq_rna 531 Hits: ---- ----- ---------------------------------------------------------- 532 # # HSP ID + description 533 ---- ----- ---------------------------------------------------------- 534 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 535 1 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 536 2 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 537 3 1 gi|262205451|ref|NR_030222.1| Homo sapiens microRNA 51... 538 539 Note that instance attributes (other than the hits) from the unfiltered 540 QueryResult are retained in the filtered object. 541 542 >>> qresult.program == filtered.program 543 True 544 >>> qresult.target == filtered.target 545 True 546 547 """ 548 hits = list(filter(func, self.hits)) 549 obj = self.__class__(hits, self.id, self._hit_key_function) 550 self._transfer_attrs(obj) 551 return obj
552
553 - def hit_map(self, func=None):
554 """Creates a new QueryResult object, mapping the given function to its 555 Hits. 556 557 :param func: map function 558 :type func: callable, accepts Hit, returns Hit 559 560 Here is an example of using ``hit_map`` with a function that discards all 561 HSPs in a Hit except for the first one:: 562 563 >>> from Bio import SearchIO 564 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 565 >>> print(qresult[:8]) 566 Program: blastn (2.2.27+) 567 Query: 33211 (61) 568 mir_1 569 Target: refseq_rna 570 Hits: ---- ----- ---------------------------------------------------------- 571 # # HSP ID + description 572 ---- ----- ---------------------------------------------------------- 573 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 574 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 575 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 576 3 2 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 577 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 578 5 2 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 579 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 580 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 581 582 >>> top_hsp = lambda hit: hit[:1] 583 >>> mapped_qresult = qresult.hit_map(top_hsp) 584 >>> print(mapped_qresult[:8]) 585 Program: blastn (2.2.27+) 586 Query: 33211 (61) 587 mir_1 588 Target: refseq_rna 589 Hits: ---- ----- ---------------------------------------------------------- 590 # # HSP ID + description 591 ---- ----- ---------------------------------------------------------- 592 0 1 gi|262205317|ref|NR_030195.1| Homo sapiens microRNA 52... 593 1 1 gi|301171311|ref|NR_035856.1| Pan troglodytes microRNA... 594 2 1 gi|270133242|ref|NR_032573.1| Macaca mulatta microRNA ... 595 3 1 gi|301171322|ref|NR_035857.1| Pan troglodytes microRNA... 596 4 1 gi|301171267|ref|NR_035851.1| Pan troglodytes microRNA... 597 5 1 gi|262205330|ref|NR_030198.1| Homo sapiens microRNA 52... 598 6 1 gi|262205302|ref|NR_030191.1| Homo sapiens microRNA 51... 599 7 1 gi|301171259|ref|NR_035850.1| Pan troglodytes microRNA... 600 601 """ 602 hits = [deepcopy(hit) for hit in self.hits] 603 if func is not None: 604 hits = [func(x) for x in hits] 605 obj = self.__class__(hits, self.id, self._hit_key_function) 606 self._transfer_attrs(obj) 607 return obj
608
609 - def hsp_filter(self, func=None):
610 """Creates a new QueryResult object whose HSP objects pass the filter 611 function. 612 613 ``hsp_filter`` is the same as ``hit_filter``, except that it filters 614 directly on each HSP object in every Hit. If the filtering removes 615 all HSP objects in a given Hit, the entire Hit will be discarded. This 616 will result in the QueryResult having less Hit after filtering. 617 618 """ 619 hits = [x for x in (hit.filter(func) for hit in self.hits) if x] 620 obj = self.__class__(hits, self.id, self._hit_key_function) 621 self._transfer_attrs(obj) 622 return obj
623
624 - def hsp_map(self, func=None):
625 """Creates a new QueryResult object, mapping the given function to its 626 HSPs. 627 628 ``hsp_map`` is the same as ``hit_map``, except that it applies the given 629 function to all HSP objects in every Hit, instead of the Hit objects. 630 631 """ 632 hits = [x for x in (hit.map(func) for hit in list(self.hits)[:]) if x] 633 obj = self.__class__(hits, self.id, self._hit_key_function) 634 self._transfer_attrs(obj) 635 return obj
636 637 # marker for default self.pop() return value 638 # this method is adapted from Python's built in OrderedDict.pop 639 # implementation 640 __marker = object() 641
642 - def pop(self, hit_key=-1, default=__marker):
643 """Removes the specified hit key and return the Hit object. 644 645 :param hit_key: key of the Hit object to return 646 :type hit_key: int or string 647 :param default: return value if no Hit exists with the given key 648 :type default: object 649 650 By default, ``pop`` will remove and return the last Hit object in the 651 QueryResult object. To remove specific Hit objects, you can use its 652 integer index or hit key. 653 654 >>> from Bio import SearchIO 655 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 656 >>> len(qresult) 657 100 658 >>> for hit in qresult[:5]: 659 ... print(hit.id) 660 ... 661 gi|262205317|ref|NR_030195.1| 662 gi|301171311|ref|NR_035856.1| 663 gi|270133242|ref|NR_032573.1| 664 gi|301171322|ref|NR_035857.1| 665 gi|301171267|ref|NR_035851.1| 666 667 # remove the last hit 668 >>> qresult.pop() 669 Hit(id='gi|397513516|ref|XM_003827011.1|', query_id='33211', 1 hsps) 670 671 # remove the first hit 672 >>> qresult.pop(0) 673 Hit(id='gi|262205317|ref|NR_030195.1|', query_id='33211', 1 hsps) 674 675 # remove hit with the given ID 676 >>> qresult.pop('gi|301171322|ref|NR_035857.1|') 677 Hit(id='gi|301171322|ref|NR_035857.1|', query_id='33211', 2 hsps) 678 679 """ 680 # if key is an integer (index) 681 # get the ID for the Hit object at that index 682 if isinstance(hit_key, int): 683 # raise the appropriate error if there is no hit 684 if not self: 685 raise IndexError("pop from empty list") 686 hit_key = list(self.hit_keys)[hit_key] 687 688 try: 689 hit = self._items.pop(hit_key) 690 # remove all alternative IDs of the popped hit 691 for alt_id in hit.id_all[1:]: 692 try: 693 del self.__alt_hit_ids[alt_id] 694 except KeyError: 695 pass 696 return hit 697 except KeyError: 698 if hit_key in self.__alt_hit_ids: 699 return self.pop(self.__alt_hit_ids[hit_key], default) 700 # if key doesn't exist and no default is set, raise a KeyError 701 if default is self.__marker: 702 raise KeyError(hit_key) 703 # if key doesn't exist but a default is set, return the default value 704 return default
705
706 - def index(self, hit_key):
707 """Returns the index of a given hit key, zero-based. 708 709 :param hit_key: hit ID 710 :type hit_key: string 711 712 This method is useful for finding out the integer index (usually 713 correlated with search rank) of a given hit key. 714 715 >>> from Bio import SearchIO 716 >>> qresult = next(SearchIO.parse('Blast/mirna.xml', 'blast-xml')) 717 >>> qresult.index('gi|301171259|ref|NR_035850.1|') 718 7 719 720 """ 721 if isinstance(hit_key, Hit): 722 return list(self.hit_keys).index(hit_key.id) 723 try: 724 return list(self.hit_keys).index(hit_key) 725 except ValueError: 726 if hit_key in self.__alt_hit_ids: 727 return self.index(self.__alt_hit_ids[hit_key]) 728 raise
729
730 - def sort(self, key=None, reverse=False, in_place=True):
731 # no cmp argument to make sort more Python 3-like 732 """Sorts the Hit objects. 733 734 :param key: sorting function 735 :type key: callable, accepts Hit, returns key for sorting 736 :param reverse: whether to reverse sorting results or no 737 :type reverse: bool 738 :param in_place: whether to do in-place sorting or no 739 :type in_place: bool 740 741 ``sort`` defaults to sorting in-place, to mimick Python's ``list.sort`` 742 method. If you set the ``in_place`` argument to False, it will treat 743 return a new, sorted QueryResult object and keep the initial one 744 unsorted. 745 746 """ 747 if key is None: 748 # if reverse is True, reverse the hits 749 if reverse: 750 sorted_hits = list(self.hits)[::-1] 751 # otherwise (default options) make a copy of the hits 752 else: 753 sorted_hits = list(self.hits)[:] 754 else: 755 sorted_hits = sorted(self.hits, key=key, reverse=reverse) 756 757 # if sorting is in-place, don't create a new QueryResult object 758 if in_place: 759 new_hits = OrderedDict() 760 for hit in sorted_hits: 761 new_hits[self._hit_key_function(hit)] = hit 762 self._items = new_hits 763 # otherwise, return a new sorted QueryResult object 764 else: 765 obj = self.__class__(sorted_hits, self.id, self._hit_key_function) 766 self._transfer_attrs(obj) 767 return obj
768 769 770 # if not used as a module, run the doctest 771 if __name__ == "__main__": 772 from Bio._utils import run_doctest 773 run_doctest() 774