PopGen dev

From Biopython
(Difference between revisions)
Jump to: navigation, search
m (Undo Spam)
 
(15 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
=Development page for the PopGen module.=
 
=Development page for the PopGen module.=
  
==abstract==
+
==Introduction==
The popgen module will contain modules to handle population genetics data.
+
The [[PopGen]] module contains modules to handle population genetics data, applications and algorithms.
  
 +
==History and philosophy==
  
==use cases==
+
Most of the existing Bio.PopGen features are of non-core population genetics functionality. This was seen as feature (and not as a bug) in order to start building a module with functionality where newbie crass errors would not have dramatic consequences. Currently, with the experience accumulated is is possible and desirable to concentrate on core population genetics functionality (i.e., statistics).
Some use cases are hosted (temporarily) here:
+
* http://github.com/dalloliogm/bio-test-datasets-repository/tree/master/usecases/popgen
+
  
==code==
+
Also worth noticing is that we wrap existing functionality whenever possible. For instance we don't provide our own coalescent simulator, but we provide wrappers to an existing one which is established and widely used (SIMCOAL2).
Some of the code is actually been hosted on github:
+
* http://github.com/dalloliogm/biopython---popgen/commits/master
+
  
==how to contribute==
+
==Future Goals==
The recommended way is to create an account on github (free), and then click on the 'Fork' button (something like http://github.com/dalloliogm/biopython---popgen/fork) and then start working on your separated branch.
+
When you will have changes that you will consider functional, tell us and we will integrate in the ufficial popgen branch (mine?)
+
  
==wishing list==
+
The fundamental goal is to have support for "classic" population genetics operations (statistics). This should be provided in an extensible, easy to use and future-proof framework. Code exists (see below on how to find it), but will probably be refactored. Below there is also a with list where you can add your desired features.
* support for a binary format - like HDF5 or this one: [http://lists.open-bio.org/pipermail/biopython/2008-December/004830.html snpfile]
+
 
 +
==In the pipeline==
 +
 
 +
Currently (i.e., for the near term) the following new functionality can be expected
 +
 
 +
* STRUCTURE support
 +
* LDNe support
 +
* [[PopGen_dev_Statistics|Statistics]]
 +
 
 +
==Code and contributing==
 +
 
 +
The official production code is available on CVS.
 +
 
 +
If you would like to contribute, we suggest the following:
 +
 
 +
# Please have a look at the [[Contributing|General Biopython contribution guidelines]].
 +
# Join us on the biopython-devel mailing list and tell us about your ideas so that we know who is working on what, and can discuss the the viability of including your contribution on the official release.
 +
# Current development of Bio.PopGen is made on [http://github.com  github]. For Biopython's intruduction to GIT check [[GitUsage|this page]]. Most probably you will want to fork from the main development line at http://github.com/tiagoantao/biopython-popgen-test/tree/master (I don't like this being associated with my personal account - any suggestions?)
 +
# You are completely free to work on your own branch (but, if you want your changes to go to the official distribution don't forget to go to biopython-dev and discuss what you are doing).
 +
# When you feel your contribution is ready and you would like to propose it to the official distribution, your branch will have to be merged with the main development one. Contact the mailing list for help with doing this. You are expected to have production quality code (this includes unit tests and documentation). If you have doubts about unit testing and producing documentation, don't hesitate to contact the mailing list.
 +
 
 +
==Existing development branches==
 +
 
 +
 
 +
While the fundamental branch to start developing is http://github.com/tiagoantao/biopython-popgen-test/tree/master (this is the real starting point if you want to develop new functionality), we would like to have a notion of who is working on what (to avoid overlapping and allow for coordination).
 +
 
 +
Here are documented existing development branches. These branches are informal places where developers are creating new functionality, correcting bugs, etc... Feel free to add yours (or fork from existing ones). If you are interested in any of them contact the author directly or go to the mailing list.
 +
 
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Purpose
 +
! URL
 +
! who
 +
|-
 +
| Statistics (He, Fst, Tajima D, ...)
 +
| http://github.com/tiagoantao/biopython/tree/stats
 +
| Tiago Antao
 +
|-
 +
| Genepop (parser and application)
 +
| http://github.com/tiagoantao/biopython/tree/genepop
 +
| Tiago Antao
 +
|}
 +
 
 +
==Wish list==
 +
* support for a binary format - like [http://www.pytables.org HDF5or this one: [http://lists.open-bio.org/pipermail/biopython/2008-December/004830.html snpfile]
 
* support for database: it is frequent to carry analysis on a big scale, so it is not unfrequent to use databases to store data
 
* support for database: it is frequent to carry analysis on a big scale, so it is not unfrequent to use databases to store data
*
 

Latest revision as of 10:40, 24 March 2010

Contents

Development page for the PopGen module.

Introduction

The PopGen module contains modules to handle population genetics data, applications and algorithms.

History and philosophy

Most of the existing Bio.PopGen features are of non-core population genetics functionality. This was seen as feature (and not as a bug) in order to start building a module with functionality where newbie crass errors would not have dramatic consequences. Currently, with the experience accumulated is is possible and desirable to concentrate on core population genetics functionality (i.e., statistics).

Also worth noticing is that we wrap existing functionality whenever possible. For instance we don't provide our own coalescent simulator, but we provide wrappers to an existing one which is established and widely used (SIMCOAL2).

Future Goals

The fundamental goal is to have support for "classic" population genetics operations (statistics). This should be provided in an extensible, easy to use and future-proof framework. Code exists (see below on how to find it), but will probably be refactored. Below there is also a with list where you can add your desired features.

In the pipeline

Currently (i.e., for the near term) the following new functionality can be expected

Code and contributing

The official production code is available on CVS.

If you would like to contribute, we suggest the following:

  1. Please have a look at the General Biopython contribution guidelines.
  2. Join us on the biopython-devel mailing list and tell us about your ideas so that we know who is working on what, and can discuss the the viability of including your contribution on the official release.
  3. Current development of Bio.PopGen is made on github. For Biopython's intruduction to GIT check this page. Most probably you will want to fork from the main development line at http://github.com/tiagoantao/biopython-popgen-test/tree/master (I don't like this being associated with my personal account - any suggestions?)
  4. You are completely free to work on your own branch (but, if you want your changes to go to the official distribution don't forget to go to biopython-dev and discuss what you are doing).
  5. When you feel your contribution is ready and you would like to propose it to the official distribution, your branch will have to be merged with the main development one. Contact the mailing list for help with doing this. You are expected to have production quality code (this includes unit tests and documentation). If you have doubts about unit testing and producing documentation, don't hesitate to contact the mailing list.

Existing development branches

While the fundamental branch to start developing is http://github.com/tiagoantao/biopython-popgen-test/tree/master (this is the real starting point if you want to develop new functionality), we would like to have a notion of who is working on what (to avoid overlapping and allow for coordination).

Here are documented existing development branches. These branches are informal places where developers are creating new functionality, correcting bugs, etc... Feel free to add yours (or fork from existing ones). If you are interested in any of them contact the author directly or go to the mailing list.


Purpose URL who
Statistics (He, Fst, Tajima D, ...) http://github.com/tiagoantao/biopython/tree/stats Tiago Antao
Genepop (parser and application) http://github.com/tiagoantao/biopython/tree/genepop Tiago Antao

Wish list

  • support for a binary format - like HDF5 or this one: snpfile
  • support for database: it is frequent to carry analysis on a big scale, so it is not unfrequent to use databases to store data
Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox