From Eqtnminer

Jump to: navigation, search

This is the developer portal of eQTNMiner. You will find here all relevant material to either understand how eQTNMiner is working and how to improve or debug it. If you have an account on SourceForge:


eQTNMiner package

The package makes use of the GNU Autotools (automake and autoconf) to ease the creation of machine-specific Makefiles and to check that dependencies required by the package are satisfied. Therefore, if someone wants to incorporate a new program, he/she only has to edit the file to add the relevant information.

The package organization is quite simple and follows a one file/one program rule, although some complex programs related to the hierarchical model are split into multiple files in order to keep the amount of code per file reasonable. Note that all the program files include the two following files:

In most cases, adding a new program consists in creating a file eqmr-xxx.c (substitute "xxx" by the keyword defining your new analysis), by looking at eqmr-db.c and eqmr-fcr.c for instance. If the new analysis has several steps, it is clearer to have one function per step, and thus having one program for the whole analysis. Several useful data structures already exist, such as in the igdb module. But it is usually necessary to add new ones, tailored to the new analysis. In that case, you will have to look at the GDL more closely (see below).

Genetic Data analysis Library

Each program in the eQTNMiner package is using a C dynamic library called Genetic Data analysis Library (GDL). This library contains and uses some code from the GNU Scientific Library (GSL), but also implements various functions and data structures useful to handle and analyze both genetic and genomic datasets.

The high-level GDL modules used by eQTNMiner are the following:

Of course, programs in eQTNMiner make use of other low-level GDL modules, especially when dealing with strings and hash tables. For details on these modules, please refer to the GDL homepage.

Coding conventions

In terms of indentation, the main guideline is to follow the so-called Allman style. If you develop under Emacs, you can add the following lines to the file $HOME/.emacs:

(setq c-default-style "bsd"
          c-basic-offset 2
          tab-width 2
          indent-tabs-mode t)

More importantly, to increase the chance of your code being re-used or improved by others, it is always better to use explicit names. For variables, use ploidy instead of P for instance, and for functions, prefer get_approx_bayes_factor() to a mere abf(). This also saves time as you don't need to write any documentation as long as the code is self-explanatory.

In terms of documenting the code, we (start to) use Doxygen such as in this example:

/** \brief Brief description.
 *         Brief description continued.
 *  Detailed description starts here.
 * \param filename name of the input file.
 * \return n number of lines in the input file.

Working with CVS

All the code from the eQTNMiner package (ie. the programs and the library) is versionned via CVS. If you have write permission (ie. the right to commit changes), you need to give your SSH passphrase at each CVS command. You will certainly find easier to avoid this by asking your computer to remember your passphrase, thanks to the following commands:

 $ exec ssh-agent /bin/bash
 $ ssh-add ~/.ssh/id_rsa

And by adding the following line to the file $HOME/.bash_profile (replace XXX by your Sourceforge login):


Moreover, as several people may develop simultaneously, it is recommended to often check if our own code is up-to-date. The following command outputs the list of local files that are different compare to the central repository (but it doesn't update these file, only lists them...):

 $ cvs -q -n update
 M eqtnminer/eqmr-meta.c
 M gdl/igdb/reg.c

In this case, the current developer modified two files. To see the differences, one can issue the following command:

 $ cvs diff -r HEAD eqtnminer/eqmr-meta.c
 Index: eqtnminer/eqmr-meta.c
 RCS file: /cvsroot/eqtnminer/eqtnminer/eqmr-meta.c,v
 retrieving revision 1.10
 diff -r1.10 eqmr-meta.c
 >   if (STEP == 3 && gdl_file_exists(GRID) == 0)
 >   {
 >     fprintf(stderr, "Error: grid file %s doesn't exist (-g)\n\n", GRID);
 >     status = GDL_FAILURE;
 >   }

To commit them, one can then run this:

 $ cvs commit -m "check grid file is present for step 3"

Table of file dependencies with respect to each program in the eQTNMiner package:

Main source file Has source dependency
Eqmr-db Eqmr-db.c
Eqmr-dbget Eqmr-dbget.c
Eqmr-fcr Eqmr-fcr.c
Eqmr-fcrout Eqmr-fcrout.c
Eqmr-fdb Eqmr-fdb.c
Eqmr-hmdb3 Eqmr-hmdb3.c
Eqmr-prbannot Eqmr-prbannot.c
HapMapUtil HMUgeno.c
Personal tools