Read indexing

Nicolas Philippe; Mikael Salson; Thierry Lecroq; Martine Leonard; Therese Commes; Eric Rivals

doi:10.14806/ej.17.B.289

Read indexing

Nicolas Philippe, Mikael Salson, Thierry Lecroq, Martine Leonard, Therese Commes, Eric Rivals

Abstract

http://www.lirmm.fr/~rivals

The question of read indexing remains broadly unexplored. However, the increase in sequence throughput urges for new algorithmic solutions to query large read collections efficiently. We propose a solution, named Gk arrays, to index large collections of reads, an algorithm to build the structure, and procedures to query it. Once constructed, the index structure is kept in main memory and is repeatedly accessed to answer various types of queries. We compare our data structure to other possible solutions to investigate its scalability and computational efficiency. Gk arrays are implemented in a general purpose library, which may prove useful for assembly purposes, for evaluating the expression level in RNA-seq, and others high throughput sequencing applications.

References
1. Querying large read collections in main memory: a versatile data structure. N. Philippe, M. Salson, T. Lecroq, M. Leonard, T. Commes and E. Rivals. BMC Bioinformatics, Vol. 12, p. 42, doi:10.1186/1471-2105-12-242, 2011.

Relevant Web sites
2. http://crac.gforge.inria.fr/gkarrays/
3. http://www.atgc-montpellier.fr/ngs/

Keywords

next generation sequencing; COST; read indexing

Full Text:

PDF

DOI: https://doi.org/10.14806/ej.17.B.289

Refbacks

There are currently no refbacks.

EMBnet.journal

Current Volume

Join Learn Cooperate

Read indexing

Abstract

Keywords

Full Text:

Refbacks

Username
Password
Remember me