Séminaire Algo - Rayan Chikhi
22-nov.-2016 14:30
Rayan Chikhi

Efficient analysis of large DNA sequencing datasets

Salle de séminaire (4B05R) - Bâtiment Copernic


This talk will be about a recent algorithm that processes large DNA sequencing datasets. More specifically, the algorithm constructs a Bruijn graph, a widely used data structure in bioinformatics. The graph consists of billions of vertices that are short DNA strings from the dataset. Compacting such graphs is a simple but important data reduction step, where long simple paths are compacted into single vertices. Construction of the compacted graph has recently become the bottleneck in many software pipelines, and improving its running time and memory usage is an important problem.

