摘要

Background: De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing data. Due to a very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. %26lt;br%26gt;Results: In this work, we show how to reduce the memory required by the data structure of Chikhi and Rizk (WABI%26apos; 12) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact on construction time. At the same time, our experiments showed a better query time compared to the method of Chikhi and Rizk. %26lt;br%26gt;Conclusion: The proposed data structure constitutes, to our knowledge, currently the most efficient practical representation of de Bruijn graphs.

  • 出版日期2014-2-24