A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms

Sze Sing Hoi<sup>*</sup>; Pimsler Meaghan L; Tomberlin Jeffery K; Jones Corbin D; Tarone Aaron M

doi:10.1186/s12864-017-3735-1

摘要

Background: With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries. Results: We develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory. Conclusions: Our strategy minimizes memory consumption while simultaneously obtaining comparable or improved accuracy over existing algorithms. It provides support for incremental updates of assemblies when new libraries become available.

出版日期2017-5-24

全文

访问全文

收藏分享被引(7) 浏览

更新时间：2022-08-21 01:27

A scalable and memory-efficient algorithm for de novo transcriptome assembly of non-model organisms

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友