ADaCGH2: parallelized analysis of (big) CNA data

Diaz Uriarte Ramon<sup>*</sup>

doi:10.1093/bioinformatics/btu099

摘要

Motivation: Studies of genomic DNA copy number alteration can deal with datasets with several million probes and thousands of subjects. Analyzing these data with currently available software (e.g. as available from BioConductor) can be extremely slow and may not be feasible because of memory requirements. Results: We have developed a BioConductor package, ADaCGH2, that parallelizes the main segmentation algorithms (using forking on multicore computers or parallelization via message passing interface, etc., in clusters of computers) and uses ff objects for reading and data storage. We show examples of data with 6 million probes per array; we can analyze data that would otherwise not fit in memory, and compared with the non-parallelized versions we can achieve speed-ups of 25-40 times on a 64-cores machine.

出版日期2014-6-15

全文

访问全文

收藏分享被引(1) 浏览

更新时间：2021-04-25 21:13

ADaCGH2: parallelized analysis of (big) CNA data

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友