Nanopore sequencing and assembly of a human genome with ultra-long reads

作者:Jain, Miten; Koren, Sergey; Miga, Karen H.; Quick, Josh; Rand, Arthur C.; Sasani, Thomas A.; Tyson, John R.; Beggs, Andrew D.; Dilthey, Alexander T.; Fiddes, Ian T.; Malla, Sunir; Marriott, Hannah; Nieto, Tom; O'Grady, Justin; Olsen, Hugh E.; Pedersen, Brent S.; Rhie, Arang; Richardson, Hollian; Quinlan, Aaron R.; Snutch, Terrance P.; Tee, Louise; Paten, Benedict; Phillippy, Adam M.; Simpson, Jared T.; Loman, Nicholas J.*; Loose, Matthew*
来源:Nature Biotechnology, 2018, 36(4): 338-+.
DOI:10.1038/nbt.4060

摘要

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing similar to 30x theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 similar to 3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5x coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 similar to 6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.

  • 出版日期2018-4