Sequencing and de novo assembly of 150 genomes from Denmark as a population reference

Publikation: Bidrag til tidsskriftLetterfagfællebedømt

Dokumenter

  • Lasse Maretty Sørensen
  • Jacob Malte Jensen
  • Siyang Liu
  • Palle Villesen
  • Laurits Skov
  • Kirstine G Belling
  • Christian Theil Have
  • Jose M. G. Izarzugaza
  • Marie Grosjean
  • Jette Bork-Jensen
  • Jakob Grove
  • Thomas D. Als
  • Shujia Huang
  • Yuqi Chang
  • Ruiqi Xu
  • Weijian Ye
  • Junhua Rao
  • Xiaosen Guo
  • Jihua Sun
  • Hongzhi Cao
  • Chen Ye
  • Johan van Beusekom
  • Thomas Espeseth
  • Esben Flindt
  • Rune M. Friborg
  • Anders E. Halager
  • Stephanie Le Hellard
  • Christina M. Hultman
  • Francesco Lescai
  • Shengting Li
  • Ole Lund
  • Peter Løngren
  • Thomas Mailund
  • Maria Luisa Matey-Hernandez
  • Ole Mors
  • Christian N. S. Pedersen
  • Thomas Sicheritz-Pontén
  • Patrick Sullivan
  • Ali Syed
  • Rachita Yadav
  • Ning Li
  • Xun Xu
  • Lars Bolund
  • Ramneek Gupta
  • Søren Besenbacher
  • Anders D. Børglum
  • Jun Wang
  • Mikkel Heide Schierup
Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits1, 2, 3, 4. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly2, 5, 6, 7. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology4, 8, 9, 10, 11, 12, 13. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.
OriginalsprogEngelsk
TidsskriftNature
Vol/bind548
Udgave nummer7665
Sider (fra-til)87-91
Antal sider5
ISSN0028-0836
DOI
StatusUdgivet - 2017

Antal downloads er baseret på statistik fra Google Scholar og www.ku.dk


Ingen data tilgængelig

ID: 183010683