A haplotype-resolved genome assembly of the Nile rat facilitates exploration of the genetic basis of diabetes

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 3,05 MB, PDF-dokument

  • Huishi Toh
  • Chentao Yang
  • Giulio Formenti
  • Kalpana Raja
  • Lily Yan
  • Alan Tracey
  • William Chow
  • Kerstin Howe
  • Bettina Haase
  • Jacquelyn Mountcastle
  • Olivier Fedrigo
  • John Fogg
  • Bogdan Kirilenko
  • Chetan Munegowda
  • Michael Hiller
  • Aashish Jain
  • Daisuke Kihara
  • Arang Rhie
  • Adam M. Phillippy
  • Scott A. Swanson
  • Peng Jiang
  • Dennis O. Clegg
  • Erich D. Jarvis
  • James A. Thomson
  • Ron Stewart
  • Mark J. P. Chaisson
  • Yury V. Bukhman

Background: The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic. Results: We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse. Conclusions: Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.

OriginalsprogEngelsk
Artikelnummer245
TidsskriftBMC Biology
Vol/bind20
Antal sider21
ISSN1741-7007
DOI
StatusUdgivet - 2022

Bibliografisk note

Funding Information:
MJPC is funded by NSF Grant Number 2046753.

Funding Information:
Woori Kwak of C&K Genomics submitted data to NCBI. Michael Collins set up software tools and the computing environment at Morgridge Institute for Research. Francoise Thibaud-Nissen led the genome annotation effort at NCBI. Amy Freitag helped edit the text. We also thank Mark Springer, Ben-Yang Liao, David Thybert, Masa Roller, Cecile Ane, Noga Kronfeld-Schor, and Francoise Thibaud-Nissen for stimulating discussions and advice. Alice Young and the NIH Intramural Sequencing Center (NISC) for assistance with sequencing. Data collection software was supplied by instrument vendors. Free open source software was used for all types and stages of data analysis. See references in the Methods Genome assembly and scaffolding: TrioCanu, Arrow, purge_dups, scaff10x, Bionano Solve, Salsa2, Longranger, Freebayes Mitochondrial genome assembly: mitoVGP Gene Ontology (GO) terms prediction: Phylo-PFP Segmental duplications workflow: https://github.com/ChaissonLab/SegDupAnnotation/releases/tag/vNR Segmental duplications workflow used windowmasker v1.0.0, RepeatMasker 4.1.1 with the parameter “-species rodentia”, SEDEF version 1.1-37-gd14abac-dirty with default parameters, CENSOR repeat masking server https://www.girinst.org/cgi-bin/censor/censor.cgi minimap2 was used for mapping gene models and Iso-seq transcripts to the genome Mutation rate analysis: GATK 4.0.7 Heterozygosity spectrum: Mummer v3.23, Assemblytics v1.2, SyRi v1.0 Branch-site test analysis: exonerate v2.4, PAML v4.9j Text mining: KinderMiner, https://www.kinderminer.org/ and SKiM, https://skim.morgridge.org/ TOGA: https://github.com/hillerlab/TOGA R: R version 4.1.0 (2021-05-18) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16 Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

Funding Information:
AR and AMP were supported by the Intramural Research Program of the National Human Genome Research Institute.

Funding Information:
RS and JAT acknowledge a grant from Marv Conney.

Funding Information:
This study was supported by The Garland Initiative for Vision funded by William K. Bowes Jr. Foundation.

Funding Information:
B Haase, J Mountcastle, O Fedrigo, and E.D. Jarvis’ contributions were supported by the Howard Hughes Medical Institute and the Rockefeller University.

Funding Information:
MH acknowledges support from the LOEWE-Centre for Translational Biodiversity Genomics. (TBG) funded by the Hessen State Ministry of Higher Education, Research and the Arts (HMWK).

Publisher Copyright:
© 2022, The Author(s).

ID: 326727294