Accurate and Effective Detection of Recurrent Copy Number Variants in Large SNP Genotype Datasets

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Dokumenter

  • Fulltext

    Forlagets udgivne version, 950 KB, PDF-dokument

  • Simone Montalbano
  • Xabier Calle Sánchez
  • Morteza Vaez
  • Dorte Helenius
  • Werge, Thomas
  • Andrés Ingason

Structural variations, including recurrent Copy Number Variants (CNVs) at specific genomic loci, have been found to be associated with increased risk of several diseases and syndromes. CNV carrier status can be determined in large collections of samples using SNP arrays and, more recently, sequencing data. Although there is some consensus among researchers about the essential steps required in such analysis (i.e., CNV calling, filtering of putative carriers, and visual validation using intensity data plots of the genomic region), standard methodologies and processes to control the quality and consistency of the results are lacking. Here, we present a comprehensive and user-friendly protocol that we have refined from our extensive research experience in the field. We cover every aspect of the analysis, from input data curation to final results. For each step, we highlight which parameters affect the analysis the most and how different settings may lead to different results. We provide a pipeline to run the complete analysis with effective (but customizable) pre-sets. We present software that we developed to better handle and filter putative CNV carriers and perform visual inspection to validate selected candidates. Finally, we describe methods to evaluate the critical sections and actions to counterbalance potential problems. The current implementation is focused on Illumina SNP array data. All the presented software is freely available and provided in a ready-to-use docker container.

OriginalsprogEngelsk
Artikelnummere621
TidsskriftCurrent Protocols
Vol/bind2
Udgave nummer12
Antal sider21
DOI
StatusUdgivet - 2022

Bibliografisk note

Funding Information:
Our research was supported by the NIH (R01 MH124789-01), but the funding source had no involvement in the design of the protocol, nor in collection, analysis, or interpretation of data. While protocol is designed as a general software, it was initially conceptualized in conjunction with CNV analysis of data from The Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH).

Funding Information:
Our research was supported by the NIH (R01 MH124789‐01), but the funding source had no involvement in the design of the protocol, nor in collection, analysis, or interpretation of data. While protocol is designed as a general software, it was initially conceptualized in conjunction with CNV analysis of data from The Lundbeck Foundation Initiative for Integrative Psychiatric Research (iPSYCH).

Publisher Copyright:
© 2022 The Authors. Current Protocols published by Wiley Periodicals LLC.

ID: 340552503