Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS. / Wang, Yunpeng; Thompson, Wesley K; Schork, Andrew J; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R; Djurovic, Srdjan; O'Donovan, Michael C; Visscher, Peter M; Andreassen, Ole A.; Dale, Anders M.
I: P L o S Genetics, Bind 12, Nr. 1, e1005803, 2016.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Leveraging Genomic Annotations and Pleiotropic Enrichment for Improved Replication Rates in Schizophrenia GWAS
AU - Wang, Yunpeng
AU - Thompson, Wesley K
AU - Schork, Andrew J
AU - Holland, Dominic
AU - Chen, Chi-Hua
AU - Bettella, Francesco
AU - Desikan, Rahul S
AU - Li, Wen
AU - Witoelar, Aree
AU - Zuber, Verena
AU - Devor, Anna
AU - Nöthen, Markus M
AU - Rietschel, Marcella
AU - Chen, Qiang
AU - Werge, Thomas
AU - Cichon, Sven
AU - Weinberger, Daniel R
AU - Djurovic, Srdjan
AU - O'Donovan, Michael C
AU - Visscher, Peter M
AU - Andreassen, Ole A.
AU - Dale, Anders M
PY - 2016
Y1 - 2016
N2 - Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic ("z-score") of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a "relative enrichment score" for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3.
AB - Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic ("z-score") of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a "relative enrichment score" for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3.
KW - Genetic Predisposition to Disease
KW - Genome, Human
KW - Genome-Wide Association Study
KW - Genomics
KW - Humans
KW - Linkage Disequilibrium
KW - Polymorphism, Single Nucleotide
KW - Schizophrenia
KW - Journal Article
KW - Research Support, N.I.H., Extramural
KW - Research Support, Non-U.S. Gov't
U2 - 10.1371/journal.pgen.1005803
DO - 10.1371/journal.pgen.1005803
M3 - Journal article
C2 - 26808560
VL - 12
JO - P L o S Genetics
JF - P L o S Genetics
SN - 1553-7390
IS - 1
M1 - e1005803
ER -
ID: 177426392