An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Standard

An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. / Thompson, Wesley K.; Wang, Yunpeng; Schork, Andrew J.; Witoelar, Aree; Zuber, Verena; Xu, Shujing; Werge, Thomas; Holland, Dominic; Andreassen, Ole A.; Dale, Anders M.

I: P L o S Genetics, Bind 11, Nr. 12, e1005717, 29.12.2015, s. 1-21.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Harvard

Thompson, WK, Wang, Y, Schork, AJ, Witoelar, A, Zuber, V, Xu, S, Werge, T, Holland, D, Andreassen, OA & Dale, AM 2015, 'An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies', P L o S Genetics, bind 11, nr. 12, e1005717, s. 1-21. https://doi.org/10.1371/journal.pgen.1005717

APA

Thompson, W. K., Wang, Y., Schork, A. J., Witoelar, A., Zuber, V., Xu, S., Werge, T., Holland, D., Andreassen, O. A., & Dale, A. M. (2015). An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. P L o S Genetics, 11(12), 1-21. [ e1005717]. https://doi.org/10.1371/journal.pgen.1005717

Vancouver

Thompson WK, Wang Y, Schork AJ, Witoelar A, Zuber V, Xu S o.a. An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. P L o S Genetics. 2015 dec. 29;11(12):1-21. e1005717. https://doi.org/10.1371/journal.pgen.1005717

Author

Thompson, Wesley K. ; Wang, Yunpeng ; Schork, Andrew J. ; Witoelar, Aree ; Zuber, Verena ; Xu, Shujing ; Werge, Thomas ; Holland, Dominic ; Andreassen, Ole A. ; Dale, Anders M. / An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. I: P L o S Genetics. 2015 ; Bind 11, Nr. 12. s. 1-21.

Bibtex

@article{56d5de3b447448be8fe1b2c3d1dc97c2,

title = "An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies",

abstract = "Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn{\textquoteright}s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci.",

author = "Thompson, {Wesley K.} and Yunpeng Wang and Schork, {Andrew J.} and Aree Witoelar and Verena Zuber and Shujing Xu and Thomas Werge and Dominic Holland and Andreassen, {Ole A.} and Dale, {Anders M.}",

year = "2015",

month = dec,

day = "29",

doi = "10.1371/journal.pgen.1005717",

language = "English",

volume = "11",

pages = "1--21",

journal = "P L o S Genetics",

issn = "1553-7390",

publisher = "Public Library of Science",

number = "12",

}

RIS

TY - JOUR

T1 - An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies

AU - Thompson, Wesley K.

AU - Wang, Yunpeng

AU - Schork, Andrew J.

AU - Witoelar, Aree

AU - Zuber, Verena

AU - Xu, Shujing

AU - Werge, Thomas

AU - Holland, Dominic

AU - Andreassen, Ole A.

AU - Dale, Anders M.

PY - 2015/12/29

Y1 - 2015/12/29

N2 - Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci.

AB - Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of pervasive small but replicating effects in CD and SZ on genomic control and power. Finally, we conclude that, despite having very similar estimates of variance explained by genotyped SNPs, CD and SZ have a broadly dissimilar genetic architecture, due to differing mean effect size and proportion of non-null loci.

U2 - 10.1371/journal.pgen.1005717

DO - 10.1371/journal.pgen.1005717

M3 - Journal article

C2 - 26714184

VL - 11

SP - 1

EP - 21

JO - P L o S Genetics

JF - P L o S Genetics

SN - 1553-7390

IS - 12

M1 - e1005717

ER -

ID: 160800056