On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets. / Nørgaard, Sarah Kristine; Linder-Steinlein, Kristoffer; Eliasen, Anders Ulrik; Stokholm, Jakob; Chawes, Bo L.; Bønnelykke, Klaus; Bisggard, Hans; Smilde, Age K.; Rasmussen, Morten Arendt.

I: Journal of Chemometrics, Bind 35, Nr. 10, e3324, 2021.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Nørgaard, SK, Linder-Steinlein, K, Eliasen, AU, Stokholm, J, Chawes, BL, Bønnelykke, K, Bisggard, H, Smilde, AK & Rasmussen, MA 2021, 'On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets', Journal of Chemometrics, bind 35, nr. 10, e3324. https://doi.org/10.1002/cem.3324

APA

Nørgaard, S. K., Linder-Steinlein, K., Eliasen, A. U., Stokholm, J., Chawes, B. L., Bønnelykke, K., Bisggard, H., Smilde, A. K., & Rasmussen, M. A. (2021). On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets. Journal of Chemometrics, 35(10), [e3324]. https://doi.org/10.1002/cem.3324

Vancouver

Nørgaard SK, Linder-Steinlein K, Eliasen AU, Stokholm J, Chawes BL, Bønnelykke K o.a. On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets. Journal of Chemometrics. 2021;35(10). e3324. https://doi.org/10.1002/cem.3324

Author

Nørgaard, Sarah Kristine ; Linder-Steinlein, Kristoffer ; Eliasen, Anders Ulrik ; Stokholm, Jakob ; Chawes, Bo L. ; Bønnelykke, Klaus ; Bisggard, Hans ; Smilde, Age K. ; Rasmussen, Morten Arendt. / On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets. I: Journal of Chemometrics. 2021 ; Bind 35, Nr. 10.

Bibtex

@article{2b1b59f645aa4f318f2e00e517c3da29,
title = "On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets",
abstract = "Integration of unstructured and very diverse data is often required for a deeper understanding of the complex biological systems. In order to uncover communalities between heterogeneous data, the data are often harmonized by constructing a kernel and perform numerical integration. In this study, we propose a method for data integration in the framework of an undirected graphical model, where the nodes represent individual data sources of varying nature in terms of complexity and underlying distribution and where the edges represent the partial correlations between two blocks of data. We propose a modified GLASSO for estimation of the graph, with a combination of cross-validation and extended Bayes Information Criterion for sparsity tuning. Furthermore, hierarchical clustering on the weighted consensus kernels from a fixed network is used to partitioning the samples into different classes. Simulations show increasing ability to uncover true edges with increasing sample size and signal to noise. Likewise, identification of nonexisting edges towards disconnected nodes is feasible. The framework is demonstrated for integration of longitudinal symptom burden data, from the second and third year of life, combined with 21 diseases precursors and information of the development of asthma and eczema at the age of 6 years, from 403 children from the COPSAC2010 mother-child cohort. This suggests that maternal predisposition as well as being born preterm indirectly lead to a higher risk of asthma via an increased respiratory symptom burden.",
keywords = "dual-primal optimization, GLASSO, heterogeneous data integration, kernelization, undirected graphical models",
author = "N{\o}rgaard, {Sarah Kristine} and Kristoffer Linder-Steinlein and Eliasen, {Anders Ulrik} and Jakob Stokholm and Chawes, {Bo L.} and Klaus B{\o}nnelykke and Hans Bisggard and Smilde, {Age K.} and Rasmussen, {Morten Arendt}",
note = "SPECIAL ISSUE Funding information   COPSAC is funded by private and public research funds all listed on https://www.copsac.com. The Lundbeck Foundation; The Danish Ministry of Health; Danish Council for Strategic Research and The Capital Region Research Foundation have provided core support for COPSAC",
year = "2021",
doi = "10.1002/cem.3324",
language = "English",
volume = "35",
journal = "Journal of Chemometrics",
issn = "0886-9383",
publisher = "Wiley",
number = "10",

}

RIS

TY - JOUR

T1 - On using kernel integration by graphical LASSO to study partial correlations between heterogeneous data sets

AU - Nørgaard, Sarah Kristine

AU - Linder-Steinlein, Kristoffer

AU - Eliasen, Anders Ulrik

AU - Stokholm, Jakob

AU - Chawes, Bo L.

AU - Bønnelykke, Klaus

AU - Bisggard, Hans

AU - Smilde, Age K.

AU - Rasmussen, Morten Arendt

N1 - SPECIAL ISSUE Funding information   COPSAC is funded by private and public research funds all listed on https://www.copsac.com. The Lundbeck Foundation; The Danish Ministry of Health; Danish Council for Strategic Research and The Capital Region Research Foundation have provided core support for COPSAC

PY - 2021

Y1 - 2021

N2 - Integration of unstructured and very diverse data is often required for a deeper understanding of the complex biological systems. In order to uncover communalities between heterogeneous data, the data are often harmonized by constructing a kernel and perform numerical integration. In this study, we propose a method for data integration in the framework of an undirected graphical model, where the nodes represent individual data sources of varying nature in terms of complexity and underlying distribution and where the edges represent the partial correlations between two blocks of data. We propose a modified GLASSO for estimation of the graph, with a combination of cross-validation and extended Bayes Information Criterion for sparsity tuning. Furthermore, hierarchical clustering on the weighted consensus kernels from a fixed network is used to partitioning the samples into different classes. Simulations show increasing ability to uncover true edges with increasing sample size and signal to noise. Likewise, identification of nonexisting edges towards disconnected nodes is feasible. The framework is demonstrated for integration of longitudinal symptom burden data, from the second and third year of life, combined with 21 diseases precursors and information of the development of asthma and eczema at the age of 6 years, from 403 children from the COPSAC2010 mother-child cohort. This suggests that maternal predisposition as well as being born preterm indirectly lead to a higher risk of asthma via an increased respiratory symptom burden.

AB - Integration of unstructured and very diverse data is often required for a deeper understanding of the complex biological systems. In order to uncover communalities between heterogeneous data, the data are often harmonized by constructing a kernel and perform numerical integration. In this study, we propose a method for data integration in the framework of an undirected graphical model, where the nodes represent individual data sources of varying nature in terms of complexity and underlying distribution and where the edges represent the partial correlations between two blocks of data. We propose a modified GLASSO for estimation of the graph, with a combination of cross-validation and extended Bayes Information Criterion for sparsity tuning. Furthermore, hierarchical clustering on the weighted consensus kernels from a fixed network is used to partitioning the samples into different classes. Simulations show increasing ability to uncover true edges with increasing sample size and signal to noise. Likewise, identification of nonexisting edges towards disconnected nodes is feasible. The framework is demonstrated for integration of longitudinal symptom burden data, from the second and third year of life, combined with 21 diseases precursors and information of the development of asthma and eczema at the age of 6 years, from 403 children from the COPSAC2010 mother-child cohort. This suggests that maternal predisposition as well as being born preterm indirectly lead to a higher risk of asthma via an increased respiratory symptom burden.

KW - dual-primal optimization

KW - GLASSO

KW - heterogeneous data integration

KW - kernelization

KW - undirected graphical models

U2 - 10.1002/cem.3324

DO - 10.1002/cem.3324

M3 - Journal article

AN - SCOPUS:85097781764

VL - 35

JO - Journal of Chemometrics

JF - Journal of Chemometrics

SN - 0886-9383

IS - 10

M1 - e3324

ER -

ID: 254660252