Biomarker evaluation under imperfect nested case-control design

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Biomarker evaluation under imperfect nested case-control design. / Wang, Xuan; Zheng, Yingye; Jensen, Majken Karoline; He, Zeling; Cai, Tianxi.

I: Statistics in Medicine, Bind 40, Nr. 18, 2021, s. 4035-4052.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Wang, X, Zheng, Y, Jensen, MK, He, Z & Cai, T 2021, 'Biomarker evaluation under imperfect nested case-control design', Statistics in Medicine, bind 40, nr. 18, s. 4035-4052. https://doi.org/10.1002/sim.9012

APA

Wang, X., Zheng, Y., Jensen, M. K., He, Z., & Cai, T. (2021). Biomarker evaluation under imperfect nested case-control design. Statistics in Medicine, 40(18), 4035-4052. https://doi.org/10.1002/sim.9012

Vancouver

Wang X, Zheng Y, Jensen MK, He Z, Cai T. Biomarker evaluation under imperfect nested case-control design. Statistics in Medicine. 2021;40(18):4035-4052. https://doi.org/10.1002/sim.9012

Author

Wang, Xuan ; Zheng, Yingye ; Jensen, Majken Karoline ; He, Zeling ; Cai, Tianxi. / Biomarker evaluation under imperfect nested case-control design. I: Statistics in Medicine. 2021 ; Bind 40, Nr. 18. s. 4035-4052.

Bibtex

@article{2296535ac92d4fb6af3419e8330bc2b9,
title = "Biomarker evaluation under imperfect nested case-control design",
abstract = "The nested case-control (NCC) design has been widely adopted as a cost-effective sampling design for biomarker research. Under the NCC design, markers are only measured for the NCC subcohort consisting of all cases and a fraction of the controls selected randomly from the matched risk sets of the cases. Robust methods for evaluating prediction performance of risk models have been derived under the inverse probability weighting framework. The probabilities of samples being included in the NCC cohort can be calculated based on the study design ``a previous study'' or estimated non-parametrically ``a previous study''. Neither strategy works well due to model mis-specification and the curse of dimensionality in practical settings where the sampling does not entirely follow the study design or depends on many factors. In this paper, we propose an alternative strategy to estimate the sampling probabilities based on a varying coefficient model, which attains a balance between robustness and the curse of dimensionality. The complex correlation structure induced by repeated finite risk set sampling makes the standard resampling procedure for variance estimation fail. We propose a perturbation resampling procedure that provides valid interval estimation for the proposed estimators. Simulation studies show that the proposed method performs well in finite samples. We apply the proposed method to the Nurses' Health Study II to develop and evaluate prediction models using clinical biomarkers for cardiovascular risk.",
keywords = "finite population sampling, inverse probability weighting, nonparametric smoothing, resampling, risk prediction",
author = "Xuan Wang and Yingye Zheng and Jensen, {Majken Karoline} and Zeling He and Tianxi Cai",
note = "Publisher Copyright: {\textcopyright} 2021 John Wiley & Sons Ltd.",
year = "2021",
doi = "10.1002/sim.9012",
language = "English",
volume = "40",
pages = "4035--4052",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "JohnWiley & Sons Ltd",
number = "18",

}

RIS

TY - JOUR

T1 - Biomarker evaluation under imperfect nested case-control design

AU - Wang, Xuan

AU - Zheng, Yingye

AU - Jensen, Majken Karoline

AU - He, Zeling

AU - Cai, Tianxi

N1 - Publisher Copyright: © 2021 John Wiley & Sons Ltd.

PY - 2021

Y1 - 2021

N2 - The nested case-control (NCC) design has been widely adopted as a cost-effective sampling design for biomarker research. Under the NCC design, markers are only measured for the NCC subcohort consisting of all cases and a fraction of the controls selected randomly from the matched risk sets of the cases. Robust methods for evaluating prediction performance of risk models have been derived under the inverse probability weighting framework. The probabilities of samples being included in the NCC cohort can be calculated based on the study design ``a previous study'' or estimated non-parametrically ``a previous study''. Neither strategy works well due to model mis-specification and the curse of dimensionality in practical settings where the sampling does not entirely follow the study design or depends on many factors. In this paper, we propose an alternative strategy to estimate the sampling probabilities based on a varying coefficient model, which attains a balance between robustness and the curse of dimensionality. The complex correlation structure induced by repeated finite risk set sampling makes the standard resampling procedure for variance estimation fail. We propose a perturbation resampling procedure that provides valid interval estimation for the proposed estimators. Simulation studies show that the proposed method performs well in finite samples. We apply the proposed method to the Nurses' Health Study II to develop and evaluate prediction models using clinical biomarkers for cardiovascular risk.

AB - The nested case-control (NCC) design has been widely adopted as a cost-effective sampling design for biomarker research. Under the NCC design, markers are only measured for the NCC subcohort consisting of all cases and a fraction of the controls selected randomly from the matched risk sets of the cases. Robust methods for evaluating prediction performance of risk models have been derived under the inverse probability weighting framework. The probabilities of samples being included in the NCC cohort can be calculated based on the study design ``a previous study'' or estimated non-parametrically ``a previous study''. Neither strategy works well due to model mis-specification and the curse of dimensionality in practical settings where the sampling does not entirely follow the study design or depends on many factors. In this paper, we propose an alternative strategy to estimate the sampling probabilities based on a varying coefficient model, which attains a balance between robustness and the curse of dimensionality. The complex correlation structure induced by repeated finite risk set sampling makes the standard resampling procedure for variance estimation fail. We propose a perturbation resampling procedure that provides valid interval estimation for the proposed estimators. Simulation studies show that the proposed method performs well in finite samples. We apply the proposed method to the Nurses' Health Study II to develop and evaluate prediction models using clinical biomarkers for cardiovascular risk.

KW - finite population sampling

KW - inverse probability weighting

KW - nonparametric smoothing

KW - resampling

KW - risk prediction

U2 - 10.1002/sim.9012

DO - 10.1002/sim.9012

M3 - Journal article

C2 - 33915597

AN - SCOPUS:85105234240

VL - 40

SP - 4035

EP - 4052

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 18

ER -

ID: 286492347