The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.
Published in | American Journal of Applied Mathematics (Volume 9, Issue 6) |
DOI | 10.11648/j.ajam.20210906.11 |
Page(s) | 192-210 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2021. Published by Science Publishing Group |
Case-cohort Study, Multiple Disease Outcomes, Survival Analysis
[1] | R. L. Prentice, “A case-cohort design for epidemiologic cohort studies and disease prevention trials,” Biometrika, vol. 73, no. 1, pp. 1–11, 1986. |
[2] | N. Breslow and J. Wellner, “Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression,” Scand. J. Statist., vol. 34, no. 1, pp. 86–102, 2007. |
[3] | S. G. Self and R. L. Prentice, “Asymptotic Distribution Theory and Efficiency Results for Case-Cohort Studies,” The Annals of Statistics, vol. 16, no. 1, pp. 64–81, 1988. |
[4] | W. E. Barlow, “Robust variance estimation for the case- cohort design.,” Biometrics, vol. 50, no. 4, pp. 1064–72, 1994. |
[5] | K. Chen and S.-H. Lo, “Case-cohort and case-control analysis with Cox’s model,” Biometrika, vol. 86, no. 4, pp. 755–764, 1999. |
[6] | O. Borgan, B. Langholz, S. O. Samuelsen, L. Goldstein, and J. Pogoda, “Exposure stratified case-cohort designs,” Lifetime Data Analysis, vol. 6, no. 1, pp. 39–58, 2000. |
[7] | S. Kang and J. Cai, “Marginal hazards model for case-cohort studies with multiple disease outcomes,” Biometrika, vol. 96, no. 4, pp. 887–901, 2009. |
[8] | S. Kim, J. Cai, and W. Lu, “More efficient estimators for case-cohort studies,” Biometrika, vol. 100, no. 3, p. 695, 2013. |
[9] | J. Ding, T.-S. Lu, J. Cai, and H. Zhou, “Recent progresses in outcome-dependent sampling with failure time data,” Lifetime data analysis, vol. 23, no. 1, pp. 57–82, 2017. |
[10] | C. M. Ballantyne, R. C. Hoogeveen, H. Bang, J. Coresh, A. R. Folsom, G. Heiss, and A. R. Sharrett, “Lipoprotein- associated phospholipase a2, high-sensitivity c-reactive protein, and risk for incident coronary heart disease in middle-aged men and women in the atherosclerosis risk in communities (aric) study,” Circulation, vol. 109, no. 7, pp. 837–842, 2004. |
[11] | C. M. Ballantyne, R. C. Hoogeveen, H. Bang, J. Coresh, A. R. Folsom, L. E. Chambless, M. Myerson, K. K. Wu, A. R. Sharrett, and E. Boerwinkle, “Lipoprotein- associated phospholipase A2, high-sensitivity C-reactive protein, and risk for incident ischemic stroke in middle- aged men and women in the Atherosclerosis Risk in Communities (ARIC) study.,” Arch Intern Med, vol. 165, pp. 2479–2484, 2005. |
[12] | M. Kulich and D. Y. Lin, “Improving the Efficiency of Relative-Risk Estimation in Case-Cohort Studies,” Journal of the American Statistical Association, vol. 99, no. 467, pp. 832–844, 2004. |
[13] | N. E. Breslow, T. Lumley, C. M. Ballantyne, L. E. Chambless, and M. Kulich, “Improved Horvitz- Thompson Estimation of Model Parameters from Two- phaseStratifiedSamples: ApplicationsinEpidemiology,” Statistics in Biosciences, vol. 1, no. 1, pp. 32–49, 2009. |
[14] | N. E. Breslow, T. Lumley, C. M. Ballantyne, L. E. Chambless, and M. Kulich, “Using the whole cohort in the analysis of case-cohort data,” American Journal of Epidemiology, vol. 169, no. 11, pp. 1398–1405, 2009. |
[15] | J. Cai and D. Zeng, “Power calculation for case-cohort studies with nonrare events,” Biometrics, vol. 63, no. 4, pp. 1288–1295, 2007. |
[16] | J. D. Kalbfleisch and R. L. Prentice, The Statistical Analysis of Failure Time Data. John Wiley & Sons, 2002. |
[17] | J. Cai and R. L. Prentice, “Estimating equations for hazard ratio parameters based on correlated failure time data,” Biometrika, vol. 82, no. 1, pp. 151–164, 1995. |
[18] | C. F. Spiekerman and D. Y. Lin, “Marginal Regression Models for Multivariate Failure Time Data,” Journal of the American Statistical Association, vol. 93, no. 443, p. 1164, 1998. |
[19] | D. Clayton and J. Cuzick, “Multivariate Generalizations of the Proportional Hazards Model,” Journal of the Royal Statistical Society. Series A, vol. 148, no. 2, pp. 82–117, 1985. |
[20] | W. Hu, J. Cai, and D. Zeng, “Sample size/power calculation for stratified case–cohort design,” Statistics in medicine, vol. 33, no. 23, pp. 3973–3985, 2014. |
[21] | O. Saarela, S. Kulathinal, E. Arjas, and E. Läärä, “Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives,” Statistics in medicine, vol. 27, no. 28, pp. 5991–6008, 2008. |
[22] | N. C. Støer and S. O. Samuelsen, “Comparison of estimators in nested case-control studies with multiple outcomes,” Lifetimedataanalysis, vol.18, no.3, pp.261– 283, 2012. |
[23] | Y. Yan, H. Zhou, and J. Cai, “Improving efficiency of parameter estimation in case-cohort studies with multivariate failure time data,” Biometrics, vol. 73, no. 3, pp. 1042–1052, 2017. |
[24] | L. Qi, C. Y. Wang, and R. L. Prentice, “Weighted Estimators for Proportional Hazards Regression With Missing Covariates,” Journal of the American Statistical Association, vol. 100, no. 472, pp. 1250–1263, 2005. |
[25] | S. Kang, J. Cai, and L. Chambless, “Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the Atherosclerosis Risk in Communities (ARIC) study,” Biostatistics, vol. 14, no. 1, pp. 28–41, 2012. |
[26] | J. Hajek, “Limiting distributions in simple random sampling from a finite population,” Publications of the Mathematics Institute of the Hungarian Academy of Science, vol. 5, no. 361, p. 74, 1960. |
[27] | A. W. Van der vaart and J. Wellner, Weak Convergence and Empirical Processes: With Applications to Statistics. Springer Series in Statistics, Springer New York, 2012. |
[28] | D. Y. Lin, “On fitting Cox’s proportional hazards models to survey data,” Biometrika, vol. 87, no. 1, pp. 37–47, 2000. |
[29] | R. V. Foutz, “On the Unique Consistent Solution to the Likelihood Equations,” Journal of the American Statistical Association, vol. 72, no. 357, pp. 147–148, 1977. |
APA Style
Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. (2021). Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. American Journal of Applied Mathematics, 9(6), 192-210. https://doi.org/10.11648/j.ajam.20210906.11
ACS Style
Hongtao Zhang; Haibo Zhou; David Couper; Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am. J. Appl. Math. 2021, 9(6), 192-210. doi: 10.11648/j.ajam.20210906.11
AMA Style
Hongtao Zhang, Haibo Zhou, David Couper, Jianwen Cai. Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies. Am J Appl Math. 2021;9(6):192-210. doi: 10.11648/j.ajam.20210906.11
@article{10.11648/j.ajam.20210906.11, author = {Hongtao Zhang and Haibo Zhou and David Couper and Jianwen Cai}, title = {Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies}, journal = {American Journal of Applied Mathematics}, volume = {9}, number = {6}, pages = {192-210}, doi = {10.11648/j.ajam.20210906.11}, url = {https://doi.org/10.11648/j.ajam.20210906.11}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajam.20210906.11}, abstract = {The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed.}, year = {2021} }
TY - JOUR T1 - Using Full Cohort Information to Improve the Estimation Efficiency of Marginal Hazard Model for Multivariate Failure Times in Case-Cohort Studies AU - Hongtao Zhang AU - Haibo Zhou AU - David Couper AU - Jianwen Cai Y1 - 2021/12/24 PY - 2021 N1 - https://doi.org/10.11648/j.ajam.20210906.11 DO - 10.11648/j.ajam.20210906.11 T2 - American Journal of Applied Mathematics JF - American Journal of Applied Mathematics JO - American Journal of Applied Mathematics SP - 192 EP - 210 PB - Science Publishing Group SN - 2330-006X UR - https://doi.org/10.11648/j.ajam.20210906.11 AB - The case-cohort design is widely used in large cohort studies when it is prohibitively costly to measure some exposures for all subjects in the full cohort, especially in studies where the disease rate is low. To investigate the effect of a risk factor on different diseases, multiple case-cohort studies using the same subcohort are usually conducted. To compare the effect of a risk factor on different types of diseases, times to different disease events need to be modeled simultaneously. Existing case-cohort estimators for multiple disease outcomes utilize only the relevant covariate information in cases and subcohort controls, though many covariates are measured for everyone in the full cohort. Intuitively, making full use of the relevant covariate information can improve efficiency. To this end, we consider a class of doubly-weighted estimators for both regular and generalized case-cohort studies with multiple disease outcomes. The asymptotic properties of the proposed estimators are derived and our simulation studies show that a gain in efficiency can be achieved with a properly chosen weight function. We apply the proposed method to re-analyze a data set from Atherosclerosis Risk in Communities (ARIC) study to showcase the gain in efficiency. Concluding remarks and future researches are also discussed. VL - 9 IS - 6 ER -