Cohort-derived machine learning models for individual prediction of chronic kidney disease in people living with HIV: a prospective multicentre cohort study.   Journal of Infectious Diseases
Roth et al. aimed to evaluate different machine learning algorithms and modeling strategies for individual chronic kidney disease (CKD) prediction to exemplify whether machine learning models can be readily trained in a high-dimensional cohort setting.
The authors included people living with HIV in the prospective Swiss HIV Cohort Study with a first estimated glomerular filtration rate (eGFR) >60 mL/minute/1.73 m2 after 1 January 2002. The primary outcome was chronic kidney disease (CKD)—defined as confirmed decrease in eGFR ≤60 mL/minute/1.73 m2 over 3 months apart. They split the cohort data into a training set (80%), validation set (10%), and test set (10%), stratified for CKD status and follow-up length.
Of 2’761 eligible individuals (median baseline eGFR, 103 mL/minute/1.73 m2), 1’192 (9%) developed a CKD after a median of 8 years. The authors used 64 static and 502 time-changing variables: Across prediction horizons and algorithms and in contrast to expert-based standard models, most machine learning models achieved state-of-the-art predictive performances with areas under the receiver operating characteristic curve and precision recall curve ranging from 0.926 to 0.996 and from 0.631 to 0.956, respectively.
In summary, in people living with HIV, there were state-of-the-art performances in forecasting individual CKD onsets with different machine learning algorithms. The underlying machine learning methods may help to advance personalized predictions of comorbidities in various populations.