Healthc Inform Res.  2020 Apr;26(2):112-118. 10.4258/hir.2020.26.2.112.

Prediction of Serum Creatinine in Hemodialysis Patients Using a Kernel Approach for Longitudinal Data

Affiliations
  • 1Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan,
  • 2Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan,
  • 3Department of Biostatistics and Epidemiology, School of Public Health, Hamadan University of Medical Sciences, Hamadan,

Abstract

Objectives

Longitudinal data are prevalent in clinical research; due to their correlated nature, special analysis must be used for this type of data. Creatinine is an important marker in predicting end-stage renal disease, and it is recorded longitudinally. This study compared the prediction performance of linear regression (LR), linear mixed-effects model (LMM), least-squares support vector regression (LS-SVR), and mixed-effects least-squares support vector regression (MLS-SVR) methods to predict serum creatinine as a longitudinal outcome.

Methods

We used a longitudinal dataset of hemodialysis patients in Hamadan city between 2013 and 2016. To evaluate the performance of the methods in serum creatinine prediction, the data was divided into two sets of training and testing samples. Then LR, LMM, LS-SVR, and MLS-SVR were fitted. The prediction performance was assessed and compared in terms of mean squared error (MSE), mean absolute error (MAE), mean absolute prediction error (MAPE), and determination coefficient (R2). Variable importance was calculated using the best model to select the most important predictors.

Results

The MLS-SVR outperformed the other methods in terms of the least prediction error; MSE = 1.280, MAE = 0.833, and MAPE = 0.129 for the training set and MSE = 3.275, MAE = 1.319, and MAPE = 0.159 for the testing set. Also, the MLS-SVR had the highest R2, 0.805 and 0.654 for both the training and testing samples, respectively. Blood urea nitrogen was the most important factor in the prediction of creatinine.

Conclusions

The MLS-SVR achieved the best serum creatinine prediction performance in comparison to LR, LMM, and LS-SVR.


Keyword

Creatinine; Support Vector Machine; Longitudinal Studies; Renal Dialysis; Machine Learning

Figure

  • Figure 1 Framework of data pre-processing. LR: linear regression, LMM: linear mixed-effects model, LS-SVR: leastsquares support vector regression, MLS-SVR: mixed-effects least-squares support vector regression.

  • Figure 2 Comparison of predicted and observed values for MLS-SVR and LMM for three patients: (A) patient #1, (B) patient #2, and (C) patient #3. The extended line is bisector. MLS-SVR: mixed-effects least-squares support-vector regression, LMM: linear mixed-effects model.

  • Figure 3 Variable importance (VIMP) of each factor in prediction of creatinine using MLS-SVR method. Mean of changes in MAE after each permutation (standard error). BUN: blood urea nitrogen, FBS: fasting blood sugar, HCT: hematocrit, P: phosphorous, HB: hemoglobin, Ca: calcium, K: potassium, MLS-SVR: mixed-effects least-squares support-vector regression, MAE: mean absolute error.


Cited by  1 articles

Estimating the Optimal Dexketoprofen Pharmaceutical Formulation with Machine Learning Methods and Statistical Approaches
Atakan Başkor, Yağmur Pirinçci Tok, Burcu Mesut, Yıldız Özsoy, Tamer Uçar
Healthc Inform Res. 2021;27(4):279-286.    doi: 10.4258/hir.2021.27.4.279.


Reference

References

1. Khazaei Z, Rajabfardi Z, Hatami H, Khodakarim S, Khazaei S, Zobdeh Z. Factors associated with end stage renal disease among hemodialysis patients in Tuyserkan City in 2013. Pajouhan Sci J. 2014; 13(1):33–41.
2. Bond M, Pitt M, Akoh J, Moxham T, Hoyle M, Anderson R. The effectiveness and cost-effectiveness of methods of storing donated kidneys from deceased donors: a systematic review and economic model. Health Technol Assess. 2009; 13(38):iii-156.
Article
3. The Iranian Dialysis Consortium. Iran Dialysis Calender [Internet]. Tehran, Iran: The Iranian Dialysis Consortium;c2019. [cited at 2020 Apr 28]. Available from: http://www.icdgroup.org.
4. Zahran A, El-Husseini A, Shoker A. Can cystatin C replace creatinine to estimate glomerular filtration rate? A literature review. Am J Nephrol. 2007; 27(2):197–205.
Article
5. Lasisi TJ, Raji YR, Salako BL. Salivary creatinine and urea analysis in patients with chronic kidney disease: a case control study. BMC Nephrol. 2016; 17:10.
Article
6. Hedeker D. Generalized linear mixed models. Everitt B, Howell DC, editors. 9780470860809. Hoboken (NJ): John Wiley & Sons;2005.
7. Hedeker D, Gibbons RD. Longitudinal data analysis. Hoboken (NJ): John Wiley & Sons;2006.
8. Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. New York (NY): Springer;2009.
9. Amini P, Ahmadinia H, Poorolajal J, Moqaddasi Amiri M. Evaluating the high risk groups for suicide: a comparison of logistic regression, support vector machine, decision tree and artificial neural network. Iran J Public Health. 2016; 45(9):1179–87.
10. Amini P, Maroufizadeh S, Hamidi O, Samani RO, Sepidarkish M. Factors associated with macrosomia among singleton live-birth: A comparison between logistic regression, random forest and artificial neural network methods. Epidemiol Biostat Public Health. 2016; 13(4):e11985.
11. Tapak L, Mahjub H, Hamidi O, Poorolajal J. Real-data comparison of data mining methods in prediction of diabetes in iran. Healthc Inform Res. 2013; 19(3):177–85.
Article
12. Suykens JA, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J. Least squares support vector machines. Singapore: World Scientific Publishing;2002.
13. Suykens JA, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999; 9(3):293–300.
14. Shim J, Sohn I, Hwang C. Kernel-based random effect time-varying coefficient model for longitudinal data. Neurocomputing. 2017; 267:500–7.
Article
15. Amiri MM, Tapak L, Faradmal J. A mixed-effects least square support vector regression model for three-level count data. J Stat Comput Simul. 2019; 89(15):2801–12.
16. Seok KH, Shim J, Cho D, Noh GJ, Hwang C. Semiparametric mixed-effect least squares support vector machine for analyzing pharmacokinetic and pharmacodynamic data. Neurocomputing. 2011; 74(17):3412–9.
Article
17. Amiri MM, Tapak L, Faradmal J. A support vector regression approach for three–level longitudinal data. Epidemiol Biostat Public Health. 2019; 16(3):e13129.
18. Hosseini J. Comparison of longitudinal data analysis methods and its application in modeling health indicators [thesis]. Hamadan, Iran: Hamadan University of Medical Sciences;2019.
19. Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis. Hoboken (NJ): John Wiley & Sons;2012.
20. Vapnik V. The nature of statistical learning theory. New York (NY): Springer;2010.
21. Zhang H, Singer BH. Recursive partitioning and applications. New York (NY): Springer;2010.
22. Sexton J. Historical tree ensembles for longitudinal data [Internet]. Wien, Austria: R Foundation;2018. [cited at 2020 Apr 28]. Available from: https://cran.r-project.org/web/packages/htree/htree.pdf.
23. Chen T, Zeng D, Wang Y. Multiple kernel learning with random effects for predicting longitudinal outcomes and data integration. Biometrics. 2015; 71(4):918–28.
Article
24. Nguyen-Khoa T, Massy ZA, De Bandt JP, Kebede M, Salama L, Lambrey G, et al. Oxidative stress and haemodialysis: role of inflammation and duration of dialysis treatment. Nephrol Dial Transplant. 2001; 16(2):335–40.
25. Tatari M, Rahgozar M, Khanloo SA, Hosseinzadeh S. The relationship between blood creatinine levels and survival of patients with kidney disease using joint longitudinal and survival model. J Health Promot Manag. 2017; 6(3):12–9.
Article
Full Text Links
  • HIR
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr