Acute Crit Care.  2022 Nov;37(4):654-666. 10.4266/acc.2022.00976.

Multicenter validation of a deep-learning-based pediatric early-warning system for prediction of deterioration events

Affiliations
  • 1VUNO Inc., Seoul, Korea
  • 2Department of Pediatrics, Seoul National University Children's Hospital, Seoul, Korea
  • 3Department of Pediatrics, Severance Children's Hospital, Yonsei University College of Medicine, Seoul, Korea
  • 4Department of Pediatrics, Kyungpook National University Children's Hospital, School of Medicine, Kyungpook National University, Daegu, Korea
  • 5Department of Pediatrics, Pusan National University Children’s Hospital, Yangsan, Korea
  • 6Department of Critical Care Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
  • 7Department of Pediatrics, Asan Medical Center Children’s Hospital, University of Ulsan College of Medicine, Seoul, Korea

Abstract

Background
Early recognition of deterioration events is crucial to improve clinical outcomes. For this purpose, we developed a deep-learning-based pediatric early-warning system (pDEWS) and aimed to validate its clinical performance. Methods: This is a retrospective multicenter cohort study including five tertiary-care academic children’s hospitals. All pediatric patients younger than 19 years admitted to the general ward from January 2019 to December 2019 were included. Using patient electronic medical records, we evaluated the clinical performance of the pDEWS for identifying deterioration events defined as in-hospital cardiac arrest (IHCA) and unexpected general ward-to-pediatric intensive care unit transfer (UIT) within 24 hours before event occurrence. We also compared pDEWS performance to those of the modified pediatric early-warning score (PEWS) and prediction models using logistic regression (LR) and random forest (RF). Results: The study population consisted of 28,758 patients with 34 cases of IHCA and 291 cases of UIT. pDEWS showed better performance for predicting deterioration events with a larger area under the receiver operating characteristic curve, fewer false alarms, a lower mean alarm count per day, and a smaller number of cases needed to examine than the modified PEWS, LR, or RF models regardless of site, event occurrence time, age group, or sex. Conclusions: The pDEWS outperformed modified PEWS, LR, and RF models for early and accurate prediction of deterioration events regardless of clinical situation. This study demonstrated the potential of pDEWS as an efficient screening tool for efferent operation of rapid response teams.

Keyword

critical care; deep learning; early warning score; in-hospital cardiac arrest; pediatrics

Figure

  • Figure 1. A flowchart for patient inclusion and exclusion.

  • Figure 2. Areas under the receiver operating characteristic curves (AUROC) for predicting deterioration events. CI: confidence interval; pDEWS: deep-learning-based pediatric early-warning system; PEWS: pediatric early-warning score.

  • Figure 3. Comparison of (A) mean alarm count per day at the same sensitivity and (B) sensitivity at the same number needed to examine for deterioration events. MACPD: mean alarm count per day; pDEWS: deep-learning-based pediatric early-warning system; PEWS: pediatric early-warning score.

  • Figure 4. Cumulative percentages of deteriorating patients. The cutoffs of the models for each figure were set at threshold points with the same specificity as (A) PEWS≥4, (B)PEWS≥5, and (C)PEWS≥6. pDEWS: deep-learning-based pediatric early-warning system; PEWS: modified pediatric early-warning score.

  • Figure 5. Areas under the receiver operating characteristic curves (AUROC) for the prediction of (A) =>in-hospital cardiac arrest (IHCA) and (B) unexpected ward-to-pediatric intensive care unit transfer (UIT). CI: confidence interval; pDEWS: deep-learning-based pediatric early-warning system; PEWS: pediatric early-warning score.

  • Figure 6. Areas under the receiver operating characteristic curves (AUROC) for prediction of deterioration events by hospital: (A) hospital A, (B) hospital B, (C) hospital C, (D) hospital D, and (E) hospital E. CI: confidence interval; pDEWS: deep-learning-based pediatric early-warning system; PEWS: pediatric early-warning score.

  • Figure 7. Areas under the receiver operating characteristic curves (AUROC) for prediction of deterioration events by subgroup analysis: (A) age group, (B) event occurring time, and (C) sex. CI: confidence interval; pDEWS: deep-learning-based pediatric early-warning system; PEWS: pediatric early-warning score.

  • Figure 8. Deep-learning-based pediatric early-warning system (pDEWS) model calibration analysis.

  • Figure 9. Deep-learning-based pediatric early-warning system (pDEWS) model feature importance analysis. (A) Sequence-wise feature importance and (B) average feature importance. RR: respiratory rate; HR: heart rate; DBP: diastolic blood pressure; SBP: systolic blood pressure; BT: body temperature.


Cited by  1 articles

An advanced pediatric early warning system: a reliable sentinel, not annoying extra work
Young Joo Han
Acute Crit Care. 2022;37(4):667-668.    doi: 10.4266/acc.2022.01445.


Reference

1. Agulnik A, Antillon-Klussmann F, Soberanis Vasquez DJ, Arango R, Moran E, Lopez V, et al. Cost-benefit analysis of implementing a pediatric early warning system at a pediatric oncology hospital in a low-middle income country. Cancer. 2019; 125:4052–8.
Article
2. Bonafide CP, Localio AR, Song L, Roberts KE, Nadkarni VM, Priestley M, et al. Cost-benefit analysis of a medical emergency team in a children’s hospital. Pediatrics. 2014; 134:235–41.
Article
3. de Groot JF, Damen N, de Loos E, van de Steeg L, Koopmans L, Rosias P, et al. Implementing paediatric early warning scores systems in the Netherlands: future implications. BMC Pediatr. 2018; 18:128.
Article
4. Sambeeck SJ, Fuijkschot J, Kramer BW, Vos GD. Pediatric Early Warning System Scores: lessons to be Learned. J Pediatr Intensive Care. 2018; 7:27–32.
Article
5. Lambert V, Matthews A, MacDonell R, Fitzsimons J. Paediatric early warning systems for detecting and responding to clinical deterioration in children: a systematic review. BMJ Open. 2017; 7:e014497.
Article
6. Chapman SM, Wray J, Oulton K, Pagel C, Ray S, Peters MJ. ‘The Score Matters’: wide variations in predictive performance of 18 paediatric track and trigger systems. Arch Dis Child. 2017; 102:487–95.
Article
7. Chapman SM, Maconochie IK. Early warning scores in paediatrics: an overview. Arch Dis Child. 2019; 104:395–9.
Article
8. Lockwood JM, Thomas J, Martin S, Wathen B, Juarez-Colunga E, Peters L, et al. AutoPEWS: automating pediatric early warning score calculation improves accuracy without sacrificing predictive ability. Pediatr Qual Saf. 2020; 5:e274.
Article
9. Gorham TJ, Rust S, Rust L, Kuehn S, Yang J, Lin JS, et al. The vitals risk index-retrospective performance analysis of an automated and objective pediatric early warning system. Pediatr Qual Saf. 2020; 5:e271.
Article
10. Zhai H, Brady P, Li Q, Lingren T, Ni Y, Wheeler DS, et al. Developing and evaluating a machine learning based algorithm to predict the need of pediatric intensive care unit transfer for newly hospitalized children. Resuscitation. 2014; 85:1065–71.
Article
11. Pimentel MA, Redfern OC, Malycha J, Meredith P, Prytherch D, Briggs J, et al. Detecting deteriorating patients in the hospital: development and validation of a novel scoring system. Am J Respir Crit Care Med. 2021; 204:44–52.
Article
12. Rubin J, Potes C, Xu-Wilson M, Dong J, Rahman A, Nguyen H, et al. An ensemble boosting model for predicting transfer to the pediatric intensive care unit. Int J Med Inform. 2018; 112:15–20.
Article
13. Kang DY, Cho KJ, Kwon O, Kwon JM, Jeon KH, Park H, et al. Artificial intelligence algorithm to predict the need for critical care in prehospital emergency medical services. Scand J Trauma Resusc Emerg Med. 2020; 28:17.
Article
14. Kwon JM, Lee Y, Lee Y, Lee S, Park J. An algorithm based on deep learning for predicting in-hospital cardiac arrest. J Am Heart Assoc. 2018; 7:e008678.
15. Cho KJ, Kwon O, Kwon JM, Lee Y, Park H, Jeon KH, et al. Detecting patient deterioration using artificial intelligence in a rapid response system. Crit Care Med. 2020; 48:e285–9.
Article
16. Lee YJ, Cho KJ, Kwon O, Park H, Lee Y, Kwon JM, et al. A multicentre validation study of the deep learning-based early warning score for predicting in-hospital cardiac arrest in patients admitted to general wards. Resuscitation. 2021; 163:78–85.
17. Park SJ, Cho KJ, Kwon O, Park H, Lee Y, Shim WH, et al. Development and validation of a deep-learning-based pediatric early warning system: a single-center study. Biomed J. 2022; 45:155–68.
Article
18. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997; 9:1735–80.
Article
19. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014; 15:1929–58.
20. Kingma DP, Ba J. Adam: a method for stochastic optimization. arXiv:1412.6980v9 [Preprint]. 2017 [cited 2021 Sep 18]. Available from: https://doi.org/10.48550/arXiv.1412.6980.
Article
21. Torrey L, Shavlik J. Transfer learning. In : Olivas ES, Guerrero JD, Martinez-Sober MM, Jose Rafael, Serrano López AJ, editors. Handbook of research on machine learning applications and trends: algorithms, methods, and techniques. Hershey: IGI Global;2009. p. 242–64.
22. Ozenne B, Subtil F, Maucort-Boulch D. The precision: recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol. 2015; 68:855–9.
23. Weng CG, Poon J. A new evaluation measure for imbalanced datasets. In : In : Roddick JF, Li J, Christen P, Kennedy PJ, editors. AusDM '08: proceedings of the 7th Australasian Data Mining Conference; 2008 Nov 27-28; Glenelg, SA, Australia. Darlinghurst, NSW, Australia: Australian Computer Society, Inc.;2008. p. 27–32.
24. Leening MJ, Vedder MM, Witteman JC, Pencina MJ, Steyerberg EW. Net reclassification improvement: computation, interpretation, and controversies: a literature review and clinician’s guide. Ann Intern Med. 2014; 160:122–31.
Article
25. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In : In : Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, editors. Advances in neural information processing systems 30 (NIPS 2017): NeurIPS Proceedings; 2017 Dec 4-8; Long Beach, CA, USA. Curran Associates Inc.;2017.
26. Trubey R, Huang C, Lugg-Widger FV, Hood K, Allen D, Edwards D, et al. Validity and effectiveness of paediatric early warning systems and track and trigger tools for identifying and reducing clinical deterioration in hospitalised children: a systematic review. BMJ Open. 2019; 9:e022105.
Article
27. Kowalski RL, Lee L, Spaeder MC, Moorman JR, Keim-Malpass J. Accuracy and Monitoring of Pediatric Early Warning Score (PEWS) scores prior to emergent pediatric intensive care unit (ICU) transfer: retrospective analysis. JMIR Pediatr Parent. 2021; 4:e25991.
Article
28. Jensen CS, Aagaard H, Olesen HV, Kirkegaard H. Inter-rater reliability of two paediatric early warning score tools. Eur J Emerg Med. 2019; 26:34–40.
Article
29. Kotsakis A, Lobos AT, Parshuram C, Gilleland J, Gaiteiro R, Mohseni-Bod H, et al. Implementation of a multicenter rapid response system in pediatric academic hospitals is effective. Pediatrics. 2011; 128:72–8.
Article
30. McLellan MC, Gauvreau K, Connor JA. Validation of the Children’s Hospital Early Warning System for critical deterioration recognition. J Pediatr Nurs. 2017; 32:52–8.
Article
31. Brown SR, Martinez Garcia D, Agulnik A. Scoping Review of Pediatric Early Warning Systems (PEWS) in resource-limited and humanitarian settings. Front Pediatr. 2019; 6:410.
Article
32. Dean NP, Cheng JJ, Crumbley I, DuVal J, Maldonado E, Ghebremariam E. Improving accuracy and timeliness of nursing documentation of Pediatric Early Warning Scores. Pediatr Qual Saf. 2020; 5:e278.
Article
33. Hyland SL, Faltys M, Hüser M, Lyu X, Gumbsch T, Esteban C, et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med. 2020; 26:364–73.
Article
34. Nguyen J, Davis K, Guglielmello G, Stawicki SP. Combating alarm fatigue: the quest for more accurate and safer clinical monitoring equipment. In : Stawicki SP, Firstenberg MS, editors. Vignettes in patient safety. London: IntechOpen Limited;2019. p. 93–113.
35. Lyons PG, Edelson DP, Carey KA, Twu NM, Chan PS, Peberdy MA, et al. Characteristics of rapid response calls in the United States: an analysis of the first 402,023 adult cases from the get with the guidelines resuscitation-medical emergency team registry. Crit Care Med. 2019; 47:1283–9.
36. Ruskin KJ, Hueske-Kraus D. Alarm fatigue: impacts on patient safety. Curr Opin Anaesthesiol. 2015; 28:685–90.
37. Cvach M. Monitor alarm fatigue: an integrative review. Biomed Instrum Technol. 2012; 46:268–77.
38. The Lancet Respiratory Medicine. Opening the black box of machine learning. Lancet Respir Med. 2018; 6:801.
39. Azodi CB, Tang J, Shiu SH. Opening the black box: interpretable machine learning for geneticists. Trends Genet. 2020; 36:442–55.
Article
Full Text Links
  • ACC
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr