Healthc Inform Res.  2022 Oct;28(4):319-331. 10.4258/hir.2022.28.4.319.

Discovery of Intentional Self-Harm Patterns from Suicide and Self-Harm Surveillance Reports

  • 1Hardware-Human Interface and Communications (H2I-Comm) Laboratory, College of Computing, Khon Kaen University, Khon Kaen, Thailand


The purpose of this study was to identify patterns of self-harm risk factors from suicide and self-harm surveillance reports in Thailand.
This study analyzed data from suicide and self-harm surveillance reports submitted to Khon Kaen Rajanagarindra Psychiatric Hospital, Thailand. The process of identifying patterns of self-harm risk factors involved: data preprocessing (namely, data preparation and cleaning, missing data management using listwise deletion and expectation-maximization techniques, subgrouping factors, determining the target factors, and data correlation for learning); classifying the risk of self-harm (severe or mild) using 10-fold cross-validation with the support vector machine, random forest, multilayer perceptron, decision tree, k-nearest neighbors, and ensemble techniques; data filtering; identifying patterns of self-harm risk factors using 10-fold cross-validation with the classification and regression trees (CART) technique; and evaluating patterns of self-harm risk factors.
The random forest technique was most accurate for classifying the risk of self-harm, with specificity, sensitivity, and F-score of 92.84%, 93.12%, and 91.46%, respectively. The CART technique was able to identify 53 patterns of self-harm risk, consisting of 16 severe self-harm risk patterns and 37 mild self-harm risk patterns, with an accuracy of 92.85%. In addition, we discovered that the type of hospital was a new risk factor for severe selfharm.
The procedure presented herein could identify patterns of risk factors from self-harm and assist psychiatrists in making decisions related to self-harm among patients visiting hospitals in Thailand.


Data Adjustment; Machine Learning; Data Analysis; Self-Injurious Behavior; Suicide


  • Figure 1 Conceptual framework of this research. LD: listwise deletion, EM: expectation-maximization, ML: machine learning, DT: decision tree, RF: random forest, SVM: support vector machine, MLP: multilayer perceptron, kNN: k-nearest neighbor, CART: classification and regression trees.

  • Figure 2 Example of a decision tree from the CART technique. CART: classification and regression trees.



1. World Health Organization. Mental health and substance use [Internet]. Geneva, Switzerland: World health Organication;2020. [cited at 2022 Oct 26]. Available from: .
2. National News Bureau of Thailand. Warning issued over suicide rates among workers and unemployed [Internet]. 2022. [cited at 2022 Oct 27]. Available from: .
3. The ASEAN Post. Suicides on the rise in Thailand [Internet]. Kuala Lumpur, Malaysia: The ASEAN Post;2019. [cited at 2022 Oct 26]. Available from: .
4. Mansourian M, Khademi S, Marateb HR. A comprehensive review of computer-aided diagnosis of major mental and neurological disorders and suicide: a biostatistical perspective on data mining. Diagnostics (Basel). 2021; 11(3):393. .
5. Mitchell TM. Does machine learning really work? AI Mag. 1997; 18(3):11. .
6. Tougui I, Jilbab A, Mhamdi JE. Impact of the choice of cross-validation techniques on the results of machine learning-based diagnostic applications. Healthc Inform Res. 2021; 27(3):189–99. .
7. Boonkwang K, Kasemvilas S, Kaewhao S, Youdkang O. A comparison of data mining techniques for suicide attempt characteristics mapping and prediction. In : Proceedings of 2018 International Seminar on Application for Technology of Information and Communication; 2018 Sep 21–22; Semarang, Indonesia. p. 488–93. .
8. Zalar B, Kores Plesnicar B, Zalar I, Mertik M. Suicide and suicide attempt descriptors by multimethod approach. Psychiatr Danub. 2018; 30(3):317–22. .
9. Edgcomb JB, Thiruvalluru R, Pathak J, Brooks JO. Machine learning to differentiate risk of suicide attempt and self-harm after general medical hospitalization of women with mental illness. Med Care. 2021; 59:S58–S64. .
10. Myers TA. Goodbye, listwise deletion: presenting hot deck imputation as an easy and effective tool for handling missing data. Commun Methods Meas. 2011; 5(4):297–310. .
11. Ibrahim JG, Zhu H, Tang N. Model selection criteria for missing-data problems using the EM algorithm. J Am Stat Assoc. 2008; 103(484):1648–58. .
12. Little TD, Jorgensen TD, Lang KM, Moore EW. On the joys of missing data. J Pediatr Psychol. 2014; 39(2):151–62. .
13. Subramanian J, Simon R. Overfitting in prediction models: is it a problem only in high dimensions? Contemp Clin Trials. 2013; 36(2):636–41. .
14. Lee SY, Lu RB, Wang LJ, Chang CH, Lu T, Wang TY, et al. Serum miRNA as a possible biomarker in the diagnosis of bipolar II disorder. Sci Rep. 2020; 10(1):1131. .
15. Wang Y, Sun K, Liu Z, Chen G, Jia Y, Zhong S, et al. Classification of unmedicated bipolar disorder using whole-brain functional activity and connectivity: a radiomics analysis. Cereb Cortex. 2020; 30(3):1117–28. .
16. Achalia R, Sinha A, Jacob A, Achalia G, Kaginalkar V, Venkatasubramanian G, et al. A proof of concept machine learning analysis using multimodal neuroimaging and neurocognitive measures as predictive biomarker in bipolar disorder. Asian J Psychiatr. 2020; 50:101984. .
17. Santos-Mayo L, San-Jose-Revuelta LM, Arribas JI. A computer-aided diagnosis system with EEG based on the P3b wave during an auditory odd-ball task in schizophrenia. IEEE Trans Biomed Eng. 2017; 64(2):395–407. .
18. Lin GM, Nagamine M, Yang SN, Tai YM, Lin C, Sato H. Machine learning based suicide ideation prediction for military personnel. IEEE J Biomed Health Inform. 2020; 24(7):1907–16. .
19. Choi SB, Lee W, Yoon JH, Won JU, Kim DW. Ten-year prediction of suicide death using Cox regression and machine learning in a nationwide retrospective cohort study in South Korea. J Affect Disord. 2018; 231:8–14. .
20. Zheng L, Wang O, Hao S, Ye C, Liu M, Xia M, et al. Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records. Transl Psychiatry. 2020; 10(1):72. .
21. Bin-Hezam R, Ward TE. A machine learning approach towards detecting dementia based on its modifiable risk factors. Int J Adv Comput Sci Appl. 2019; 10(8):1–9.
22. Chen Q, Zhang-James Y, Barnett EJ, Lichtenstein P, Jokinen J, D’Onofrio BM, et al. Predicting suicide attempt or suicide death following a visit to psychiatric specialty care: a machine learning study using Swedish national registry data. PLoS Med. 2020; 17(11):e1003416. .
23. Miche M, Studerus E, Meyer AH, Gloster AT, Beesdo-Baum K, Wittchen HU, et al. Prospective prediction of suicide attempts in community adolescents and young adults, using regression methods and machine learning. J Affect Disord. 2020; 265:570–8. .
24. Shen Y, Zhang W, Chan BS, Zhang Y, Meng F, Kennon EA, et al. Detecting risk of suicide attempts among Chinese medical college students using a machine learning algorithm. J Affect Disord. 2020; 273:18–23. .
25. Alimardani F, Cho JH, Boostani R, Hwang HJ. Classification of bipolar disorder and schizophrenia using steady-state visual evoked potential based features. IEEE Access. 2018; 6:40379–88. .
26. Çolakoglu N, Akkaya B. Comparison of multi-class classification algorithms on early diagnosis of heart diseases. In : Proceedings of tge y-BIS Conference 2019: Recent Advances in Data Science and Business Analytics; 2019 Sep 25–28; Istanbul, Turkey. p. 162–71.
27. Department of Mental Health, Ministry of Public Health. Annual report 2020 [Internet]. Nonthaburi, Thailand: Ministry of Public Health;2020. [cited at 2022 Oct 26]. Available from: .
28. Onishi K. Risk factors and social background associated with suicide in Japan: a review. Jpn Hosp. 2015; (34):35–50.
29. Gulabutr V. Suicide risk factors of Royal Thai Police Officers. Int J Crime Law Soc Issues. 2017; 4(2):65–80. .
Full Text Links
  • HIR
export Copy
  • Twitter
  • Facebook
Similar articles
Copyright © 2023 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: