J Korean Acad Oral Health.  2021 Dec;45(4):218-226. 10.11149/jkaoh.2021.45.4.218.

A step-by-step guide to random forest model using orange data mining in the field of periodontitis

Affiliations
  • 1Department of Dental Education, Dental Science Research Institute, School of Dentistry, Chonnam National University, Gwangju, Korea

Abstract


Objectives
The purpose of this study was to show a procedure for a random forest (RF) analysis which predicts periodontal disease status by using R and Orange Data Mining software, and helps us to understand how to apply the RF technique for dental research.
Methods
Oral examination data of the 7th Korea National Health and Nutrition Examination Survey were used. A RF model was adopted to analyze the data where the target variable was periodontal disease status and the features were gender, age, education level, marital status, alcohol consumption level, smoking status, brushing before sleep, hypertension, and diabetes-related variables.
Results
The important features of the RF analysis were in the order of age, marital status, and prevalence of hypertension and diabetes. The accuracy of the RF analysis was 73% which is not high enough for use in the clinical field.
Conclusions
The RF technique is an ensemble method used to predict periodontal disease status which produces higher accurate outputs than a single method. This study provides a step-by-step guide using Orange Data Mining for researchers who want to study machine learning techniques.

Keyword

Machine learning; Periodontitis; Random forest

Figure

  • Fig. 1 Association of artificial intelligence, machine learning and deep learning.

  • Fig. 2 Bias-variance trade-off.

  • Fig. 3 Voting classifiers. (A) Hard voting. (B) Soft voting.

  • Fig. 4 Random forest.

  • Fig. 5 Workflow.

  • Fig. 6 (A) Select columns. (B) Impute.

  • Fig. 7 Widget options in orange datamining. (A) CSV file import. (B) Feature constructor. (C) Data table.

  • Fig. 8 Edit domain.

  • Fig. 9 Rank.

  • Fig. 10 Widget options in Orange Datamining. (A) Random forest. (B) Tree.

  • Fig. 11 Test and score.


Reference

References

1. Mitchell T. 1997. Machine Learning. McGraw Hill;New York:
2. Geron A. 2019. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow Second edition: Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media Sebastopol;California:
3. Park SH, Lim HJ. 2018; A step-by-step guide to Meta-analysis with dichotomous outcomes using RevMan in dental research. J Korean Dent Assoc. 56:18–40.
4. Lim HJ, Park SH. 2016; A step-by-step guide to Generalized Estimating Equations using SPSS in dental research. J Korean Dent Assoc. 54:850–864.
5. Lim HJ. 2014; Sample size determination in dental research. The Journal of the Korean dental association. 52:558–569.
6. Lim HJ. 2014; Meta-analysis in dental research. J Korean Dent Assoc. 52:478–490.
7. An H, Lim HJ. 2020; A step-by-step guide to Propensity Score Matching method using R program in dental research. J Korean Dent Assoc. 58:152–168.
8. Farhadian M, Torkaman S, Mojarad F. 2020; Random forest algorithm to identify factors associated with sports-related dental injuries in 6 to 13-year-old athlete children in Hamadan, Iran-2018-a cross-sectional study. BMC Sports Sci Med Rehabil. 12:69. DOI: 10.1186/s13102-020-00217-5. PMID: 33292522. PMCID: PMC7659093.
Article
9. Rocca J. Ensemble methods: bagging, boosting and stacking [Internet]. https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-stacking-c9214a10a205. cited 2021 Oct 26.
10. Neal B, Mittal S, Baratin A, Tantia V, Scicluna M, Lacoste-Julien S, Mitliagkas L. 2019; A Modern Take on the Bias-Variance Tradeoff in Neural Networks. arXiv:1810.08591.
11. Guido S, Mueller AC. 2016. Introduction to Machine Learning with Python. O'Reilly Media;California:
12. Zhou ZH. Li SZ, editor. 2009. Ensemble learning. Encyclopedia of biometrics. Springer;Berlin: DOI: 10.1007/978-0-387-73003-5_293.
Article
13. Yang JH. 2016. Are you new to machine learning? Knowing More Publishing;Seoul:
Full Text Links
  • JKAOH
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr