A comparative study of ensemble methods in the field of education: Bagging and Boosting algorithms


Creative Commons License

Şevgin H.

INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION, cilt.10, sa.3, ss.546-562, 2023 (ESCI) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 10 Sayı: 3
  • Basım Tarihi: 2023
  • Doi Numarası: 10.21449/ijate.1167705
  • Dergi Adı: INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION
  • Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), Central & Eastern European Academic Source (CEEAS), ERIC (Education Resources Information Center), TR DİZİN (ULAKBİM)
  • Sayfa Sayıları: ss.546-562
  • Van Yüzüncü Yıl Üniversitesi Adresli: Evet

Özet

This study aims to conduct a comparative study of Bagging and Boosting algorithms among ensemble methods and to compare the classification performance of TreeNet and Random Forest methods using these algorithms on the data extracted from ABİDE application in education. The main factor in choosing them for analyses is that they are Ensemble methods combining decision trees via Bagging and Boosting algorithms and creating a single outcome by combining the outputs obtained from each of them. The data set consists of mathematics scores of ABİDE (Academic Skills Monitoring and Evaluation) 2016 implementation and various demographic variables regarding students. The study group involves 5000 students randomly recruited. On the deletion of loss data and assignment procedures, this number decreased to 4568. The analyses showed that the TreeNet method performed more successfully in terms of classification accuracy, sensitivity, F1-score and AUC value based on sample size, and the Random Forest method on specificity and accuracy. It can be alleged that the TreeNet method is more successful in all numerical estimation error rates for each sample size by producing lower values compared to the Random Forest method. When comparing both analysis methods based on ABİDE data, considering all the conditions, including sample size, cross validity and performance criteria following the analyses, TreeNet can be said to exhibit higher classification performance than Random Forest. Unlike a single classifier or predictive method, the classification or prediction of multiple methods by using Boosting and Bagging algorithms is considered important for the results obtained in education.