Classification of Variables Affecting Birth Weight by Decision Trees and K-Nearest Neighbor Methods

Creative Commons License


International Journal of Scientific and Technological Research, vol.5, no.12, pp.112-119, 2019 (Peer-Reviewed Journal)


Objective: The aim of this study was to determine the factors affecting the birth weight of infants by using some Decision Trees and K-Nearest Neighbor methods with high accuracy and to evaluate the performance of the algorithms in the classification of low birth weight. Material and Methods: The algorithms used for classification can generally be examined under two headings as “unsupervised” and “supervised”. “Decision trees” and “k-nearest neighbor” algorithms in supervised data mining; nonparametric methods and has predictive feature. With these algorithms applied for classification purposes, explanatory variables which are most effective on the birth weight of babies have been determined. From decision trees; “CART, CHAID, exhaustive CHAID, QUEST, Random Forest and C4.5” algorithms have been used. In k-nearest neighbor algorithm; “Euclidean” and “Manhattan” distance measurements have been applied. Results: The highest estimation rate in terms of sensitivity has been observed in the “CART” algorithm with 88.4%. The highest estimation rate in terms of specificity criterion has been seen 98.2% in the “Random Forest” algorithm. The highest estimation rate in terms of accuracy criterion has been seen 94.5% in the “C4.5” algorithm. The lowest rate in terms of the risk estimate has been observed in the “C4.5” of 5.6%. Conclusion: When the results are examined; it can be said that all algorithms work with “good classification, high estimation and low error rate”. This study may contribute to early investigations of the birth weight of newborn babies, whether it is low birth weight or not, and thus taking preventive measures.