Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya

Victor Opiyo; Emma Anyika

doi:doi:10.11648/j.ajai.20250902.22

Research Article |

| Peer-Reviewed

Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya

Victor Opiyo^*

, Emma Anyika

Published in American Journal of Artificial Intelligence (Volume 9, Issue 2)

Received: 14 September 2025 Accepted: 24 September 2025 Published: 27 October 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.

Published in	American Journal of Artificial Intelligence (Volume 9, Issue 2)
DOI	10.11648/j.ajai.20250902.22
Page(s)	198-209
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Low Birth Weight, Ensemble Learning, Machine Learning, Predictive Modelling, Kenya

References

[1]	WHO, “Born too soon: decade of action on preterm birth.” Accessed: Feb. 08, 2025. [Online]. Available: https://www.who.int/publications/i/item/9789240073890
[2]	A. Ranjbar, F. Montazeri, M. V. Farashah, V. Mehrnoush, F. Darsareh, and N. Roozbeh, “Machine learning-based approach for predicting low birth weight,” BMC Pregnancy Childbirth, vol. 23, no. 1, p. 803, Nov. 2023, https://doi.org/10.1186/s12884-023-06128-w
[3]	D. Unicef, “Low birthweight,” UNICEF DATA. Accessed: Feb. 08, 2025. [Online]. Available: https://data.unicef.org/topic/nutrition/low-birthweight/
[4]	A. K’Oloo et al., “Improving birth weight measurement and recording practices in Kenya and Tanzania: a prospective intervention study with historical controls,” Popul. Health Metr., vol. 21, no. 1, p. 6, May 2023, https://doi.org/10.1186/s12963-023-00305-x
[5]	S. J. Sawe, “MACHINE LEARNING PREDICTION OF LOW BIRTH WEIGHT IN KENYA USING MATERNAL RISK FACTORS,” 2022.
[6]	J. Lanowski, J. von Ehr, and M., “Impact of Ultrasound Training and Experience on Accuracy regarding Fetal Weight Estimation at Term Creative Education.” Accessed: Aug. 05, 2025. [Online]. Available: https://www.scirp.org/journal/paperinformation?paperid=79172
[7]	W. T. Bekele, “Machine learning algorithms for predicting low birth weight in Ethiopia,” BMC Med. Inform. Decis. Mak., vol. 22, no. 1, p. 232, Sept. 2022, https://doi.org/10.1186/s12911-022-01981-9
[8]	S. Sanchez-Martinez et al., “Prediction of low birth weight from fetal ultrasound and clinical characteristics: a comparative study between a low- and middle-income and a high-income country,” BMJ Glob. Health, vol. 9, no. 12, p. e016088, Dec. 2024, https://doi.org/10.1136/bmjgh-2024-016088
[9]	Rubaiya, Mohaimen Mansur, and Md. I. Rayhan, “Unraveling birth weight determinants: Integrating machine learning, spatial analysis, and district-level mapping,” Heliyon, vol. 10, no. 5, p. e27341, Mar. 2024, https://doi.org/10.1016/j.heliyon.2024.e27341
[10]	M. M. Musau et al., “Spatial heterogeneity of low-birthweight deliveries on the Kenyan coast,” BMC Pregnancy Childbirth, vol. 23, no. 1, p. 270, Apr. 2023, https://doi.org/10.1186/s12884-023-05586-6
[11]	Z. D. Bailey, J. M. Feldman, and M. T. Bassett, “How Structural Racism Works - Racist Policies as a Root Cause of U.S. Racial Health Inequities,” N. Engl. J. Med., vol. 384, no. 8, pp. 768–773, Feb. 2021, https://doi.org/10.1056/NEJMms2025396
[12]	N. Kozuki et al., “The associations of parity and maternal age with small-for-gestational-age, preterm, and neonatal and infant mortality: a meta-analysis,” BMC Public Health, vol. 13 Suppl 3, no. Suppl 3, p. S2, 2013, https://doi.org/10.1186/1471-2458-13-S3-S2
[13]	W. H. Organization, “Global nutrition targets 2025: low birth weight policy brief,” Art. no. WHO/NMH/NHD/14.5, 2014, Accessed: Jan. 06, 2025. [Online]. Available: https://iris.who.int/handle/10665/149020
[14]	J. Molitoris, K. Barclay, and M. Kolk, “When and Where Birth Spacing Matters for Child Survival: An International Comparison Using the DHS,” Demography, vol. 56, no. 4, pp. 1349–1370, Aug. 2019, https://doi.org/10.1007/s13524-019-00798-y
[15]	Y. I. Coulibaly et al., “The Impact of Six Annual Rounds of Mass Drug Administration on Wuchereria bancrofti Infections in Humans and in Mosquitoes in Mali,” Am. J. Trop. Med. Hyg., vol. 93, no. 2, pp. 356–360, Aug. 2015, https://doi.org/10.4269/ajtmh.14-0516
[16]	G. Rezaeizadeh et al., “Maternal education and its influence on child growth and nutritional status during the first two years of life: a systematic review and meta-analysis,” eClinicalMedicine, vol. 71, p. 102574, Apr. 2024, https://doi.org/10.1016/j.eclinm.2024.102574
[17]	Y. B. Okwaraji, S. Cousens, Y. Berhane, K. Mulholland, and K. Edmond, “Effect of Geographical Access to Health Facilities on Child Mortality in Rural Ethiopia: A Community Based Cross Sectional Study,” PLoS ONE, vol. 7, no. 3, p. e33564, Mar. 2012, https://doi.org/10.1371/journal.pone.0033564
[18]	T. Saito and M. Rehmsmeier, “The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets,” PloS One, vol. 10, no. 3, p. e0118432, 2015, https://doi.org/10.1371/journal.pone.0118432
[19]	A. Luque, A. Carrasco, A. Martín, and A. de las Heras, “The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit., vol. 91, pp. 216–231, July 2019, https://doi.org/10.1016/j.patcog.2019.02.023
[20]	E. W. Steyerberg et al., “Assessing the performance of prediction models: a framework for traditional and novel measures,” Epidemiol. Camb. Mass, vol. 21, no. 1, pp. 128–138, Jan. 2010, https://doi.org/10.1097/EDE.0b013e3181c30fb2
[21]	K. Hajian-Tilaki, “Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation,” Casp. J. Intern. Med., vol. 4, no. 2, pp. 627–635, 2013.
[22]	V. Borisov, T. Leemann, K. Seßler, J. Haug, M. Pawelczyk, and G. Kasneci, “Deep Neural Networks and Tabular Data: A Survey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 6, pp. 7499–7519, June 2024, https://doi.org/10.1109/TNNLS.2022.3229161
[23]	T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in KDD ’16. New York, NY, USA: Association for Computing Machinery, Aug. 2016, pp. 785–794. https://doi.org/10.1145/2939672.2939785

Cite This Article

Plain Text BibTeX RIS

APA Style

Opiyo, V., Anyika, E. (2025). Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. American Journal of Artificial Intelligence, 9(2), 198-209. https://doi.org/10.11648/j.ajai.20250902.22

Copy | Download

ACS Style

Opiyo, V.; Anyika, E. Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. Am. J. Artif. Intell. 2025, 9(2), 198-209. doi: 10.11648/j.ajai.20250902.22

Copy | Download

AMA Style

Opiyo V, Anyika E. Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. Am J Artif Intell. 2025;9(2):198-209. doi: 10.11648/j.ajai.20250902.22

Copy | Download

@article{10.11648/j.ajai.20250902.22,
  author = {Victor Opiyo and Emma Anyika},
  title = {Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya
},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {2},
  pages = {198-209},
  doi = {10.11648/j.ajai.20250902.22},
  url = {https://doi.org/10.11648/j.ajai.20250902.22},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250902.22},
  abstract = {Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya

AU  - Victor Opiyo
AU  - Emma Anyika
Y1  - 2025/10/27
PY  - 2025
N1  - https://doi.org/10.11648/j.ajai.20250902.22
DO  - 10.11648/j.ajai.20250902.22
T2  - American Journal of Artificial Intelligence
JF  - American Journal of Artificial Intelligence
JO  - American Journal of Artificial Intelligence
SP  - 198
EP  - 209
PB  - Science Publishing Group
SN  - 2639-9733
UR  - https://doi.org/10.11648/j.ajai.20250902.22
AB  - Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.

VL  - 9
IS  - 2
ER  -

Copy | Download

Author Information

Victor Opiyo

Department of Computer Science and Information Technology, Cooperative University of Kenya, Nairobi, Kenya

Contact Email

http://orcid.org/0009-0006-8445-6671
Emma Anyika

Department of Computer Science and Information Technology, Cooperative University of Kenya, Nairobi, Kenya

Contact Email

http://orcid.org/0000-0001-6418-5305

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Opiyo, V., Anyika, E. (2025). Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. American Journal of Artificial Intelligence, 9(2), 198-209. https://doi.org/10.11648/j.ajai.20250902.22

Copy | Download

ACS Style

Opiyo, V.; Anyika, E. Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. Am. J. Artif. Intell. 2025, 9(2), 198-209. doi: 10.11648/j.ajai.20250902.22

Copy | Download

AMA Style

Opiyo V, Anyika E. Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya. Am J Artif Intell. 2025;9(2):198-209. doi: 10.11648/j.ajai.20250902.22

Copy | Download

@article{10.11648/j.ajai.20250902.22,
  author = {Victor Opiyo and Emma Anyika},
  title = {Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya
},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {2},
  pages = {198-209},
  doi = {10.11648/j.ajai.20250902.22},
  url = {https://doi.org/10.11648/j.ajai.20250902.22},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250902.22},
  abstract = {Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Leveraging Ensemble Models for Optimizing Predictive Accuracy of Low Birthweight Risk in Kenya

AU  - Victor Opiyo
AU  - Emma Anyika
Y1  - 2025/10/27
PY  - 2025
N1  - https://doi.org/10.11648/j.ajai.20250902.22
DO  - 10.11648/j.ajai.20250902.22
T2  - American Journal of Artificial Intelligence
JF  - American Journal of Artificial Intelligence
JO  - American Journal of Artificial Intelligence
SP  - 198
EP  - 209
PB  - Science Publishing Group
SN  - 2639-9733
UR  - https://doi.org/10.11648/j.ajai.20250902.22
AB  - Low birth weight (LBW) is a prevalent public health challenge in low- and middle-income countries, including Kenya, where approximately 11.5% of newborns are affected. LBW is linked to heightened infant mortality, infections, and long-term developmental issues. While machine learning (ML), particularly ensemble learning, has demonstrated potential in improving LBW risk prediction, its application in resource-limited settings like Kenya remains underexplored. Prior research has largely focused on developed countries with limited adoption in sub-Saharan Africa, highlighting a crucial gap this study aims to address. This research develops and evaluates ensemble machine learning models to predict LBW risk using nationally representative data from the 2022 Kenya Demographic and Health Survey. The study integrates traditional clinical indicators with advanced computational methods, employing base classifiers such as Support Vector Machines and Logistic Regression alongside ensemble methods including Random Forest, Gradient Boosting, and Extreme Gradient Boosting. Meta-ensemble approaches such as bagging, voting, and stacking were also assessed. Data preprocessing included treatment of missing values, encoding categorical variables, and addressing class imbalance through the Synthetic Minority Over-sampling Technique (SMOTE). Models were trained and validated using stratified cross-validation and independent testing, with evaluation metrics comprising ROC AUC, accuracy, F1-score, Matthews Correlation Coefficient, and Brier score, emphasizing both discrimination and calibration. Results indicate that Random Forest outperformed other models, achieving a high ROC AUC of 0.957 and PR AUC of 0.971, with excellent calibration (Brier score 0.089), evidencing its strong predictive capability for LBW risk in the Kenyan context. Important predictors identified were gestational age, maternal height and weight, antenatal care utilization, and socioeconomic factors, consistent with known biological and contextual determinants. Ethical considerations regarding patient privacy, algorithmic fairness, and transparency were incorporated to promote responsible AI use in healthcare. The findings demonstrate that tailored ensemble learning models provide robust, interpretable, and practical tools for LBW prediction in low-resource settings. This work fills a critical research gap by applying advanced ML methods to Kenyan maternal-child health data, offering potential to enhance clinical decision-making and improve maternal and neonatal outcomes. The study underscores the importance of contextualized AI solutions and ethical governance for sustainable healthcare innovation.

VL  - 9
IS  - 2
ER  -

Copy | Download