Study goals
To assess XGBoost performance in predicting financial distress in Latin American utility companies, identifying the most relevant accounting and market variables, and comparing it with traditional models.
Relevance / originality
The study applies machine learning with a sector-specific focus, enhancing accuracy and interpretability in emerging contexts. It analyzes predictive metrics and variable importance, reinforcing sector-specific approaches, still underexplored in corporate financial risk prediction in Latin America.
Methodology / approach
Dataset of 103 companies from six countries (2000–2019), applying multiple definitions of financial distress (FD1–FD4). Temporal train/test split, undersampling, and XGBoost implementation in Python. Evaluated AUC, accuracy, Brier Score, Type I/II errors, and variable importance.
Main results
XGBoost outperformed logistic regression and random forest, especially in FD1 and FD4 (accuracy >94%, Type II error = 0). Key variables: profitability (FD1, FD4), market valuation and sales growth (FD2, FD3). Sector specialization improved predictive consistency.
Theoretical / methodological contributions
Demonstrates that sector-specific machine learning models improve performance and interpretability in financial risk prediction. Integrates predictive metrics and variable analysis, expanding literature on XGBoost applications in emerging markets and proposing a replicable approach for future studies.
Social / management contributions
The model supports analysts, investors, and regulators in early detection of financial deterioration, optimizing credit decisions, risk mitigation, and capital allocation. Enhances predictability and transparency, strengthening strategic management and stability in the Latin American utility sector.