Mortgage default prediction is always on the table for financial institutions. Banks are interested in provision planning, while regulators monitor systemic risk, which this sector may possess. This research is focused on predicting defaults on a one-year horizon using data from the Ukrainian credit registry applying machine-learning methods. This research is useful for not only academia but also policymakers since it helps to assess the need for implementation of macroprudential instruments. We tested two data balancing techniques: weighting the original sample and synthetic minority oversampling technique and compared the results. It was found that random forest and extreme gradient-boosting decision trees are better classifiers regarding both accuracy and precision. These models provided an essential balance between actual default precision and minimizing false defaults. We also tested neural networks, linear discriminant analysis, support vector machines with linear kernels, and decision trees, but they showed similar results to logistic regression. The result suggested that real gross domestic product (GDP) growth and debt-service-to-income ratio (DSTI) were good predictors of default. This means that a realistic GDP forecast as well as a proper assessment of the borrower’s DSTI through the loan history can predict default on a one-year horizon. Adding other variables such as the borrower’s age and loan interest rate can also be beneficial. However, the residual maturity of mortgage loans does not contribute to default probability, which means that banks should treat both new borrowers equally and those who nearly repaid the loan.
This work is licensed under a Creative Commons Attribution 4.0 International License.