Operational efficiency of debt collection companies

ML in debt recovery
Artificial Intelligence
Team Outsourcing
Payment model:
Fixed price
6 months


One of the largest debt collection companies in Poland, serving more than 4 million customers, decided to work with us on two AI/ML scope projects. The client’s confidence stemmed from our joint successful partnership in consulting and training in the past and our experience in applying mathematical methods to debt management.


Valuation of debt packages before they are purchased is one of the key areas of a debt collection company. The entire valuation process, carried out carefully, is time-consuming. At the same time, the reality of the market is that it is impossible to spend so much time on the valuation of a debt package. Since the client had limited information on receivables for new clients and could use historical repayment data for known clients, it had the basis for creating its own valuation model tailored strictly to the needs and specifics of its clients.

The client’s goal was not only to value individual portfolios more accurately but also to provide more precise information that explains why a particular valuation was made. This allows analysts valuing debt parcels to provide additional information to those making final decisions on the purchase price. The result is more effective decisions in the debt purchase process.


The main idea of the project was to develop an alternative method to the existing one for valuing debt packages before their purchase. The goal was to obtain the most accurate, repeatable, and fast valuation of a debt package. Its key element was to predict the recovery rate of the purchased debt. The project involved a four-person team consisting of three analysts and a programmer supervised by an expert in data analytics.

Product workshops 

The first stage of the project was a series of workshops to identify:

– The scope of the project (including the types of debt portfolios to be valued.

– The functionality of the solution.

Evaluation of data model validation criteria

We proposed a rather complex validation scheme to reflect the conditions under which the model will be used. To this end, we used custom methods based on historical data and multivariate statistics (with optimization). These were prepared to ensure both the flexibility of the method and the accuracy of the recovery forecast.

Our goal was to provide computational engines that are fed with data and, as a result, provide predictions. The applications used for pricing were built at the client’s site, feeding our engine with data and receiving the results.

Building a package pricing model 

The purpose of creating the model was to achieve the most accurate, reproducible, and rapid valuation of the debt package.

There are several approaches to modeling based on ML/AI that can be used in such a task. The choice depends on the size of the database, the number of debtors, the willingness to use methods that require longer calculations, and other factors. Choosing an approach is not a simple task, as it requires an understanding of the client’s needs and business constraints, as well as those arising from the size and quality of the data.

The approaches we chose are listed below.

  • Distance methods – The method consisted of valuing a debt package based on the historical behavior of the most similar packages the debt collection company had previously bought. Application of the approach required preparation of (micro)segmentation of packages, definition of a similarity measure adequate to the purpose, multidimensional optimization of parameters (e.g., the number of packages from which we were valuing), and development of measures that allowed reliable validation of the solution.
  • Multidimensional look-up table – The approach was to divide historical cases into small parcels using the features describing them. This was achieved by dividing the numerical features into ranges. In the next step, a large parcel of valued cases was divided in an analogous manner. The behavior of each of the resulting parcels of valued cases was predicted based on the behavior of the corresponding parcel of historical cases.
  • Regression approach – This involved a regression machine learning model that predicted the recovery percentage for each case in a bundle based on its characteristics, giving iterations of the package pricing model.

In subsequent stages, we delivered gradually improved models for integration into the client’s system. After initial testing and analysis of the results, it turned out that minor changes were needed to the assumptions made at the beginning of the project, but these had considerable consequences. This is a typical situation in projects using mathematical methods. Thanks to proper preparation of the code that builds the models and continuous delivery in the next iteration of modeling, the changes were quickly taken into account.


The project was successful, and the solution is now being used to price portfolios. Reports designed for valuation experts help the client’s teams understand the reasons behind the algorithms’ pricing and facilitate manual adjustments (if necessary, e.g., due to insufficient historical data).

The client greatly appreciated the successful cooperation and good communication within the team, especially the ability to adapt flexibly to requirements. In addition, we suggested how to present the results and additional analysis in a report for analysts valuing debt portfolios. This proved to be a great help in the valuation process.

“We were impressed with the development of dedicated algorithms for our data and needs.” Board Member & CIO, Debt Collection Firm 


Virtual fitting room of clothes in online stores

Design, Development, DevOps or Cloud – which team do you need to speed up work on your projects?
Chat with your consultation partners to see if we are a good match.

Jakub Orczyk

Member of the Management Board/ Sales Director VM.PL

Book a free consultation
kuba (2)