Harnessing AI and machine learning for credit decisioning

Gordon Campbell is the co-founder and chief customer officer of Rich Data Co. (RDC). Over a 25-year career he has led business strategy, product strategy and digital transformation in a variety of organizations. He is passionate about applying data and AI to enable financial inclusion and social impact. His experience has given him insight into how financial institutions can use technology to drive innovation and enhance existing processes.

“New AI-enabled techniques are always emerging, so enterprises must position themselves to understand and take advantage of them,” he says.

How do AI and machine learning differ from other forms of automated prediction already in use?

By their nature, statistical models can only handle a few data attributes or features, which are the building blocks of datasets. This usually means modelers must engineer features that are meaningful, and that ends up being a very hands-on process.

Not only can AI and machine learning handle a much broader set of data attributes, but they also enable us to incorporate what are called weak attributes, or those that have a lesser impact on the model’s final score or prediction. We’re still able to gain insights from such attributes with models driven by AI and machine learning. For some use cases this means they will get a better, more accurate prediction. University studies have backed this up: you can generate more accurate results with a machine-learned model.

What is AI explainability, and what role does it play when implementing AI?

Explainability is the ability to understand and explain the data used in building a model, and the features built from that data. You should always be clear on the data you selected, the way you engineer features from that data, how you trained and tested the model that uses those features and how you monitor that model in production. When I build a model, I need to understand that I’ve used the right data, but also what that data is telling me inside the model when I’m building and experimenting. And every time I run that model in production with real customer data, I should be able to understand what that data is saying in that specific instance.

At RDC, we ensure there’s explainability through the model life cycle. This is where we monitor the features that go into the model so we can determine how those features have changed over time. So, for example, we build a “model rule explainer” for decision tree models, which looks at the rules that have been “learned from the data” and incorporated into the model. This helps data scientists understand—and if necessary, tune—the resulting models. We also do a lot of data testing during each phase, so we know that we’re not bringing in any unexplained bias into the process, and we use people to identify any unintentional bias.

How does interpretability relate to explainability, and what are some of the ways financial services organizations can ensure interpretability?

Interpretability is a key part of explainability. Since models can’t explain themselves, we need to be able to understand them and why they arrive at a prediction. Interpretability is also important for validating your model and identifying any issues that need to be addressed. In terms of ensuring interpretability, this can be done in several ways.

One approach is XGBoost (Extreme Gradient Boost), a decision tree–bound model (i.e., if this, then that). XGBoost works by assigning importance scores to features to help data scientists understand which features most contribute to predictions. Other interpretability approaches include SHAP (Shapely Additive exPlanation) and LIME, which are popular Python libraries for model explainability. Both are looking at the model results (design and runtime) to interpret model outcomes. SHAP leverages the idea of Shapley values for model feature influence scoring.

To cite a real-life example of the importance of the interpretability of a decision, there’s a famous case of a self-driving Uber hitting and killing a pedestrian. The system first detected something on the road six seconds before collision, oscillated on its predictions but accurately communicated that there was something in front of the car; however, the system failed to behave properly. A rule that ran over sensors to enforce the safe stopping distance did not trigger with enough time to alert the co-pilot driver to intervene. So, it’s not just a matter of the model generating a prediction; it’s a combination of the prediction and the accurate interpretation of what it means and the rule that is tuned by human expertise to adjust. That’s why predictions and rules go hand-in-hand, accompanied by an auditable lineage of the decision, to determine why an outcome occurred.

What are some of the challenges involved in ensuring data quality?

That remains the same whether you’re running a statistical model or a machine-learned one. Organizations have become more adept at data governance more broadly and are actively considering best practices on the collection, labeling and storage of data. When we start our modeling work, we begin with understanding the data lineage and how we’re using it from that point forward to ensure we don’t accidentally inherit any quality issues.

One good thing about AI is that there are techniques you can apply that allow you to work with slightly less-than-perfect data quality, so you can address quality issues within the model or by using the explainability approaches I previously mentioned.

Another factor in data quality is that the tooling and techniques available have evolved in the last five or so years as more organizations invest in data quality and related governance practices. We are much more adept at handling missing values, dealing with duplicate values and converting badly formatted values, outliers and so on.

What recommendations do you have for financial services organizations starting their AI journey?

Organizations looking to leverage AI will need to experiment until they understand how to manage it. Experimenting can alert you to the skills you’ll need inside the organization to take advantage of better datasets, whether it’s people skills or certain capabilities. All of that’s part of the learning process. It’s also important to think through the model management and governance approach because these differ for AI versus statistical models. The quality of AI models deteriorates more quickly than that of statistical models, so while enterprises already have model governance and management processes in place, they’ll have to determine which processes they need to adapt for an AI-driven model.

At RDC, we think AI should be something that becomes very natural in financial services organizations. Many of the enterprises we work with are initially apprehensive about using AI-driven models, but once we start implementing, they recognize a lot of the processes they’re already familiar with. So, we demystify the process through the education we do with our customers.

How should enterprises approach choosing an AI partner?

You’ve got to take great care when picking an AI partner. While you may find a vendor that’s well-versed in AI and machine learning, that doesn’t mean they understand credit. When searching for an AI partner, it’s vital to balance technology and domain expertise. Choosing a partner that understands the domain, its problems and the way you might work through some of those problems is key. In the credit domain, there are governance processes and procedures that a credit team will need to follow. This is so they can demonstrate internally, and externally to regulators, that they have followed the appropriate processes to build, test and run models.

We work directly with credit teams, speak their language and involve them in all parts of the modeling process to help them gain confidence that our approach is safe and responsible.

What does responsible AI deployment look like in the credit risk arena?

We want to ensure responsible lending decisions, and we need to be accurate. We also need to be sure we’re not generating biased predictions. That’s why explainability and interpretability are so important. Understanding the model’s data and why it’s made certain predictions ensures that we’re not including a characteristic that could unfairly impact a certain demographic or group. This is one reason, for example, we don’t include—or need—PII in our models, as it does not contribute to a financial understanding of that person or business.

The bank is ultimately responsible for ensuring the use of that data is aligned with its own credit policy and that policy is ethically aligned. We provide the evidence to help the bank have comfort that they are achieving this outcome. We ensure that documentation exists on the data, features, models and rules, and monitor the decisions with complete transparency.

Gordon Campbell is the co-founder and chief customer officer of Rich Data Co.

Read more AI best practice recommendations in the BAI Special Report “Leveraging AI with Human Capital.”

About the Author

Gordon Campbell

Article

Harnessing AI and machine learning for credit decisioning

Login to view this content