Sampling Bias and Scorecard Evaluation

  • 0 Rating
  • 0 Reviews
  • 11 Students Enrolled

Sampling Bias and Scorecard Evaluation

In this talk, we introduce a Bayesian evaluation framework that leverages unlabeled data to facilitate more accurate predictions of the performance of a credit scorecard when put into production.

  • 0 Rating
  • 0 Reviews
  • 11 Students Enrolled
  • Free
Tags:



Courselet Content

2 components

Requirements

  • Familiarity with credit scoring and supervised machine learning is beneficial.

General Overview

Description

We revisit the problem of sample selection bias in application scoring. Unlike most prior research, our focus is not on how sampling bias impacts the development of credit scorecards. Instead, we study how using a biased sample (i.e., containing only accepted credit applicants) impacts the evaluation of scorecards. We demonstrate that standard practices for scorecard evaluation underestimate the true bad rate substantially. More importantly, we propose a Bayesian evaluation framework to mitigate the adverse effect of the sampling bias on scorecard evaluation. Our framework is agnostic of the employed measure of scorecard performance and supports a range of standard indicators including, but not limited to, the AUC, KS, the Brier score as well as any measure derived from the confusion matrix like precision, recall, or F-score. 

Extensive empirical experiments using synthetic and real-world microlending data provide strong evidence that our Bayesian framework predicts the true performance of a scorecard in production much better than standard practices and benchmarks taken from the literature. 

Last, we propose an approach to shed light on the business value of better practices for scorecard performance prediction. Considering a setting in which an institution aims to optimize its acceptance policy and seeks to determine the profit-maximizing acceptance rate, we find that Bayesian evaluation outperforms alternative approaches with notable margin across a wide range of LGDs. For example, assuming an average loan amount of $375 and an average LGD of 50% (based on our microlending data), we show that the incremental profit of Bayesian evaluation over the baseline of using only data from accepted clients is roughly $30 per loan. This significant increase in profit stems from the fact that Bayesian evaluation recommends a better (i.e., more profitable) acceptance rate.  

Courses that include this CL

Meet the instructors !

instructor
About the Instructor

Stefan received a PhD from the University of Hamburg in 2007, where he also completed his habilitation on decision analysis and support using ensemble forecasting models in 2012. He then joined the Humboldt-University of Berlin in 2014, where he heads the Chair of Information Systems at the School of Business and Economics. He serves as an associate editor for the International Journal of Business Analytics, Digital Finance, and the International Journal of Forecasting, and as department editor of Business and Information System Engineering (BISE). Stefan has secured substantial amounts of research funding and published several papers in leading international journals and conferences. His research concerns the support of managerial decision-making using quantitative empirical methods. He specializes in applications of (deep) machine learning techniques in the broad scope of marketing and risk analytics. Stefan actively participates in knowledge transfer and consulting projects with industry partners; from start-up companies to global players and not-for-profit organizations.