Long-Term Analysis of Data Drift in Online Lending Credit Risk Models
Main Article Content
Abstract
Online lending platforms continuously adjust credit policies in response to market fluctuations and borrower behavior. However, long-term changes in data distributions often weaken the effectiveness of static risk assessment models. Transaction records from a regional lending platform between 2018 and 2023 were analyzed, covering 1.27 million loan applications and 94,000 default events. Feature distributions, approval rates, and model outputs were tracked over 18 consecutive quarters. Significant concept drift was observed in income verification, repayment delay, and credit utilization indicators. A rolling retraining strategy combined with window-based feature recalibration was applied to mitigate performance degradation. After deployment, the area under the ROC curve increased from 0.71 to 0.79, while false rejection rates declined by approximately 12%. Model stability also improved during periods of economic volatility, particularly in 2020 and 2022. These findings illustrate how unmanaged data drift can gradually undermine financial risk systems in real-world operations.