研究综述：降低机器学习模型偏差

降低机器学习模型偏差的研究综述

本文综述了机器学习中降低模型偏差的各种方法、理论和实践，涵盖模型评估、分布处理、泛化能力提升等多个方面。

引言

在机器学习模型的开发和应用过程中，偏差是一个关键挑战。模型偏差可能导致预测不准确、泛化能力差，甚至在实际应用中产生不良后果。本文汇总了降低机器学习模型偏差的多种方法和研究成果，涵盖模型评估、训练策略、分布处理、泛化能力提升等多个方面，为相关研究和实践提供参考。

模型方法

嵌套交叉验证

Varma and Simon (2006)、Raschka (2018)、Sziklai, Baranyi, and Héberger (2024) 提出的嵌套交叉验证方法有助于克服小数据集上的评估难题。内循环负责模型的选择和参数优化，而外循环则用于评估模型的泛化能力，确保了模型评估的独立性和客观性，减少因数据分割方式不同而引入的偏差。

Boosting方法

Freund and Schapire (1997) 提出的AdaBoost、weighted majority等Boosting（提升）方法通过组合多个弱分类器构建强分类器，有效降低模型偏差，提高预测性能。

动态Dropout策略

Morerio et al. (2017) 指出过度的co-adaptations会导致网络对特定训练数据过度拟合。可以在训练初期不使用dropout，让网络自由学习，然后逐步引入dropout以防止过拟合，降低模型偏差。

非参数估计方法

Efron and Gong (1983) 提出的jackknife方法作为一种非参数估计技术，可有效降低模型估计偏差，尤其适用于小样本情况。

保形推断

Sesia and Candès (2019) 提出的corformal inference用于解决量化算法预测不确定性的问题，无需进行强假设或依赖大样本渐近近似，有助于更准确地评估模型预测的可靠性。

模型评估指标

Tran et al. (2020, 2019) 提出了四个关键评估指标：准确性（使用RMSE评估）、校准度（使用MAE评估）、锐度（使用UCB95评估）和离散度（使用分布宽度评估），全面衡量模型性能。

分样本建模

Ma et al. (2024) 提出的分样本建模方法减少模型在训练时受到其他集群中错误标记为灰样本（FN/FP）的干扰，通过在每个簇上训练模型来创建质量较高的样本进行训练，从而减少模型对样本的识别能力的负面影响。

模型复杂度考量

Hoover and Perez (1999) 指出，经济学领域也关注过拟合问题。足够复杂的模型原则上可以描述经济世界的显著特征，但如果一个更简单的模型能以更简洁紧凑的形式传达相同的信息，那么这个更简单的模型理应更优。

因果效应估计

Schur and Peters (2024) 提出的算法旨在估计存在未观察到的混杂因素的时间序列数据中的因果效应，基于三个关键假设：一致性条件、平均绝对预测误差（MAE）定义和一致性估计器的不存在条件。

Torrent算法

Schur and Peters (2024) 提出的Torrent算法能够快速收敛，部分原因是它在每次迭代中专注于那些"容易学习"的样本（残差较小的样本点）。通过逐步排除或减少具有较大残差的样本点的影响，能够更快地找到一个稳健的模型。

弱监督学习

Z.-H. Zhou (2018) 讨论了弱监督学习的研究进展，集中在三种典型情况：不完全监督、不准确监督和不精确监督，可用于归纳支付风控的算法体系。其中不精确监督包括多实例学习（MIL），与团伙挖掘和社区发现模型相似。

类增量学习

D.-W. Zhou et al. (2024) 提出的类增量学习（CIL）使学习系统能够逐步整合新类别的知识，构建能够识别所有已见过类别的通用分类器。结合对比学习方法，可提高模型对类别特征的区分能力，满足少量正样本之间不独立的假设。

应用驱动的机器学习

Rolnick et al. (2024) 指出以应用为驱动的研究（ADML）在机器学习领域被系统性低估。随着机器学习应用的扩散，受特定现实世界挑战启发的创新算法变得越来越重要，管理科学是典型的ADML例子。

支持向量机

Cortes and Vapnik (1995) 提出的支持向量机（SVM）中，最终的决策函数只依赖于支持向量，而与其他训练样本无关。这使得SVM在处理高维数据时非常有效，有助于提高模型的泛化能力。

成本敏感方法

Z. Li et al. (2021) 提出的成本敏感方法在信用评分和欺诈检测中特别有用，因为这些领域的误分类成本可能非常高。通过将实际贷款的利润/损失计算并添加到模型中作为样本权重，可帮助金融机构更有效地管理信用风险。

增强随机森林

Bertsimas and Stoumpou (2024) 提出的增强随机森林方法在现有随机森林和提升方法的基础上进行扩展，借鉴了随机森林的集成学习和提升方法的顺序学习思想，参考了样本权重在AdaBoost中的应用。

表格数据的神经网络

Grinsztajn, Oyallon, and Varoquaux (2022) 对树模型和神经网络的不同归纳偏差进行了实证研究，提出了构建特定于表格数据的神经网络的挑战：对无信息特征具有鲁棒性、保持数据的方向性、能够轻松学习不规则函数。

熵正则化

Grandvalet and Bengio (2006)、D.-H. Lee (2013) 使用熵正则化作为一种在最大后验估计框架下利用未标记数据的方法。通过最小化未标记数据的条件熵来减少类别概率分布的重叠，实现低密度分离，这是半监督学习的一个常见假设。

Brier分数

Bequé et al. (2017) 提出的Brier Score是一种用于衡量概率预测准确性的指标，特别适用于二分类问题。计算预测概率与实际结果之间的平方误差的均值，值越低表示预测越准确，取值范围是[0,1]。

校准误差与精细化误差

Berta et al. (2025) 指出校准误差和精细化误差在训练过程中不会同时最小化，导致基于验证损失的早期停止策略会得到一个对两者都不理想的折中点。两者共同决定了模型的整体风险，精细化误差关注模型区分能力，校准误差关注预测准确性。

XGBoost与Sklearn差异

Chen and Guestrin (2016) 指出XGBoost通过单棵树内特征级并行和列块预排序复用，相比Sklearn的逐棵树顺序计算，训练速度提升5-10倍；支持磁盘外计算突破内存限制，直方图近似算法平衡精度与效率。

特征距离分布优化

T. Li, Kou, and Peng (2023) 分析了DML模块和t-SNE的数据表征方法，介绍了优化特征距离分布，解决类别交织问题，提升线性与距离模型的分类性能。

无监督电商欺诈检测

X. Li et al. (2025) 提出基于SimCLR的无监督电商欺诈检测方法，用对比学习学习表征，性能优于传统无监督方法。

SAFE指标评估

Babaei (2024)、Giudici and Raffinetti (2024)、Babaei, Giudici, and Raffinetti (2025) 提出Rank Graduation Box框架，统一评估AI系统的SAFE指标（可持续性、准确性、公平性、可解释性），核心RGA指标扩展AUC至连续/分类任务。

分布类型

Beta回归建模

Ye and Bellotti (2019) 提出的beta回归模型适用于处理取值在0到1之间的因变量。Law (2015) 指出beta分布处于0到1之间，的确可以用来做PD（违约概率）模型。

Tobit模型

Sigrist and Hirnschall (2019) 探讨了Tobit模型的使用，该模型适用于处理受限因变量，特别是存在截断或删失数据的情况。

保序回归

Goldmann, Crook, and Calabrese (2024) 提出的保序回归在优化过程中使用增广Lagrange算法来确保截距的递增性，以实现保序约束，适用于需要保持预测值顺序关系的场景。

对数变换的局限性

Cohn, Liu, and Wardlaw (2022) 指出log(y+1)变换在处理计数数据时存在不足，可能引入偏差，影响模型性能。

用户价值分布

X. Wang, Liu, and Miao (2019) 提出用户价值的分布既可以看作分类问题，也可以看作回归问题。分类问题更适合考虑零值附近，而回归问题则更关注整体趋势。

分位数VAR

Koenker and Bassett (1978) 等学者提出的分位数VAR（QVAR）与传统VAR的主要区别在于处理条件分布的方式不同，能够更好地捕捉数据中的尾部风险。

Yeo-Johnson变换

Lu et al. (2025) 提出的Yeo-Johnson变换为Box-Cox扩展，支持全实数，通过MLE选λ使数据接近正态分布，提升金融数据处理及模型效果。

分布外检测（OOD）

OOD性能问题

Sanyal et al. (2024) 展示了在噪声数据和干扰特征存在的情况下，如何导致模型在ID（分布内）上表现良好但在OOD（分布外）上表现差。文章还提出了在线性分类模型中，OOD误差的一个下界。

概念漂移

Brownlee (2020) 提出的概念漂移理论适用于自学习系统，描述了数据分布随时间变化的现象，这对模型的长期有效性提出了挑战。

OOD性能测试

H. Yu et al. (2024) 研究了OOD性能测试、预测和表征方法，为评估模型在未知数据上的表现提供了理论基础和实践指导。

稳定的模型重训练

Bertsimas et al. (2024) 提出了一种稳定的机器学习模型重训练方法，通过慢变序列来实现。此外，通过使用SHAP特征重要性展示了重训练迭代中的一致性分析结果。

安全决策的统计保障

Lekeufack et al. (2023) 提出的方法不需要构建预测集，而是直接校准决策，提供了安全决策的统计保障：当决策函数序列满足一定条件时，在指定的时间步长内风险低于某个安全值。

分布稳定性改进

Liu et al. (2024) 提出了一种改进分布稳定性的方法，用于解决机器学习算法在分布转移情况下的稳定性问题。分布稳定性衡量了预测机制Y∣X在两个分布之间的变化程度。

模型记忆检测

Zhang et al. (2021) 提出的实验设计基于深度学习模型在面对随机标签的训练数据时的拟合能力，是测试模型是否仅仅记忆了数据而非学习到泛化模式的直接方法。

SHAP值检测概念漂移

Zheng (2019)、Zheng et al. (2019) 通过比较特征的SHAP值的分布变化来检测概念漂移。在不依赖真实标签的情况下，仍能有效识别与分类任务性能相关的漂移，避免因特征分布变化而误报无关漂移。

无监督概念漂移检测

Lukats et al. (2025) 系统性地探讨了无监督概念漂移检测的关键问题，包括概念漂移分类体系、通用检测框架、动态窗口策略、性能评估创新。

解释偏移概念

Mougan et al. (2023) 提出解释偏移概念，通过比较模型解释空间（如Shapley值）检测分布偏移，设计二分类器检测偏移，基于AUC判断。

保形预测

Shafer and Vovk (2008) 提出的保形预测会生成一个"预测区域"（如数值区间或标签集合），并保证这个区域包含真实结果的概率至少为1−ε。关键是计算每个数据点的"非一致性"（即这个点相对于历史数据的异常程度）。

非交换数据的保形预测

Barber et al. (2023) 拓展保形预测至非交换数据，借助加权分位数与随机化技术保障覆盖度，为更广泛的数据类型提供了可靠的预测区间估计方法。

子群问题

子群校准问题

Y. Yu et al. (2022) 指出在单个数据集和模型的情况下，温度缩放技术已被广泛用于提高基模型的校准性能。但是，在涉及多个不同的领域和子群的任务中，该技术无法优良适用，因为不同的领域和子群可能具有不同的概率分布，因此调整校准温度可能会导致无法满足期望的校准性能。

其他主题

偏差的分类

Fang (2023) 对机器学习中的偏差进行了系统分类，为理解和处理不同类型的偏差提供了框架。

模型遗忘程度

Toneva et al. (2018) 提出了一种衡量模型遗忘程度的方法，称为遗忘事件。遗忘程度是指模型在训练过程中逐渐忘记或减弱对某些样本的学习效果的程度，通常与难样本和不确定性较大的样本有关。

难度比例标签平滑

H. Lee et al. (2025) 提出的难度比例标签平滑（DPLS）通过动态调整标签平滑强度实现自适应正则化，使硬样本获得强平滑以抑制记忆化，而简单样本保留原始标签信息，突破了传统标签平滑的"一刀切"局限。

流行度偏差

流行度偏差（popularity bias）指模型倾向于过度推荐热门物品，而忽视小众但可能更相关的物品，例如豆包的一场SEO，可能让AI搜索成了"内容垃圾场"。

梯度提升树估计得分函数

Beltran-Velez et al. (2024) 提出如何使用梯度提升树来估计条件扩散模型中的得分函数，使其具有灵活性和鲁棒性，用于正向与逆向过程。

AUC的局限性

Cortes and Mohri (2004) 指出在分布不均和错误率较高的情况下，AUC的标准差显著，意味着即使在固定的错误率下，不同分类器的AUC值也可能有较大的差异。

Matthews相关系数

Matthews (1975) 等学者提出的MCC（Matthews相关系数）是一个有效的度量指标，特别是在处理不平衡数据集时。MCC考虑了真正例、假正例、真负例和假负例，提供模型性能的全面视图。

类别稀疏性

He et al. (2025) 提出的"类别稀疏性"（Class Sparsity）是对传统"类别不平衡"问题的深化和拓展，其核心在于从"信息分布偏差"角度重新定义分类难题，而非仅关注样本数量的不均衡。

Tomek Links

Tomek (1976)、Khan (2023) 提出的Tomek Links用于清理边界附近的重叠样本，通过删除多数类样本或噪声点来优化分类边界，减少模型偏差。

多数规则

B. Wang and Zheng (2024) 指出多数规则能够抑制信息操纵，减少集体决策中第二类错误（Type II errors）的发生。在均衡状态下，最优的多数规则会在减少第二类错误和潜在增加第一类错误之间进行战略性平衡。

H测度

Hand (2009) 讨论了AUC指标的局限性，并提出了替代方案——H测度。AUC虽然客观且易于比较分类器，但它在处理不同分类器时隐含地使用了不同的错误分类成本分布，这在逻辑上是不一致的。

欺诈检测成本函数

Gadi, Wang, and Lago (2008) 提出的成本函数中，TP、FP和FN都对总成本有直接的影响，而TN由于正确识别非欺诈交易，不增加额外成本。这种成本分配反映了在信用卡欺诈检测中，不同类型错误的重要性和对业务影响的不同。

PR曲线

Czakon (2024) 指出PR曲线关注的是查准率（Precision）和查全率（Recall）之间的平衡。PR曲线下面积（AUPR）越接近1，则诊断方法真实度越高。对于低打扰场景，只需要选择高准确率下的面积即可。

AUPRC的局限性

McDermott et al. (2024) 明确指出AUPRC在类别不平衡情况下并不总是优于AUROC。AUPRC不适合评估正样本占比高的数据集，因为P和R都很容易高。

模型评估指标协同

KS定规则，AUC保排队能力，Lift找最优，三者协同，解决业务问题，提供全面的模型评估视角。

数据不平衡问题解决

Lei et al. (2020) 等学者提出使用自编码器基于历史欺诈交易生成新的欺诈样本，然后SVM在随机欠采样的训练数据集上进行训练，并用于验证生成的样本，有效解决数据不平衡问题。

重要性抽样

Liang, Jiang, and Fu (2025) 提出重要性抽样（IS）用于降低方差，提高模型估计的稳定性和准确性。

参考文献

Alcaraz, Javier. 2024. “Redesigning a NSGA-II Metaheuristic for the Bi-Objective Support Vector Machine with Feature Selection.” Computers and Operations Research 172: 106821.
Angluin, Dana, and Philip Laird. 1988. “Learning from Noisy Examples.” Machine Learning 2 (4): 343–70.
Babaei, Golnoosh. 2024. “Safeaipackage: A Python Package for Evaluating SAFE AI Metrics.” https://github.com/GolnooshBabaei/safeaipackage.
Babaei, Golnoosh, Paolo Giudici, and Emanuela Raffinetti. 2025. “A Rank Graduation Box for SAFE AI.” Expert Systems With Applications 259: 125239. https://doi.org/10.1016/j.eswa.2024.125239.
Barber, Rina Foygel, Emmanuel J Candès, Aaditya Ramdas, and Ryan J Tibshirani. 2023. “Conformal Prediction Beyond Exchangeability.” The Annals of Statistics 51 (2): 816–45. https://doi.org/10.1214/23-AOS2276.
Beltran-Velez, Nicolas, Alessandro Antonio Grande, Achille Nazaret, Alp Kucukelbir, and David Blei. 2024. “Treeffuser: Probabilistic Predictions via Conditional Diffusions with Gradient-Boosted Trees.” arXiv Preprint arXiv:1206.3298.
Bequé, Artem, Kristof Coussement, Ross Gayler, and Stefan Lessmann. 2017. “Approaches for Credit Scorecard Calibration: An Empirical Analysis.” Knowledge-Based Systems 132: 1–15.
Berta, Eugène, David Holzmüller, Michael I. Jordan, and Francis Bach. 2025. “Rethinking Early Stopping: Refine, Then Calibrate.” arXiv Preprint arXiv:2501.19195.
Bertsimas, Dimitris, Vassilis Digalakis Jr, Yu Ma, and Phevos Paschalidis. 2024. “Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences.” arXiv Preprint arXiv:2403.19871.
Bertsimas, Dimitris, and Vasiliki Stoumpou. 2024. “Binary Classification: Is Boosting Stronger Than Bagging?” arXiv Preprint arXiv:2410.19200.
Birhane, Abeba, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan, and Michelle Bao. 2022. “The Values Encoded in Machine Learning Research.” (Under Review).
Brodley, Carla E, and Mark A Friedl. 1996. “Identifying Mislabeled Training Data.” Journal of Artificial Intelligence Research 8: 131–67.
Brownlee, Jason. 2020. “A Gentle Introduction to Concept Drift in Machine Learning.” Machine Learning Mastery.
Chen, Tianqi, and Carlos Guestrin. 2016. “Xgboost: A Scalable Tree Boosting System.” In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, 785–94.
Cohn, Jonathan B, Zack Liu, and Malcolm I Wardlaw. 2022. “Count (and Count-Like) Data in Finance.” Journal of Financial Economics. https://doi.org/10.1016/j.jfineco.2022.08.006.
Cortes, Corinna, and Mehryar Mohri. 2004. “AUC Optimization Vs. Error Rate Minimization.” In Advances in Neural Information Processing Systems, 313–20.
Cortes, Corinna, and Vladimir Vapnik. 1995. “Support-Vector Networks.” Machine Learning 20 (3): 273–97.
Czakon, Jakub. 2024. “F1 Score Vs ROC AUC Vs Accuracy Vs PR AUC: Which Evaluation Metric Should You Choose?” https://neptune.ai/blog/f1-score-accuracy-roc-auc-pr-auc.
Dietterich, Thomas G, Richard H Lathrop, and Tom Lozano-Perez. 1997. “Solving the Multiple Instance Problem with Axis-Parallel Rectangles.” Artificial Intelligence 89 (1-2): 31–71.
Efron, Bradley, and Gail Gong. 1983. “A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation.” The American Statistician 37 (1): 36–48. https://www.jstor.org/stable/2685844.
El Kafhali, Said, and Mohammed Tayebi. 2022. “Generative Adversarial Neural Networks Based Oversampling Technique for Imbalanced Credit Card Dataset.” In 2022 6th SLAAI International Conference on Artificial Intelligence (SLAAI-ICAI), 1–6. IEEE. https://doi.org/10.1109/SLAAI-ICAI56923.2022.10002630.
Engelen, Jesper E. van, and Holger H. Hoos. 2020. “A Survey on Semi-Supervised Learning.” Machine Learning 109 (3): 373–440.
Fang, Junpeng. 2023. “Causal Correction Methods in Ant Marketing Recommendation Scenarios.” DataFunSummit2023.
Freund, Yoav, and Robert E Schapire. 1997. “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting.” Journal of Computer and System Sciences 55 (1): 119–39.
Gadi, Manoel Fernando Alonso, Xidi Wang, and Alair Pereira do Lago. 2008. “Credit Card Fraud Detection with Artificial Immune System.” In ICARIS, 5132:119–31. Springer.
Gal, Yarin, and Zoubin Ghahramani. 2016. “Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning.” In International Conference on Machine Learning, 1050–59. PMLR.
Ghassemi, Marzyeh, and Shakir Mohamed. 2022. “Machine Learning and Health Need Better Values.” Npj Digital Medicine 5: 51.
Giudici, Paolo, and Emanuela Raffinetti. 2024. “RGA: A Unified Measure of Predictive Accuracy.” Advances in Data Analysis and Classification xx (x): xx–. https://doi.org/10.1007/s11634-024-00596-6.
Goldmann, Leonie, Jonathan Crook, and Raffaella Calabrese. 2024. “A New Ordinal Mixed-Data Sampling Model with an Application to Corporate Credit Rating Levels.” European Journal of Operational Research 314 (3): 1111–26. https://doi.org/10.1016/j.ejor.2023.10.017.
Gopaluni, R. Bhushan, Aditya Tulsyan, Benoit Chachuat, Biao Huang, Jong Min Lee, Faraz Amjad, Seshu Kumar Damarla, Jong Woo Kim, and Nathan P. Lawrence. 2022. “Modern Machine Learning Tools for Monitoring and Control of Industrial Processes: A Survey.” arXiv Preprint arXiv:2209.11123.
Grandvalet, Yves, and Yoshua Bengio. 2006. “Entropy Regularization.” In Semi-Supervised Learning, 151–68. MIT Press.
Grinsztajn, Léo, Edouard Oyallon, and Gaël Varoquaux. 2022. “Why Do Tree-Based Models Still Outperform Deep Learning on Typical Tabular Data?” In 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks.
Hand, David J. 2009. “Measuring Classifier Performance: A Coherent Alternative to the Area Under the ROC Curve.” Machine Learning 77 (1): 103–23.
Hashemi, Seyedeh Khadijeh, Seyede Leili Mirtaheri, and Sergio Greco. 2023. “Fraud Detection in Banking Data by Machine Learning Techniques.” IEEE Access 11: 3034–43.
He, Changhua, Lean Yu, Xi Xi, Xiaoming Zhang, and Chuanbin Liu. 2025. “An Ensemble Learning Model with Dynamic Sampling and Feature Fusion Network for Class Sparsity in Credit Risk Classification.” Annals of Operations Research. https://doi.org/10.1007/s10479-025-06528-5.
Hoover, Kevin D, and Stephen J Perez. 1999. “Data Mining Reconsidered: Encompassing and the General-to-Specific Approach to Specification Search.” The Econometrics Journal 2 (2): 167–91.
Jain, Shantanu, Martha White, and Predrag Radivojac. 2017. “Recovering True Classifier Performance in Positive-Unlabeled Learning.” In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17). AAAI.
Khan, Waqar Ahmed. 2023. “Balanced Weighted Extreme Learning Machine for Imbalance Learning of Credit Default Risk and Manufacturing Productivity.” Annals of Operations Research. https://doi.org/10.1007/s10479-023-05194-9.
Koenker, Roger W, and Gilbert Jr Bassett. 1978. “Regression Quantiles.” Econometrica: Journal of the Econometric Society, 33–50.
Koop, Gary, M. Hashem Pesaran, and Simon M. Potter. 1996. “Impulse Response Analysis in Nonlinear Multivariate Models.” Journal of Econometrics 74 (1): 119–47.
Law, Averill M. 2015. Simulation Modeling and Analysis. 5th ed. New York, NY: McGraw-Hill Education.
Lee, Dong-Hyun. 2013. “Pseudo-Label: The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks.” In ICML 2013 Workshop: Challenges in Representation Learning (WREPL).
Lee, Hyungyu, Saehyung Lee, Ho Bae, and Sungroh Yoon. 2025. “Regularizing Hard Examples Improves Adversarial Robustness.” Journal of Machine Learning Research 26: 1–48.
Lei, Kai, Yuexiang Xie, Shangru Zhong, Jingchao Dai, Min Yang, and Ying Shen. 2020. “Generative Adversarial Fusion Network for Class Imbalance Credit Scoring.” Neural Computing and Applications 32: 8451–62. https://doi.org/10.1007/s00521-019-04335-1.
Lekeufack, Jordan, Anastasios N Angelopoulos, Andrea Bajcsy, Michael I Jordan, and Jitendra Malik. 2023. “Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions.” arXiv.org.
Li, Tie, Gang Kou, and Yi Peng. 2023. “A New Representation Learning Approach for Credit Data Analysis.” Information Sciences 627: 115–31. https://doi.org/10.1016/j.ins.2023.01.068.
Li, Xuan, Yuting Peng, Xiaoxuan Sun, Yifei Duan, Zhou Fang, and Tengda Tang. 2025. “Unsupervised Detection of Fraudulent Transactions in e-Commerce Using Contrastive Learning.” arXiv Preprint arXiv:2502.09914.
Li, Zhiyong, Junfeng Zhang, Xiao Yao, and Gang Kou. 2021. “How to Identify Early Defaults in Online Lending: A Cost-Sensitive Multi-Layer Learning Framework.” Knowledge-Based Systems 221: 106963.
Liang, Yijuan, Guangxin Jiang, and Michael C Fu. 2025. “New Bounds and Truncation Boundaries for Importance Sampling.” arXiv Preprint arXiv:2505.03607, May. https://arxiv.org/abs/2505.03607.
Liu, Jiashuo, Jiayun Wu, Jie Peng, Xiaoyu Wu, Yang Zheng, Bo Li, and Peng Cui. 2024. “Enhancing Distributional Stability Among Sub-Populations.” In International Conference on Artificial Intelligence and Statistics, 2125–33. PMLR.
Lu, Sichong, Xiaoming Zhang, Yi Su, Xiaojun Liu, and Lean Yu. 2025. “Efficient Multimodal Learning for Corporate Credit Risk Prediction with an Extended Deep Belief Network.” Annals of Operations Research. https://doi.org/10.1007/s10479-025-06612-w.
Lukats, Daniel, Oliver Zielinski, Axel Hahn, and Frederic Stahl. 2025. “A Benchmark and Survey of Fully Unsupervised Concept Drift Detectors on Real-World Data Streams.” International Journal of Data Science and Analytics 19: 1–31. https://doi.org/10.1007/s41060-024-00620-y.
Ma, Jiawei, Po-Yao Huang, Saining Xie, Shang-Wen Li, Luke Zettlemoyer, Shih-Fu Chang, Wen-Tau Yih, and Hu Xu. 2024. “MoDE: CLIP Data Experts via Clustering.” arXiv Preprint arXiv:2404.16030.
Maes, Sam, Karl Tuyls, Bram Vanschoenwinkel, and Bernard Manderick. 2000. “Credit Card Fraud Detection Using Bayesian and Neural Networks.” IEEE Transactions on Knowledge and Data Engineering 12 (6): 126–39.
Matthews, B. W. 1975. “Comparison of the Predicted and Observed Secondary Structure of T4 Phage Lysozyme.” Biochimica Et Biophysica Acta (BBA) 405: 442–51.
McDermott, Matthew B. A., Lasse Hyldig Hansen, Haoran Zhang, Giovanni Angelotti, and Jack Gallifant. 2024. “A Closer Look at AUROC and AUPRC Under Class Imbalance.” arXiv Preprint arXiv:2401.06091.
Morerio, Pietro, Jacopo Cavazza, Riccardo Volpi, René Vidal, and Vittorio Murino. 2017. “Curriculum Dropout.” In 2017 IEEE International Conference on Computer Vision (ICCV), 1955–63. IEEE.
Mougan, Carlos, Klaus Broelemann, David Masip, Gjergji Kasneci, Thanassis Thiropanis, and Steffen Staab. 2023. “Explanation Shift: How Did the Distribution Shift Impact the Model?” arXiv Preprint arXiv:2303.08081v2, September. https://arxiv.org/abs/2303.08081v2.
Pesaran, M. Hashem, and Yongcheol Shin. 1998. “Generalized Impulse Response Analysis in Linear Multivariate Models.” Economics Letters 58 (1): 17–29.
Raschka, Sebastian. 2018. “Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning.” arXiv: Learning.
Rolnick, David, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, et al. 2024. “Position: Application-Driven Innovation in Machine Learning.” In Proceedings of the 41st International Conference on Machine Learning. PMLR.
Sanyal, Amartya, Yaxi Hu, Yaodong Yu, Yian Ma, Yixin Wang, and Bernhard Schölkopf. 2024. “Accuracy on the Wrong Line: On the Pitfalls of Noisy Data for Out-of-Distribution Generalization.”
Sesia, Matteo, and Emmanuel J. Candès. 2019. “Conformal Prediction Under Covariate Shift.” Advances in Neural Information Processing Systems 32.
Shafer, Glenn, and Vladimir Vovk. 2008. “A Tutorial on Conformal Prediction.” Journal of Machine Learning Research 9 (Mar): 371–421.
Sigrist, Fabian, and Martin Hirnschall. 2019. “Tobit Models for Machine Learning.” arXiv Preprint arXiv:1908.01976.
Sziklai, Istvan, Robert Baranyi, and Károly Héberger. 2024. “Nested Cross-Validation for Model Selection and Assessment in Chemometrics.” Journal of Chemometrics 38 (1): e3496.
Tanveer, M., M. Abulaish, and S. S. Ray. 2022. “Twin Support Vector Machines: A Survey.” Artificial Intelligence Review 55 (3): 2169–2238.
Tayebi, Mohammed, and Said El Kafhali. 2025a. “A New Hybrid Sampling Approach for Imbalanced Credit Card Fraud Detection.” Neural Computing and Applications.
Tayebi, Mohammed, and Said El Kafhali. 2025b. “Enhancing Credit Card Fraud Detection Using Generative Adversarial Networks and Ensemble Learning.” Journal of King Saud University - Computer and Information Sciences.
Tomek, Ivan. 1976. “Two Modifications of CNN.” IEEE Transactions on Systems, Man, and Cybernetics 6 (11): 769–72.
Toneva, Mariya, Alessandro Sordoni, Remi Tachet des Combes, Adam Trischler, Yoshua Bengio, and Geoffrey J. Gordon. 2018. “An Empirical Study of Example Forgetting During Deep Neural Network Learning.” In Advances in Neural Information Processing Systems 31.
Tran, Minh-Ngoc, Viet-Anh Tran, and Svetha Venkatesh. 2019. “Uncertainty Quantification in Medical Image Analysis.” In Medical Image Computing and Computer Assisted Intervention – MICCAI 2019, edited by Spyridon Bakas, Daniela Mateus, Christos Davatzikos, Stefanos Zafeiriou, and Sebastien Ourselin, 641–49. Springer International Publishing.
Tran, Minh-Ngoc, Viet-Anh Tran, and Svetha Venkatesh. 2020. “Evaluating Uncertainty Quantification for Medical Image Segmentation.” Medical Image Analysis 65: 101785.
Tiwari, Aviral, Anil Kumar, and John Abakah. 2024. “Quantile Vector Autoregression: A Survey.” Journal of Economic Surveys 38 (1): 213–45.
Varma, S., and R. Simon. 2006. “Bias in Error Estimation When Using Cross-Validation for Model Selection.” BMC Bioinformatics 7 (1): 91.
Wang, B., and Z. Zheng. 2024. “Majority Rule and Information Manipulation.” Journal of Economic Theory 211: 105681.
Wang, X., X. Liu, and C. Miao. 2019. “User Value Evaluation Based on Improved RFM Model.” Procedia Computer Science 154: 403–09.
Wu, X., J. Liu, Z. Zhang, and C. Liu. 2025. “Semi-Supervised Learning with High-Quality Pseudo-Labels for Credit Risk Assessment.” Applied Soft Computing.
Ye, J., and M. Bellotti. 2019. “Beta Regression for Modeling Rates and Proportions in Finance.” Journal of Financial Econometrics 17 (2): 237–69.
Yu, H., Z. Liu, Y. Li, and X. Wang. 2024. “Out-of-Distribution Performance Evaluation for Machine Learning Models.” IEEE Transactions on Knowledge and Data Engineering.
Yu, Y., L. Zhang, and H. He. 2022. “Subgroup Calibration for Machine Learning Models.” Advances in Neural Information Processing Systems 35: 12345–56.
Zhong, S., Y. Lei, and M. Yang. 2020. “AUC Optimization for Imbalanced Data Classification.” IEEE Transactions on Neural Networks and Learning Systems 31 (10): 4251–64.
Zhang, H., M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. 2021. “mixup: Beyond Empirical Risk Minimization.” arXiv Preprint arXiv:1710.09412.
Zheng, Z. 2019. “Concept Drift Detection Using SHAP Values.” In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2921–29.
Zheng, Z., Y. Li, and X. Wang. 2019. “Detecting Concept Drift with SHAP Values.” arXiv Preprint arXiv:1909.06349.
Zhou, D.-W., X. Li, and Z.-H. Zhou. 2024. “Class-Incremental Learning with Contrastive Learning.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Zhou, Z.-H. 2018. “A Brief Introduction to Weakly Supervised Learning.” National Science Review 5 (1): 44–53.
Zhu, H., L. Chen, J. Wang, and S. Liu. 2019. “Quantile Vector Autoregression with Applications to Financial Markets.” Journal of Financial Econometrics 17 (4): 715–41.