Accuracy-fairness trade-off in ML for healthcare: A quantitative evaluation of bias mitigation strategies

Document Type

Article

Publication Title

Information and Software Technology

Abstract

Context: Although machine learning (ML) has significant potential to improve healthcare decision-making, embedded biases in algorithms and datasets risk exacerbating health disparities across demographic groups. To address this challenge, it is essential to rigorously evaluate bias mitigation strategies to ensure fairness and reliability across patient populations. Objective: The aim of this research is to propose a comprehensive evaluation framework that systematically assesses a wide range of bias mitigation techniques at pre-processing, in-processing, and post-processing stages, using both single- and multi-stage intervention approaches. Methods: This study evaluates bias mitigation strategies across three clinical prediction tasks: breast cancer diagnosis, stroke prediction, and Alzheimer’s disease detection. Our evaluation employs group- and individual-level fairness metrics, contextualized for specific sensitive attributes relevant to each dataset. Beyond fairness-accuracy trade-offs, we demonstrate how metric selection must align with clinical goals (e.g., parity metrics for equitable access, confusion-matrix metrics for diagnostics). Results: Our results reinforce that no single classifier or mitigation strategy is universally optimal, underscoring the value of our proposed framework for evaluating fairness and accuracy throughout the bias mitigation process. According to the results, Adversarial Debiasing improved fairness by 95% in breast cancer diagnosis without compromising accuracy. Reweighing was most effective in stroke prediction, boosting fairness by 41%, and Reject Option Classification yielded nearly 50% fairness improvement in Alzheimer’s detection. Multi-stage bias mitigation did not consistently lead to better outcomes, and in many cases, fairness gains came at the expense of accuracy. Conclusion: These findings provide practical guidance for selecting fairness-aware machine learning strategies in healthcare, aiding both model development and benchmarking across diverse clinical applications.

DOI

10.1016/j.infsof.2025.107896

Publication Date

12-1-2025

This document is currently not available here.

Share

COinS