APAROS: A Clustering-Based Hybrid Approach for Handling Overlapped Regions in Imbalanced Datasets
Document Type
Article
Publication Title
IEEE Access
Abstract
In many real-world datasets, the class distribution is often highly imbalanced, and minority class samples are located within the majority regions, leading to significant overlap. These challenges produce a model of bias and misclassifications, particularly for minority classes. To address these issues, we introduced a novel technique that integrates Affinity Propagation with Adaptive Synthetic Sampling (ADASYN) and Random Oversampling (ROS), collectively termed as APAROS. Using Affinity Propagation Clustering (APC), our proposed method categorizes the data set into Overlapping Clusters (OLC), Pure minority Clusters (PmC) and Pure Majority Clusters (PMC). Further, we applied existing oversampling methods; ADASYN in OLC and ROS in PmC, effectively balancing the dataset while minimizing the risk of miss-classification in overlapped regions. The proposed method was evaluated on ten benchmark imbalanced datasets and compared against existing resampling techniques such as ADASYN, ROS and Kmeans-SMOTE. Seven machine learning classifiers were employed to evaluate the models performance considering the metrics such as accuracy, precision, recall, f1_score, Matthews Correlation Coefficient (MCC) and Cohen’s Kappa coefficient. Experimental results consistently demonstrate that APAROS performs significantly better than traditional resampling techniques and produced superior classification performance, particularly in highly overlapped and imbalanced scenarios.
First Page
170705
Last Page
170722
DOI
10.1109/ACCESS.2025.3614993
Publication Date
1-1-2025
Recommended Citation
Nandini, Annam; Kumar Mishra, Tapas; Pujahari, Abinash; and Mishra, Sanket, "APAROS: A Clustering-Based Hybrid Approach for Handling Overlapped Regions in Imbalanced Datasets" (2025). Open Access archive. 14440.
https://impressions.manipal.edu/open-access-archive/14440