WBIN-Tree: A Single Scan Based Complete, Compact and Abstract Tree for Discovering Rare and Frequent Itemset Using Parallel Technique

Document Type

Article

Publication Title

IEEE Access

Abstract

Data analytics is an integral part of strategic decision making in various fields but not limited to business, education and healthcare systems. Existing research works focus on the discovery of itemsets with rare antecedents and consequent or frequent antecedents and consequent. Analysis of association among itemsets with rare antecedents and frequent consequent is equally important to gain valuable insights before making crucial decisions. Mining these itemsets from large datasets is time and resource intensive process. Expedition in the process of mining aids in quick decision making and hence, the entire dataset needs to be stored in the RAM. In this paper, a novel Weighted Binary Count Tree (WBIN-Tree) is proposed and implemented in CUDA to exploit the power of GPU and discover rules with rare antecedent and frequent consequent using parallel approach. WBIN-Tree stores the entire dataset in an abstract, complete and compact form in the RAM using single database scan. WBIN-Tree is compared with existing sequential and parallel algorithms by varying the data size and dimension. The performance evaluation of WBIN-Tree showed promising results, proving to be the most time and space efficient algorithm to store the entire large dataset in the RAM. However, based on the size of the GPU, the performance drops when executed on datasets with large dimensions which could be handled by processing the attributes in batches. Additionally, a case study is included to understand the importance of mining association rules with rare antecedent and frequent consequent by executing the algorithm on breast cancer dataset.

First Page

6281

Last Page

6297

DOI

10.1109/ACCESS.2024.3350737

Publication Date

1-1-2024

This document is currently not available here.

Share

COinS