Binary Count Tree: An Efficient and Compact Structure for Mining Rare and Frequent Itemsets
Document Type
Article
Publication Title
Engineered Science
Abstract
The discovery of rare and frequent itemsets is done efficiently if the datasets to be processed are stored within the main memory. In recent years, various data structures have been developed to represent a large dataset in a compact form, which otherwise cannot be stored as a whole within the main memory. Binary Count Tree (BIN-Tree), a tree data structure is proposed in this paper, represents the entire dataset in a compact and complete form without any information loss. Each transaction is encoded and stored as a node in the tree, in contrast to the existing algorithms that store each item as a node. The efficiency of BIN-Tree for datasets of varying size and dimensions was evaluated against Single Scan Pattern Tree (SSP-Tree) and Weighted Count Tree (WC-Tree). The results obtained revealed BIN-Tree to be 95% and 75% more space-efficient than SSP-Tree and WC-Tree, respectively. The BIN-Tree construction and discovery of itemsets from a large dataset were found to be 93% and 22% more time-efficient than SSP-Tree and WC-Tree, respectively. BIN-Tree is equally efficient to discover rare and frequent itemsets from a small dataset in the main memory.
First Page
185
Last Page
194
DOI
10.30919/es8d602
Publication Date
1-1-2022
Recommended Citation
Rai, Shwetha; Geetha, M.; Kumar, Preetham; and Giridhar, B., "Binary Count Tree: An Efficient and Compact Structure for Mining Rare and Frequent Itemsets" (2022). Open Access archive. 5123.
https://impressions.manipal.edu/open-access-archive/5123