A Hybrid Multiple Indefinite Kernel Learning Framework for Disease Classification from Gene Expression Data

Document Type

Article

Publication Title

International Journal of Advanced Computer Science and Applications

Abstract

In recent years, Machine Learning (ML) techniques have been used by several researchers to classify diseases using gene expression data. Disease categorization using heterogeneous gene expression data is often used for defining critical problems such as cancer analysis. A variety of evaluated factors known as genes are used to characterize the gene expression data gathered from DNA microarrays. Accurate classification of genetic data is essential to provide accurate treatments to sick people. A large number of genes can be viewed simultaneously from the collected data. However, processing this data has some limitations due to noises, redundant data, frequent errors, increased complexity, smaller samples with high dimensionality, difficult interpretation, etc. A model must be able to distinguish the features in such heterogeneous data with high accuracy to make accurate predictions. So this paper presents an innovative model to overcome these issues. The proposed model includes an effective multiple indefinite kernel learning based model for analyze the gene expression microarray data, then an optimized kernel principal component analysis (OKPCA) to select best features and hybrid flow-directed arithmetic support vector machine (SVM)-based multiple infinite kernel learning (FDASVM-MIKL) model for classification. Flow direction and arithmetic optimization algorithms are combined with SVM to increase classification accuracy. The proposed technique has an accuracy of 99.95%, 99.63%, 99.60%, 99.51%, and 99.79% using the datasets including colon, Isolet, ALLAML, Lung_cancer, and Snp2 graph.

First Page

844

Last Page

855

DOI

10.14569/IJACSA.2023.0140690

Publication Date

1-1-2023

This document is currently not available here.

Share

COinS