Summary of - Knowledge Discovery in a Recommender System: The Matrix Factorization Approach

Document Type

Article

Abstract

Study Background: The research presented titled "Knowledge Discovery in a Recommender System: The Matrix Factorization Approach" by Dr Murchhana Tripathy, Dr Santilata Champati, and Dr Hemanta Kumar Bhuyan shows how the two matrix factorization methods, SVD (singular value decomposition) and NMF (nonnegative matrix factorization), can be used to fulfil multiple objectives in a recommender system. The research explores the different aspects of knowledge discovery through these methods. The work presents clustering using SVD and NMF and has shown how these clusters can help in out-of-sample extension and solve the recommender systems' cold-start problem. Also, SVD has been used to solve the missing data problem of a recommender system dataset. The analysis shows that in recommender systems, SVD can perform dimensionality reduction, missing value imputation, clustering, and out-of-sample extension.

Similarly, NMF can perform dimensionality reduction, clustering, and out-of-sample extension. The processes are more autonomous in the case of SVD than in NMF. The work presents algorithms for missing value imputations and out-of-sample extensions for a single user and multiple users.Research Goals and Hypotheses: The primary goal is to analyze different aspects of knowledge discovery for collaborative filtering recommender system (CFRS) through matrix factorization methods.

Methodological Approach:

  1. Impute missing data in a CFRS dataset
  2. Perform SVD. Do clustering using SVD. Perform out-of-sample extension for single user and multiple users
  3. Perform NMF. Do clustering using NMF. Perform out-of-sample extension for single user and show it for multiple users
  4. Evaluate clustering quality using NMI (Normalized Mutual Information) and Purity.

Results and Discoveries

This work explores the different aspects of knowledge discovery by using two matrix factorization techniques, SVD and NMF, in the context of a recommender system. The authors have shown that SVD and NMF can be used for clustering and out-of-sample extension through algorithms and examples. The clustering capabilities of SVD and NMF have been demonstrated by constructing suitable rating matrices. For out-of-sample extension, in the case of SVD, two new algorithms have been developed, one for a single new user and another for more than one new user. Similarly, for NMF, an out-of-sample extension was done using vector projection for a single user. The steps to be followed for out-of-sample extension have been discussed in the case of multiple users.

Further, the capability of SVD in doing missing data imputation has been explored. In this context, the authors have given a detailed analysis of the type of data used and the method of imputation that can be considered. Reasoning for the mode-substitution method as an imputation technique for recommender system rating matrices has been presented. In addition, the reasoning behind why NMF cannot be used for missing data imputation is given. In the end, cluster quality is evaluated using NMI and purity. The authors conclude that depending on the need of the application, any of these methods can be used for clustering. However, SVD is a more autonomous method than NMF because, in the case of NMF, the number of clusters needs to be specified, but SVD does not need anything other than the dataset.

Citation to the base paper - Tripathy, M., Champati, S., & Bhuyan, H. K. (2022). Knowledge Discovery in a Recommender System: The Matrix Factorization Approach. Journal of Information & Knowledge Management, 21(04), 2250051.

Publication Date - June, 2022

Recommended Citation - Tripathy, M., Champati, S., & Bhuyan, H. K. (2022). Knowledge Discovery in a Recommender System: The Matrix Factorization Approach. Journal of Information & Knowledge Management, 21(04), 2250051

Publication Date

2022

Share

COinS