ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Neural matrix factorization++ based recommendation system

[version 1; peer review: 1 approved, 2 approved with reservations]
PUBLISHED 25 Oct 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

Abstract

In recent years, Recommender System (RS) research work has covered a wide variety of Artificial Intelligence techniques, ranging from traditional Matrix Factorization (MF) to complex Deep Neural Networks (DNN). Traditional Collaborative Filtering (CF) recommendation methods such as MF, have limited learning capabilities as it only considers the linear combination between user and item vectors. For learning non-linear relationships, methods like Neural Collaborative Filtering (NCF) incorporate DNN into CF methods. Though, CF methods still suffer from cold start and data sparsity. This paper proposes an improved hybrid-based RS, namely Neural Matrix Factorization++ (NeuMF++), for effectively learning user and item features to improve recommendation accuracy and alleviate cold start and data sparsity. NeuMF++ is proposed by incorporating effective latent representation into NeuMF via Stacked Denoising Autoencoders (SDAE). NeuMF++ can also be seen as the fusion of GMF++ and MLP++. NeuMF is an NCF framework which associates with GMF (Generalized Matrix Factorization) and MLP (Multilayer Perceptrons). NeuMF achieves state-of-the-art results due to the integration of GMF linearity and MLP non-linearity. Concurrently, incorporating latent representations has shown tremendous improvement in GMF and MLP, which result in GMF++ and MLP++. Latent representation obtained through the SDAEs’ latent space allows NeuMF++ to effectively learn user and item features, significantly enhancing its learning capability. However, sharing feature extractions among GMF++ and MLP++ in NeuMF++ might hinder its performance. Hence, allowing GMF++ and MLP++ to learn separate features provides more flexibility and greatly improves its performance. Experiments performed on a real-world dataset have demonstrated that NeuMF++ achieves an outstanding result of a test root-mean-square error of 0.8681. In future work, we can extend NeuMF++ by introducing other auxiliary information like text or images. Different neural network building blocks can also be integrated into NeuMF++ to form a more robust recommendation model.

Keywords

Recommender System, Matrix Factorization, Collaborative Filtering, Deep Neural Networks, Neural Collaborative Filtering.

Introduction

Collaborative Filtering (CF) based Recommender System (RS) typically suggests items based on user-item interactions. Users’ interests are predicted based on analyzing other users’ tastes and preferences in the system. Matrix Factorization (MF),1 popularized by the Netflix price,2 has emerged as a powerful CF recommendation tool. However, its simple interaction function, which is the inner product, has hindered its performance. Not to mention that CF methods also suffer from cold start and data sparsity.

Much effort has been devoted to improving MF’s accuracy throughout the years, but one approach that has caught much attention is deep learning (DL). DL has drastically improved MF’s accuracy by exploiting deep neural networks (DNN). Eventually, many researchers have also suggested incorporating side information into CF methods. This subsequently forms a hybrid-based (HB) method that solves CF’s cold start and data sparsity.3

In this paper, we proposed a novel hybrid-based RS named Neural Matrix Factorization ++ (NeuMF++). NeuMF++ is an improved version of NeuMF that incorporates an effective latent representation of side information via Stacked Denoising Autoencoders (SDAEs). In the original work, NeuMF has achieved outstanding results. It is surprising to see that not much prior work has been done to enhance NeuMF. In NeuMF++, SDAEs extract high-level representations from side information and later incorporate them as latent feature vectors. Incorporating user-item features in the learning process enhances its learning capabilities and improves its recommendation performance. Experiments on a real-world dataset have demonstrated the effectiveness of side information in NeuMF++, yielding state-of-the-art results.

The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 introduces our proposed framework, NeuMF++, in detail. Section 4 discusses the result. Finally, section 5 summarizes the paper and briefly introduces our future work.

Related work

There are different DL models ranging from standard Multilayer Perceptrons (MLP) to Convolutional Neural Network (CNN). DL models like MLP are utilized to add the non-linear transformation to existing linear techniques and interpret them as neural extensions.4,5 NCF frameworks,2 which include Generalized MF (GMF), MLP and NeuMF, make use of DNN into traditional MF to further enhance its recommendation performance and quality. The differences between the three models are their interaction functions. GMF uses a linear kernel by taking user and item latent vectors and multiplying them element by element (element-wise product). In contrast, MLP uses a non-linear kernel by concatenating user and item latent vectors and then fully connects to an MLP. Lastly, NeuMF integrates the linearity of GMF and non-linearity MLP by combining both of their outputs with a single-layer MLP.

Another popular DL model is the Autoencoder (AE). AE is a powerful tool for dimensionality reduction and can be considered a strict generalization of Principal Component Analysis. It aims to reconstruct the input data as output. Many popular MF techniques can be thought of as a form of dimensionality reduction.3 Therefore, AE can be adapted for this task as well, such as AutoRec.6 Subsequently,7 further enhances AutoRec by training it much deeper, which aids the network to generalize better8 proposed Collaborative Denoising Autoencoder, which utilized a Denoising Autoencoder (DAE) to perform CF tasks. Noises are added intentionally to the rating input and reconstructing the original rating input as the output. This allows the network to be more noise-resistant and helps it to learn more stable features.

Most studies only focus on ratings, but ratings alone are unable to reveal user-item relation fully. Additionally, most CF methods also suffer from cold start and data sparsity. Hence, several researchers suggested incorporating side information into the model, forming an HB method3,8 proposed a new HB method known as CF Network (CFN). Instead of only adding the side information into the first layer, the author injected that information into every layer except the output layer of the network.

However, most AE-based CFs utilize side information as regularization in their models. However, due to the sparse nature of the rating matrix together with side information, the learned latent vectors might not be very effective. Therefore,9 introduced Collaborative Deep Learning (CDL), in which a DAE learns item features and is then utilized as an item latent vector for MF. Subsequently,10 proposed a marginalized DAE for CF (mDA-CF), an extension of CDL by adding user latent vectors learned by another AE. The key of mDA-CF is to extract user and item features from mDAs and combine them in a joint framework.

Even though both CFL and mDA-CF utilize DNN to improve recommendation performance, their CF’s core is still a linear MF. Therefore,3 proposed two models 一 GMF++ and MLP++. GMF++/MLP++ enhances the GMF/MLP of the NCF frameworks by incorporating user and item latent vectors extracted from SDAEs into neural collaborative filtering.

Methods

The real-world dataset was obtained from the GroupLens Research Project. The GroupLens Research Project is a research group in the Department of Computer Science and Engineering at the University of Minnesota. The Movielens-1M dataset from the GroupLens Research is available at: https://grouplens.org/datasets/movielens/1m/.

Ethical Approval Number: EA1572021

Ethical Approval Body: Research Ethic Committee 2021, Multimedia University

First, we will present NeuMF++ as a general framework. Then, we will describe feature extraction and neural collaborative filtering in detail. Lastly, we will explain the learning and optimization of NeuMF++. Table 1 shows the frequent notations.

Table 1. Frequent notation.

NotationDescription
mNumber of users.
nNumber of items.
dEmbedding dimension.
pUser feature dimension.
qItem feature dimension.
XRm×qUser side information.
VRn×qItem side information.
PRm×dUser embedding.
QRn×dItem embedding.
rRating.
σlNon-linear function at layer-l.
WlWeight matrix at layer-l.
blBias matrix at layer-1.
puUser latent vector.
qiItem latent vector.

NeuMF++: A general framework

In this section, the proposed NeuMF++ is introduced in general. As illustrated in Figure 1, NeuMF++ is a hybrid model that bridges multiple SDAEs to a NeuMF. NeuMF++ contains two major components: feature extraction and neural collaborative filtering.

c085dd82-17dc-4949-a86d-8cc4466493ef_figure1.gif

Figure 1. NeuMF++ architecture

In feature extraction, each user and item features are assigned with 2 SDAEs for feature extraction. As discussed earlier, recommendation performance and accuracy can be improved by incorporating side information. NeuMF++ utilizes SDAEs to learn user-item features by minimizing the errors of the reconstructed and the original input features. Then, compressed high-level features can be extracted from the bottleneck layer, located in the middle-most layer. In neural collaborative filtering, NeuMF has been chosen as our framework due to its outstanding performance. As mentioned earlier, NeuMF combines the output of GMF and MLP interaction functions. Similarly, NeuMF++ combines the output of GMF++ and MLP++ interaction functions. First, user and item latent vectors can be formed by concatenating the user and item embeddings of GMF and MLP, with the learned user and item latent feature vectors extracted from the SDAEs. Then, the user and item latent vector will be fed to the respective GMF++ and MLP++ interaction function. Finally, the outputs obtained from GMF++ and MLP++ are concatenated and fed into a single-layer MLP 一 NeuMF layer to generate ratings.

NeuMF++: Feature extraction

SDAE can be formed by stacking multiple DAEs on top of one another. Side information (features) is usually composed of the subject attributes like users’ age and occupation or item’s shape and size. In NeuMF++, SDAEs take user features X and item features V as input, encode them in a low-dimensional latent space, and then reconstruct X̂ and V̂ in the output space. At the same time, noises are added intentionally between layers during training.

For example, given a set of features XRm×p the SDAE minimize the reconstruction error,

(1)
lu=XX̂F2+λωωF2

where ω denotes as the model parameters, λω as the regularization term, and X̂ as the reconstruction of XRm×p, where

(2)
X̂=σLσ1X¯W1X+b1XWLX+bLX

where denotes the noise function. During inference, the values of the bottleneck layer can be extracted as in Eq. (3).

(3)
XLX/2=σL/2σ1XW1X+b1XWL/2X+bL/2X

NeuMF++: Neural collaborative filtering

NeuMF++ can be seen as the combination of GMF++ and MLP++. The ++ acronym denotes that side information is appended to the model. At first, one-hot encoding is performed on user and item ID to obtain the user and item embeddings. Then, user and item latent feature vectors are extracted and concatenated with their respective embedding to form user and item latent vectors pu and qi, formulated as such

(4)
pu=PuXLX/2u
(5)
qi=QiVLV/2i

As discussed earlier, GMF++ and MLP++ use different computations and layers in their interaction function. GMF++ performs an element-wise product between pu and qi as shown in Eq. (6). In contrast, MLP++ utilizes a standard MLP by adding several hidden layers on the concatenated latent vectors, as shown in Eq. (7).

(6)
ϕGMF++=puqi
(7)
ϕMLP++=σLσ1puqiW1+b1WL+bL

Finally, the NeuMF layer, a single-layer MLP, is introduced to combine both GMF++ and MLP++ interaction output. Specifically, NeuMF++ integrates GMF++ and MLP++ with a single-layer MLP can be formulated in Eq. (8).

(8)
r̂=σpuqiσ1puqiW1+b1W+b

From Eq. (8), we can see that GMF++ and MLP++ shared the same pu and qi which extracted from the same user and item SDAEs. This might limit the performance and learning capabilities of NeuMF++. For example, the hyperparameters and latent vector size between GMF++ and MLP++ might vary. Hence, we allow GMF++ and MLP++ to perform user-item feature extraction separately. This provides more flexibility to the NeuMF++. Hence, the final NeuMF++ algorithm can be written as,

(9)
ϕGMF++=puGMF++qiGMF++
(10)
ϕMLP++=σLσ1puMLP++qiMLP++W1+b1WL+bL
(11)
r̂=σϕGMF++ϕMLP++W+b

NeuMF++: Learning and optimization

NeuMF++ objective function consists of user-item feature reconstruction error in feature extraction and prediction error in neural collaborative filtering. The loss function of user and item SDAE can be seen in Eq. (1). Since NeuMF++ is a rating prediction model, its output ruî range between 0NWhere N is the maximum rating number. Hence, the loss function can be defined in Eq. (12),

(12)
lr=ruiruîF2+λθθF2

where θ denotes as the parameters of the models, λθ as the regularization term.

Therefore, the general loss function for optimizing NeuMF++ is formulated in Eq. (13).

(13)
l=lr+αluGMF+++βliGMF+++γluMLP+++δliMLP++
where α,β,γ,δ are trade-off parameter for each reconstruction loss.

Results

Experimental settings

This paper uses the public MovieLens 1-M dataset.11 The dataset contains approximate 1 million ratings from 6040 unique users across 3706 unique movies, with 95.8% sparseness. Concurrently, we also use the side information provided by the dataset. The user side information consists of age, occupation and gender attributes, while the item consists of 18 different movie genres. All features are preprocessed and encoded as one-hot numeric arrays.

The evaluation index used in this paper is the root mean square error, RMSE, as shown in Eq. (14). RMSE is directly related to our loss function. The smaller the RMSE, the better the recommendation accuracy.

(14)
RMSE=u=1mi=1ruiruîN

We compared our proposed NeuMF++ with related baseline models which include MF, GMF, MLP, NeuMF, GMF++ and MLP++.1-3

All the experiments were implemented using Pytorch, a deep learning framework built on top of the Python programming language. We utilized the Adam optimization method to optimize our model by setting the batch size of 1024, regularization term of 0.001 and learning rate of 0.001. Concurrently, we split the dataset into 70:30 ratios, where 70% of the dataset is used for training, while another 30% is used for testing. The hyperparameters used on the related baseline models are based on their respective publications.2,3

As mentioned previously, we used different hyperparameters on GMF++ and MLP++ for user-item feature extraction. We used 8 neurons on 1 hidden layer in GMF++ user-item SDAEs, and 16:8:16 neurons on 3 hidden layers in MLP++ user-item SDAEs. Hence, the latent vector dimensions for all SDAEs are 8. Each SDAE layer is also inputted with some Gaussian noises. In neural collaborative filtering, the embedding vector dimension, d chosen is 8. We used ReLU as GMF++ activation function, while SeLU as MLP++ activation function. Concurrently, MLP++ composed of [32,16,8] neurons in its interaction MLP layers. Finally, we set all the trade-off parameters α,β,γ,δ to 0.000001.

Experimental result and analysis

In Table 2, we can see that NeuMF++ has proved to outperform all the other baseline models with 0.7964 in train RMSE and 0.8681 in test RMSE. NeuMF++ has achieved a 1.37% improvement than its predecessor NeuMF and 2% improvement than traditional MF. As a result, NeuMF++ has demonstrated the effectiveness of employing DNN and side information for rating prediction.

Table 2. RMSE of different compared models on 1M Movielens data with 70-30 train-test split.

MethodTraining RMSETesting RMSE
MF0.80100.8958
GMF0.78350.8928
GMF++0.77380.8894
MLP0.86960.8879
MLP++0.86860.8864
NeuMF0.81520.8725
NeuMF++ (Ours)0.79640.8681

Figures 2 and 3 show that most models converged very fast, except for MF and GMF. This shows that models with DNN learn much faster than the models without DNN in this dataset. Also, MLP++ does not converge as much as MLP. Therefore, side information does not provide much effect on MLP.

c085dd82-17dc-4949-a86d-8cc4466493ef_figure2.gif

Figure 2. Training loss of compared models over 100 iterations/epochs.

c085dd82-17dc-4949-a86d-8cc4466493ef_figure3.gif

Figure 3. Testing loss of compared models over 100 iterations/epochs.

To demonstrate the effectiveness of separate feature extraction and pre-trained weights for NeuMF++, we compared the performance on three versions of NeuMF++ as seen in Table 3. As expected, NeuMF++, with pre-trained weights and feature extraction separated among the GMF++ and MLP++ layers, achieve the best performance.

Table 3. RMSE of different NeuMF variations on 1M Movielens data with 70-30 train-test split.

MethodTraining RMSETesting RMSE
NeuMF0.81520.8725
NeuMF++0.86860.8865
NeuMF++ (seperate)0.90070.9108
NeuMF++ (seperate + pre-train)0.79640.8681

Concurrently, we also observed that NeuMF++ with feature extraction shared among the GMF and MLP layers, over-fitted in the early iterations, as shown in Figure 4.

c085dd82-17dc-4949-a86d-8cc4466493ef_figure4.gif

Figure 4. Training and testing loss of different NeuMF variations over 100 iterations/epochs.

At first, we found out that NeuMF++ did not perform as well as NeuMF. Hence, inspired by the concept of a pre-training method from,2 we loaded and froze pre-trained GMF++ and MLP++ weights into NeuMF++. As a result, we noticed a 8.11% improvement, as shown in Table 3. This pre-training method updates weights within the NeuMF layer but not within the GMF++ and MLP++ layers. As a result, NeuMF++ with pre-trained weights performed much better as compared to NeuMF++ without pre-trained weights. This justified that the usefulness of the pre-training method for initializing NeuMF++.

Conclusion

In this paper, we proposed an HB recommendation model, namely NeuMF++, which is an enhanced version of NeuMF that incorporates effective latent representations of side information. Throughout the experiment, we found that incorporating side information to neural collaborative filtering can improve the recommendation performance and eliminate CF cold start and data sparsity.

NeuMF++ is also not limited to categorical or numerical type information, and can be extended with other information types such as text or even images. For example, pre-trained word embedding models such as word2vec, ELMO or BERT, can transform textual information into input bags of words. Besides, CNN can also learn features from images and aid feature extraction or neural collaborative filtering.

DL’s flexibility also allows different neural network building blocks to be integrated. This concept can also be applied to NeuMF++ to form a more robust recommendation model and further improve its recommendation precision.

Author contributions

Ong, Ng and Haw conceived the presented idea. Ong carried out the experiment and wrote the manuscript. Ng and Haw supervised the project and provided critical feedback.

Data availability

None.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 25 Oct 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Ong K, Ng KW and Haw SC. Neural matrix factorization++ based recommendation system [version 1; peer review: 1 approved, 2 approved with reservations]. F1000Research 2021, 10:1079 (https://doi.org/10.12688/f1000research.73240.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 25 Oct 2021
Views
5
Cite
Reviewer Report 27 Mar 2024
Zhigang Liu, Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing University of Posts and Telecommunications,, Chongqing, China 
Approved with Reservations
VIEWS 5
This paper proposes a hybrid recommender system based on neural matrix factorization and stacked denoising autoencoders. The authors claim that their model, called NeuMF++, can effectively learn user and item features from side information and improve the accuracy of rating ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Liu Z. Reviewer Report For: Neural matrix factorization++ based recommendation system [version 1; peer review: 1 approved, 2 approved with reservations]. F1000Research 2021, 10:1079 (https://doi.org/10.5256/f1000research.76881.r211206)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
10
Cite
Reviewer Report 16 Oct 2023
Khanh Luong, Queensland University of Technology, Brisbane, Australia 
Approved with Reservations
VIEWS 10
This paper introduces a novel Hybrid recommendation model that leverages two distinct models: GMF (Generalized Matrix Factorization) and SDAE (Stacked Denoising Autoencoder). This hybrid model is composed of two fundamental components: feature extraction and collaborative filtering, where feature extraction step ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Luong K. Reviewer Report For: Neural matrix factorization++ based recommendation system [version 1; peer review: 1 approved, 2 approved with reservations]. F1000Research 2021, 10:1079 (https://doi.org/10.5256/f1000research.76881.r211221)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
16
Cite
Reviewer Report 15 Feb 2022
Dino Caesaron, Telkom University, Bandung, Indonesia 
Approved
VIEWS 16
Overall, the work is fine, and is suitable for indexing. The experimental design has been meticulously planned. The proposed methodology is an improvement to the existing work. The experiment tested on 1 million movielens data. Hence, the proposed method is ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Caesaron D. Reviewer Report For: Neural matrix factorization++ based recommendation system [version 1; peer review: 1 approved, 2 approved with reservations]. F1000Research 2021, 10:1079 (https://doi.org/10.5256/f1000research.76881.r101096)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

Comments on this article Comments (0)

Version 1
VERSION 1 PUBLISHED 25 Oct 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.