# INCORPORATING USER-ITEM SIMILARITY IN HYBRID NEIGHBORHOOD-BASED RECOMMENDATION SYSTEM **Anonymous authors** Paper under double-blind review ABSTRACT Modern hybrid recommendation systems require a sufficient amount of data. However, several internet privacy issues make users skeptical about sharing their personal information with online service providers. This work introduces various novel methods utilizing the baseline estimate to learn user interests from their interactions. Subsequently, extracted user feature vectors are implemented to estimate the user-item correlations, providing an additional fine-tuning factor for neighborhood-based collaborative filtering systems. Comprehensive experiments show that utilizing the user-item similarity can boost the accuracy of hybrid neighborhood-based systems by at least 2.11% while minimizing the need for tracking users’ digital footprints. 1 INTRODUCTION The continuously accelerated growth of communication technology and data storage in the past decades has benefited customers with an enormous amount of online multimedia content such as movies, music, news, and articles, creating billion-dollar industries. Following this evolution, recommendation systems (RSs) have been widely developed to automatically help users filter redundant information and suggest only suitable products that fit their needs. Such systems are used in a variety of domains and have become a part of our daily online experience (Ricci et al., 2015). RSs are commonly classified into three main types (Adomavicius & Tuzhilin, 2005): the contentbased technique, the collaborative filtering technique, and the hybrid technique. The content-based approach, as in Lang (1995); Pazzani & Billsus (1997); Lops et al. (2011); Narducci et al. (2016), learns to recommend items that are similar to the ones that a user liked based on items’ features. The main weakness of this approach is the lack of available and reliable metadata associated with item (Mooney & Roy, 2000). Meanwhile, the collaborative filtering (CF) approach does not require product information but only relies on users’ interaction history which can be either explicit or implicit feedback. CF systems can be divided into two major categories: i) neighborhood-based models which suggest items that are most similar to the item that a user is interested in (Herlocker et al., 2000; Tintarev & Masthoff, 2007), and ii) matrix factorization models which could explore the latent factors connecting items to users in order to make accurate recommendations (Ricci et al., 2015; Koren et al., 2009; Koren, 2008). However, it is often the case that there is not enough transaction data to make accurate recommendations for a new user or item. To tackle this so-called cold-start problem, hybrid methods are proposed by combining auxiliary information into CF models (Singh & Gordon, 2008; Agarwal & Chen, 2009; Wang & Blei, 2011; Li et al., 2011; Rendle, 2010). In the interest of the hybrid approach and its advantages, our study attempts to improve typical neighborhood-based RSs utilizing available content-related knowledge. The main contributions of this work are summarized as follows: - Introducing new methods to represent user preference via combining user’s interaction data and item’s content-based information, which helps to estimate the similarity between a user and an item. - Integrating the user-item similarity degree into the baseline estimate of neighborhood-based RSs to provide more precise recommendations, surpassing competitive hybrid models. ----- The remainder of this paper is organized as follows. Section 2 reviews the basic knowledge on neighborhood-based CF systems, including hybrid models. Detail descriptions of our proposed methods are presented in Section 4. Section 5 gives experimental results and in-depth analysis. At last, we conclude this study in Section 6. 2 PRELIMINARIES In this paper, u, v denote users and i, j denote items. rui denotes the preference by user u for item _i, also known as the rating, where high values indicate strong preference, and all the (u, i) pairs are_ stored in the set K = {(u, i)|rui is known}. Meanwhile, R(u) denotes the set of all items rated by user u. In rating prediction task, the objective is to predict unknown rating ˆrui where user u has not rated item i yet. Popular neighborhood-based CF techniques for the rating prediction task and an existing hybrid variant are briefly reviewed as follows. 2.1 NEIGHBORHOOD-BASED MODELS The neighborhood-based approach is one of the most popular techniques in CF, which is only based on the similarity between users or items to give recommendations. There are currently two methods for implementing neighborhood-based CF models: i) user-oriented (or user-user) model which predicts a user’s preference based on similar users, and ii) item-oriented (or item-item) model which finds similar items to the item a user liked and recommends these items to her. Of the two methods, the latter introduced by Sarwar et al. (2001) has become dominant. This is due to the fact that the number of users in real-life systems is often orders of magnitude bigger than of items, which makes the user-oriented model inefficient. Furthermore, the item-oriented model is capable of providing a rational explanation for recommendations (Ricci et al., 2015). Therefore, our implementations in this work adopt the item-item approach as the base model. The fundamental of neighborhood-based models is similarity measure. By computing the similarity degree sij between all pairs of items i and j using popular similarity measures such as Cosine similarity (Cos) or Pearson Correlation Coefficients (PCC), we can identify the set of k neighbors S[k](i, u) which consists of k most similar items to i rated by user u. Then, ˆrui can be predicted as a weighted average of the ratings of similar items: _sijruj_ _j∈SX[k](i,u)_ (1) _sij_ _j∈SX[k](i,u)_ _rˆui =_ Even though Equation (1) can capture the user-item interactions, much of the observed ratings are due to the bias effects associated with either users or items, independently of their interactions. In detail, some items usually receive higher ratings than others, and some users tend to give higher ratings than others. kNNBaseline model proposed by Koren (2010) adjusts the above formula through a baseline estimate which accounts for the user and item effects as follows. _sij (ruj −_ _buj)_ _rˆui[kNNBaseline]_ = bui + _j∈SX[k](i;u)_ (2) _sij_ _j∈SX[k](i,u)_ where bui = µ + bu + bi denotes the baseline estimate, µ denotes the mean of overall ratings, bu and bi correspond to the bias of user u and item i, respectively, which can be trained using popular optimization algorithms such as Stochastic Gradient Descent (SGD) or Alternating Least Squares (ALS). 2.2 INTEGRATING CONTENT-BASED INFORMATION INTO NEIGHBORHOOD-BASED MODELS A number of problems regarding kNN models using similarity measure on the rating information were noticed by Duong et al. (2019). The first problem is the sparsity of the rating matrix, which ----- might yield an inaccurate similarity score between two items that share only a few common users. Secondly, filtering common users who rated both items to calculate the similarity score is a timeconsuming task due to a large number of users. To address these problems, a novel similarity measure was proposed using item content-based information. Assuming that each item i is characterized by a feature vector qi = {qi1, qi2, ..., qif _} ∈_ R[f] where f is the number of features, which is stored in matrix Q ∈ R[n][×][f] . The value of each element encodes how strong an item exhibits particular properties. The similarity score sij between movies i and j is calculated as follows. _f_ _k=1_ _[q][ik][q][jk]_ _s[Cos]ij_ [content] = (3) _f_ _f_ _kP=1_ _[q]ik[2]_ _k=1_ _[q]jk[2]_ or qP qP _fk=1[(][q][ik][ −]_ **_q[¯]i)(qjk_** **_q¯j)_** _s[PCC]ij_ [content] = _−_ (4) _fk=1P[(][q][ik][ −]_ **_q[¯]i)[2]_** _fk=1[(][q][jk][ −]_ **_q[¯]j)[2]_** where ¯qi and ¯qj are the mean of feature vectorsqP **_qi andqP qj, respectively. Experiments showed that_** the item-oriented CF models using Cos[content] and PCC[content] provide equivalent accuracy to the stateof-the-art CF models using rating information whilst performing at least 2 times faster. Hereafter, kNNBaseline model using one of these similarity measures is referred to as kNNContent. 3 EXPERIMENTAL SETTING 3.1 MOVIELENS DATASET AND EVALUATION CRITERIA In this work, the MovieLens 20M dataset is used as a benchmark. This is a widely used dataset in the study of RSs which originally contains 20,000,263 ratings and 465,564 tag applications across 27,278 movies. The ratings are float values ranging from 0.5 to 5.0 with a step of 0.5. Tag Genome data, which is computed on user-contributed content including tags, ratings, and textual reviews, is firstly introduced in this version of MovieLens dataset (Harper & Konstan, 2016). To utilize this kind of data, a cleaning process is applied to the dataset. Specifically, the movies without genome tags are excluded from the dataset. Then, only movies and users with at least 20 ratings are kept. Table 1 summarizes the result of the cleaning stage. Table 1: Summary of the original MovieLens 20M and the preprocessed dataset. |Dataset|# Ratings|# Users|# Movies|Sparsity| |---|---|---|---|---| |Original|20,000,263|138,493|27,278|99.47%| |Preprocessed|19,793,342|138,185|10,239|98.97%| **Dataset** # Ratings # Users # Movies Sparsity Original 20,000,263 138,493 27,278 99.47% Preprocessed 19,793,342 138,185 10,239 98.97% The preprocessed dataset is split into 2 distinct parts: 80% as the training set and the remaining 20% as the testing set. To evaluate the performance of the proposed models, three commonly used indicators in the field of rating prediction are used: RMSE (Root Mean Squared Error) and MAE (Mean Absolute Error) for accuracy evaluation where smaller values indicate better performance, and Time [s] for timing evaluation. Here, RMSE and MAE are defined as follows. (ˆrui _rui)[2]_ _/_ TestSet (5) su,i∈XTestSet _−_ _|_ _|_ _rˆui_ _rui_ _/_ TestSet (6) _u,i∈XTestSet_ _|_ _−_ _|_ _|_ _|_ RMSE = MAE = where |TestSet| is the size of the testing set. The total duration of the model’s learning process on the training set and predicting all samples in the testing set is measured as Time [s]. All experiments are carried out using Google Colaboratory with 25GB RAM and no GPU. ----- 3.2 BASELINE MODELS In this paper, several popular models are selected as baselines to evaluate the proposed methods. Firstly, we implement two competitive neighborhood-based models including kNNBaseline (Koren, 2010) and kNNContent (Duong et al., 2019). Besides, SVD (Funk, 2006) and SVD++ (Koren, 2008), two well-known representatives of matrix factorization technique, are also experimented due to their superior accuracy and flexible scalability with extremely sparse data (Funk, 2006). Error rates and the time to make predictions are measured for comparison. The optimal hyperparameters for each baseline model are carefully chosen using 5-fold cross-validation. In particular, the error rates of the neighborhood-based models are calculated with the neighborhood size _k ∈{10, 15, 20, 25, 30, 35, 40}. Due to better performance compared to Cos and Cos[content], PCC_ and PCC[content] are chosen as the similarity measures in kNNBaseline and kNNContent models. SVD and SVD++ models are trained using 40 hidden factors with 100 iterations and the step size of 0.002. 4 PROPOSED SYSTEMS So far, kNNBaseline models have successfully applied the item-item similarity exploiting rating information and the available metadata representing item features provided by users. In contrast, the knowledge about user-user correlation finds it difficult to be deployed in practical applications due to its modest performance and high memory requirement (Ricci et al., 2015). Besides, to our best knowledge, the interest of a user in individual characteristics of an item also lacks careful consideration, which is a major problem restricting the growth of RSs. One of the main reasons is that the user-item correlation is commonly defined as a similarity degree between a user’s interest in individual item features and an item feature vector, which requires a customer to provide her personal preferences as much as possible for accurate recommendations. In reality, it is impractical due to a variety of data privacy concerns (Jeckmans et al., 2013). This study first tackles this problem by introducing various novel methods to represent a user preference in the form of a vector, utilizing her past interactions with items and the feature vectors of those items. We then propose a modification to the baseline estimate of kNNBaseline model and its variants by integrating the user-item similarity score, which boosts the precision of the conventional kNNBaseline model. 4.1 ESTIMATING USER INTERESTS FOR USER-ITEM SIMILARITY MEASURE In RSs, there are two main sources of information: the interaction records (such as transactions history, ratings, ...) and the item content information (item catalog, movie genres, or the Tag Genome data in the MovieLens 20M dataset). User personal data, however, is not stored or included in publicly available datasets due to the risk of exposing user identities. Therefore, in most datasets for research, there is rarely any data or statistic that directly specifies user interest in each item feature. In this section, we present 3 different methods to characterize a user’s interest based on the ratings and metadata of the movies she watched. The most straightforward approach to estimate a user interest is via a weighted average of the feature vectors qi of items that she rated as follows. _rui[norm]_ **_qi_** _·_ **_p[norm]u_** = _i∈XR(u|R)_ (u)| (7) where rui[norm] is the rating of user u for item i which has been normalized to the range of [0, 1]. As a result, the normalized feature vector p[norm]u of user u has the same dimension and range of element values as an item vector qi. More importantly, each user is currently described in an explainable way: elements with higher values indicate that the user has a greater preference for the corresponding item attributes and vice versa. Although this method helps to create a simple shortcut to understand user preferences, all users are treated in the same way. Specifically, users’ ratings are all normalized using the minimum and maximum values of the system’s rating scale. Whereas, in practice, different users have a variety ----- tendencies of rating an item according to their characters. For example, easy-going people often rate movies a little higher than they really feel, and conversely, strict users often give lower scores than the others. That means if two users have conflicting views after watching a movie but accept to give a 3-star rating for that movie, for example, then the system will implicitly assume they have the same weight of opinion. This fact leads to the researches taking into account the user and item biases, which have a considerable impact on kNNBaseline and biased SVD models (Koren et al., 2009; Koren, 2010). Therefore, a modification of Equation (7) incorporating the effect of biases is proposed as follows. _zui_ **_qi_** _·_ **_p[biased]u_** = _i∈XR(u)_ (8) _zui_ _|_ _|_ _i∈XR(u)_ less than her expectation to a moviewhere the residual rating zui = rui − i. In more detail, this formula applies the residual ratings asbui denotes how much extra rating a user u gives more or weighting factors to the corresponding item feature vectors, which helps to eliminate the restrictions of rui[norm]. The resulting biased user feature vector p[biased]u now has its elements in the value range of [−1, +1], where −1 / +1 indicates that she totally hates / loves the respective item attribute, and 0 is neutral preference. It is expected that p[biased]u could measure the user interest in each item attribute more precisely than its normalized version p[norm]u . However, both of the above methods treat all items equally in profiling a user interest. For example, consider user Janet and two movies “Titanic” and “Mad Max”. The scores of “Titanic” and “Mad Max” for the romantic genre are 0.90 and 0.05, respectively, which means “Titanic” is a romantic movie while “Mad Max” has almost no romantic scene. Assume that Janet’s normalized ratings for these movies are ˜rJanet,Titanic = 0.7 and ˜rJanet,Mad Max = 0.72, which are almost identical. Thus, the romantic genre score of Janet calculated by Equation (7) is quite low: (0.7 × 0.9 + 0.72 × 0.05)/2 = 0.333. The fact that “Mad Max” has no romantic element does not mean that Janet doesn’t like romantic movies. Equation (8) also encounters the same problem. This might lead to misunderstanding the character of a user in some cases. This problem can be solved by alleviating the influence of low score features whilst primarily focusing on features with high values. Accordingly, the simplest method is to use the scores themselves as the weights in parallel with normalized ratings to estimate user feature vectors. Specifically for the above example, the score of Janet for the romantic genre is equal to [(0][.][7][×](0[0][.].[9]7[×]×[0]0[.].[9)+(0]9)+(0[.].[72]72[×]×[0]0[.].[05]05)[×][0][.][05)] = 0.854, which is much more reasonable than measuring the affection of a user for a kind of genre based on items that are not relevant to that genre. The biased feature vector of user u weighted by item feature vector p[w]u _[−][biased]_ can be formulated as follows. _zui_ **_qi[2]_** _·_ _i∈XR(u)_ (9) _zui_ **_qi_** _|_ _| ·_ _i∈XR(u)_ **_pu[w][−][biased]_** From the user feature vectors calculated using one of the above methods, it is noticeable that each value describing user interests has comparatively the same meaning as the corresponding value in the item feature vector. Therefore, the strength of the relevance between a user and an item can be evaluated using common similarity measures such as Cos or PCC (Section 2.2), which eventually calculates a user-item similarity matrix. In the next section, we demonstrate the effectiveness of these vector representations by integrating user-item similarity into the popular kNNBaseline model and its variants. 4.2 INTEGRATING THE USER-ITEM CORRELATIONS INTO THE BASELINE ESTIMATE In item-oriented kNNBaseline, the baseline estimate takes the main role of predicting the coarse ratings while the analogy between items serves as a fine-tuning term to improve the accuracy of the final predicted ratings. Furthermore, it also means that the more precise the baseline estimate is to the targeted rating, the better the kNNBaseline models get in terms of prediction accuracy (Duong Tan et al., 2020). However, we realize that a conventional baseline estimate only considers ----- the biases of users and items separately, ignoring the user-item correlations, which might lead to a rudimentary evaluation approach. For example, an RS needs to estimate the ratings of user James to two movies “Titanic” and “Mad Max”. Assuming that the average rating, µ, is 3.7 stars. Furthermore, “Titanic” is better than an ordinary movie, so it tends to be rated 0.5 stars above the average. Meanwhile, James is a critical user, who usually rates 0.3 stars lower than a moderate user. Thus, the baseline estimate of “Titanic” rated by James would be 3.9 stars (= 3.7 − 0.3 + 0.5). On the other hand, “Mad Max” tends to be rated 0.6 stars higher than the mean rating; hence, the baseline estimate of James for “Mad Max” would be 4.0 stars (= 3.7 − 0.3 + 0.6). However, from James’s past interactions with other movies, the system estimates James’s interests using one of the methods described in Section 4.1 and discovers that a romantic and drama movie like “Titanic” seems to be very suitable for James while his personality is contradictory compared to an action and thriller movie like “Mad Max”. Consequently, the above predicted ratings of James now turn out to be rather irrational. Figure 1: The residual rating of several users with respect to the user-item similarity degree. sij values are calculated using PCC similarity measure to compare between Tag Genome data of the movies in the MovieLens 20M dataset and the p[w-biased]u matrix. The red trendlines are determined using linear regression. As illustrated in Figure 1, there is an approximate-linear relationship between the user-item correlations and the residual ratings: the more interested a user is in a movie (i.e., the larger user-item similarity score), the higher she tends to rate that movie. In order to take the analogy between user and item into account, we propose a revised version of the baseline estimate by integrating the user-item similarity score as follows. _bui = µ + bu + bi + ω × sui_ (10) where sui is the similarity degree between user u and item i, and ω serves as the weight to adjust the contribution of the user-item correlation term to fit the rating information. By introducing ω, the least squares problem of the enhanced baseline estimate term is now updated to the following function. _b[∗]u[, b][∗]i_ _[, ω][∗]_ [= arg min] _bu,bi,ω_ _u,iX∈K(rui −_ (µ + bu + bi + ωsui))[2] (11) (11) + λ _b[2]u_ [+] _b[2]i_ [+] _ω[2]_   _u_ Xi _u,iX∈K_ [X]  In this paper, two common optimization techniques, namely SGD and ALS, are experimented to solve this problem. An SGD optimizer minimizes the sum of the squared errors in Equation (11) using the following update rule. _bu ←_ _bu + α(eui −_ _λ.bu)_ _bi ←_ _bi + α(eui −_ _λ.bi)_ (12) _ω_ _ω + α(eui.sui_ _λ.ω)_ _←_ _−_ ----- where eui = rui − _rˆui is the predicting error, α is the learning rate, and λ is L2 regularization term._ Different from SGD, the ALS technique decouples the calculation of one parameter from the others (Koren, 2010). In one iteration, the ALS process can be described as follows. First, for each item i, the optimizer fixes the bu’s and ω to solve for the bi’s. _rui_ _µ_ _bu_ _ωsui_ _−_ _−_ _−_ _u|(Xu,i)∈K_ (13) _λi + |{u|(u, i) ∈_ K}| _bi =_ Then, for each user u, the optimizer fixes the bi’s and ω to solve for the bu’s. _rui_ _µ_ _bi_ _ωsui_ _−_ _−_ _−_ _i|(u,iX)∈K_ (14) _λu + |{i|(u, i) ∈_ K}| _bu =_ Finally, the optimizer fixes both the bu’s and the bi’s to solve for ω. _sui(rui_ _µ_ _bu_ _bi)_ _u,iX∈K_ _−_ _−_ _−_ (15) _λω + |K|_ _ω =_ Here, the regularization terms λi, λu, and λω are the shrinkage and vary due to the number of the ratings that affect each parameter. Therefore, each parameter of bu’s, bi’s, and ω needs a distinct value of λ, which can be determined by cross-validation. By applying a learnable weighting factor _ω to the user-item similarity term, the new kNNBaseline model is capable of exploiting auxiliary_ information to achieve more precise predictions. 5 PERFORMANCE EVALUATION To assess the new methods of characterizing user preferences and the proposed baseline estimate in Section 4, Tag Genome in the MovieLens 20M dataset is used to construct a movie feature vector: **_qi = {gi1, gi2, ..., gik, ...} where gik is the genome score of genome tag k[th]. In our experiments,_** **_p[norm]u_**, p[biased]u, and p[w-biased]u are first integrated into the traditional baseline estimate to find the optimal technique of profiling user interest in terms of predicting accuracy. The enhanced baseline estimate is then implemented into several neighborhood-based models to comprehensively evaluate its impact on the final rating prediction. 5.1 ACCURACY OF THE BASELINE ESTIMATE UTILIZING THE USER-ITEM CORRELATION The enhanced baseline estimates are learned using both optimization algorithms SGD and ALS for comparison. For SGD, the baseline are trained using the learning rate α = 0.005 and the regularization λ = 0.02. For ALS, typical values for λu and λi in the MovieLens dataset are 15 and 10, respectively (Hug, 2020). However, the number of training points in set K is much larger than the number of appearances of each user or item, which completely differs the value of λω from λu and λi. Therefore, a grid search is performed on λω, which finds that λω = −9, 500, 000 is the best choice. Table 2 shows that utilizing the user-item correlation helps to improve the accuracy of the traditional baseline estimate at the price of increased complexity. Empirical results also prove the superior of **_p[w-biased]u_** over its counterparts for both similarity measures being used. Specifically, calculating the user-item similarity with PCC achieves the coarse rating prediction with 6.46% lower RMSE and 6.71% lower MAE but takes approximately 3.6 times as much time as the original baseline estimate (optimized via ALS). A noteworthy point here is that ALS achieves consistently lower error rates than SGD for all cases at the expense of requiring an additional hyperparameter tuning process (and thus a further computational complexity). However, this trade-off is acceptable at this stage because the absolute time to determine the baseline estimate compared to the total time to make the final prediction is negligible. Hence, ALS is selected as the optimizer for the proposed baseline estimate hereafter. ----- Table 2: Performance of the enhanced baseline estimates with different types of user feature vectors. The conventional baseline estimate without the user-item similarity is included for comparison. |User feature vectors|Similarity measure|SGD|Col4|Col5|ALS|Col7|Col8| |---|---|---|---|---|---|---|---| |||RMSE|MAE|Time [s]|RMSE|MAE|Time [s]| |None||0.8593|0.6595|24|0.8576|0.6590|34| |pnorm u|Cos|0.8553 (-0.47%)|0.6567 (-0.42%)|71 (x3.0)|0.8351 (-2.62%)|0.6348 (-3.67%)|114 (x3.4)| ||PCC|0.8432 (-1.87%)|0.6474 (-1.83%)|75 (x3.1)|0.8184 (-4.80%)|0.6274 (-4.79%)|121 (x3.6)| |pbiased u|Cos|0.8153 (-5.12%)|0.6239 (-5.40%)|73 (x3.3)|0.8129 (-5.21%)|0.6228 (-5.49%)|117 (x3.4)| ||PCC|0.8096 (-5.78%)|0.6201 (-5.97%)|79 (x3.3)|0.8072 (-5.88%)|0.6186 (-6.13%)|126 (x3.7)| |pw-biased u|Cos|0.8149 (-5.17%)|0.6235 (-5.46%)|74 (x3.1)|0.8057 (-6.05%)|0.6172 (-6.34%)|119 (x3.5)| ||PCC|0.8069 (-6.10%)|0.6171 (-6.43%)|80 (x3.3)|0.8022 (-6.46%)|0.6148 (-6.71%)|122 (x3.6)| **User feature** **Similarity** **SGD** **ALS** **vectors** **measure** **RMSE** **MAE** **Time [s]** **RMSE** **MAE** **Time [s]** _None_ _0.8593_ _0.6595_ _24_ _0.8576_ _0.6590_ _34_ 0.8553 0.6567 71 0.8351 0.6348 114 Cos _(-0.47%)_ _(-0.42%)_ _(x3.0)_ _(-2.62%)_ _(-3.67%)_ _(x3.4)_ **_p[norm]u_** 0.8432 0.6474 75 0.8184 0.6274 121 PCC _(-1.87%)_ _(-1.83%)_ _(x3.1)_ _(-4.80%)_ _(-4.79%)_ _(x3.6)_ 0.8153 0.6239 73 0.8129 0.6228 117 Cos _(-5.12%)_ _(-5.40%)_ _(x3.3)_ _(-5.21%)_ _(-5.49%)_ _(x3.4)_ **_p[biased]u_** 0.8096 0.6201 79 0.8072 0.6186 126 PCC _(-5.78%)_ _(-5.97%)_ _(x3.3)_ _(-5.88%)_ _(-6.13%)_ _(x3.7)_ 0.8149 0.6235 74 0.8057 0.6172 119 Cos _(-5.17%)_ _(-5.46%)_ _(x3.1)_ _(-6.05%)_ _(-6.34%)_ _(x3.5)_ **_p[w-biased]u_** 0.8069 0.6171 80 **0.8022** **0.6148** **122** PCC _(-6.10%)_ _(-6.43%)_ _(x3.3)_ _(-6.46%)_ _(-6.71%)_ _(x3.6)_ 5.2 PERFORMANCE OF THE UNIFIED NEIGHBORHOOD-BASED SYSTEM Finally, the advanced baseline estimates are integrated into kNNBaseline and kNNContent models to refine the ultimate rating predictions. For calculating the item-item similarity in kNNBaseline model, two common measures Cos and PCC are examined. The same goes for kNNContent model, where Cos[content] and PCC[content] are both implemented for comparison. Figure 2: Error rates of kNNBaseline and kNNContent models when incorporating the user-item correlations with different sizes of the neighborhood. ----- As shown in Figure 2, the outperformance of the modified baseline estimates over their original makes a significant improvement in predicting accuracy: the newly proposed neighborhood-based models totally surpass their initial versions for all cases. It is noticeable that even though incorporating sui calculated by PCC is clearly better than Cos when using p[norm]u, the difference between these two similarity measures gets much smaller in the case of p[biased]u and almost disappears with p[w-biased]u . This is because the user feature vector generated by Equation (8) or Equation (9) has the original ratings subtracted by the baseline estimate, which makes the mean of the resulting vector come close to 0. Therefore, applying Cos or PCC to the approximately zero-mean vectors produces nearly identical results. In the following experiments, p[w-biased]u and PCC are thus opted for calculating the user-item similarity for best accuracy. Table 3 shows a comparison between the neighborhood-based models incorporating the user-item correlations and several common CF ones. The most accurate model, kNNContent with sui, gains: - 4.80% lower RMSE and 4.88% lower MAE than original kNNBaseline. - 2.11% lower RMSE and 2.03% lower MAE than original kNNContent. - 2.56% lower RMSE and 2.91% lower MAE than SVD. - 2.22% lower RMSE and 2.10% lower MAE than SVD++. Table 3: Performance of the neighborhood-based models utilizing the user-item correlations against popular CF models. |Model|RMSE|MAE|Time [s]| |---|---|---|---| |kNNBaseline (k = 40) kNNContent (k = 20) SVD (40 factors) SVD++ (40 factors)|0.8108 0.7885 0.7922 0.7894|0.6167 0.5988 0.6042 0.5992|565 293 292 27,387| |kNNBaseline incorporating s ui (k = 40)|0.7853|0.5981|659| |kNNContent incorporating s ui (k = 25)|0.7719|0.5866|392| **Model** **RMSE** **MAE** **Time [s]** kNNBaseline (k = 40) 0.8108 0.6167 565 kNNContent (k = 20) 0.7885 0.5988 293 SVD (40 factors) 0.7922 0.6042 292 SVD++ (40 factors) 0.7894 0.5992 27,387 kNNBaseline incorporating sui 0.7853 0.5981 659 (k = 40) kNNContent incorporating sui **0.7719** **0.5866** **392** (k = 25) These improvements in predicting accuracy are achieved at the expense of the additional complexity. However, in practice evaluating the user-item similarity matrix from fixed-length vectors could be performed in parallel with a low computational cost. Hence, we consider that this trade-off is worth it in real-life applications. 6 CONCLUSION In this paper, we first introduced various techniques to characterize user preferences utilizing both rating data and item content information. The new user representations not only help to understand user interests in each item attribute but also make it possible to measure the user-item correlations. An innovative method was then proposed to adjust the baseline estimate of kNNBaseline model that takes the user-item similarity into account. Thereby, the resulting hybrid models achieve at least 2.11% lower RMSE and 2.03% MAE compared to their neighborhood-based counterparts. This leads to the conclusion that neighborhood-based RSs could be greatly improved by integrating both the item-item and user-item correlations in the predicting model. REFERENCES Gediminas Adomavicius and Alexander Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE transactions on knowledge _and data engineering, 17(6):734–749, 2005._ ----- Deepak Agarwal and Bee-Chung Chen. Regression-based latent factor models. In Proceedings of _the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp._ 19–28. ACM, 2009. Tan Nghia Duong, Viet Duc Than, Tuan Anh Vuong, Trong Hiep Tran, Quang Hieu Dang, Duc Minh Nguyen, and Hung Manh Pham. A novel hybrid recommendation system integrating contentbased and rating information. In International Conference on Network-Based Information Sys_tems, pp. 325–337. Springer, 2019._ Nghia Duong Tan, Tuan Anh Vuong, Duc Minh Nguyen, and Quang Hieu Dang. Utilizing an autoencoder-generated item representation in hybrid recommendation system. IEEE Access, PP: 1–1, 04 2020. doi: 10.1109/ACCESS.2020.2989408. Simon Funk. Netflix update: Try this at home, 2006. F Maxwell Harper and Joseph A Konstan. The movielens datasets: History and context. Acm _transactions on interactive intelligent systems (tiis), 5(4):19, 2016._ Jonathan L Herlocker, Joseph A Konstan, and John Riedl. Explaining collaborative filtering recommendations. In Proceedings of the 2000 ACM conference on Computer supported cooperative _work, pp. 241–250. ACM, 2000._ Nicolas Hug. Surprise: A python library for recommender systems. Journal of Open Source Soft_[ware, 5(52):2174, 2020. doi: 10.21105/joss.02174. URL https://doi.org/10.21105/](https://doi.org/10.21105/joss.02174)_ [joss.02174.](https://doi.org/10.21105/joss.02174) Arjan JP Jeckmans, Michael Beye, Zekeriya Erkin, Pieter Hartel, Reginald L Lagendijk, and Qiang Tang. Privacy in recommender systems. In Social media retrieval, pp. 263–281. Springer, 2013. Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and _data mining, pp. 426–434. ACM, 2008._ Yehuda Koren. Factor in the neighbors: Scalable and accurate collaborative filtering. ACM Trans_actions on Knowledge Discovery from Data (TKDD), 4(1):1, 2010._ Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009. Ken Lang. Newsweeder: Learning to filter netnews. In Machine Learning Proceedings 1995, pp. 331–339. Elsevier, 1995. Wu-Jun Li, Dit-Yan Yeung, and Zhihua Zhang. Generalized latent factor models for social network analysis. In Twenty-Second International Joint Conference on Artificial Intelligence, 2011. Pasquale Lops, Marco De Gemmis, and Giovanni Semeraro. Content-based recommender systems: State of the art and trends. In Recommender systems handbook, pp. 73–105. Springer, 2011. Raymond J Mooney and Loriene Roy. Content-based book recommending using learning for text categorization. In Proceedings of the fifth ACM conference on Digital libraries, pp. 195–204. ACM, 2000. Fedelucio Narducci, Pierpaolo Basile, Cataldo Musto, Pasquale Lops, Annalina Caputo, Marco de Gemmis, Leo Iaquinta, and Giovanni Semeraro. Concept-based item representations for a cross-lingual content-based recommendation process. Information Sciences, 374:15–31, 2016. Michael Pazzani and Daniel Billsus. Learning and revising user profiles: The identification of interesting web sites. Machine learning, 27(3):313–331, 1997. Steffen Rendle. Factorization machines. In 2010 IEEE International Conference on Data Mining, pp. 995–1000. IEEE, 2010. Francesco Ricci, Lior Rokach, and Bracha Shapira. Recommender systems: introduction and challenges. In Recommender systems handbook, pp. 1–34. Springer, 2015. ----- Badrul Munir Sarwar, George Karypis, Joseph A Konstan, John Riedl, et al. Item-based collaborative filtering recommendation algorithms. Www, 1:285–295, 2001. Ajit P Singh and Geoffrey J Gordon. Relational learning via collective matrix factorization. In _Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and_ _data mining, pp. 650–658. ACM, 2008._ Nava Tintarev and Judith Masthoff. A survey of explanations in recommender systems. In 2007 _IEEE 23rd international conference on data engineering workshop, pp. 801–810. IEEE, 2007._ Chong Wang and David M Blei. Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and _data mining, pp. 448–456. ACM, 2011._ -----