4.1 Problem definition
Suppose there is an online shopping network, \(U = \{u_1, u_2, \ldots , u_n\}\) denotes the users (or accounts), \(C = \{c_1, c_2, \ldots , c_m\}\) denotes the categories of products. The problem is defined as to find a ranking of all categories for target users, according to the rating score P(u, c), which denotes user u’s preference to category c.
The historical purchase behavior \(H_u\), although very sparse, has to be considered in purchase prediction, because it is a \(100~\%\) ground truth of the user. We could use this information as a predicting dimension, which is depicted as \(P_H(u, c)\). Additionally, we could use user profiles \(D_u\) and posted messages \(M_u\) as predicting aspects \(P_D(u, c)\) and \(P_M(u, c)\), respectively. By doing so, we could enrich the ingredients of the predicting recipe from social network sites.
We consider all the aforementioned information, and merge it into one prediction outcomes by giving a group of weight parameters
\(\alpha , \beta , \gamma \) to each predicting dimension as shown in Eq. (
2). Through the following subsections, we will present detailed descriptions to the three predicting dimensions, and put forward the method to calculate the weight parameters.
$$\begin{aligned} P(u, c) = \alpha P_H(u, c) + \beta P_D(u, c) + \gamma P_M(u, c) \end{aligned}$$
(2)
4.2 Prediction with purchase history and product categories
Users’ purchase history is one of the key issues we should take into consideration [
25], because it is the first-hand truth of user likes, even though data sparsity may prevent the importance of its predicting results. Moreover, purchase behaviors have the characteristic of time sequencing, therefore, recent purchase behaviors have more guiding values than aging ones.
We split the historical purchase behaviors of a user
u into periods of length
t, where we define
\(N_u^t\) as the total number of purchases of a user,
\(N_u^t(c)\) as the number of purchases of category
c, and
\(N^t\) as the total number of purchase behaviors. Then
u’s preference to
\(c_i\) can be denoted as
\(p_u^t(c_i|L) = (N_u^t(c_i))/(N_u^t)\), purchase ratio within time
t is
\(p_u^t(L) = (N_u^t)/N^t\), purchase ratio of
\(c_i\) within time
t is
\(p^t(c_i) = (N_u^t)/N^t\). According to Bayes model, we can denote the preference of user
u to category
\(c_i\) as
$$\begin{aligned} p_u^t(L|c_i) = \frac{p_u^t(c_i|L) \cdot p_u^t(L)}{p^t(c_i)} \end{aligned}$$
(3)
To strengthen the timeliness of the user preference, we apply Gaussian equation
\(\mathcal {N}(\mu , \delta ^2)\) to describe the weight of preference according to
t, where
\(\mu \) is the target time,
\(\delta \) is to describe the smoothness of Gaussian equation. The predicting function of
\(P_H(u, c_i)\) can therefore be expressed as Eq. (
4).
$$\begin{aligned} P_H(u, c_i) = \frac{\sum \mathcal {N}_t(\mu , \delta ^2) \cdot N_u^t \cdot p_u^t(L|c_i)}{\sum \mathcal {N}_t(\mu , \delta ^2) \cdot N_u^t} \end{aligned}$$
(4)
4.3 Prediction with social network information
Based on the matched accounts, or the strong relationship between social network sites and online shopping sites, we can use both user profiles and the published statuses to predict the purchase behaviors of the future.
Social network demographical characteristic When someone signs up an account in a social network site, he or she would probably provide personal information to the service provider, e.g., name, gender, age, or even religious belief. Authentic personal information helps the new user to find his or her friends in reality, and makes it easy for other users to find him or her with similar hobbies or interests [
26]. For gender, we merely match users with the same gender. Let
\(B(u_j)\) be user
\(u_j\)’s match result, where 1 represents the same gender, 0 otherwise. We also assume that closer users are likely to have closer preferences with high probability. For age, we use Gaussian equation to calculate the distance between the target user and the training users, as shown in Eq. (
6), and sum the results together to assign to the target user. Similarly, we could calculate the distance between locations. Finally, we sum all three of the characteristics together with weights, as depicted in Eq. (
8).
In more details, we firstly calculate the relation between social networks and online shopping sites based on the training set. The simple version is to count the shopping behaviors for each social network feature
\(L(c, b_j) = \sum _{k=1}^n w_k^{b_j}\), where
c denotes a social network feature,
\(b_j\) is a kind of behavior in online shopping sites,
n is the number of users in the training set, and
\(w_k^{b_j}\) is the weight of user
k’s behavior. The advanced version is to add relation into the learning of social network features regarding the likes of online shopping sites
\(B(u) = \sum _i^k w_u^{c_i} L(c, b)\), where
k denotes the number of likes of the user,
\(w_u^{c_i}\) denotes the weight of
\(c_i\) for user
u,
L(
c,
b) is the correlation between likes and behaviors. In Eq. (
5),
n denotes the number of users in the training set,
\(B(u_j, c_i)\) shows if
\(u_j\) has feature
\(c_i\) (1 if yes, 0 otherwise).
\(w(c_i)\) denotes the weight of social network feature
\(c_i, M\) is the number of features
\(u_j\) has, and
\(w(c_k)\) is the weight of social network feature
\(c_k\) of user
\(u_j\). In Eq. (
7),
\(\theta _i\) denotes the weight. Equation (
8) is the fusion of the three aspects
\(p_G(u, c_i), p_A(u, c_i)\) and
\(p_L(u, c_i)\), where
\(p_L(u, c_i)\) is the prediction according to locations.
$$\begin{aligned} p_G(u, c_i)= & {} \sum _j^n B(u_j, c_i) \frac{w(c_i)}{\sum _k^M w(c_k)} \end{aligned}$$
(5)
$$\begin{aligned} D(u_i, u_j)= & {} \mathcal {N}(\mu , \delta ^2) \end{aligned}$$
(6)
$$\begin{aligned} p_A(u, c_i)= & {} \sum _j^n D(u, u_j) \frac{w(c_i)}{\sum _k^M w(c_k)} \end{aligned}$$
(7)
$$\begin{aligned} p_D(u, c_i)= & {} \theta _1 p_G(u, c_i) + \theta _2 p_A(u, c_i) + \theta _3 p_L(u, c_i) \end{aligned}$$
(8)
Social network user statuses Social network is with billions of user interactions, most of which are short text messages [
24]. We could utilize the messages to build a model to describe user preferences, by which we can recommend friends of the user. We use an open source knowledge base Freebase to understand the short messages, and to learn the preference model. For semantic recommendation strategies, methods based on matrix factorization models are the state-of-art approaches in recommender system.
Matrix Factorization (or MF) [
27] is one of the common methods for model-based recommendation. MF has been proposed to perform predictions for a single user-item rating matrix. In MF, each user and each item is associated with a
K dimensional latent factor vector: the latent factor of user
u is denoted as
\(U_u\) and is stored as the
uth row of user factor matrix
U. The latent factor of item
i is denoted as
\(V_i\) and stored as the
ith row of item factor matrix
V. To learn the latent factors of users and items, [
28] employs probabilistic matrix factorization to factor the user-item matrix into the product of user and item latent factors. The conditional probability of the observed ratings is defined as the following equation:
$$\begin{aligned} P_M(u, c) = p(RU, V, \delta ^2) = \prod _{u=1}^N \prod _{i=1}^M [\mathcal {N}_t(R_{u,i}|U_u^T V_i, \delta _r^2)]^{I_{u,i}^R} \end{aligned}$$
(9)
where
\(\mathcal {N}_t(x|\mu , \delta ^2)\) is the normal distribution with mean
\(\mu \) and variance
\(\delta ^2\), and
\(I_{u,i}^R\) is the indicator function that is equal to 1 if
u has been rated
i, and is equal to 0 otherwise.