1 Introduction
1.1 Contributions
-
We propose a novel SLP training scheme, which can derive s predictive models for s different patterns simultaneously. Furthermore, based on the technique of mini-batch Mohassel and Zhang (2017), the trained model \(\mathbf w \) can smoothly and rapidly converge to the optimum value. Compared with the scheme in Zhang et al. (2018), the computational complexity can be dramatically reduced from \(O(n^3)\) to \(O(n^{2})\).
-
We first introduce the verification mechanism into SLP training scheme. If the cloud server cheats the client by returning an incorrect value, the dishonest behavior will be detected by the client definitely.
-
We design a lightweight privacy-preserving predictive algorithm based on secure two-party computation Malek and Miri (2006). With this method, both the predictive model \(\mathbf w \) and the new data record can be well protected. Moreover, the final classification result is only privately hold by the requester.
1.2 Related work
2 Preliminaries
2.1 Mini-batch SLP training algorithm
2.2 Privacy-preserving method for outsourcing matrix multiplication
-
KeyGen: On input the security parameter \(\lambda \), the client randomly chooses three sets \(\left\{ \alpha _{1}, \alpha _{2},\ldots ,\alpha _{n}\right\} \), \(\{\beta _{1}, \beta _{2},\ldots , \beta _{n}\}\) and \(\left\{ \gamma _{1}, \gamma _{2},\ldots , \gamma _{n}\right\} \) from specific key space. By using the same method in Lei et al. (2014), the client generates three random permutations, \(\pi _{1}, \pi _{2}, \pi _{3}\). Similarly, the client generates three sparse matrices, \(\mathbf F _{1}(i,j)=\alpha _{i}\delta _{\pi _{1}(i),j}, \mathbf F _{2}(i,j)=\beta _{i}\delta _{\pi _{2}(i),j}, \mathbf F _{3}(i,j)=\gamma _{i}\delta _{\pi _{3}(i),j} \), where the formula of the Kronecker delta function \(\delta _{x,y}\) is as follows. \(\delta _{x,y}=\left\{ \begin{aligned} 1,\quad x=y\\ 0,\quad x\ne y\\ \end{aligned} \right. \)
-
MMEnc: Given two large-scale matrices \(\mathbf X , \mathbf Y \), the resource-constrained client needs to calculate matrix multiplication. In order to protect his own private information, the client will encrypt his inputs before uploading them to the cloud server to compute with. Therefore, by using the matrix blinding technique the client computes \(\hat{\mathbf{X }}=\mathbf F _{1}{} \mathbf X {} \mathbf F _{2}^{-1}\) and \(\hat{\mathbf{Y }}=\mathbf F _{2}{} \mathbf X {} \mathbf F _{3}\) locally and sends the blinding inputs \(\hat{\mathbf{X }}, \hat{\mathbf{Y }}\) to the cloud server.
-
Compute: After receiving two matrices \(\hat{\mathbf{X }}, \hat{\mathbf{Y }}\) from the client, the cloud server conducts this algorithm to compute \(\hat{\mathbf{T }}=\hat{\mathbf{X }}\hat{\mathbf{Y }}\). Subsequently, the cloud server sends the blinding result \(\hat{\mathbf{T }}\) to the client.
-
MMDec: On input the returned result \(\hat{\mathbf{T }}\), the client will decrypt it, \(\mathbf T =\mathbf F _{1}^{-1}\hat{\mathbf{T }}{} \mathbf F _{3}^{-1}=\mathbf X {} \mathbf Y \). Therefore, the client will obtain the final result.
-
Verify: Considering the cloud server is honest but curious, after decrypting the result the client should check the correctness of the calculation result \(\mathbf T \). The client firstly selects a vector \(\mathbf r =\left\{ r_{1}, r_{2},\ldots , r_{n}\right\} \) and checks the equation \(\mathbf T {} \mathbf r \overset{?}{=}{} \mathbf X {} \mathbf Y {} \mathbf r \). If yes, the result \(\mathbf T \) will pass verification; otherwise, this algorithm will output \(\bot \).
2.3 Secure dot-product protocol
-
For \(\alpha , \beta \in \mathbb {F}_{p^{n}}\), \(T(\alpha +\beta )=T(\alpha )+T(\beta )\);
-
For \(\alpha \in \mathbb {F}_{p^{n}}\), \(c\in \mathbb {F}_{p}\), \(T(c\alpha )=cT(\alpha )\);
-
For \(a\in \mathbb {F}_{p}\), \(T(a)=na\);
-
For \(\alpha , \in \mathbb {F}_{p^{n}}\), \(T(\alpha ^{p})=T(\alpha )\).
3 System and security models
3.1 System model
-
The client The main task of the client is to train s prediction models for s different patterns. The client takes the training cases \(\left\{ x_{i,j}\right\} \) \((1\le i\le n, 1\le j\le m)\), random weight \(\left\{ w_{j,k}\right\} \) \((1\le j\le m, 1\le k\le s)\), learning rate \(\eta \), the size of mini-batch n and the predetermined iteration round p as inputs. And the client takes a final weight matrix \(\mathbf W \) for s different patterns as its output.
-
The cloud server A cloud server possesses substantial computation and storage resources. With the help of the cloud server, the client can outsource the heavy computational operations in order to save the local resources by pay-per-use manner. Generally speaking, the cloud server is curious but honest. That is, the cloud server can follow the protocol honestly, but he will try his best to dig up some sensitive information beyond what he has known.
-
The requester A requester who owns a new data record wants to know the classification result under a specific prediction model. On the one hand, the new data record is privately held by the requester. On the other hand, the specific prediction model belongs to the client’s own asset which costs the client substantial resources to obtain. Therefore, the requester should learn nothing about the prediction model other than the final result.
3.2 Security model
-
Privacy In training stage, we require that the client’s data are secure against the cloud server. Given the encrypted sample cases, the cloud server cannot get the client’s original data. Furthermore, the result is also hidden from the server. In predictive stage, both the new query record and prediction model should be well protected. That is, the two participants cannot learn nothing beyond what they have known.
-
Verifiability Since the cloud server is semi-honest, the client should have the ability to detect errors. That is to say, any error result from a cheating cloud server cannot pass the verification.
-
Efficiency In training process, for the client, the computation cost for preparing outsourcing calculation task to the cloud server and extracting the results from the returned values should be less than that of computing the computation task by its own.
4 The proposed VPSPT scheme
4.1 High description
4.2 Verifiable privacy-preserving SLP training scheme
-
Initialization Firstly, in order to protect the client’s sensitive information, it is necessary to encrypt the input information before uploading them to the cloud server. Therefore, the client conducts KeyGen algorithm to generate three secret sparse matrices \(\mathbf F _{1}\in \mathbb {R}^{n\times n}, \mathbf F _{2}\in \mathbb {R}^{m\times m}\) and \(\mathbf F _{3}\in \mathbb {R}^{s\times s}\), which are used to blind input matrices. Secondly, the client randomly selects a weight matrix \(\mathbf W \in \mathbb {R}^{m\times s}\) where all elements are equal to small random numbers.
-
Training stage In the following, the completed protocol will be depicted. The privacy-preserving and verifiable SLP training scheme is described by Algorithm 2.
-
Step 1 Based on the mini-batch idea in Mohassel and Zhang (2017), the client randomly selects a small batch of samples instead of a piece of data per iteration. The client chooses n pieces of training data \(\left\{ \mathbf x _{1}, \mathbf x _{2},\ldots \mathbf x _{n} \right\} \) with associated desired outputs \(\left\{ \mathbf o _{1}, \mathbf o _{2},\ldots \mathbf o _{n} \right\} \), and each piece of training data has m feature values. Hence, we denote these training data as matrix \(\mathbf X \in \mathbb {R}^{n\times m}\). In order to protect the sensitive information of the client \(\mathbf X \) and the training models \(\mathbf W \), the client needs to conduct the MMEnc algorithm to obtain \(\hat{\mathbf{X }}=\mathbf F _{1}{} \mathbf X {} \mathbf F _{2}\) and \(\hat{\mathbf{W }}=\mathbf F _{2}^{-1}{} \mathbf W {} \mathbf F _{3}\), and then uploads the ciphertext tuple \(\left\langle \hat{\mathbf{X }}, \hat{\mathbf{W }} \right\rangle \) to the cloud server.
-
Step 2 Upon receiving the ciphertext tuple \(\left\langle \hat{\mathbf{X }},\hat{\mathbf{W }} \right\rangle \) from the client, the cloud server executes the matrix multiplication function, i.e., \(\hat{\mathbf{Y }}=\hat{\mathbf{X }}\times \hat{\mathbf{W }}\). In the following, the cloud server sends the blinding training result \(\hat{\mathbf{Y }}\) to the client.
-
Step 3 In this step, the client conducts the decryption operation as \(\mathbf Y =\mathbf F _{1}^{-1} \hat{\mathbf{Y }} \mathbf F _{3}^{-1} \) and derives the final result matrix \(\mathbf Y \). Furthermore, in order to build confidence of the outsourcer, the client will check the correctness of the computation result \(\mathbf Y \) which is returned by the cloud server. Firstly, the client randomly selects a vector \(\mathbf r =\left\{ r_{1}, r_{2},{\ldots } r_{s} \right\} \) where not all elements are equal to zero. Secondly, the client locally calculates \(\mathbf Y {} \mathbf r \) and \(\mathbf X {} \mathbf W {} \mathbf r \), respectively, and checks whether they are equal to each other. If yes, the computation result \(\mathbf Y \) will pass the verification. Otherwise, the client will terminate this algorithm.
-
Step 4 To simplify the representation, we select one column of matrix \(\mathbf Y \) and denoted by \(\mathbf y _{k}\). It implies that we just elaborately discuss the training process of a specific model and other models are trained by the same method. Later, for each element of the vector \(\mathbf y _{k}\), the client executes the sign function asand then the client compares each element of the \(\left\{ t_{i,k} \right\} \) with the desired classification results \(\left\{ o_{i,k} \right\} \) detailedly. For some \(t_{i,k}\ne o_{i,k}\) (for \(1\le i\le n\)), the client updates the weight vector \(\mathbf w _{k}\) as$$\begin{aligned} t_{i,k}=sign(y_{i,k}) (for 1\le i\le n) \end{aligned}$$If the weight vector \(\mathbf w _{k}\) satisfies one of the two terminating conditions, i.e., the number of iteration round exceeds the preset value or all classification results for this model are correct, the SLP training algorithm will be terminated and go to step 5. Otherwise, the client will repeat the steps from step 1 with the help of the cloud server.$$\begin{aligned} \mathbf w _{k}=\mathbf w _{k}+\frac{\eta }{n}{\textstyle \sum \limits _{t_{i,k}\ne o_{i,k}}}{} \mathbf x _{i}o_{i,k} \end{aligned}$$
-
Step 5 In this paper, we assume that these s models synchronously achieve the convergence condition or they have the same preset threshold. After several iterations by conducting above training process, finally, the client obtains s prediction models from a set of samples for s different patterns.
-
-
Predictive stage To predict a new data record for the requester, based on the main idea in Malek and Miri (2006) we design a lightweight privacy-preserving predictive algorithm to obtain the classification result. The requester who owns the new data tuple \( \mathbf x =\left\{ x_{1}, x_{2},{\ldots } x_{n} \right\} \) will cooperate with the client owned the predictive model \( \mathbf w =\left\{ w_{1}, w_{2},{\ldots }w_{n} \right\} \) to conduct the predictive algorithm. Finally, only will the requester know the final classification result. What’s more, the client learns nothing about the requester’s input and the requester learns nothing about the model within the whole process. Especially, the predictive algorithm consists of the following three steps.
-
Step 1 Assume that \(\left\{ \alpha _{1}, \alpha _{2},\ldots \alpha _{n} \right\} \) is a basis of \(\mathbb {F}_{{p}^{n}}\) over \(\mathbb {F_{p}}\), and \(\left\{ \beta _{1}, \beta _{2},\ldots \beta _{n} \right\} \) is its dual basis. Therefore, the two vectors \(\mathbf X \) and \(\mathbf W \) can be presented over \(\mathbb {F}_{{p}^{n}}\) asThe requester randomly chooses \(\mathbf Z \in \mathbb {F}_{{p}^{n}}\), and \(a,b,c,d\in \mathbb {F}_{p}\), s.t. \((ad-bc)\ne 0\). Next, the requester locally computes the two following messages$$\begin{aligned} \mathbf X= & {} x_{1}\alpha _{1}+x_{2}\alpha _{2}+\cdots +x_{n}\alpha _{n}\\ \mathbf W= & {} w_{1}\beta _{1}+w_{2}\beta _{2}+\cdots +w_{n}\beta _{n} \end{aligned}$$Then, the requester sends the ciphertext tuple \(\left\langle \mathbf M , \mathbf N \right\rangle \) to the client for prediction.$$\begin{aligned} \mathbf M= & {} a\mathbf X +b\mathbf Z \\ \mathbf N= & {} c\mathbf X +d\mathbf Z . \end{aligned}$$
-
Step 2 On receiving the ciphertext tuple \(\left\langle \mathbf M , \mathbf N \right\rangle \) from the requester, the client who owns the model \(\mathbf w \) will computeIn the meanwhile, the client computes the trace function \(T(\mathbf W {} \mathbf M )\), \(T(\mathbf W {} \mathbf N )\) and sends them to the requester.$$\begin{aligned} \mathbf W {} \mathbf M =\mathbf W (a\mathbf X +b\mathbf Z )\\ \mathbf W {} \mathbf N =\mathbf W (c\mathbf X +d\mathbf Z ). \end{aligned}$$
-
Step 3 After receiving the trace functions from the client, the requester who owns the random numbers a, b, c, d will compute the messageSubsequently, the requester conducts the activation function, i.e., \(t=sign(o)\). Therefore, the requester can obtain the final classification result t in secure manner without privacy information leakage. The detailed predictive algorithm is depicted in Algorithm 3.$$\begin{aligned} o=(ad-bc)^{-1}(d\ T(\mathbf W {} \mathbf M )-b\ T(\mathbf W {} \mathbf N )). \end{aligned}$$
-
4.3 Correctness
-
Training stage In step 2 and step 3, on receiving the blinding result \(\hat{\mathbf{Y }}\), the client who possesses the secret keys \(\mathbf F _{1}^{-1}\) and \(\mathbf F _{3}^{-1}\) will conduct the following decryption operations.By selecting a random vector \(\mathbf r \), the client checks \(\mathbf Y {} \mathbf r \overset{?}{=}{} \mathbf X {} \mathbf W {} \mathbf r \). If the result passes verification, that means the client can derive a series of correct computational results. In addition, the rest of training tasks per round are conducted by the client locally.$$\begin{aligned} \mathbf F _{1}^{-1}\hat{\mathbf{Y }}{} \mathbf F _{3}^{-1}&= \mathbf F _{1}^{-1}\hat{\mathbf{X }}\hat{\mathbf{W }}{} \mathbf F _{3}^{-1} \\&=\mathbf F _{1}^{-1}{} \mathbf F _{1}{} \mathbf X {} \mathbf F _{2}{} \mathbf F _{2}^{-1}{} \mathbf W {} \mathbf F _{3}{} \mathbf F _{3}^{-1}\\&= \mathbf X {} \mathbf W =\mathbf Y \end{aligned}$$
-
Predictive stage Next, we will illustrate the correctness of the predictive algorithm. After receiving two trace functions \(T(\mathbf WM )\) and \(T(\mathbf WN )\) from the client, the requester privately computesLater, the requester carries out the sign function to achieve the final classification result \(t=sign(o)\).$$\begin{aligned} o&= (ad-bc)^{-1}(d\ T(\mathbf W {} \mathbf M )-b\ T(\mathbf W {} \mathbf N ))\\&=(ad-bc)^{-1}(d\ T(\mathbf W (a\mathbf X +b\mathbf Z ))\\&\quad -b\ T(\mathbf W (c\mathbf X +d\mathbf Z )))\\&=(ad-bc)^{-1}(ad-bc)T(\mathbf XW )\\&= T(\mathbf X {} \mathbf W ) \ mod\ p\\&= \mathbf x \cdot \mathbf w \end{aligned}$$
5 Security and efficiency analysis
5.1 Security analysis
Phase | Step | Entity | Computation cost | Communication cost |
---|---|---|---|---|
Initialization | – | Client | \(n+m+s\)G | – |
Training | Step 1 | Client | \(nm+ms\)M |
\(nm+ms\)
|
Step 2 | Server | nmsM |
ns
| |
Step 3 | Client | 5nsM | – | |
Step 4 | Client | \(\le n\)M | – | |
Prediction | Step 1 | Requester | 4nM | 2n |
Step 2 | Client | 2nM+2nE | 2 | |
Step 3 | Requester | 5M+1I | – |
Party | Scheme in Zhang et al. (2018) | Our scheme | |
---|---|---|---|
Computation (initialization) | Hospital (client) | \(n^{3}\)M | \((n+m+s)\)G |
Computation(training) | Hospital (client) | \(n^{3}\)M | \((nm+ms+5ns)\)M |
Cloud server | \(n^{4}\)M | nmsM | |
Verification | – | No | Yes |
Privacy in prediction | – | No | Yes |
5.2 Efficiency analysis
-
Computation overhead We will illustrate the computation cost containing three phases, initialization, training and prediction in Table 1. In the following, we will present the detailed efficiency analysis. In addition, we denoted by G an operation of generating a random number, M an operation of multiplication, E an operation of exponentiation, I an operation of inversion over finite field. In initialization phase, we call the KeyGen algorithm to generate three secret sparse matrices \(\mathbf F _{1}\in \mathbb {R}^{n\times n}\), \(\mathbf F _{2}\in \mathbb {R}^{m\times m}\) and \(\mathbf F _{3}\in \mathbb {R}^{s\times s}\), which cost \((n+m+s)\)G in total. In Step 1, in order to protect the sensitive information in training samples \(\mathbf X \) and s training models \(\mathbf W \), the client conducts encryption operations, which costs \((nm+ms)\)M. In step 2, after receiving the blinding inputs, the cloud server conducts the computation task according to the protocol. In fact, the cloud server multiplies \(\hat{\mathbf{X }}\) with \(\hat{\mathbf{W }}\) and achieves the blinding result \(\hat{\mathbf{Y }}\), which costs (nms)M. In Step 3, the client extracts the final result \(\mathbf Y \) from the blinding result \(\hat{\mathbf{Y }}\) by computing \(\mathbf F _{1}^{-1}\hat{\mathbf{Y }}{} \mathbf F _{3}^{-1}\), which costs (2ns)M. Since the cloud server is always semi-honest, it is necessary for the client to build the verification mechanism and check whether the returned result is correct or not. Therefore, the client conducts verification algorithm which costs (3ns)M to verify the result of decryption \(\mathbf Y \). In Step 4, the client conducts sign function to achieve the classification result \(t_{i,k}\) for training model k. For some incorrect classification results, the client updates the values of current model k. Especially, the computational overhead in this step is ranging from 0 to nM corresponding to the number of incorrect classification result ranging from 0 to n. So far, we have presented the detailed efficiency analysis for training process in each round. Besides, the rest of training process before arriving at the terminating condition is identical to the mentioned process. Next, we will introduce the efficiency analysis for our privacy-preserving predictive algorithm. Before submitting the data record to the client, the requester need to deal with some encryption operations, which costs (4n)M computational complexity. In the following, the client multiplies his own predictive model \(\mathbf W \) with the coming data record \(\left\langle \mathbf M , \mathbf N \right\rangle \), which costs (2n)M. In addition, in order to assist the requester with computing the classification result, the client needs to spend (2n)E to compute two trace functions \(T(\mathbf WM )\) and \(T(\mathbf WN )\). Finally, the requester computes the final result locally, which costs (5M+1I).
-
Communication overhead The communication overhead in three stages is also described in Table 1. As we can see from this table, both in training stage and predictive stage are only involved in one interaction by come-and-go manner. In training process, the client off-loads the complicated computation task to the cloud server by uploading the blinding inputs matrices \(\hat{\mathbf{X }}\) and \(\hat{\mathbf{W }}\), which costs \((nm+ms)\). The cloud server calculates the matrix multiplication and sends back the blinding result \(\hat{\mathbf{Y }}\) to the client, costing (ns). In predictive stage, the requester submits his data to predict the result with the cost of (2n). Subsequently, the client returns the two messages \(T(\mathbf WM )\) and \(T(\mathbf WN )\) to the requester with the cost of 2. Firstly, compared to the scheme in Zhang et al. (2018), our VPSPT scheme has plenty of advantages in computational cost among three phases. In Table 2, we will present the computation comparison between two schemes. We have analyzed the entire computational overhead in our scheme above, and the readers can refer to Zhang et al. (2018) for more details. We remark that the dimension of the training sample vector in Zhang et al. (2018) is also denoted by n. Secondly, in VPSPT scheme, the result returned by the cloud server can be checked by the client in order to avoid cheating by the malicious attacker. Finally, in predictive stage, both two participants’ privacy protection are considered in our scheme.
6 Performance evaluation
Time cost for training stage (ms) | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Number of attributes | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | |
Dataset A | 200 rounds | 180 | 214 | 238 | 265 | 302 | 307 | 336 | 352 | 379 | 407 | 435 |
500 rounds | 454 | 535 | 594 | 732 | 709 | 803 | 826 | 892 | 977 | 1026 | 1060 | |
Number of attributes | 4 | 10 | 16 | 22 | 28 | 34 | 40 | 46 | 52 | 58 | 64 | |
Dataset B | 200 rounds | 314 | 476 | 643 | 800 | 974 | 1166 | 1317 | 1495 | 1617 | 1860 | 2046 |
500 rounds | 804 | 1210 | 1593 | 1996 | 2419 | 2939 | 3327 | 3716 | 4187 | 4697 | 5190 |