Digital provenance: Enabling secure data forensics in cloud computing

https://doi.org/10.1016/j.future.2013.10.006Get rights and content

Highlights

  • We propose a practical secure provenance scheme with fine-grained access control.

  • A broadcast encryption technique is utilized to decrease the data owner’s computational overhead.

  • An attribute-based signature is applied to realize efficient anonymous authentication.

Abstract

Secure provenance that records the ownership and process history of data objects is vital to the success of data forensics in cloud computing. In this paper, we propose a new secure provenance scheme based on group signature and attribute-based signature techniques. The proposed provenance scheme provides confidentiality on sensitive documents stored in a cloud, unforgeability of the provenance record, anonymous authentication to cloud servers, fine-grained access control on documents, and provenance tracking on disputed documents. Furthermore, it is assumed that the cloud server has huge computation capacity, while users are regarded as devices with low computation capability. Aiming at this, we show how to utilize the cloud server to outsource and decrease the user’s computational overhead during the process of provenance. With provable security techniques, we formally demonstrate the security of the proposed scheme under standard assumptions.

Introduction

Cloud computing is a promising next-generation computing paradigm which integrates multiple existing and new technologies such as virtualization and distributed computing. It provides unlimited “virtualized” resources to users as services across the Internet while abstracting the details from users. With the emergence of commercial cloud computing platforms such as Amazon’s EC2 and S3  [1], Google’s App Engine  [2], and Microsoft’s Azure  [3], cloud computing has become more a reality than just a concept  [4].

As in any existing application and system, security and privacy play an extremely important role for the success of cloud computing, and certainly raise a lot of challenges among the many others that cloud computing is confronted with. It is hard to imagine that a cloud customer, say a company, would like to store all its sensitive information on cloud computing platforms, e.g., Amazon’s S3, and put the security protection of the information at the mercy of the cloud computing operator.

Besides the confidentiality of this sensitive information, the user’s identity privacy, a fundamental right to privacy, is also expected in cloud computing. If the access to a cloud discloses a user’s real identity, the user could still be unwilling to accept this paradigm. Thus, anonymous authentication  [5] is desirable in cloud computing. Although anonymous authentication can provide privacy of a user’s identity, it is required to only provide conditional anonymity. For example, when a group of users is authorized to access a document, if some dispute arises in a modification, the real user can be tracked by some designated party.

The provenance systems  [6], [7], [8] have been developed to record provenance meta-data. Given its provenance, a data object can report who created and who modified its contents. Practical provenance systems use a specialized recording instrument to collect information about data processing at runtime. The instrument annotates data with information on the relevant operations performed on it. The ordered collection of provenance annotations becomes an unalterable record of data evolution called a provenance chain. Therefore, once a dispute arises in a document stored in a cloud, provenance is important for data forensics to provide digital evidences for post investigation. Provenance information has a wide range of critical application areas. For example, scientific data processing needs to keep track of data ownership and processing workflow to ensure the trust assigned to the output data. In business environments, provenance of documents is even more critical for regulatory and legal reasons. A company’s financial reports are required to contain provenance information on the path the data took during various stages of processing and the principals who performed various actions on it.

Therefore, cloud computing should also provide provenance  [9] to record the ownership and process history of data objects in the cloud in order to gain wide acceptance to the public. However, there are many challenges to provenance in cloud computing  [8], in which we need protect the security of provenance information, i.e., to not violate the information confidentiality and user privacy in cloud computing. Specifically, these requirements  [9] include confidentiality of documents, unforgeability of the provenance record, and conditional anonymity of the user’s identity.

Though secure provenance is vital to the success of data forensics in cloud computing, before its deployment in cloud computing, two critical issues have to be addressed, namely, (1) fine-grained access control: when a document is being created, the data owner can specify a fine-grained access control policy for the documents stored remotely in the cloud servers; (2) low computation and communication overhead at the data owner/user side: in cloud computing, the computational ability is not required to be high except for the cloud server. Actually, the devices are always assumed to be devices with low computational capability. Thus, a provenance system with low computation for data owners and users is preferred in cloud computing.

Aiming at this, we propose a practical secure provenance scheme with fine-grained access control based on the bilinear pairing technique in this paper, which can provide trusted evidence for data forensics in cloud computing. Our contribution in this paper is as follows.

  • (1)

    The computation and communication overhead for the data owner is low. Compared with the previous work  [5], two new techniques are utilized here to decrease the data owner’s computational overhead. The first is broadcast encryption, which is used by the cloud server to control the user’s access. The other is the attribute-based signature, which is computed by users, instead of data owners, as part of their access requests.

  • (2)

    The computational overhead for the data owner/user has been significantly reduced by outsourcing the cryptographic operation of exponentiation in a bilinear group. More specifically, the computation is moved from the data owner/user side to the cloud server by using the following two techniques. The first is to use the two-server model  [10] to compute the exponentiation cooperatively. The second is to use the proxy re-signature method  [11]. As a result, we significantly reduce the complexity at the user/data owner side with respect to the computation of modular exponentiation from O(k) to O(1) in terms of the number of modular multiplications required  [12], where k represents the number of bits of the exponent.

The rest of the paper is organized as follows. In Section  2, we present the related work for a secure provenance system. In Section  3, the architecture and the security model for a secure provenance system are given. In Section  4, we show some basic tools which will be used in this paper, which include the attribute-based signature scheme and the group signature scheme. In Section  5, a new and efficient secure provenance scheme is given, as well as its security analysis. We also discuss how to provide fine-grained access control and better efficiency in this section. Finally, we draw our conclusions in Section  6.

Section snippets

Related work

Provenance has been studied extensively in archival theory for the purpose of asserting authenticity. In recent years, provenance has also gained importance in digital realms and e-Science  [13], [14]. However, most schemes require trustworthiness of the server. Provenance systems that do not rely on a trusted server have also been developed  [15].

Although provenance of workflow and documents has been studied extensively in the past, very little work has been done on securing the provenance

System model

In this paper, we consider a cloud data system consisting of data owners W, data users U, a cloud server, attribute authorities A1,A2,,AN, and a third-party auditor TPA; see Fig. 1. W stores his/her sensitive data on the cloud server. U is issued attributes from A1,A2,,AN. To access and operate the remote stored data documents shared by W, user U needs to show his/her access privilege to the cloud server. The cloud server is always online and is operated by the Cloud Service Provider (CSP).

Pairing

Let G1=g1,GT be multiplicative cyclic groups of prime order p. Pairing eˆ:G1×G1GT is a bilinear map with the following properties.

  • Bilinearity: eˆ(g1a,g1b)=eˆ(g1,g1)ab for all a,bZp.

  • Non-degeneracy: eˆ(g1,g1)1.

  • Computability: It is efficient to compute eˆ(g1,g2) for all g1,g2G1.

Attribute-based signature with multi-authority

An ABS scheme consists of four algorithms: a setup algorithm Setup, private key extraction algorithm Extract, signing algorithm Sign, and verification algorithm Verify. We describe the ABS scheme with multiple

The construction

In this section, we show the construction of the proposed provenance system based on ABS. One of the reasons that we use ABS instead of ABE is to reduce the computational overhead of the data owner, which will be explained in the next section. For simplicity, we provide the construction with a basic (dk,mk)-threshold access control policy, where dk is some prefixed number for each authority Ak. We will also show how to improve this construction and support a more fine-grained tree structure. In

Conclusion

In this paper, we have proposed a new provenance system with fine-grained access control based on an ABS scheme. In this new provenance system, the anonymity of the user is guaranteed by using the techniques of group signature and ABS. Furthermore, because the user’s attribute private key is issued from multiple attribute authorities with an anonymous key-issuing protocol, the user’s privacy is also protected from the attribute authorities. The computation and communication overhead for the

Acknowledgments

We are grateful to the anonymous referees for their invaluable suggestions. This work is supported by the National Natural Science Foundation of China (Nos. 61100224, 61272455).

Jin Li received his B.S. (2002) and M.S. (2004) from Southwest University and Sun Yat-sen University, respectively, both in Mathematics. He obtained his Ph.D. degree in information security from Sun Yat-sen University in 2007. Currently, he is working at Guangzhou University. His research interests include applied cryptography and security in cloud computing (secure outsourcing computation and cloud storage).

References (29)

  • R. Buyya et al.

    Cloud computing and emerging IT platforms: vision, hype, and reality for delivering computing as the 5th utility

    Future Generation Computer Systems

    (2009)
  • Jin Li et al.

    Hidden attribute-based signatures without anonymity revocation

    Information Sciences

    (2010)
  • Amazon Web Services, AWS. Online at:...
  • Google App Engine. Online at:...
  • Microsoft Azure....
  • Rongxing Lu et al.

    Secure provenance: the essential of bread and butter of data forensics in cloud computing

  • R. Aldeco-Perez, L. Moreau, Provenance-based auditing of private data use, in: Proceedings of the 2008 international...
  • R.S. Barga et al.

    Automatic capture and efficient storage of e-Science experiment provenance

    Concurrency and Computation: Practice and Experience

    (2008)
  • R. Hasan, R. Sion, M. Winslett, Introducing secure provenance: problems and challenges, in: Proceedings of ACM workshop...
  • C.A. Lynch

    When documents deceive: trust and provenance as new factors for information retrieval in a tangled Web

    Journal of the American Society for Information Science and Technology

    (2001)
  • Susan Hohenberger et al.

    How to securely outsource cryptographic computations

  • Giuseppe Ateniese et al.

    Proxy re-signatures: new definitions, algorithms, and applications

  • D.M. Gorden

    A survey of fast exponentiation methods

    Journal of Algorithms

    (1998)
  • Y.L. Simmhan et al.

    A survey of data provenance in e-Science

    SIGMOD Record

    (2005)
  • Cited by (39)

    • Design and topological analysis of probabilistic distributed mutual exclusion algorithm with unbiased refined ordering

      2019, Future Generation Computer Systems
      Citation Excerpt :

      This results in high message complexity due to several rounds of messaging within a group as well as inter-group. In cloud computing paradigm, analysis and similarity assessment of events generated by distributed processes can be performed based on shared data requiring distributed mutex in closed group of processes [27]. In case of very large scale mobile distributed systems, formation of process groups can be analyzed in view of topology, which is inline with the topological data grouping in massive data sets [28–30].

    • Model driven design and evaluation of security level in orchestrated cloud services

      2018, Journal of Network and Computer Applications
      Citation Excerpt :

      They evaluate the effects of different mechanisms in different composition patterns. Indeed, they do not address design, but Orchestration Patterns and they realize that different compositions (orchestration) of the same security mechanisms Li et al. (2016, 2014a), result in different effects on the overall (composite) system. Anyway, all these results are not new to us and were discussed in Amato and Moscato (2017b).

    • Attribute-based cloud storage with secure provenance over encrypted data

      2018, Future Generation Computer Systems
      Citation Excerpt :

      Notice that the efficiency of the above construction can be further improved by using the technique introduced in [23], where an untrusted server (the server does not keep any secret, and thus the server’s role can be played by the cloud) is introduced to mitigate the workloads to data users involved in revocation and decryption, thereby making each data user only need to perform several exponentiations to accomplish key update and decryption. To our knowledge, besides our work, concrete constructions in [1–3] can also be used for data forensics in the setting of cloud storage. The provenance system in [1] efficiently achieves user anonymity and message confidentiality using group signature, but it was built in the inefficiency composite-order group, and it did not support expressive data access control.

    • Geometrical and topological approaches to Big Data

      2017, Future Generation Computer Systems
      Citation Excerpt :

      Users do not have time and do not want to maintain data storage and computing hardware, so the easiest way is to send data to the cloud [25]. However, this modern technology also has its limits—the volume of communication capacity and security [26,23]. Cloud computing is still considered a hot trend.

    • Cloud forensics: A centralized cloud provenance investigation system using MECC

      2024, Concurrency and Computation: Practice and Experience
    View all citing articles on Scopus

    Jin Li received his B.S. (2002) and M.S. (2004) from Southwest University and Sun Yat-sen University, respectively, both in Mathematics. He obtained his Ph.D. degree in information security from Sun Yat-sen University in 2007. Currently, he is working at Guangzhou University. His research interests include applied cryptography and security in cloud computing (secure outsourcing computation and cloud storage).

    Xiaofeng Chen received his B.S. and M.S. degrees in Mathematics from Northwest University, China. He obtained his Ph.D. degree in Cryptography from Xidian University in 2003. Currently, he is working at Xidian University as a professor. His research interests include applied cryptography and cloud security.

    Qiong Huang received his B.S. and M.S. degrees from Fudan University in 2003 and 2006, respectively, and he obtained his Ph.D. degree from City University of Hong Kong in 2010. He is currently working at the College of Informatics, South China Agricultural University. His research interests include cryptography and information security, in particular, cryptographic protocols design and analysis.

    Duncan S. Wong received his B.Eng. degree in Electrical and Electronic Engineering with first class honors from the University of Hong Kong in 1994, his M.Phil. degree in Information Engineering from the Chinese University of Hong Kong in 1998, and his Ph.D. degree in Computer Science from Northeastern University, Boston, MA, USA, in 2002. After graduation, he was a visiting assistant professor at the Chinese University of Hong Kong for one year before joining City University of Hong Kong in September 2003. He is now an associate professor in the Department of Computer Science.

    View full text