nach oben

2010 | Buch

Kapitel lesen Erstes Kapitel lesen

Secure Data Management

7th VLDB Workshop, SDM 2010, Singapore, September 17, 2010. Proceedings

herausgegeben von: Willem Jonker, Milan Petković

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The VLDB Secure Data Management Workshop was held for the 7th time this year. The topic of data security remains an important area of research especially due to the growing proliferation of data in open environments as a result of emerging data services such as cloud computing, location based services, and health-related services. Con?dentiality is the main driving force behind the - search that covers topics such as privacy enhancing technologies, access control, and search in encrypted data. We received 20 submissions from which the program committee selected 10 papers to be presented at the workshop and included in the proceedings (50% acceptancerate). In addition, we areproud that Elisa Bertino accepted our in- tation to give a keynote for which she selected the topic of data trustworthiness. We hope the papers collected in this volume will stimulate your research in this area. The regular papers in the proceeding have been grouped into two sections. The?rstsectionfocusesonprivacy.Thepapersinthissectionpresentabalanced mix of theoretical work on anonymity and application-oriented work. Thesecondsectionfocusesondatasecurityinopenenvironments.Thepapers address issues related to the management of con?dential data that is stored in or released to open environments, such as, for example, in cloud computing. We wish to thank all the authors of submitted papers for their high-quality submissions. We would also like to thank the program committee members as well as additional referees for doing an excellent review job. Finally, let us - knowledge the work of Luan Ibraimi, who helped in the technical preparation of the proceedings.

Inhaltsverzeichnis

Frontmatter

Keynote Paper

Assuring Data Trustworthiness - Concepts and Research Challenges

Abstract

Today, more than ever, there is a critical need to share data within and across organizations so that analysts and decision makers can analyze and mine the data, and make effective decisions. However, in order for analysts and decision makers to produce accurate analysis and make effective decisions and take actions, data must be trustworthy. Therefore, it is critical that data trustworthiness issues, which also include data quality, provenance and lineage, be investigated for organizational data sharing, situation assessment, multi-sensor data integration and numerous other functions to support decision makers and analysts. The problem of providing trustworthy data to users is an inherently difficult problem that requires articulated solutions combining different methods and techniques. In the paper we first elaborate on the data trustworthiness challenge and discuss a framework to address this challenge. We then present an initial approach for assess the trustworthiness of streaming data and discuss open research directions.

Elisa Bertino, Hyo-Sang Lim

Privacy Protection

On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization

Abstract

We present in this paper a method for dynamically creating hierarchies for quasi-identifier numerical attributes. The resulting hierarchies can be used for generalization in microdata k-anonymization, or for allowing users to define generalization boundaries for constrained k-anonymity. The construction of a new numerical hierarchy for a numerical attribute is performed as a hierarchical agglomerative clustering of that attribute’s values in the dataset to anonymize. Therefore, the resulting tree hierarchy reflects well the closeness and clustering tendency of the attribute’s values in the dataset. Due to this characteristic of the hierarchies created on-the-fly for quasi-identifier numerical attributes, the quality of the microdata anonymized through generalization based on these hierarchies is well preserved, and the information loss in the anonymization process remains in reasonable bounds, as proved experimentally.

Alina Campan, Nicholas Cooper

eM2: An Efficient Member Migration Algorithm for Ensuring k-Anonymity and Mitigating Information Loss

Abstract

Privacy preservation (PP) has become an important issue in the information age to prevent expositions and abuses of personal information. This has attracted much research and k-anonymity is a well-known and promising model invented for PP. Based on the k-anonymity model, this paper introduces a novel and efficient member migration algorithm, called eM², to ensure k-anonymity and avoid information loss as much as possible, which is the crucial weakness of the model. In eM², we do not use the existing generalization and suppression technique. Instead we propose a member migration technique that inherits advantages and avoids disadvantages of existing k-anonymity-based techniques. Experimental results with real-world datasets show that eM² is superior to other k-anonymity algorithms by an order of magnitude.

Phuong Huynh Van Quoc, Tran Khanh Dang

Constrained Anonymization of Production Data: A Constraint Satisfaction Problem Approach

Abstract

The use of production data which contains sensitive information in application testing requires that the production data be anonymized first. The task of anonymizing production data becomes difficult since it usually consists of constraints which must also be satisfied in the anonymized data. We propose a novel approach to anonymize constrained production data based on the concept of constraint satisfaction problems. Due to the generality of the constraint satisfaction framework, our approach can support a wide variety of mandatory integrity constraints as well as constraints which ensure the similarity of the anonymized data to the production data. Our approach decomposes the constrained anonymization problem into independent sub-problems which can be represented and solved as constraint satisfaction problems (CSPs). Since production databases may contain many records that are associated by vertical constraints, the resulting CSPs may become very large. Such CSPs are further decomposed into dependant sub-problems that are solved iteratively by applying local modifications to the production data. Simulations on synthetic production databases demonstrate the feasibility of our method.

Ran Yahalom, Erez Shmueli, Tomer Zrihen

Privacy Preserving Event Driven Integration for Interoperating Social and Health Systems

Abstract

Processes in healthcare and socio-assistive domains typically span multiple institutions and require cooperation and information exchange among multiple IT systems. In most cases this cooperation today is handled ”manually” via document exchange (by email, post, or fax) and in a point-to-point fashion. One of the reasons that makes it difficult to implement an integrated solution is that of privacy, as health information is often sensitive and there needs to be a tight control on which information is sent to who and on the purpose for which it is requested and used. In this paper we report on how we approached this problem and on the lessons learned from designing and deploying a solution for monitoring multi-organization healthcare processes in Italy. The key idea lies in combining a powerful monitoring and integration paradigm, that of event bus and publish/subscribe systems on top of service-oriented architectures, with a simple but flexible privacy mechanism based on publication of event summaries and then on explicit requests for details by all interested parties. This approach was the first to overcome the privacy limitations defined by the laws while allowing publish/subscribe event-based integration.

Giampaolo Armellin, Dario Betti, Fabio Casati, Annamaria Chiasera, Gloria Martinez, Jovan Stevovic

Data Security in Open Environments

Joining Privately on Outsourced Data

Abstract

In an outsourced database framework, clients place data management with specialized service providers. Of essential concern in such frameworks is data privacy. Potential clients are reluctant to outsource sensitive data to a foreign party without strong privacy assurances beyond policy “fine–prints”. In this paper we introduce a mechanism for executing general binary JOIN operations (for predicates that satisfy certain properties) in an outsourced relational database framework with full computational privacy and low overheads – a first, to the best of our knowledge. We illustrate via a set of relevant instances of JOIN predicates, including: range and equality (e.g., for geographical data), Hamming distance (e.g., for DNA matching) and semantics (i.e., in health-care scenarios – mapping antibiotics to bacteria). We experimentally evaluate the main overhead components and show they are reasonable. For example, the initial client computation overhead for 100000 data items is around 5 minutes. Moreover, our privacy mechanisms can sustain theoretical throughputs of over 30 million predicate evaluations per second, even for an un-optimized OpenSSL based implementation.

Bogdan Carbunar, Radu Sion

Computationally Efficient Searchable Symmetric Encryption

Abstract

Searchable encryption is a technique that allows a client to store documents on a server in encrypted form. Stored documents can be retrieved selectively while revealing as little information as possible to the server. In the symmetric searchable encryption domain, the storage and the retrieval are performed by the same client. Most conventional searchable encryption schemes suffer from two disadvantages. First, searching the stored documents takes time linear in the size of the database, and/or uses heavy arithmetic operations. Secondly, the existing schemes do not consider adaptive attackers; a search-query will reveal information even about documents stored in the future. If they do consider this, it is at a significant cost to the performance of updates. In this paper we propose a novel symmetric searchable encryption scheme that offers searching at constant time in the number of unique keywords stored on the server. We present two variants of the basic scheme which differ in the efficiency of search and storage. We show how each scheme could be used in a personal health record system.

Peter van Liesdonk, Saeed Sedghi, Jeroen Doumen, Pieter Hartel, Willem Jonker

Towards the Secure Modelling of OLAP Users’ Behaviour

Abstract

Information Security is a crucial aspect for organizations, and must be considered during the development of Information Systems. The data in Data Warehouses (DWs) are highly sensitive since they manage historical information which is used to make strategic decisions, and security constraints should therefore be included in DW modelling within its structural aspects. However, another dynamic security component is also related to the sequences of OLAP (On-Line Analytical Processing) operations, and could be used to access (or infer) unauthorized information. This paper complements the modelling of DWs with state models, which permit the modelling of these dynamic situations in which sensitive information could be inferred. That is, it models queries that include security issues, and controls that their evolution through the application of OLAP operations always leads to authorized states. Finally, our proposal has been applied to a healthcare case study in which a DW manages admissions information with various security constraints.

Carlos Blanco, Eduardo Fernández-Medina, Juan Trujillo, Jan Jurjens

A Formal P3P Semantics for Composite Services

Abstract

As online services are moving from the single service to the composite service paradigm, privacy is becoming an important issue due to the amount of user data being collected and stored. The Platform for Privacy Preferences (P3P) was defined to provide privacy protection by enabling services to express their privacy practices, which in turn helps users decide whether to use the services or not. However, P3P was designed for the single service model, bringing some challenges when employing it with composite services. Moreover the P3P language may lead to misinterpretation by P3P user agents due to its flexibility and may have internal semantic inconsistencies due to a lack of clear semantics. Therefore, we enhance P3P to be able to support composite services, propose a formal semantic for P3P to preserve semantic consistency, and also define combining methods to obtain the privacy policies of composite services.

Assadarat Khurat, Dieter Gollmann, Joerg Abendroth

A Geometric Approach for Efficient Licenses Validation in DRM

Abstract

In DRM systems contents are distributed from the owner to consumers, often through multiple middle level distributors. The owner issues redistribution licenses to its distributors. The distributors using their received redistribution licenses can generate and issue new redistribution licenses to their sub-distributors and new usage licenses to consumers. For the rights violation detection, all the newly generated licenses must be validated. The validation process becomes complex when there exist multiple redistribution licenses for a content with the distributors. In such cases, it requires the validation using an exponential number of validation equations, which makes the validation process much computation-intensive. Thus to do the validation efficiently, in this paper we propose a method to geometrically derive the relationship between different validation equations to identify the redundant validation equations. These redundant validation equations are then removed using graph theory concepts. Experimental results show that the validation time can be significantly reduced using our proposed approach.

Amit Sachan, Sabu Emmanuel, Mohan S. Kankanhalli

Differentially Private Data Release through Multidimensional Partitioning

Abstract

Differential privacy is a strong notion for protecting individual privacy in privacy preserving data analysis or publishing. In this paper, we study the problem of differentially private histogram release based on an interactive differential privacy interface. We propose two multidimensional partitioning strategies including a baseline cell-based partitioning and an innovative kd-tree based partitioning. In addition to providing formal proofs for differential privacy and usefulness guarantees for linear distributive queries , we also present a set of experimental results and demonstrate the feasibility and performance of our method.

Yonghui Xiao, Li Xiong, Chun Yuan

Backmatter

Titel: Secure Data Management
herausgegeben von: Willem Jonker
Milan Petković
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-15546-8
Print ISBN: 978-3-642-15545-1
DOI: https://doi.org/10.1007/978-3-642-15546-8

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Keynote Paper

Assuring Data Trustworthiness - Concepts and Research Challenges

Privacy Protection

On-the-Fly Hierarchies for Numerical Attributes in Data Anonymization

eM2: An Efficient Member Migration Algorithm for Ensuring k-Anonymity and Mitigating Information Loss

Constrained Anonymization of Production Data: A Constraint Satisfaction Problem Approach

Privacy Preserving Event Driven Integration for Interoperating Social and Health Systems

Data Security in Open Environments

Joining Privately on Outsourced Data

Computationally Efficient Searchable Symmetric Encryption

Towards the Secure Modelling of OLAP Users’ Behaviour

A Formal P3P Semantics for Composite Services

A Geometric Approach for Efficient Licenses Validation in DRM

Differentially Private Data Release through Multidimensional Partitioning

Backmatter

Premium Partner