Skip to main content
Top

2022 | Book

Guide to Data Privacy

Models, Technologies, Solutions

insite
SEARCH

About this book

Data privacy technologies are essential for implementing information systems with privacy by design.

Privacy technologies clearly are needed for ensuring that data does not lead to disclosure, but also that statistics or even data-driven machine learning models do not lead to disclosure. For example, can a deep-learning model be attacked to discover that sensitive data has been used for its training? This accessible textbook presents privacy models, computational definitions of privacy, and methods to implement them. Additionally, the book explains and gives plentiful examples of how to implement—among other models—differential privacy, k-anonymity, and secure multiparty computation.

Topics and features:

Provides integrated presentation of data privacy (including tools from statistical disclosure control, privacy-preserving data mining, and privacy for communications)Discusses privacy requirements and tools for different types of scenarios, including privacy for data, for computations, and for usersOffers characterization of privacy models, comparing their differences, advantages, and disadvantagesDescribes some of the most relevant algorithms to implement privacy modelsIncludes examples of data protection mechanisms

This unique textbook/guide contains numerous examples and succinctly and comprehensively gathers the relevant information. As such, it will be eminently suitable for undergraduate and graduate students interested in data privacy, as well as professionals wanting a concise overview.

Vicenç Torra is Professor with the Department of Computing Science at Umeå University, Umeå, Sweden.

Table of Contents

Frontmatter
1. Introduction
Abstract
Large amounts of data are collected and processed nowadays. Sensitive information is present in these data, or can be inferred from them. Data privacy is to ensure that disclosure of sensitive information does not take place. In this chapter we give an introduction to the field. We describe the motivations for data privacy, underline the links between data privacy and the society, and review terminology and concepts.
Vicenç Torra
2. Machine and Statistical Learning, and Cryptography
Abstract
This  chapter reviews main concepts on machine and statistical learning, as well on cryptography that are needed in the rest of the book. Algorithms for supervised and unsupervised learning are described. They include regression, clustering, and association rule mining. Some additional tools as indices to compare indices are also described. A summary of most important cryptographic concepts is also included. For example, private-key and public-key cryptography as well as homomorphic encryption is described.
Vicenç Torra
3. Disclosure, Privacy Models, and Privacy Mechanisms
Abstract
This chapter describes the different types of disclosure that can take place in data and data-driven model releases. They are, mainly, of two types: identity and attribute disclosure. Then, we formalize privacy models. That is, computational definitions of privacy. These privacy models include, among others, k-anonymity, differential privacy, and secure multiparty computation. Finally, we give a roadmap of the privacy mechanisms and relate them with the privacy models.
Vicenç Torra
4. Privacy for Users
Abstract
User’s privacy provides tools to users to help them to protect the information that is sensitive. In this chapter we consider tools for privacy in communications and for privacy in information retrieval. For each scenario, we consider protecting the identity of the user and protecting their data. We describe concepts as anonymity systems (e.g. Tor) and private information retrieval.
Vicenç Torra
5. Privacy for Computations, Functions, and Queries
Abstract
This chapter presents different protection mechanisms that apply when we know which is the function we want to compute, and we want to avoid disclosure from the outcome of this function. Different mechanisms have been developed to provide guarantees for the different privacy models that apply in these scenarios. We structure the chapter in terms of the privacy models including sections on differential privacy and secure multiparty computation.
Vicenç Torra
6. Privacy for Data: Masking Methods
Abstract
This chapter describes major methods for protecting databases. This includes perturbative and non-perturbative methods, as well as synthetic data generators. This review includes rank swapping, microaggregation, additive and multiplicative noise, PRAM, and generalization. We also describe the use of GANs to generate synthetic data. The chapter also includes a discussion on methods for achieving k-anonymity and methods appropriate for big data.
Vicenç Torra
7. Selection of a Data Protection Mechanism: Information Loss and Risk
Abstract
Masking methods produce a distorted version of the data. This distortion depends on the method as well as of its parameterization. Data utility or the information loss caused by the method can help on method and parameter selection. Disclosure risk may be another element to take also into account. In this chapter we give an overview of information loss measures, and on method selection. Some of the ideas that appear here are useful for protection mechanisms other than masking methods. We complete the chapter with a discussion of data protection in machine learning and federated learning. Federated learning is a very good example to illustrate the difficulties related to data protection mechanism selection.
Vicenç Torra
8. Other Data-Driven Mechanisms
Abstract
In this chapter we describe privacy mechanisms for two additional types of protections. We describe result-driven approaches with examples on rule mining. We describe how we can modify a database so that some rules cannot be extracted once the protected database is published. We introduce tabular data protection. We understand tabular data as aggregates of data in table form. We describe rules to detect when a cell in the table is sensitive and two approaches for tabular data protection.
Vicenç Torra
9. Conclusions
Abstract
This book has given an introduction to data privacy. We have presented the main areas and some of the methods and tools to ensure privacy and avoid disclosure. We have tried to show the difficulties of correctly assessing disclosure risk and provided two examples, already in the first chapter, that are paradigmatic of the problems we encounter on building privacy-aware systems. In this chapter we provide some guidelines for implementing privacy.
Vicenç Torra
Backmatter
Metadata
Title
Guide to Data Privacy
Author
Vicenç Torra
Copyright Year
2022
Electronic ISBN
978-3-031-12837-0
Print ISBN
978-3-031-12836-3
DOI
https://doi.org/10.1007/978-3-031-12837-0

Premium Partner