ABSTRACT
Motion sensors such as accelerometers and gyroscopes measure the instant acceleration and rotation of a device, in three dimensions. Raw data streams from motion sensors embedded in portable and wearable devices may reveal private information about users without their awareness. For example, motion data might disclose the weight or gender of a user, or enable their re-identification. To address this problem, we propose an on-device transformation of sensor data to be shared for specific applications, such as monitoring selected daily activities, without revealing information that enables user identification. We formulate the anonymization problem using an information-theoretic approach and propose a new multi-objective loss function for training deep autoencoders. This loss function helps minimizing user-identity information as well as data distortion to preserve the application-specific utility. The training process regulates the encoder to disregard user-identifiable patterns and tunes the decoder to shape the output independently of users in the training set. The trained autoencoder can be deployed on a mobile or wearable device to anonymize sensor data even for users who are not included in the training dataset. Data from 24 users transformed by the proposed anonymizing autoencoder lead to a promising trade-off between utility and privacy, with an accuracy for activity recognition above 92% and an accuracy for user identification below 7%.
- Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308--318. Google ScholarDigital Library
- Yoshua Bengio. 2009. Learning deep architectures for AI. Foundations and trends® in Machine Learning 2, 1 (2009), 1--127. Google ScholarDigital Library
- David S Broomhead and Gregory P King. 1986. Extracting qualitative dynamics from experimental data. Physica D: Nonlinear Phenomena 20, 2--3 (1986), 217--236. Google ScholarDigital Library
- J. Burke, D. Estrin, M. Hansen, A. Parker, N. Ramanathan, S. Reddy, and M. B. Srivastava. 2006. Participatory sensing. In Workshop on World-Sensor-Web (WSW'06): Mobile Device Centric Sensor Networks and Applications. 117--134.Google Scholar
- John Duchi, Martin J Wainwright, and Michael I Jordan. 2013. Local privacy and minimax bounds: Sharp rates for probability estimation. In Advances in Neural Information Processing Systems. 1529--1537. Google ScholarDigital Library
- Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N Rothblum. 2010. Differential privacy under continual observation. In Proceedings of the forty-second ACM symposium on Theory of computing. ACM, 715--724. Google ScholarDigital Library
- Harrison Edwards and Amos Storkey. 2016. Censoring Representations with an Adversary. In International Conference in Learning Representations (ICLR2016).Google Scholar
- Jonas Gehring, Yajie Miao, Florian Metze, and Alex Waibel. 2013. Extracting deep bottleneck features using stacked auto-encoders. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on. IEEE, 3377--3381.Google Scholar
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680. Google ScholarDigital Library
- Jihun Hamm. 2017. Minimax filter: learning to preserve privacy from inference attacks. The Journal of Machine Learning Research 18, 1 (2017), 4704--4734. Google ScholarDigital Library
- Chong Huang, Peter Kairouz, Xiao Chen, Lalitha Sankar, and Ram Rajagopal. 2017. Context-aware generative adversarial privacy. Entropy 19, 12 (2017), 656.Google ScholarCross Ref
- Eric Jones, Travis Oliphant, Pearu Peterson, et al. 2001--. SciPy: Open source scientific tools for Python. (2001--). http://www.scipy.org/Google Scholar
- Kleomenis Katevas, Hamed Haddadi, and Laurissa Tokarchuk. 2014. Poster: Sensingkit: A multi-platform mobile sensing framework for large-scale experiments. In Proceedings of the 20th annual international conference on Mobile computing and networking. ACM, 375--378. Google ScholarDigital Library
- Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
- Yann LeCun and Yoshua Bengio. 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks 3361, 10 (1995), 1995. Google ScholarDigital Library
- Changchang Liu, Supriyo Chakraborty, and Prateek Mittal. 2017. DEEProtect: Enabling Inference-based Access Control on Mobile Sensing Applications. arXiv preprint arXiv: 1702.06159 (2017).Google Scholar
- Chris YT Ma and David KY Yau. 2015. On information-theoretic measures for quantifying privacy protection of time-series data. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security. ACM, 427--438. Google ScholarDigital Library
- Sumit Majumder, Emad Aghayi, Moein Noferesti, Hamidreza Memarzadeh-Tehran, Tapas Mondal, Zhibo Pang, and M Deen. 2017. Smart Homes for Elderly Healthcare - Recent Advances and Research Challenges. Sensors 17, 11 (2017), 2496.Google ScholarCross Ref
- Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv preprint arXiv: 1511.05644 (2015).Google Scholar
- Mohammad Malekzadeh, Richard G Clegg, Andrea Cavallaro, and Hamed Haddadi. 2018. Protecting sensory data against sensitive inferences. In Proceedings of the 1st Workshop on Privacy by Design in Distributed Systems. ACM, 2. Google ScholarDigital Library
- Mohammad Malekzadeh, Richard G Clegg, and Hamed Haddadi. 2018. Replacement autoencoder: A privacy-preserving algorithm for sensory data analysis. In Internet-of-Things Design and Implementation (IoTDI), 2018 IEEE/ACM Third International Conference on. IEEE, 165--176.Google ScholarCross Ref
- Jonathan Masci, Ueli Meier, Dan Cireşan, and Jürgen Schmidhuber. 2011. Stacked convolutional auto-encoders for hierarchical feature extraction. In International Conference on Artificial Neural Networks. Springer, 52--59. Google ScholarDigital Library
- Natalia Neverova, Christian Wolf, Griffin Lacey, Lex Fridman, Deepak Chandra, Brandon Barbello, and Graham Taylor. 2016. Learning human identity from motion patterns. IEEE Access 4 (2016), 1810--1820.Google ScholarCross Ref
- Seyed Ali Osia, Ali Taheri, Ali Shahin Shamsabadi, Kleomenis Katevas, Hamed Haddadi, and Hamid R Rabiee. 2019. Deep Private-Feature Extraction. IEEE Transactions on Knowledge and Data Engineering.Google Scholar
- Borzoo Rassouli and Deniz Gündüz. 2018. Optimal Utility-Privacy Trade-off with the Total Variation Distance as the Privacy Measure. arXiv preprint arXiv: 1801.02505 (2018).Google Scholar
- Nisarg Raval, Ashwin Machanavajjhala, and Jerry Pan. 2019. Olympus: Sensor Privacy through Utility Aware Obfuscation. Proceedings on Privacy Enhancing Technologies 2019, 1 (2019), 5--25.Google ScholarCross Ref
- Stan Salvador and Philip Chan. 2007. Toward accurate dynamic time warping in linear time and space. Intelligent Data Analysis 11, 5 (2007), 561--580. Google ScholarDigital Library
- Lalitha Sankar, S Raj Rajagopalan, and H Vincent Poor. 2013. Utility-privacy tradeoffs in databases: An information-theoretic approach. IEEE Transactions on Information Forensics and Security 8, 6 (2013), 838--852. Google ScholarDigital Library
- Sandra Servia-Rodríguez, Liang Wang, Jianxin R Zhao, Richard Mortier, and Hamed Haddadi. 2018. Privacy-Preserving Personal Model Training. In Internet-of-Things Design and Implementation (IoTDI), 2018 IEEE/ACM Third International Conference on. IEEE, 153--164.Google ScholarCross Ref
- Ali Shahin Shamsabadi, Hamed Haddadi, and Andrea Cavallaro. 2018. Distributed One-class Learning. In IEEE International Conference on Image Processing (icip 18). IEEE.Google Scholar
- Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929--1958. Google ScholarDigital Library
- Jun Tang, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and Xiaofeng Wang. 2017. Privacy Loss in Apple's Implementation of Differential Privacy on macOS 10.12. arXiv preprint arXiv: 1709.02753 (2017).Google Scholar
- Apple Differential Privacy Team. 2017. Learning with privacy at scale. Online at: https://machinelearning.apple.com/2017/12/06/learning-with-privacy-at-scale.html (2017).Google Scholar
- Ardhendu Tripathy, Ye Wang, and Prakash Ishwar. 2017. Privacy-Preserving Adversarial Networks. arXiv preprint arXiv: 1712.07008 (2017).Google Scholar
- Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. Extracting and Composing Robust Features with Denoising Autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, 1096--1103. Google ScholarDigital Library
- Jun Wang, Shubo Liu, and Yongkai Li. 2015. A review of differential privacy in individual data release. International Journal of Distributed Sensor Networks 11, 10 (2015), 259682. Google ScholarDigital Library
- Fengjun Xiao, Mingming Lu, Ying Zhao, Soumia Menasria, Dan Meng, Shang-sheng Xie, Juncai Li, and Chengzhi Li. 2018. An information-aware visualization for privacy-preserving accelerometer data sharing. Human-centric Computing and Information Sciences 8, 1 (2018), 13. Google ScholarDigital Library
- Jianbo Yang, Minh Nhut Nguyen, Phyo Phyo San, Xiao Li Li, and Shonali Krishnaswamy. 2015. Deep convolutional neural networks on multichannel time series for human activity recognition. In Twenty-Fourth International Joint Conference on Artificial Intelligence. Google ScholarDigital Library
Index Terms
- Mobile sensor data anonymization
Recommendations
Spectral Anonymization of Data
The goal of data anonymization is to allow the release of scientifically useful data in a form that protects the privacy of its subjects. This requires more than simply removing personal identifiers from the data because an attacker can still use ...
Fast data anonymization with low information loss
VLDB '07: Proceedings of the 33rd international conference on Very large data basesRecent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-...
Comments