Abstract
In response to queries asked to a statistical database, the query system should avoid releasing summary statistics that could lead to the disclosure of confidential individual data. Attacks to the security of a statistical database may be direct or indirect and, in order to repel them, the query system should audit queries by controlling the amount of information released by their responses. This paper focuses on sum-queries with a response variable of nonnegative real type and proposes a compact representation of answered sum-queries, called an information model in “normal form,” which allows the query system to decide whether the value of a new sum-query can or cannot be safely answered. If it cannot, then the query system will issue the range of feasible values of the new sum-query consistent with previously answered sum-queries. Both the management of the information model and the answering procedure require solving linear-programming problems and, since standard linear-programming algorithms are not polynomially bounded (despite their good performances in practice), effective procedures that make a parsimonious use of them are stated for the general case. Moreover, in the special case that the information model is “graphical.” It is shown that the answering procedure can be implemented in polynomial time.
- Adam, N. R. and Wortmann, J. C. 1989. Security control-methods for statistical databases: a comparative study. ACM Computing Surveys 21, 515--556. Google Scholar
- Ahuja, R. K., Magnanti, T. L., and Orlin, J. B. 1993. Network Flows. Prentice Hall, Englewood Cliffs, NJ.Google Scholar
- Chang Chen, M. and L. McNamee, L. 1989a. A model of summary data and its applications to statistical databases. In Proceedings of the 4th International Conference on Statistical and Scientific Database Management, G. Goos and J. Hatmanis, Eds. Lecture Notes in Computer Sciences vol. 339. 354--372. Google Scholar
- Chang Chen, M., L. McNamee, L., and Melkanoff, M. 1989b. On the data model and access method of summary data management. IEEE Trans. Knowledge Data Engineering 1, 519--529. Google Scholar
- Chin, F. 1986. Security problems on inference control for SUM, MAX, and MIN queries. J. ACM 33, 451--464. Google Scholar
- Chin, F. Y. and Ozsoyoglu, G. 1982. Auditing and inference control in statistical databases. IEEE Trans. Software Engineering 8, 574--582.Google Scholar
- Chvátal, V. 1983. Linear Programming. Freeman, New York.Google Scholar
- Cox, L. H. 1980. Suppression methodology and statistical disclosure control. J. American Statistical Association 75, 377--385.Google Scholar
- Cox, L. H. and Zayatz, L. V. 1995. An agenda for research on statistical disclosure limitation. J. Official Statistics 11, 205--220.Google Scholar
- Dobra, A. and Fienberg, S. E. 2000. Bounds for cell entries in contingency tables given the marginal totals and decomposable graphs. In Proc. Nat. Acad. Sci. USA 97, 11885--11892.Google Scholar
- Duncan, G. T., Fienberg, S. E., Krishnan, R., Padman, R., and Roehrig, S. F. 2001. Disclosure limitation methods and information loss for tabular data. In Confidentiality, Disclosure and Data Access, P. Doyle, J. Lane, J. Theeuwes, L. Zayatz, Eds. Elsevier, New York, 135--166.Google Scholar
- Gusfield, D. 1988. A graph-theoretic approach to statistical data security. SIAM J. Computing 17, 552--571. Google Scholar
- Kleinberg, J. M., Papadimitriou, C. H., and Raghavan, P. 2000. Auditing Boolean attributes. In Proceedings of the 19th ACM Symposium on Principles of Database Systems. 86--91. Google Scholar
- Maier, D. 1983. The Theory of Relational Databases. Computer Science Press, Rockville, IL. Google Scholar
- Malvestuto, F. M. 1993. A universal-schema approach to statistical databases containing homogeneous summary tables. ACM Trans. Database Systems 18, 678--708. Google Scholar
- Malvestuto, F. M. and Mezzini, M. 2001. On the hardness of protecting sensitive information in a statistical database. In Proceedings of the World Multiconference on Systemics, Cybernetics and Informatics, vol. XIV. 504--509.Google Scholar
- Malvestuto, F. M. and Mezzini, M. 2002. A linear algorithm for finding the invariant edges of an edge-weighted graph. SIAM J. Computing 31, 1438--1455. Google Scholar
- Malvestuto, F. and M., Mezzini, M. 2003. Auditing sum-queries. In Proceedings of the International Conference on Database Theory. Lecture Notes in Computer Sciences. 504--509. Google Scholar
- Malvestuto, F. M. and Mezzini, M. 2004. Privacy preserving and data mining in an on-line statistical database of additive type. In Proceedings of the International Conference on Privacy in Statistical Databases, Barcelona.Google Scholar
- Malvestuto, F. M. and Moscarini, M. 1990. Query evaluability in statistical databases. IEEE Trans. Knowledge and Data Engineering 2, 425--430. Google Scholar
- Malvestuto, F. M. and Moscarini, M. 1999. An audit expert for large statistical databases. In Statistical Data Protection, EUROSTAT. 29--43.Google Scholar
- Malvestuto, F. M. and Moscarini, M. 2003. Privacy in multidimensional databases. In Multidimensional Databases, M. Rafanelli, Ed., Idea Group Pub., Hershey, PA. 310--360. Google Scholar
- Schrijver, A. 1986. Theory of Linear and Integer Programming. Wiley, New York. Google Scholar
- Wang, L., Wijekera, D., and Jajodia, S. 2002. Cardinality-based inference control in sum-only data cubes. In Proc. Europ. Symp. Computer Security (ESORICS 2002). Lecture Notes in Computer Science, vol. 2502. Springer-Verlag, New York. 55--71. Google Scholar
- Wang, L., Wijekera, D., and Jajodia, S. 2004. Cardinality-based inference control in datacubes. J. Comp. Security 12, 655--692. Google Scholar
- Willenborg, L. and de Waal, T. 1996. Statistical Disclosure Control in Practice. Lecture Notes in Statistics, vol. 111. Springer-Verlag, New York.Google Scholar
- Willenborg, L. and de Waal, T. 2000. Elements of Statistical Disclosure. Lecture Notes in Statistics, 155. Springer-Verlag, New York.Google Scholar
- Zhang, N., Zhao, W., and Chen, J. 2004. Cardinality-based inference control in OLAP systems: an information theoretic approach. In Proceedings of the ACM International Workshop on Data Warehousing and OLAP (DOLAP 2004). 59--64. Google Scholar
Index Terms
- Auditing sum-queries to make a statistical database secure
Recommendations
Auditing for secure statistical databases
ACM '81: Proceedings of the ACM '81 conferenceA statistical database (SDB) is an ordinary database that returns statistical information to user queries. The security problem for the SDB is to control the use of the SDB so that only statistical information is available and no sequence of queries is ...
Statistical Database Auditing Without Query Denial Threat
<P>Statistical database auditing is the process of checking aggregate queries that are submitted in a continuous manner, to prevent inference disclosure. Compared to other data protection mechanisms, auditing has the features of flexibility and maximum ...
Statistical Database Auditing Without Query Denial Threat
<P>Statistical database auditing is the process of checking aggregate queries that are submitted in a continuous manner, to prevent inference disclosure. Compared to other data protection mechanisms, auditing has the features of flexibility and maximum ...
Comments