Skip to main content
Top
Published in: Journal of Combinatorial Optimization 3/2021

21-11-2018

On the complexity and approximability of repair position selection problem

Authors: Xianmin Liu, Yingshu Li, Jianzhong Li, Yuqiang Feng

Published in: Journal of Combinatorial Optimization | Issue 3/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Inconsistent data indicates that there is conflicted information in the data, which can be formalized as the violations of given semantic constraints. To improve data quality, repair means to make the data consistent by modifying the original data. Using the feedbacks of users to direct the repair operations is a popular solution. Under the setting of big data, it is unrealistic to let users give their feedbacks on the whole data set. In this paper, the repair position selection problem (RPS for short) is formally defined and studied. Intuitively, the RPS problem tries to find an optimal set of repair positions under the limitation of repairing cost such that we can obtain consistent data as many as possible. First, the RPS problem is formalized. Then, by considering three different repair strategies, the complexities and approximabilities of the corresponding RPS problems are studied.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
In fact, most of the results in this paper can also apply to other kinds of dependency rules such as conditional functional dependency (Cong et al. 2007) and so on.
 
Literature
go back to reference Abiteboul S, Hull R, Vianu V (eds) (1995) Foundations of databases: the logical level, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston Abiteboul S, Hull R, Vianu V (eds) (1995) Foundations of databases: the logical level, 1st edn. Addison-Wesley Longman Publishing Co., Inc, Boston
go back to reference Arenas M, Bertossi L, Chomicki J (1999) Consistent query answers in inconsistent databases. In: Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS ’99, New York, NY, USA, ACM, pp 68–79 Arenas M, Bertossi L, Chomicki J (1999) Consistent query answers in inconsistent databases. In: Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS ’99, New York, NY, USA, ACM, pp 68–79
go back to reference Beskales G, Ilyas IF, Golab L, Galiullin A (2014) Sampling from repairs of conditional functional dependency violations. VLDB J 23:103–128CrossRef Beskales G, Ilyas IF, Golab L, Galiullin A (2014) Sampling from repairs of conditional functional dependency violations. VLDB J 23:103–128CrossRef
go back to reference Bohannon P, Fan W, Flaster M, Rastogi R (2005) A cost-based model and effective heuristic for repairing constraints by value modification In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, SIGMOD ’05, New York, NY, USA, ACM pp 143–154 Bohannon P, Fan W, Flaster M, Rastogi R (2005) A cost-based model and effective heuristic for repairing constraints by value modification In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, SIGMOD ’05, New York, NY, USA, ACM pp 143–154
go back to reference Bohannon P, Fan W, Geerts F, Jia X, Kementsietsidis A (2007) Conditional functional dependencies for data cleaning. In: 2007 IEEE 23rd international conference on data engineering, pp 746–755 Bohannon P, Fan W, Geerts F, Jia X, Kementsietsidis A (2007) Conditional functional dependencies for data cleaning. In: 2007 IEEE 23rd international conference on data engineering, pp 746–755
go back to reference Cai Z, Heydari M, Lin G (2006) Iterated local least squares microarray missing value imputation. J Bioinform Comput Biol 4:935–958CrossRef Cai Z, Heydari M, Lin G (2006) Iterated local least squares microarray missing value imputation. J Bioinform Comput Biol 4:935–958CrossRef
go back to reference Chiang F, Miller RJ (2011) A unified model for data and constraint repair. In: Proceedings of the 2011 IEEE 27th international conference on data engineering, ICDE ’11. IEEE Computer Society, Washington, DC, USA, pp 446–457 Chiang F, Miller RJ (2011) A unified model for data and constraint repair. In: Proceedings of the 2011 IEEE 27th international conference on data engineering, ICDE ’11. IEEE Computer Society, Washington, DC, USA, pp 446–457
go back to reference Chomicki J, Marcinkowski Jerzy (2005) Minimal-change integrity maintenance using tuple deletions. Inf Comput 197:90–121MathSciNetCrossRef Chomicki J, Marcinkowski Jerzy (2005) Minimal-change integrity maintenance using tuple deletions. Inf Comput 197:90–121MathSciNetCrossRef
go back to reference Cong G, Fan W, Geerts F, Jia X, Ma S (2007) Improving data quality: consistency and accuracy. In: Proceedings of the 33rd international conference on very large data bases, VLDB ’07, VLDB Endowment, pp 315–326 Cong G, Fan W, Geerts F, Jia X, Ma S (2007) Improving data quality: consistency and accuracy. In: Proceedings of the 33rd international conference on very large data bases, VLDB ’07, VLDB Endowment, pp 315–326
go back to reference Cong G, Fan W, Geerts F, Li J, Luo J (2012) On the complexity of view update analysis and its application to annotation propagation. IEEE Trans Knowl Data Eng 24:506–519CrossRef Cong G, Fan W, Geerts F, Li J, Luo J (2012) On the complexity of view update analysis and its application to annotation propagation. IEEE Trans Knowl Data Eng 24:506–519CrossRef
go back to reference Decker H, Martinenghi D (2011) Inconsistency-tolerant integrity checking. IEEE Trans Knowl Data Eng 23:218–234CrossRef Decker H, Martinenghi D (2011) Inconsistency-tolerant integrity checking. IEEE Trans Knowl Data Eng 23:218–234CrossRef
go back to reference Eiter T, Fink M, Greco G, Lembo D (2008) Repair localization for query answering from inconsistent databases. ACM Trans Database Syst 33:10:1–10:51CrossRef Eiter T, Fink M, Greco G, Lembo D (2008) Repair localization for query answering from inconsistent databases. ACM Trans Database Syst 33:10:1–10:51CrossRef
go back to reference Feige U, Seltser M (1997) On the densest k-subgraph problems. Technical report. The Weizmann Institute. Jerusalem, Israel Feige U, Seltser M (1997) On the densest k-subgraph problems. Technical report. The Weizmann Institute. Jerusalem, Israel
go back to reference Greco S, Sirangelo C, Trubitsyna I, Zumpano E (2003) Preferred repairs for inconsistent databases. In: Proceedings of the seventh international database engineering and applications symposium. July 2003, pp 202–211 Greco S, Sirangelo C, Trubitsyna I, Zumpano E (2003) Preferred repairs for inconsistent databases. In: Proceedings of the seventh international database engineering and applications symposium. July 2003, pp 202–211
go back to reference Kimelfeld B, Vondrák J, Woodruff DP (2013) Multi-tuple deletion propagation: approximations and complexity. Proc VLDB Endow 6:1558–1569CrossRef Kimelfeld B, Vondrák J, Woodruff DP (2013) Multi-tuple deletion propagation: approximations and complexity. Proc VLDB Endow 6:1558–1569CrossRef
go back to reference Lechtenbörger J, Vossen G (2003) On the computation of relational view complements. ACM Trans Database Syst 28:175–208CrossRef Lechtenbörger J, Vossen G (2003) On the computation of relational view complements. ACM Trans Database Syst 28:175–208CrossRef
go back to reference Li J, Liu X (2013) An important aspect of big data: data usability. J Comput Res Dev 50:1147–1162 Li J, Liu X (2013) An important aspect of big data: data usability. J Comput Res Dev 50:1147–1162
go back to reference Lopatenko A, Bravo L (2007) Efficient approximation algorithms for repairing inconsistent databases. In: 2007 IEEE 23rd international conference on data engineering, pp 216–225 Lopatenko A, Bravo L (2007) Efficient approximation algorithms for repairing inconsistent databases. In: 2007 IEEE 23rd international conference on data engineering, pp 216–225
go back to reference Lopatenko A, Leopoldo B (2006) Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In: Proceedings of the 11th international conference on database theory, ICDT’07. Springer, Berlin, pp 179–193 Lopatenko A, Leopoldo B (2006) Complexity of consistent query answering in databases under cardinality-based and incremental repair semantics. In: Proceedings of the 11th international conference on database theory, ICDT’07. Springer, Berlin, pp 179–193
go back to reference Miao D, Liu X, Li J (2016) On the complexity of sampling query feedback restricted database repair of functional dependency violations. Theor Comput Sci 609:594–605MathSciNetCrossRef Miao D, Liu X, Li J (2016) On the complexity of sampling query feedback restricted database repair of functional dependency violations. Theor Comput Sci 609:594–605MathSciNetCrossRef
go back to reference Staworko SS, Chomicki J (2010) Consistent query answers in the presence of universal constraints. Inf Syst 35:1–22CrossRef Staworko SS, Chomicki J (2010) Consistent query answers in the presence of universal constraints. Inf Syst 35:1–22CrossRef
go back to reference Wang Y, Cai Z, Stothard P, Moore S, Goebel R, Wang L, Lin G (2012) Fast accurate missing snp genotype local imputation. BMC Res Notes 5:404CrossRef Wang Y, Cai Z, Stothard P, Moore S, Goebel R, Wang L, Lin G (2012) Fast accurate missing snp genotype local imputation. BMC Res Notes 5:404CrossRef
go back to reference West DB (2001) Introduction to graph theory. Prentice Hall, New York West DB (2001) Introduction to graph theory. Prentice Hall, New York
go back to reference Wijsen J (2002) Condensed representation of database repairs for consistent query answering. In: Proceedings of the 9th international conference on database theory, ICDT ’03, London, Springer, London, UK, pp 378–393 Wijsen J (2002) Condensed representation of database repairs for consistent query answering. In: Proceedings of the 9th international conference on database theory, ICDT ’03, London, Springer, London, UK, pp 378–393
go back to reference Zhao B, Zhou H, Li G, Huang Y (2018) Zenlda: large-scale topic model training on distributed data-parallel platform. Big Data Min Anal 1:57–74CrossRef Zhao B, Zhou H, Li G, Huang Y (2018) Zenlda: large-scale topic model training on distributed data-parallel platform. Big Data Min Anal 1:57–74CrossRef
go back to reference Zhou B, Li J, Wang X, Gu Y, Xu L, Hu Y, Zhu L (2018) Online internet traffic monitoring system using spark streaming. Big Data Min Anal 1:47–56CrossRef Zhou B, Li J, Wang X, Gu Y, Xu L, Hu Y, Zhu L (2018) Online internet traffic monitoring system using spark streaming. Big Data Min Anal 1:47–56CrossRef
Metadata
Title
On the complexity and approximability of repair position selection problem
Authors
Xianmin Liu
Yingshu Li
Jianzhong Li
Yuqiang Feng
Publication date
21-11-2018
Publisher
Springer US
Published in
Journal of Combinatorial Optimization / Issue 3/2021
Print ISSN: 1382-6905
Electronic ISSN: 1573-2886
DOI
https://doi.org/10.1007/s10878-018-0362-y

Other articles of this Issue 3/2021

Journal of Combinatorial Optimization 3/2021 Go to the issue

Premium Partner