Skip to main content
Erschienen in: The VLDB Journal 3/2021

23.02.2021 | Regular Paper

Cleaning timestamps with temporal constraints

verfasst von: Shaoxu Song, Ruihong Huang, Yue Cao, Jianmin Wang

Erschienen in: The VLDB Journal | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Timestamps are often found to be dirty in various scenarios, e.g., in distributed systems with clock synchronization problems or unreliable RFID readers. Without cleaning the imprecise timestamps, temporal-related applications such as provenance analysis or pattern queries are not reliable. To evaluate the correctness of timestamps, temporal constraints could be employed, which declare the distance restrictions between timestamps. Guided by such constraints on timestamps, in this paper, we study a novel problem of repairing inconsistent timestamps that do not conform to the required temporal constraints. Following the same line of data repairing, the timestamp repairing problem is to minimally modify the timestamps towards satisfaction of temporal constraints. This problem is practically challenging, given the huge space of possible timestamps. We tackle the problem by identifying a concise set of promising candidates, where an optimal repair solution can always be found. Repair algorithms with efficient pruning are then devised over the identified candidates. Approximate solutions are also presented including simple heuristic and linear programming (LP) relaxation. Experiments on real datasets demonstrate the superiority of our proposal compared to the state-of-the-art approaches.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Referring to Proposition 13.
 
Literatur
7.
Zurück zum Zitat Barga, R.S., Goldstein, J., Ali, M.H., Hong, M.: Consistent streaming through time: a vision for event stream processing. In: CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, 7–10 Jan 2007, Online Proceedings, pp. 363–374 (2007) Barga, R.S., Goldstein, J., Ali, M.H., Hong, M.: Consistent streaming through time: a vision for event stream processing. In: CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, 7–10 Jan 2007, Online Proceedings, pp. 363–374 (2007)
8.
Zurück zum Zitat Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRef Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)CrossRef
9.
Zurück zum Zitat Bohannon, P., Flaster, M., Fan, W., Rastogi, R.: A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, 14–16 June 2005, pp. 143–154 (2005) Bohannon, P., Flaster, M., Fan, W., Rastogi, R.: A cost-based model and effective heuristic for repairing constraints by value modification. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, 14–16 June 2005, pp. 143–154 (2005)
10.
Zurück zum Zitat Cheng, D., Bahadori, M.T., Liu, Y.: FBLG: a simple and effective approach for temporal dependence discovery from time series data. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA, 24–27 Aug 2014, pp. 382–391. ACM (2014) Cheng, D., Bahadori, M.T., Liu, Y.: FBLG: a simple and effective approach for temporal dependence discovery from time series data. In: Macskassy, S.A., Perlich, C., Leskovec, J., Wang, W., Ghani, R. (eds.) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA, 24–27 Aug 2014, pp. 382–391. ACM (2014)
11.
Zurück zum Zitat Chomicki, J., Marcinkowski, J.: On the computational complexity of minimal-change integrity maintenance in relational databases. In: Inconsistency Tolerance [Result from a Dagstuhl Seminar], pp. 119–150 (2005) Chomicki, J., Marcinkowski, J.: On the computational complexity of minimal-change integrity maintenance in relational databases. In: Inconsistency Tolerance [Result from a Dagstuhl Seminar], pp. 119–150 (2005)
12.
Zurück zum Zitat Chu, X., Ilyas, I.F., Papotti, P.: Discovering denial constraints. PVLDB 6(13), 1498–1509 (2013) Chu, X., Ilyas, I.F., Papotti, P.: Discovering denial constraints. PVLDB 6(13), 1498–1509 (2013)
13.
Zurück zum Zitat Chu, X., Ilyas, I.F., Papotti, P.: Holistic data cleaning: putting violations into context. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, 8–12 April 2013, pp. 458–469 (2013) Chu, X., Ilyas, I.F., Papotti, P.: Holistic data cleaning: putting violations into context. In: 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, 8–12 April 2013, pp. 458–469 (2013)
14.
15.
Zurück zum Zitat Ding, L., Chen, S., Rundensteiner, E.A., Tatemura, J., Hsiung, W., Candan, K.S.: Runtime semantic query optimization for event stream processing. In: Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, 7–12 April 2008, Cancún, Mexico, pp. 676–685 (2008) Ding, L., Chen, S., Rundensteiner, E.A., Tatemura, J., Hsiung, W., Candan, K.S.: Runtime semantic query optimization for event stream processing. In: Proceedings of the 24th International Conference on Data Engineering, ICDE 2008, 7–12 April 2008, Cancún, Mexico, pp. 676–685 (2008)
16.
Zurück zum Zitat Duan, L., Pang, T., Nummenmaa, J., Zuo, J., Zhang, P., Tang, C.: Bus-OLAP: a data management model for non-on-time events query over bus journey data. Data Sci. Eng. 3(1), 52–67 (2018)CrossRef Duan, L., Pang, T., Nummenmaa, J., Zuo, J., Zhang, P., Tang, C.: Bus-OLAP: a data management model for non-on-time events query over bus journey data. Data Sci. Eng. 3(1), 52–67 (2018)CrossRef
17.
Zurück zum Zitat Dyreson, C.E., Snodgrass, R.T.: Supporting valid-time indeterminacy. ACM Trans. Database Syst. 23(1), 1–57 (1998)CrossRef Dyreson, C.E., Snodgrass, R.T.: Supporting valid-time indeterminacy. ACM Trans. Database Syst. 23(1), 1–57 (1998)CrossRef
18.
Zurück zum Zitat Fan, W.: Dependencies revisited for improving data quality. In: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, 9–11 June 2008, Vancouver, BC, Canada, pp. 159–170 (2008) Fan, W.: Dependencies revisited for improving data quality. In: Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, 9–11 June 2008, Vancouver, BC, Canada, pp. 159–170 (2008)
19.
Zurück zum Zitat Fan, W.: Constraint-driven database repair, 2nd edn. In: Encyclopedia of Database Systems (2018) Fan, W.: Constraint-driven database repair, 2nd edn. In: Encyclopedia of Database Systems (2018)
20.
Zurück zum Zitat Jin, T., Wang, J., Wen, L.: Efficiently querying business process models with beehivez. In: Proceedings of the Demo Track of the Ninth Conference on Business Process Management 2011, Clermont-Ferrand, France, August 31st, 2011 (2011) Jin, T., Wang, J., Wen, L.: Efficiently querying business process models with beehivez. In: Proceedings of the Demo Track of the Ninth Conference on Business Process Management 2011, Clermont-Ferrand, France, August 31st, 2011 (2011)
21.
Zurück zum Zitat Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium on the Complexity of Computer Computations, Held 20–22 March 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, pp. 85–103 (1972) Karp, R.M.: Reducibility among combinatorial problems. In: Proceedings of a symposium on the Complexity of Computer Computations, Held 20–22 March 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, pp. 85–103 (1972)
22.
Zurück zum Zitat Rogge-Solti, A., Mans, R., van der Aalst, W.M.P., Weske, M.: Improving documentation by repairing event logs. In: The Practice of Enterprise Modeling—6th IFIP WG 8.1 Working Conference, PoEM 2013, Riga, Latvia, 6–7 Nov 2013, Proceedings, pp. 129–144 (2013) Rogge-Solti, A., Mans, R., van der Aalst, W.M.P., Weske, M.: Improving documentation by repairing event logs. In: The Practice of Enterprise Modeling—6th IFIP WG 8.1 Working Conference, PoEM 2013, Riga, Latvia, 6–7 Nov 2013, Proceedings, pp. 129–144 (2013)
23.
Zurück zum Zitat Song, S., Cao, Y., Wang, J.: Cleaning timestamps with temporal constraints. PVLDB 9(10), 708–719 (2016) Song, S., Cao, Y., Wang, J.: Cleaning timestamps with temporal constraints. PVLDB 9(10), 708–719 (2016)
24.
Zurück zum Zitat Song, S., Zhang, A., Wang, J., Yu, P.S.: SCREEN: stream data cleaning under speed constraints. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31–June 4, 2015, pp. 827–841 (2015) Song, S., Zhang, A., Wang, J., Yu, P.S.: SCREEN: stream data cleaning under speed constraints. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31–June 4, 2015, pp. 827–841 (2015)
25.
Zurück zum Zitat Sun, P., Liu, Z., Davidson, S.B., Chen, Y.: Detecting and resolving unsound workflow views for correct provenance analysis. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 549–562 (2009) Sun, P., Liu, Z., Davidson, S.B., Chen, Y.: Detecting and resolving unsound workflow views for correct provenance analysis. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2009, Providence, Rhode Island, USA, June 29–July 2, 2009, pp. 549–562 (2009)
26.
Zurück zum Zitat Tang, L., Li, T., Shwartz, L.: Discovering lag intervals for temporal dependencies. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, 12–16 Aug 2012, pp. 633–641 (2012) Tang, L., Li, T., Shwartz, L.: Discovering lag intervals for temporal dependencies. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, Beijing, China, 12–16 Aug 2012, pp. 633–641 (2012)
27.
Zurück zum Zitat Yakout, M., Berti-Équille, L., Elmagarmid, A.K.: Don’t be scared: use scalable automatic repairing with maximal likelihood and bounded changes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, 22–27 June 2013, pp. 553–564 (2013) Yakout, M., Berti-Équille, L., Elmagarmid, A.K.: Don’t be scared: use scalable automatic repairing with maximal likelihood and bounded changes. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, 22–27 June 2013, pp. 553–564 (2013)
28.
Zurück zum Zitat Zhang, H., Diao, Y., Immerman, N.: Recognizing patterns in streams with imprecise timestamps. PVLDB 3(1), 244–255 (2010) Zhang, H., Diao, Y., Immerman, N.: Recognizing patterns in streams with imprecise timestamps. PVLDB 3(1), 244–255 (2010)
Metadaten
Titel
Cleaning timestamps with temporal constraints
verfasst von
Shaoxu Song
Ruihong Huang
Yue Cao
Jianmin Wang
Publikationsdatum
23.02.2021
Verlag
Springer Berlin Heidelberg
Erschienen in
The VLDB Journal / Ausgabe 3/2021
Print ISSN: 1066-8888
Elektronische ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-020-00641-6

Weitere Artikel der Ausgabe 3/2021

The VLDB Journal 3/2021 Zur Ausgabe