Skip to main content
Top
Published in: The VLDB Journal 3/2017

22-02-2017 | Regular Paper

Disjoint interval partitioning

Authors: Francesco Cafagna, Michael H. Böhlen

Published in: The VLDB Journal | Issue 3/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In databases with time interval attributes, query processing techniques that are based on sort-merge or sort-aggregate deteriorate. This happens because for intervals no total order exists and either the start or end point is used for the sorting. Doing so leads to inefficient solutions with lots of unproductive comparisons that do not produce an output tuple. Even if just one tuple with a long interval is present in the data, the number of unproductive comparisons of sort-merge and sort-aggregate gets quadratic. In this paper we propose disjoint interval partitioning (\(\mathcal {DIP}\)), a technique to efficiently perform sort-based operators on interval data. \(\mathcal {DIP}\) divides an input relation into the minimum number of partitions, such that all tuples in a partition are non-overlapping. The absence of overlapping tuples guarantees efficient sort-merge computations without backtracking. With \(\mathcal {DIP}\) the number of unproductive comparisons is linear in the number of partitions. In contrast to current solutions with inefficient random accesses to the active tuples, \(\mathcal {DIP}\) fetches the tuples in a partition sequentially. We illustrate the generality and efficiency of \(\mathcal {DIP}\) by describing and evaluating three basic database operators over interval data: join, anti-join and aggregation.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Our implementation applies standard DBMS optimization techniques, such as projecting on the attributes required for the aggregation (i.e., \(T\) and \(A\)), so that the full outer join result includes only the needed attributes.
 
Literature
1.
go back to reference Date, C., Darwen, H., Lorentzos, N.: Temporal data and the relational model, pp. 77–86. Morgan Kaufmann Publishers, Burlington (2003)CrossRef Date, C., Darwen, H., Lorentzos, N.: Temporal data and the relational model, pp. 77–86. Morgan Kaufmann Publishers, Burlington (2003)CrossRef
3.
go back to reference Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2), 73–169 (1993)CrossRef Graefe, G.: Query evaluation techniques for large databases. ACM Comput. Surv. 25(2), 73–169 (1993)CrossRef
4.
go back to reference Li, W., Gao, D., Snodgrass, R.T.: Skew handling techniques in sort-merge join. In: SIGMOD, pp. 169–180. (2002) Li, W., Gao, D., Snodgrass, R.T.: Skew handling techniques in sort-merge join. In: SIGMOD, pp. 169–180. (2002)
5.
go back to reference Chawda, B., Gupta, H., Negi, S., Faruquie, T.A., Subramaniam, L.V., Mohania, M.K.: Processing interval joins on map-reduce. In: EDBT, pp. 463–474. (2014) Chawda, B., Gupta, H., Negi, S., Faruquie, T.A., Subramaniam, L.V., Mohania, M.K.: Processing interval joins on map-reduce. In: EDBT, pp. 463–474. (2014)
6.
go back to reference Soo, M., Snodgrass, R., Jensen, C.S.: Efficient evaluation of the valid-time natural join. In: ICDE, pp. 282–292. (1994) Soo, M., Snodgrass, R., Jensen, C.S.: Efficient evaluation of the valid-time natural join. In: ICDE, pp. 282–292. (1994)
7.
go back to reference Dignös, A., Böhlen, M.H., Gamper, J.: Overlap interval partition join. In: SIGMOD, pp. 1459–1470. (2014) Dignös, A., Böhlen, M.H., Gamper, J.: Overlap interval partition join. In: SIGMOD, pp. 1459–1470. (2014)
8.
go back to reference Leung, T.Y.C., Muntz, R.R.: Temporal query processing and optimization in multiprocessor database machines. In: VLDB, pp. 383–394. (1992) Leung, T.Y.C., Muntz, R.R.: Temporal query processing and optimization in multiprocessor database machines. In: VLDB, pp. 383–394. (1992)
9.
go back to reference Kaufmann, M., Manjili, A.A., Vagenas, P., Fischer, P.M., Kossmann, D., Färber, F., May, N.: Timeline index: a unified data structure for processing queries on temporal data in sap hana. In: SIGMOD, pp. 1173–1184. (2013) Kaufmann, M., Manjili, A.A., Vagenas, P., Fischer, P.M., Kossmann, D., Färber, F., May, N.: Timeline index: a unified data structure for processing queries on temporal data in sap hana. In: SIGMOD, pp. 1173–1184. (2013)
10.
go back to reference Arge, L., Procopiuc, O., Ramaswamy, S., Suel, T., Vitter, J.S.: Scalable sweeping-based spatial join. In: VLDB, pp. 570–581. (1998) Arge, L., Procopiuc, O., Ramaswamy, S., Suel, T., Vitter, J.S.: Scalable sweeping-based spatial join. In: VLDB, pp. 570–581. (1998)
11.
go back to reference Piatov, D., Helmer, S., Dignös, A.: An interval join optimized for modern hardware. In ICDE, pp. 1098–1109. (2016) Piatov, D., Helmer, S., Dignös, A.: An interval join optimized for modern hardware. In ICDE, pp. 1098–1109. (2016)
12.
go back to reference Segev, A., Gunadhi, H.: Event-join optimization in temporal relational databases. In: VLDB, pp. 205–215. (1989) Segev, A., Gunadhi, H.: Event-join optimization in temporal relational databases. In: VLDB, pp. 205–215. (1989)
13.
go back to reference Böhlen, M., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: EDBT, pp. 257–275. (2006) Böhlen, M., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: EDBT, pp. 257–275. (2006)
14.
go back to reference Lopez, I.F.V., Snodgrass, R.T., Moon, B.: Spatiotemporal aggregate computation: a survey. TKDE 17(2), 271–286 (2005) Lopez, I.F.V., Snodgrass, R.T., Moon, B.: Spatiotemporal aggregate computation: a survey. TKDE 17(2), 271–286 (2005)
15.
go back to reference Dignös, A., Böhlen, M.H., Gamper, J.: Temporal alignment. In: SIGMOD, pp. 433–444. (2012) Dignös, A., Böhlen, M.H., Gamper, J.: Temporal alignment. In: SIGMOD, pp. 433–444. (2012)
16.
go back to reference Agesen, M., Böhlen, M.H., Poulsen, L., Torp, K.: A split operator for now-relative bitemporal databases. In: ICDE, pp. 41–50. (2001) Agesen, M., Böhlen, M.H., Poulsen, L., Torp, K.: A split operator for now-relative bitemporal databases. In: ICDE, pp. 41–50. (2001)
17.
go back to reference Enderle, J., Hampel, M., Seidl, T.: Joining interval data in relational databases. In: SIGMOD, pp. 683–694. (2004) Enderle, J., Hampel, M., Seidl, T.: Joining interval data in relational databases. In: SIGMOD, pp. 683–694. (2004)
18.
go back to reference Kriegel, H.-P., Pötke, M., Seidl, T.: Managing intervals efficiently in object-relational databases. In: VLDB, pp. 407–418. (2000) Kriegel, H.-P., Pötke, M., Seidl, T.: Managing intervals efficiently in object-relational databases. In: VLDB, pp. 407–418. (2000)
19.
go back to reference Stroustrup, B.: Software development for infrastructure. IEEE Comput. 45(1), 47–58 (2012)CrossRef Stroustrup, B.: Software development for infrastructure. IEEE Comput. 45(1), 47–58 (2012)CrossRef
20.
go back to reference Kline, N., Snodgrass, R.: Computing temporal aggregates. In: ICDE, pp. 222–231. (1995) Kline, N., Snodgrass, R.: Computing temporal aggregates. In: ICDE, pp. 222–231. (1995)
21.
go back to reference Moon, B., Lopez, I.F.V., Immanuel, V.: Efficient algorithms for large-scale temporal aggregation. TKDE 15(3), 744–759 (2003) Moon, B., Lopez, I.F.V., Immanuel, V.: Efficient algorithms for large-scale temporal aggregation. TKDE 15(3), 744–759 (2003)
22.
go back to reference Yang, J., Widom, J.: Incremental computation and maintenance of temporal aggregates. VLDB J. 12(3), 262–283 (2003)CrossRef Yang, J., Widom, J.: Incremental computation and maintenance of temporal aggregates. VLDB J. 12(3), 262–283 (2003)CrossRef
23.
go back to reference Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)CrossRef Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub totals. Data Min. Knowl. Discov. 1(1), 29–53 (1997)CrossRef
24.
go back to reference Böhlen, M.H., Jensen, C.S., Snodgrass, R.T.: Temporal statement modifiers. ACM Trans. Database Syst. 25(4), 407–456 (2000)CrossRefMATH Böhlen, M.H., Jensen, C.S., Snodgrass, R.T.: Temporal statement modifiers. ACM Trans. Database Syst. 25(4), 407–456 (2000)CrossRefMATH
25.
go back to reference Snodgrass, R.T. (ed.): The TSQL2 temporal query language. Kluwer, Dordrecht (1995) Snodgrass, R.T. (ed.): The TSQL2 temporal query language. Kluwer, Dordrecht (1995)
26.
go back to reference Taliun, A., Böhlen, M., Bracher, A., Cafagna, F.: A gis-based data analysis platform for analyzing the time-varying quality of animal feed and its impact on the environment. In: iEMSs, pp. 1447–1454. (2012) Taliun, A., Böhlen, M., Bracher, A., Cafagna, F.: A gis-based data analysis platform for analyzing the time-varying quality of animal feed and its impact on the environment. In: iEMSs, pp. 1447–1454. (2012)
27.
go back to reference Kleinberg, J., Tardos, E.: Algorithm Design. Addison-Wesley Longman Publishing Co., Inc, Boston (2005) Kleinberg, J., Tardos, E.: Algorithm Design. Addison-Wesley Longman Publishing Co., Inc, Boston (2005)
28.
go back to reference Gunadhi, H., Segev, A.: Query processing algorithms for temporal intersection joins. In: ICDE, pp. 336–344. (1991) Gunadhi, H., Segev, A.: Query processing algorithms for temporal intersection joins. In: ICDE, pp. 336–344. (1991)
30.
go back to reference Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J., Van den Broeck, W.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)MathSciNetCrossRef Isella, L., Stehlé, J., Barrat, A., Cattuto, C., Pinton, J., Van den Broeck, W.: What’s in a crowd? Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)MathSciNetCrossRef
31.
go back to reference Monacchi, A., Egarter, D., Elmenreich, W., D’Alessandro, S., Tonello, A.M.: GREEND: an energy consumption dataset of households in Italy and Austria. In: SmartGridComm, pp. 511–516. (2014) Monacchi, A., Egarter, D., Elmenreich, W., D’Alessandro, S., Tonello, A.M.: GREEND: an energy consumption dataset of households in Italy and Austria. In: SmartGridComm, pp. 511–516. (2014)
Metadata
Title
Disjoint interval partitioning
Authors
Francesco Cafagna
Michael H. Böhlen
Publication date
22-02-2017
Publisher
Springer Berlin Heidelberg
Published in
The VLDB Journal / Issue 3/2017
Print ISSN: 1066-8888
Electronic ISSN: 0949-877X
DOI
https://doi.org/10.1007/s00778-017-0456-7

Other articles of this Issue 3/2017

The VLDB Journal 3/2017 Go to the issue

Premium Partner