Skip to main content
Top

2018 | OriginalPaper | Chapter

Distributed In-Memory Analytics for Big Temporal Data

Authors : Bin Yao, Wei Zhang, Zhi-Jie Wang, Zhongpu Chen, Shuo Shang, Kai Zheng, Minyi Guo

Published in: Database Systems for Advanced Applications

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The temporal data is ubiquitous, and massive amount of temporal data is generated nowadays. Management of big temporal data is important yet challenging. Processing big temporal data using a distributed system is a desired choice. However, existing distributed systems/methods either cannot support native queries, or are disk-based solutions, which could not well satisfy the requirements of high throughput and low latency. To alleviate this issue, this paper proposes an In-memory based Two-level Index Solution in Spark (ITISS) for processing big temporal data. The framework of our system is easy to understand and implement, but without loss of efficiency. We conduct extensive experiments to verify the performance of our solution. Experimental results based on both real and synthetic datasets consistently demonstrate that our solution is efficient and competitive.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
4.
go back to reference Ahn, I., Snodgrass, R.: Performance evaluation of a temporal database management system. In: SIGMOD (1986)CrossRef Ahn, I., Snodgrass, R.: Performance evaluation of a temporal database management system. In: SIGMOD (1986)CrossRef
5.
go back to reference Becker, B., Gschwind, S., Ohler, T., Seeger, B., Widmayer, B.: An asymptotically optimal multiversion B-tree. VLDBJ (1996) Becker, B., Gschwind, S., Ohler, T., Seeger, B., Widmayer, B.: An asymptotically optimal multiversion B-tree. VLDBJ (1996)
6.
go back to reference Bettini, C., Wang, X.S., Bertino, E., Jajodia, S.: Semantic assumptions and query evaluation in temporal databases. In: SIGMOD (1995) Bettini, C., Wang, X.S., Bertino, E., Jajodia, S.: Semantic assumptions and query evaluation in temporal databases. In: SIGMOD (1995)
7.
go back to reference Bliujute, R., Jensen, C.S., Saltenis, S., Slivinskas, G.: R-tree based indexing of now-relative bitemporal data. In: VLDB (1998) Bliujute, R., Jensen, C.S., Saltenis, S., Slivinskas, G.: R-tree based indexing of now-relative bitemporal data. In: VLDB (1998)
8.
go back to reference Böhlen, M., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Boehm, K., Kemper, A., Grust, T., Boehm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 257–275. Springer, Heidelberg (2006). https://doi.org/10.1007/11687238_18CrossRef Böhlen, M., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: Ioannidis, Y., Scholl, M.H., Schmidt, J.W., Matthes, F., Hatzopoulos, M., Boehm, K., Kemper, A., Grust, T., Boehm, C. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 257–275. Springer, Heidelberg (2006). https://​doi.​org/​10.​1007/​11687238_​18CrossRef
9.
go back to reference Chandramouli, B., Goldstein, J., Duan, S.: Temporal analytics on big data for web advertising. In: ICDE (2012) Chandramouli, B., Goldstein, J., Duan, S.: Temporal analytics on big data for web advertising. In: ICDE (2012)
11.
go back to reference Elmasri, R., Wuu, G.T., Kim, Y.J.: The time index: an access structure for temporal data. In: VLDB (1990) Elmasri, R., Wuu, G.T., Kim, Y.J.: The time index: an access structure for temporal data. In: VLDB (1990)
12.
go back to reference Färber, F., et al.: The SAP HANA database-an architecture overview. IEEE Data Eng. Bull. (2012) Färber, F., et al.: The SAP HANA database-an architecture overview. IEEE Data Eng. Bull. (2012)
13.
go back to reference Gao, D., Jensen, S., Snodgrass, R.T., Soo, D.: Join operations in temporal databases. VLDBJ (2005) Gao, D., Jensen, S., Snodgrass, R.T., Soo, D.: Join operations in temporal databases. VLDBJ (2005)
14.
go back to reference Gendrano, J.A.G., Huang, B.C., Rodrigue, J.M., Moon, B., Snodgrass, R.T., Parallel algorithms for computing temporal aggregates. In: ICDE (1999) Gendrano, J.A.G., Huang, B.C., Rodrigue, J.M., Moon, B., Snodgrass, R.T., Parallel algorithms for computing temporal aggregates. In: ICDE (1999)
15.
go back to reference Gollapudi, S., Sivakumar, D.: Framework and algorithms for trend analysis in massive temporal data sets. In: CIKM (2004) Gollapudi, S., Sivakumar, D.: Framework and algorithms for trend analysis in massive temporal data sets. In: CIKM (2004)
16.
go back to reference Günnemann, S., Kremer, H., Laufkötter, C., Seidl, T.: Tracing evolving subspace clusters in temporal climate data. DMKD 24, 387–410 (2012)MathSciNet Günnemann, S., Kremer, H., Laufkötter, C., Seidl, T.: Tracing evolving subspace clusters in temporal climate data. DMKD 24, 387–410 (2012)MathSciNet
17.
go back to reference Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. TKDE (2014) Gupta, M., Gao, J., Aggarwal, C.C., Han, J.: Outlier detection for temporal data: a survey. TKDE (2014)
18.
go back to reference Jensen, C.S., Snodgrass, R.T.: Temporal data management. TKDE (1999) Jensen, C.S., Snodgrass, R.T.: Temporal data management. TKDE (1999)
19.
go back to reference Kaufmann, M., Fischer, P.M., May, N., Ge, C., Goel, A.K., Kossmann, D.: Bi-temporal timeline index: a data structure for processing queries on bi-temporal data. In: ICDE (2015) Kaufmann, M., Fischer, P.M., May, N., Ge, C., Goel, A.K., Kossmann, D.: Bi-temporal timeline index: a data structure for processing queries on bi-temporal data. In: ICDE (2015)
20.
go back to reference Kaufmann, M., Manjili, A.A., Vagenas, P., Fischer, P.M., Kossmann, D., Färber, F., May, N.: Timeline index: a unified data structure for processing queries on temporal data in SAP HANA. In: SIGMOD (2013) Kaufmann, M., Manjili, A.A., Vagenas, P., Fischer, P.M., Kossmann, D., Färber, F., May, N.: Timeline index: a unified data structure for processing queries on temporal data in SAP HANA. In: SIGMOD (2013)
21.
go back to reference Kline, N., Snodgrass, R.T.: Computing temporal aggregates. In: ICDE (1995) Kline, N., Snodgrass, R.T.: Computing temporal aggregates. In: ICDE (1995)
22.
go back to reference Kollios, G., Tsotras, V.J.: Hashing methods for temporal data. TKDE (2002) Kollios, G., Tsotras, V.J.: Hashing methods for temporal data. TKDE (2002)
23.
go back to reference Le, W., Li, F., Tao, Y., Christensen, R.: Optimal splitters for temporal and multi-version databases. In: SIGMOD (2013) Le, W., Li, F., Tao, Y., Christensen, R.: Optimal splitters for temporal and multi-version databases. In: SIGMOD (2013)
25.
go back to reference Leung, T.C., Muntz, R.R.: Temporal query processing and optimization in multiprocessor database machines. In: VLDB (1992) Leung, T.C., Muntz, R.R.: Temporal query processing and optimization in multiprocessor database machines. In: VLDB (1992)
26.
go back to reference Li, F., Yi, K., Le, W.: Top-k queries on temporal data. VLDBJ (2010) Li, F., Yi, K., Le, W.: Top-k queries on temporal data. VLDBJ (2010)
28.
go back to reference Lomet, D., et al.: Transaction time support inside a database engine. In: ICDE (2006) Lomet, D., et al.: Transaction time support inside a database engine. In: ICDE (2006)
30.
go back to reference Roddick, J.F., Spiliopoulou, M.: A survey of temporal knowledge discovery paradigms and methods. TKDE (2002) Roddick, J.F., Spiliopoulou, M.: A survey of temporal knowledge discovery paradigms and methods. TKDE (2002)
31.
go back to reference Saracco, C.M., et al.: A matter of time: temporal data management in DB2 10. Technical report, IBM (2012) Saracco, C.M., et al.: A matter of time: temporal data management in DB2 10. Technical report, IBM (2012)
32.
go back to reference Wang, P., Zhang, P., Zhou, C., Li, Z., Yang, H.: Hierarchical evolving Dirichlet processes for modeling nonlinear evolutionary traces in temporal data. DMKD 31, 32–64 (2017)MathSciNetMATH Wang, P., Zhang, P., Zhou, C., Li, Z., Yang, H.: Hierarchical evolving Dirichlet processes for modeling nonlinear evolutionary traces in temporal data. DMKD 31, 32–64 (2017)MathSciNetMATH
33.
go back to reference Wang, X.S., Jajodia, S., Subrahmanian, V.: Temporal modules: an approach toward federated temporal databases. In: SIGMOD (1993) Wang, X.S., Jajodia, S., Subrahmanian, V.: Temporal modules: an approach toward federated temporal databases. In: SIGMOD (1993)
34.
go back to reference Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: efficient in-memory spatial analytics. In: SIGMOD (2016) Xie, D., Li, F., Yao, B., Li, G., Zhou, L., Guo, M.: Simba: efficient in-memory spatial analytics. In: SIGMOD (2016)
35.
go back to reference Yang, J., Widom, J.: Incremental computation and maintenance of temporal aggregates. In: ICDE (2001) Yang, J., Widom, J.: Incremental computation and maintenance of temporal aggregates. In: ICDE (2001)
36.
go back to reference Yang, Y., Chen, K.: Temporal data clustering via weighted clustering ensemble with different representations. TKDE (2011) Yang, Y., Chen, K.: Temporal data clustering via weighted clustering ensemble with different representations. TKDE (2011)
37.
go back to reference Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI (2012) Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: NSDI (2012)
38.
go back to reference Zhang, D., Markowetz, A., Tsotras, V.J., Gunopulos, D., Seeger, B.: On computing temporal aggregates with range predicates. TODS (2008) Zhang, D., Markowetz, A., Tsotras, V.J., Gunopulos, D., Seeger, B.: On computing temporal aggregates with range predicates. TODS (2008)
39.
go back to reference Zhang, S., Yang, Y., Fan, W., Lan, L., Yuan, M.: OceanRT: real-time analytics over large temporal data. In: SIGMOD (2014) Zhang, S., Yang, Y., Fan, W., Lan, L., Yuan, M.: OceanRT: real-time analytics over large temporal data. In: SIGMOD (2014)
Metadata
Title
Distributed In-Memory Analytics for Big Temporal Data
Authors
Bin Yao
Wei Zhang
Zhi-Jie Wang
Zhongpu Chen
Shuo Shang
Kai Zheng
Minyi Guo
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-91452-7_36

Premium Partner