Skip to main content
Erschienen in: World Wide Web 6/2019

11.04.2018

Efficient time-interval data extraction in MVCC-based RDBMS

verfasst von: Haixiang Li, Zhanhao Zhao, Yijian Cheng, Wei Lu, Xiaoyong Du, Anqun Pan

Erschienen in: World Wide Web | Ausgabe 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Account reconciliation is the core business in banks and game companies. It regularly examines the account balance with the bank or expense statement for every user and reports the daily, weekly, or monthly balance. Once an account imbalance occurs, it is necessary to efficiently trace the transactions that possibly destroy the account balances. To help efficiently trace this kind of transactions, in this paper, we investigate the problem of doing efficient time-interval data extraction in MVCC-based RDBMS, i.e., extracting the incremental data that are valid between a given time interval in MVCC-based RDBMS. To this end, we propose a snapshot-based method to extract incremental data based on the fact that each record is inherently associated with lifetime, indicating whether the record can be accessed or not for a given time interval. We elaborate how to integrate our method into MySQL, an open-sourced RDBMS, and propose a declarative way to fetch the incremental data. Several optimization techniques are proposed to boost the extraction performance. Extensive experiments are conducted over the standardized Sysbench benchmark to show that our proposed method is robust and efficient.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Bernstein, P.A., Goodman, N.: Concurrency control in distributed database systems. ACM Comput. Surv. (CSUR) 13(2), 185–221 (1981)MathSciNetCrossRef Bernstein, P.A., Goodman, N.: Concurrency control in distributed database systems. ACM Comput. Surv. (CSUR) 13(2), 185–221 (1981)MathSciNetCrossRef
2.
Zurück zum Zitat Cahill, M.J., Röhm, U., Fekete, A.D.: Serializable isolation for snapshot databases. ACM Trans. Datab. Syst. (TODS) 34(4), 20 (2009) Cahill, M.J., Röhm, U., Fekete, A.D.: Serializable isolation for snapshot databases. ACM Trans. Datab. Syst. (TODS) 34(4), 20 (2009)
4.
Zurück zum Zitat Doan, A., Naughton, J. F., Ramakrishnan, R., Baid, A., Chai, X., Chen, F., Chen, T., Chu, E., DeRose, P., Gao, B., et al.: Information extraction challenges in managing unstructured data. ACM SIGMOD Rec. 37(4), 14–20 (2009)CrossRef Doan, A., Naughton, J. F., Ramakrishnan, R., Baid, A., Chai, X., Chen, F., Chen, T., Chu, E., DeRose, P., Gao, B., et al.: Information extraction challenges in managing unstructured data. ACM SIGMOD Rec. 37(4), 14–20 (2009)CrossRef
5.
Zurück zum Zitat Labio, W., Garcia-Molina, H.: Efficient Snapshot Differential Algorithms in Data Warehousing. Tech. rep., Stanford InfoLab (1996) Labio, W., Garcia-Molina, H.: Efficient Snapshot Differential Algorithms in Data Warehousing. Tech. rep., Stanford InfoLab (1996)
6.
Zurück zum Zitat Li, H., Feng, Y., Fan, P.: The art of Database Transaction Processiong: Transaction Management and Concurrency Control. China Machine Press (2017) Li, H., Feng, Y., Fan, P.: The art of Database Transaction Processiong: Transaction Management and Concurrency Control. China Machine Press (2017)
7.
Zurück zum Zitat Lu, W., Fung, G.P.C., Du, X., Zhou, X., Chen, L., Deng, K.: Approximate entity extraction in temporal databases. World Wide Web 14(2), 157–186 (2011)CrossRef Lu, W., Fung, G.P.C., Du, X., Zhou, X., Chen, L., Deng, K.: Approximate entity extraction in temporal databases. World Wide Web 14(2), 157–186 (2011)CrossRef
8.
Zurück zum Zitat Lu, W., Hou, J., Yan, Y., Zhang, M., Du, X., Moscibroda, T.: MSQL: efficient similarity search in metric spaces using SQL. VLDB J. 26(6), 829–854 (2017)CrossRef Lu, W., Hou, J., Yan, Y., Zhang, M., Du, X., Moscibroda, T.: MSQL: efficient similarity search in metric spaces using SQL. VLDB J. 26(6), 829–854 (2017)CrossRef
9.
Zurück zum Zitat Ma, K., Yang, B.: Log-based change data capture from schema-free document stores using mapreduce. In: 2015 International Conference on Cloud Technologies and Applications (CloudTech), pp. 1–6 (2015). Ma, K., Yang, B.: Log-based change data capture from schema-free document stores using mapreduce. In: 2015 International Conference on Cloud Technologies and Applications (CloudTech), pp. 1–6 (2015).
10.
Zurück zum Zitat McWherter, D.T., Schroeder, B., Ailamaki, A., Harchol-Balter, M.: Priority mechanisms for OLTP and transactional Web applications. In: ICDE. IEEE Computer Society, pp. 535–546 (2004) McWherter, D.T., Schroeder, B., Ailamaki, A., Harchol-Balter, M.: Priority mechanisms for OLTP and transactional Web applications. In: ICDE. IEEE Computer Society, pp. 535–546 (2004)
11.
Zurück zum Zitat Meehan, J., Tatbul, N., Zdonik, S., Aslantas, C., Cetintemel, U., Du, J., Kraska, T., Madden, S., Maier, D., Pavlo, A., Stonebraker, M., Tufte, K., Wang, H.: S-store: Streaming meets transaction processing. Proc. VLDB Endow. 8(13), 2134–2145 (2015)CrossRef Meehan, J., Tatbul, N., Zdonik, S., Aslantas, C., Cetintemel, U., Du, J., Kraska, T., Madden, S., Maier, D., Pavlo, A., Stonebraker, M., Tufte, K., Wang, H.: S-store: Streaming meets transaction processing. Proc. VLDB Endow. 8(13), 2134–2145 (2015)CrossRef
12.
Zurück zum Zitat Melnik, S., Gubarev, A., Long, J.J., Romer, G., Shivakumar, S., Tolton, M., Vassilakis, T.: Dremel: Interactive analysis of Web-scale datasets. Proc. VLDB Endow. 3(1-2), 330–339 (2010)CrossRef Melnik, S., Gubarev, A., Long, J.J., Romer, G., Shivakumar, S., Tolton, M., Vassilakis, T.: Dremel: Interactive analysis of Web-scale datasets. Proc. VLDB Endow. 3(1-2), 330–339 (2010)CrossRef
13.
Zurück zum Zitat Ports, D.R.K., Grittner, K.: Serializable snapshot isolation in postgresql. Proc. VLDB Endow. 5, 1850–1861 (2012)CrossRef Ports, D.R.K., Grittner, K.: Serializable snapshot isolation in postgresql. Proc. VLDB Endow. 5, 1850–1861 (2012)CrossRef
15.
Zurück zum Zitat Ram, P., Do, L.: Extracting delta for incremental data warehouse maintenance. In: Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), pp. 220–229 (2000). Ram, P., Do, L.: Extracting delta for incremental data warehouse maintenance. In: Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073), pp. 220–229 (2000).
16.
Zurück zum Zitat Reed, D. P.: Naming and Synchronization in a Decentralized Computer System. Ph.D. thesis Massachusetts Institute of Technology (1978) Reed, D. P.: Naming and Synchronization in a Decentralized Computer System. Ph.D. thesis Massachusetts Institute of Technology (1978)
17.
Zurück zum Zitat Revilak, S., O’Neil, P., O’Neil, E.: Precisely serializable snapshot isolation (pssi). In: 2011 IEEE 27th International Conference on Data Engineering, pp. 482–493 (2011) Revilak, S., O’Neil, P., O’Neil, E.: Precisely serializable snapshot isolation (pssi). In: 2011 IEEE 27th International Conference on Data Engineering, pp. 482–493 (2011)
18.
Zurück zum Zitat Stonebraker, M.: The design of the postgres storage system. In: Proceedings of the 13th International Conference on Very Large Data Bases, VLDB ’87, pp 289–300. Morgan Kaufmann Publishers Inc., San Francisco (1987) Stonebraker, M.: The design of the postgres storage system. In: Proceedings of the 13th International Conference on Very Large Data Bases, VLDB ’87, pp 289–300. Morgan Kaufmann Publishers Inc., San Francisco (1987)
19.
Zurück zum Zitat Stonebraker, M., Rowe, L.A., Hirohama, M.: The implementation of postgres. IEEE Trans. Knowl. Data Eng. 2(1), 125–142 (1990)CrossRef Stonebraker, M., Rowe, L.A., Hirohama, M.: The implementation of postgres. IEEE Trans. Knowl. Data Eng. 2(1), 125–142 (1990)CrossRef
23.
Zurück zum Zitat Wu, S, Ren, W, Yu, C, Chen, G, Zhang, D, Zhu, J: Personal recommendation using deep recurrent neural networks in NetEase. In: 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016. pp. 1218–1229 (2016) Wu, S, Ren, W, Yu, C, Chen, G, Zhang, D, Zhu, J: Personal recommendation using deep recurrent neural networks in NetEase. In: 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016. pp. 1218–1229 (2016)
24.
Zurück zum Zitat Yabandeh, M., Gómez Ferro, D.: A critique of snapshot isolation. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 155–168. ACM (2012) Yabandeh, M., Gómez Ferro, D.: A critique of snapshot isolation. In: Proceedings of the 7th ACM European Conference on Computer Systems, pp. 155–168. ACM (2012)
25.
Zurück zum Zitat Zhang, C., Sterck, H.D.: Supporting multi-row distributed transactions with global snapshot isolation using bare-bones hbase. In: 2010 11th IEEE/ACM International Conference on Grid Computing, pp. 177–184 (2010) Zhang, C., Sterck, H.D.: Supporting multi-row distributed transactions with global snapshot isolation using bare-bones hbase. In: 2010 11th IEEE/ACM International Conference on Grid Computing, pp. 177–184 (2010)
Metadaten
Titel
Efficient time-interval data extraction in MVCC-based RDBMS
verfasst von
Haixiang Li
Zhanhao Zhao
Yijian Cheng
Wei Lu
Xiaoyong Du
Anqun Pan
Publikationsdatum
11.04.2018
Verlag
Springer US
Erschienen in
World Wide Web / Ausgabe 6/2019
Print ISSN: 1386-145X
Elektronische ISSN: 1573-1413
DOI
https://doi.org/10.1007/s11280-018-0552-7

Weitere Artikel der Ausgabe 6/2019

World Wide Web 6/2019 Zur Ausgabe

Premium Partner