Skip to main content
Erschienen in: The Journal of Supercomputing 10/2020

11.04.2018

Techniques and guidelines for effective migration from RDBMS to NoSQL

verfasst von: Ho-Jun Kim, Eun-Jeong Ko, Young-Ho Jeon, Ki-Hoon Lee

Erschienen in: The Journal of Supercomputing | Ausgabe 10/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Migration from RDBMS to NoSQL has become an important topic in a big data era. This paper provides comprehensive techniques and guidelines for effective migration from RDBMS to NoSQL. We discuss the challenges faced in translating SQL queries; the effects of denormalization, column families, secondary indexes, join algorithms, and column name length; and decision support for the migration. We focus on a column-oriented NoSQL, HBase because it is widely used by many Internet enterprises such as Facebook, Twitter, and LinkedIn. Because HBase does not support SQL, we use Apache Phoenix as an SQL layer on top of HBase. Experimental results using TPC-H show that column-level denormalization with atomicity and grouping columns into column families significantly improve query performance; the use of secondary indexes on foreign keys is not as effective as in RDBMSs; the query optimizer of Phoenix is not very sophisticated; shortened column names significantly reduce the database size and improve query performance; and the SVM classifier can predict whether query performance is improved by migration or not. Important open problems in NoSQL research are supporting complex SQL queries, automatic index selection, and optimizing SQL queries for NoSQL.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kim H-J, Ko E-J, Jeon Y-H, Lee K-H (2017) Migration from RDBMS to column-oriented NoSQL: lessons learned and open problems. In: EDB, LNEE, vol 461, pp 25–33 Kim H-J, Ko E-J, Jeon Y-H, Lee K-H (2017) Migration from RDBMS to column-oriented NoSQL: lessons learned and open problems. In: EDB, LNEE, vol 461, pp 25–33
2.
Zurück zum Zitat Yoo J, Lee K-H, Jeon Y-H (2018) Migration from RDBMS to NoSQL using column-level denormalization and atomic aggregates. J Inf Sci Eng 34(1):243–259 Yoo J, Lee K-H, Jeon Y-H (2018) Migration from RDBMS to NoSQL using column-level denormalization and atomic aggregates. J Inf Sci Eng 34(1):243–259
3.
Zurück zum Zitat Karnitis G, Arnicans G (2015) Migration of relational database to document-oriented database: structure denormalization and data transformation. In: CICSyN, pp 113–118 Karnitis G, Arnicans G (2015) Migration of relational database to document-oriented database: structure denormalization and data transformation. In: CICSyN, pp 113–118
4.
Zurück zum Zitat Zhao G, Lin Q, Li L, Li Z (2014) Schema conversion model of SQL database to NoSQL. In: 3PGCIC, pp 355–362 Zhao G, Lin Q, Li L, Li Z (2014) Schema conversion model of SQL database to NoSQL. In: 3PGCIC, pp 355–362
5.
Zurück zum Zitat Lee C-H, Zheng Y-L (2015) Automatic SQL-to-NoSQL schema transformation over the MySQL and HBase databases. In: IEEE ICCE-TW, pp 426–427 Lee C-H, Zheng Y-L (2015) Automatic SQL-to-NoSQL schema transformation over the MySQL and HBase databases. In: IEEE ICCE-TW, pp 426–427
6.
Zurück zum Zitat Zhao G, Li L, Li Z, Lin Q (2014) Multiple nested schema of HBase for migration from SQL. In: 3PGCIC, pp 338–343 Zhao G, Li L, Li Z, Lin Q (2014) Multiple nested schema of HBase for migration from SQL. In: 3PGCIC, pp 338–343
7.
Zurück zum Zitat Lee C-H, Zheng Y-L (2015) SQL-to-NoSQL schema denormalization and migration: a study on content management systems. In: IEEE SMC, pp 2022–2026 Lee C-H, Zheng Y-L (2015) SQL-to-NoSQL schema denormalization and migration: a study on content management systems. In: IEEE SMC, pp 2022–2026
8.
Zurück zum Zitat Vajk T, Feher P, Fekete K, Charaf H (2013) Denormalizing data into schema-free databases. In: IEEE CogInfoCom, pp 747–752 Vajk T, Feher P, Fekete K, Charaf H (2013) Denormalizing data into schema-free databases. In: IEEE CogInfoCom, pp 747–752
9.
Zurück zum Zitat Vajk T, Deak L, Fekete K, Mezei G (2013) Automatic NoSQL schema development: a case study. In: PDCN, pp 656–663 Vajk T, Deak L, Fekete K, Mezei G (2013) Automatic NoSQL schema development: a case study. In: PDCN, pp 656–663
10.
Zurück zum Zitat Ho L-Y, Hsieh M-J, Wu J-J, Liu P (2015) Data partition optimization for column-family NoSQL databases. In: IEEE Smart City, pp 668–675 Ho L-Y, Hsieh M-J, Wu J-J, Liu P (2015) Data partition optimization for column-family NoSQL databases. In: IEEE Smart City, pp 668–675
11.
Zurück zum Zitat Mior MJ, Salem K, Aboulnaga A, Liu R (2016) NoSE: schema design for NoSQL applications. In: IEEE ICDE, pp 181–192 Mior MJ, Salem K, Aboulnaga A, Liu R (2016) NoSE: schema design for NoSQL applications. In: IEEE ICDE, pp 181–192
12.
Zurück zum Zitat Ge W, Huang Y, Zhao D, Luo S, Yuan C, Zhou W, Tang Y, Zhou J (2014) A secondary index with hotscore caching policy on key-value data store. In: ADMA, LNCS, vol 8933, pp 602–615 Ge W, Huang Y, Zhao D, Luo S, Yuan C, Zhou W, Tang Y, Zhou J (2014) A secondary index with hotscore caching policy on key-value data store. In: ADMA, LNCS, vol 8933, pp 602–615
13.
Zurück zum Zitat Gadkari A, Nikam VB, Meshram BB (2014) Implementing joins over HBase on cloud platform. In: IEEE CIT, pp 547–554 Gadkari A, Nikam VB, Meshram BB (2014) Implementing joins over HBase on cloud platform. In: IEEE CIT, pp 547–554
14.
Zurück zum Zitat Han D, Stroulia E (2012) A three-dimensional data model in HBase for large time-series dataset analysis. In: IEEE MESOCA, pp 47–56 Han D, Stroulia E (2012) A three-dimensional data model in HBase for large time-series dataset analysis. In: IEEE MESOCA, pp 47–56
15.
Zurück zum Zitat Baralis E, Valle AD, Garza P, Rossi C, Scullino F (2017) SQL versus NoSQL databases for geospatial applications. In: IEEE BSD Baralis E, Valle AD, Garza P, Rossi C, Scullino F (2017) SQL versus NoSQL databases for geospatial applications. In: IEEE BSD
16.
Zurück zum Zitat Lee S-A, Kim J-H, Moon Y-S, Lee W-K (2015) Efficient level-based top-down data cube computation using MapReduce. In: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXI, LNCS, vol 9260, pp 1–19 Lee S-A, Kim J-H, Moon Y-S, Lee W-K (2015) Efficient level-based top-down data cube computation using MapReduce. In: Transactions on Large-Scale Data- and Knowledge-Centered Systems XXI, LNCS, vol 9260, pp 1–19
17.
Zurück zum Zitat Lee K-H, Park Y-H (2011) Revisiting source-level XQuery normalization. IEICE Trans Inf Syst E94-D(3):622–631CrossRef Lee K-H, Park Y-H (2011) Revisiting source-level XQuery normalization. IEICE Trans Inf Syst E94-D(3):622–631CrossRef
18.
Zurück zum Zitat Lee K-H, Kim S-Y, Whang E, Lee J-G (2006) A practitioner’s approach to normalizing XQuery expressions. In: DASFAA, LNCS, vol 3882, pp 437–453 Lee K-H, Kim S-Y, Whang E, Lee J-G (2006) A practitioner’s approach to normalizing XQuery expressions. In: DASFAA, LNCS, vol 3882, pp 437–453
20.
Zurück zum Zitat Ganski R, Wong H (1987) Optimization of nested SQL queries revisited. In: ACM SIGMOD, pp 23–33 Ganski R, Wong H (1987) Optimization of nested SQL queries revisited. In: ACM SIGMOD, pp 23–33
Metadaten
Titel
Techniques and guidelines for effective migration from RDBMS to NoSQL
verfasst von
Ho-Jun Kim
Eun-Jeong Ko
Young-Ho Jeon
Ki-Hoon Lee
Publikationsdatum
11.04.2018
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 10/2020
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2361-2

Weitere Artikel der Ausgabe 10/2020

The Journal of Supercomputing 10/2020 Zur Ausgabe