Skip to main content
Erschienen in: SICS Software-Intensive Cyber-Physical Systems 3-4/2017

03.11.2016 | Special Issue Paper

NoSQL database systems: a survey and decision guidance

verfasst von: Felix Gessert, Wolfram Wingerath, Steffen Friedrich, Norbert Ritter

Erschienen in: SICS Software-Intensive Cyber-Physical Systems | Ausgabe 3-4/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Today, data is generated and consumed at unprecedented scale. This has lead to novel approaches for scalable data management subsumed under the term “NoSQL” database systems to handle the ever-increasing data volume and request loads. However, the heterogeneity and diversity of the numerous existing systems impede the well-informed selection of a data store appropriate for a given application context. Therefore, this article gives a top-down overview of the field: instead of contrasting the implementation specifics of individual representatives, we propose a comparative classification model that relates functional and non-functional requirements to techniques and algorithms employed in NoSQL databases. This NoSQL Toolbox allows us to derive a simple decision tree to help practitioners and researchers filter potential system candidates based on central application requirements.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Computer Science - Research and Development

Computer Science – Research and Development (CSRD), formerly Informatik – Forschung und Entwicklung (IFE), is a quarterly international journal that publishes high-quality research and survey papers from the Software Engineering & Systems area.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Fußnoten
1
The JavaScript Object Notation is a standard format consisting of nested attribute-value pairs and lists.
 
2
In some systems (e.g. Bigtable and HBase), multi-versioning is implemented by adding a timestamp as third-level key.
 
3
ACID [23]: Atomicity, Consistency, Isolation, Duration.
 
4
BASE [42]: Basically Available, Soft-state, Eventually consistent.
 
5
The FLP theorem states, that in a distributed system with asynchronous message delivery, no algorithm can guarantee to reach a consensus between participating nodes if one or more of them can fail by stopping.
 
6
A read/write register is a data structure with only two operations: setting a specific value (set) and returning the latest value that was set (get).
 
7
Therefore, consensus as used for coordination in many NoSQL systems either natively [4] or through coordination services like Chubby and Zookeeper [28] is even harder to achieve with high availability than strong consistency [19].
 
8
Low-end hardware is used, because it is substantially more cost-efficient than high-end hardware [27, Sect. 3.1].
 
9
Currently only RethinkDB can perform general \(\theta \)-joins. MongoDB’s aggregation framework has support for left-outer equi-joins in its aggregation framework and CouchDB allows joins for pre-declared map-reduce views.
 
10
An alternative to MapReduce] are generalized data processing pipelines, where the database tries to optimize the flow of data and locality of computation based on a more declarative query language (e.g. MongoDB’s aggregation framework).
 
Literatur
1.
Zurück zum Zitat Abadi D (2012) Consistency tradeoffs in modern distributed database system design: cap is only part of the story. Computer 45(2):37–42CrossRef Abadi D (2012) Consistency tradeoffs in modern distributed database system design: cap is only part of the story. Computer 45(2):37–42CrossRef
2.
Zurück zum Zitat Attiya H, Bar-Noy A et al (1995) Sharing memory robustly in message-passing systems. JACM 42(1) Attiya H, Bar-Noy A et al (1995) Sharing memory robustly in message-passing systems. JACM 42(1)
3.
Zurück zum Zitat Bailis P, Kingsbury K (2014) The network is reliable. Commun ACM 57(9):48–55CrossRef Bailis P, Kingsbury K (2014) The network is reliable. Commun ACM 57(9):48–55CrossRef
4.
Zurück zum Zitat Baker J, Bond C, Corbett JC et al (2011) Megastore: providing scalable, highly available storage for interactive services. In: CIDR, pp 223–234 Baker J, Bond C, Corbett JC et al (2011) Megastore: providing scalable, highly available storage for interactive services. In: CIDR, pp 223–234
5.
Zurück zum Zitat Bernstein PA, Cseri I, Dani N et al (2011) Adapting microsoft sql server for cloud computing. In: 27th ICDE, pp 1255–1263 IEEE Bernstein PA, Cseri I, Dani N et al (2011) Adapting microsoft sql server for cloud computing. In: 27th ICDE, pp 1255–1263 IEEE
6.
Zurück zum Zitat Boykin O, Ritchie S, O’Connell I, Lin J (2014) Summingbird: a framework for integrating batch and online mapreduce computations. VLDB 7(13) Boykin O, Ritchie S, O’Connell I, Lin J (2014) Summingbird: a framework for integrating batch and online mapreduce computations. VLDB 7(13)
7.
Zurück zum Zitat Brewer EA (2000) Towards robust distributed systems Brewer EA (2000) Towards robust distributed systems
8.
Zurück zum Zitat Calder B, Wang J, Ogus A et al (2011) Windows azure storage: a highly available cloud storage service with strong consistency. In: 23th SOSP. ACM Calder B, Wang J, Ogus A et al (2011) Windows azure storage: a highly available cloud storage service with strong consistency. In: 23th SOSP. ACM
9.
Zurück zum Zitat Chang F, Dean J, Ghemawat S et al (2006) Bigtable: a distributed storage system for structured data. In: 7th OSDI, USENIX Association, pp 15–15 Chang F, Dean J, Ghemawat S et al (2006) Bigtable: a distributed storage system for structured data. In: 7th OSDI, USENIX Association, pp 15–15
10.
Zurück zum Zitat Charron-Bost B, Pedone F, Schiper A (2010) Replication: theory and practice, lecture notes in computer science, vol. 5959. Springer Charron-Bost B, Pedone F, Schiper A (2010) Replication: theory and practice, lecture notes in computer science, vol. 5959. Springer
11.
Zurück zum Zitat Cooper BF, Ramakrishnan R, Srivastava U et al (2008) Pnuts: Yahoo!’s hosted data serving platform. Proc VLDB Endow 1(2):1277–1288CrossRef Cooper BF, Ramakrishnan R, Srivastava U et al (2008) Pnuts: Yahoo!’s hosted data serving platform. Proc VLDB Endow 1(2):1277–1288CrossRef
12.
Zurück zum Zitat Corbett JC, Dean J, Epstein M, et al (2012) Spanner: Google’s globally-distributed database. In: Proceedings of OSDI, USENIX Association, pp 251–264 Corbett JC, Dean J, Epstein M, et al (2012) Spanner: Google’s globally-distributed database. In: Proceedings of OSDI, USENIX Association, pp 251–264
13.
Zurück zum Zitat Curino C, Jones E, Popa RA et al. (2011) Relational cloud: a database service for the cloud. In: 5th CIDR Curino C, Jones E, Popa RA et al. (2011) Relational cloud: a database service for the cloud. In: 5th CIDR
14.
Zurück zum Zitat Das S, Agrawal D, El Abbadi A et al (2010) G-store: a scalable data store for transactional multi key access in the cloud. In: 1st SoCC, ACM, pp 163–174 Das S, Agrawal D, El Abbadi A et al (2010) G-store: a scalable data store for transactional multi key access in the cloud. In: 1st SoCC, ACM, pp 163–174
15.
Zurück zum Zitat Davidson SB, Garcia-Molina H, Skeen D et al (1985) Consistency in a partitioned network: a survey. SUR 17(3):341–370CrossRef Davidson SB, Garcia-Molina H, Skeen D et al (1985) Consistency in a partitioned network: a survey. SUR 17(3):341–370CrossRef
16.
Zurück zum Zitat Dean J (2009) Designs, lessons and advice from building large distributed systems. Keynote talk at LADIS 2009 Dean J (2009) Designs, lessons and advice from building large distributed systems. Keynote talk at LADIS 2009
17.
Zurück zum Zitat Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. COMMUN ACM 51(1) Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. COMMUN ACM 51(1)
18.
Zurück zum Zitat DeC andia G, Hastorun D et al (2007) Dynamo: amazon’s highly available key-value store. In: 21th SOSP, ACM, pp 205–220 DeC andia G, Hastorun D et al (2007) Dynamo: amazon’s highly available key-value store. In: 21th SOSP, ACM, pp 205–220
19.
Zurück zum Zitat Fischer MJ, Lynch NA, Paterson MS (1985) Impossibility of distributed consensus with one faulty process. J ACM 32(2):374–382MathSciNetCrossRefMATH Fischer MJ, Lynch NA, Paterson MS (1985) Impossibility of distributed consensus with one faulty process. J ACM 32(2):374–382MathSciNetCrossRefMATH
20.
Zurück zum Zitat Gessert F, Schaarschmidt M, Wingerath W, Friedrich S, Ritter N (2015) The cache sketch: Revisiting expiration-based caching in the age of cloud data management. In: BTW, pp 53–72 Gessert F, Schaarschmidt M, Wingerath W, Friedrich S, Ritter N (2015) The cache sketch: Revisiting expiration-based caching in the age of cloud data management. In: BTW, pp 53–72
21.
Zurück zum Zitat Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51–59CrossRef Gilbert S, Lynch N (2002) Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. SIGACT News 33(2):51–59CrossRef
22.
Zurück zum Zitat Gray J, Helland P (1996) The dangers of replication and a solution. SIGMOD Rec 25(2):173–182CrossRef Gray J, Helland P (1996) The dangers of replication and a solution. SIGMOD Rec 25(2):173–182CrossRef
23.
Zurück zum Zitat Haerder T, Reuter A (1983) Principles of transaction-oriented database recovery. ACM Comput Surv 15(4):287–317MathSciNetCrossRef Haerder T, Reuter A (1983) Principles of transaction-oriented database recovery. ACM Comput Surv 15(4):287–317MathSciNetCrossRef
24.
Zurück zum Zitat Hamilton J (2007) On designing and deploying internet-scale services. In: 21st LISA. USENIX Association Hamilton J (2007) On designing and deploying internet-scale services. In: 21st LISA. USENIX Association
25.
Zurück zum Zitat Hellerstein JM, Stonebraker M, Hamilton J (2007) Architecture of a database system. Now Publishers Inc Hellerstein JM, Stonebraker M, Hamilton J (2007) Architecture of a database system. Now Publishers Inc
26.
Zurück zum Zitat Herlihy MP, Wing JM (1990) Linearizability: a correctness condition for concurrent objects. TOPLAS 12 Herlihy MP, Wing JM (1990) Linearizability: a correctness condition for concurrent objects. TOPLAS 12
27.
Zurück zum Zitat Hoelzle U, Barroso LA (2009) The Datacenter As a Computer: an introduction to the design of warehouse-scale machines. Morgan and Claypool Publishers Hoelzle U, Barroso LA (2009) The Datacenter As a Computer: an introduction to the design of warehouse-scale machines. Morgan and Claypool Publishers
28.
Zurück zum Zitat Hunt P, Konar M, Junqueira FP, Reed B (2010) Zookeeper: wait-free coordination for internet-scale systems. In: USENIXATC. USENIX Association Hunt P, Konar M, Junqueira FP, Reed B (2010) Zookeeper: wait-free coordination for internet-scale systems. In: USENIXATC. USENIX Association
29.
Zurück zum Zitat Kallman R, Kimura H, Natkins J et al (2008) H-store: a high-performance, distributed main memory transaction processing system. VLDB Endowment Kallman R, Kimura H, Natkins J et al (2008) H-store: a high-performance, distributed main memory transaction processing system. VLDB Endowment
30.
Zurück zum Zitat Karger D, Lehman E, Leighton T et al (1997) Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: 29th STOC, ACM Karger D, Lehman E, Leighton T et al (1997) Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: 29th STOC, ACM
31.
Zurück zum Zitat Kleppmann M (2016) Designing data-intensive applications. O Reilly, to appear Kleppmann M (2016) Designing data-intensive applications. O Reilly, to appear
32.
Zurück zum Zitat Kraska T, Pang G, Franklin MJ et al (2013) Mdcc: Multi-data center consistency. In: 8th EuroSys, ACM Kraska T, Pang G, Franklin MJ et al (2013) Mdcc: Multi-data center consistency. In: 8th EuroSys, ACM
33.
Zurück zum Zitat Kreps J (2014) Questioning the lambda architecture. Accessed: 17 Dec 2015 Kreps J (2014) Questioning the lambda architecture. Accessed: 17 Dec 2015
34.
Zurück zum Zitat Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. SIGOPS Oper Syst Rev 44(2):35–40CrossRef Lakshman A, Malik P (2010) Cassandra: a decentralized structured storage system. SIGOPS Oper Syst Rev 44(2):35–40CrossRef
35.
Zurück zum Zitat Laney D (2001) 3d data management: Controlling data volume, velocity, and variety. Tech. rep, META Group Laney D (2001) 3d data management: Controlling data volume, velocity, and variety. Tech. rep, META Group
36.
Zurück zum Zitat Lloyd W, Freedman MJ, Kaminsky, M et al (2011) Don’t settle for eventual: scalable causal consistency for wide-area storage with cops. In: 23th SOSP. ACM Lloyd W, Freedman MJ, Kaminsky, M et al (2011) Don’t settle for eventual: scalable causal consistency for wide-area storage with cops. In: 23th SOSP. ACM
37.
Zurück zum Zitat Mahajan P, Alvisi L, Dahlin M et al (2011) Consistency, availability, and convergence. University of Texas at Austin Tech Report 11 Mahajan P, Alvisi L, Dahlin M et al (2011) Consistency, availability, and convergence. University of Texas at Austin Tech Report 11
38.
Zurück zum Zitat Mao Y, Junqueira FP, Marzullo K (2008) Mencius: building efficient replicated state machines for wans. OSDI 8:369–384 Mao Y, Junqueira FP, Marzullo K (2008) Mencius: building efficient replicated state machines for wans. OSDI 8:369–384
39.
Zurück zum Zitat Marz N, Warren J (2015) Big data: principles and best practices of scalable realtime data systems. Manning Publications Co Marz N, Warren J (2015) Big data: principles and best practices of scalable realtime data systems. Manning Publications Co
40.
Zurück zum Zitat Min C, Kim K, Cho H et al (2012) Sfs: random write considered harmful in solid state drives. In: FAST Min C, Kim K, Cho H et al (2012) Sfs: random write considered harmful in solid state drives. In: FAST
41.
Zurück zum Zitat Özsu MT, Valduriez P (2011) Principles of distributed database systems. Springer Science & Business Media Özsu MT, Valduriez P (2011) Principles of distributed database systems. Springer Science & Business Media
42.
43.
Zurück zum Zitat Qiao L, Surlaker K, Das S et al (2013) On brewing fresh espresso: Linkedin’s distributed data serving platform. In: SIGMOD, ACM, pp 1135–1146 Qiao L, Surlaker K, Das S et al (2013) On brewing fresh espresso: Linkedin’s distributed data serving platform. In: SIGMOD, ACM, pp 1135–1146
44.
Zurück zum Zitat Sadalage PJ, Fowler M (2013) NoSQL distilled : a brief guide to the emerging world of polyglot persistence. Addison-Wesley, Upper Saddle River Sadalage PJ, Fowler M (2013) NoSQL distilled : a brief guide to the emerging world of polyglot persistence. Addison-Wesley, Upper Saddle River
45.
Zurück zum Zitat Shapiro M, Preguica N, Baquero C et al (2011) A comprehensive study of convergent and commutative replicated data types. Ph.D. thesis, INRIA Shapiro M, Preguica N, Baquero C et al (2011) A comprehensive study of convergent and commutative replicated data types. Ph.D. thesis, INRIA
46.
Zurück zum Zitat Shukla D, Thota S, Raman K et al (2015) Schema-agnostic indexing with azure documentdb. PVLDB 8(12) Shukla D, Thota S, Raman K et al (2015) Schema-agnostic indexing with azure documentdb. PVLDB 8(12)
47.
Zurück zum Zitat Sovran Y, Power R, Aguilera MK, Li J (2011) Transactional storage for geo-replicated systems. In: 23th SOSP, ACM, pp 385–400 Sovran Y, Power R, Aguilera MK, Li J (2011) Transactional storage for geo-replicated systems. In: 23th SOSP, ACM, pp 385–400
48.
Zurück zum Zitat Stonebraker M, Madden S, Abadi DJ et al (2007) The end of an architectural era: (it’s time for a complete rewrite). In: 33rd VLDB, pp 1150–1160 Stonebraker M, Madden S, Abadi DJ et al (2007) The end of an architectural era: (it’s time for a complete rewrite). In: 33rd VLDB, pp 1150–1160
49.
Zurück zum Zitat Wiese L et al (2015) Advanced Data Management: For SQL. Cloud and Distributed Databases. Walter de Gruyter GmbH & Co KG, NoSQL Wiese L et al (2015) Advanced Data Management: For SQL. Cloud and Distributed Databases. Walter de Gruyter GmbH & Co KG, NoSQL
50.
Zurück zum Zitat Zhang H, Chen G et al (2015) In-memory big data management and processing: a survey. TKDE Zhang H, Chen G et al (2015) In-memory big data management and processing: a survey. TKDE
Metadaten
Titel
NoSQL database systems: a survey and decision guidance
verfasst von
Felix Gessert
Wolfram Wingerath
Steffen Friedrich
Norbert Ritter
Publikationsdatum
03.11.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
SICS Software-Intensive Cyber-Physical Systems / Ausgabe 3-4/2017
Print ISSN: 2524-8510
Elektronische ISSN: 2524-8529
DOI
https://doi.org/10.1007/s00450-016-0334-3

Weitere Artikel der Ausgabe 3-4/2017

SICS Software-Intensive Cyber-Physical Systems 3-4/2017 Zur Ausgabe

Special Issue Paper

Microservices tenets