Skip to main content

2017 | OriginalPaper | Buchkapitel

Fast Log Replication in Highly Available Data Store

verfasst von : Donghui Wang, Peng Cai, Weining Qian, Aoying Zhou, Tianze Pang, Jing Jiang

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Modern large-scale data stores widely adopt consensus protocols to achieve high availability and throughput. The recently proposed Raft algorithm has better understandability and widely implemented in large amount of open source projects. In these consensus algorithms including Raft, log replication is a common and frequently used operation which has significant impact on the system performance. Especially, since the commit latency is capped by the slowest follower out of the majority followers responded to the leader, it’s important to design a fast scheme to process the replicated logs by follower nodes. Based on the analysis on how the follower node handles the received log entries in Raft algorithm, we figure out the main factors influencing the duration time from when the follower receives the log and to when it acknowledges the leader this log was received. In terms of these factors we propose an effective log replication scheme to optimize the process of flushing logs to disk and replaying them, referred to as Raft with Fast Followers (FRaft). Finally, we compare the performance of Raft and FRaft using YCSB benchmark and Sysbench test tools, and experimental results demonstrate FRaft has lower latency and higher throughput than the Raft only using straightforward pipeline and batch optimization for log replication.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
4.
Zurück zum Zitat Ananthanarayanan, R., Basker, V., Das, S., Photon, A.G., et al.: fault-tolerant and scalable joining of continuous data streams. In: Proceedings of the ACM SIGMOD, pp. 577–588 (2013) Ananthanarayanan, R., Basker, V., Das, S., Photon, A.G., et al.: fault-tolerant and scalable joining of continuous data streams. In: Proceedings of the ACM SIGMOD, pp. 577–588 (2013)
5.
Zurück zum Zitat Bartoli, A., Calabrese, C., Prica, M., Muro, E.A.D., Montresor, A.: Adaptive message packing for group communication systems. In: On The Move to Meaningful Internet Systems, pp. 912–925 (2003) Bartoli, A., Calabrese, C., Prica, M., Muro, E.A.D., Montresor, A.: Adaptive message packing for group communication systems. In: On The Move to Meaningful Internet Systems, pp. 912–925 (2003)
6.
Zurück zum Zitat Bolosky, W.J., Bradshaw, D., Haagens, R.B., Kusters, N.P., Li, P.: Paxos replicated state machines as the basis of a high-performance data store. In: Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, NSDI (2011) Bolosky, W.J., Bradshaw, D., Haagens, R.B., Kusters, N.P., Li, P.: Paxos replicated state machines as the basis of a high-performance data store. In: Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation, NSDI (2011)
7.
Zurück zum Zitat Carmeli, B., Gershinsky, G., Harpaz, A., Naaman, N., Nelken, H., Satran, J., Vortman, P.: High throughput reliable message dissemination. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC), pp. 322–327 (2004) Carmeli, B., Gershinsky, G., Harpaz, A., Naaman, N., Nelken, H., Satran, J., Vortman, P.: High throughput reliable message dissemination. In: Proceedings of the 2004 ACM Symposium on Applied Computing (SAC), pp. 322–327 (2004)
8.
Zurück zum Zitat Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 398–407 (2007) Chandra, T.D., Griesemer, R., Redstone, J.: Paxos made live: an engineering perspective. In: Proceedings of the Twenty-Sixth Annual ACM Symposium on Principles of Distributed Computing, pp. 398–407 (2007)
9.
Zurück zum Zitat Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, pp. 143–154 (2010) Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC 2010, pp. 143–154 (2010)
10.
Zurück zum Zitat Corbett, J.C., Dean, J., Epstein, M., Spanner, A.F., et al.: Google’s globally distributed database. ACM Trans. Comput. Syst. 31, 8:1–8:22 (2013)CrossRef Corbett, J.C., Dean, J., Epstein, M., Spanner, A.F., et al.: Google’s globally distributed database. ACM Trans. Comput. Syst. 31, 8:1–8:22 (2013)CrossRef
11.
Zurück zum Zitat Dwork, C., Lynch, N.A., Stockmeyer, L.J.: Consensus in the presence of partial synchrony. J. ACM 35, 288–323 (1988)MathSciNetCrossRef Dwork, C., Lynch, N.A., Stockmeyer, L.J.: Consensus in the presence of partial synchrony. J. ACM 35, 288–323 (1988)MathSciNetCrossRef
12.
Zurück zum Zitat Friedman, R., Hadad, E.: Adaptive batching for replicated servers. In: 25th IEEE Symposium on Reliable Distributed Systems, pp. 311–320 (2006) Friedman, R., Hadad, E.: Adaptive batching for replicated servers. In: 25th IEEE Symposium on Reliable Distributed Systems, pp. 311–320 (2006)
13.
Zurück zum Zitat Friedman, R., van Renesse, R.: Packing messages as a tool for boosting the performance of total ordering protocols. In: Proceedings of the 6th International Symposium on High Performance Distributed Computing, pp. 233–242 (1997) Friedman, R., van Renesse, R.: Packing messages as a tool for boosting the performance of total ordering protocols. In: Proceedings of the 6th International Symposium on High Performance Distributed Computing, pp. 233–242 (1997)
14.
Zurück zum Zitat Gifford, D.K.: Information storage in a decentralized computer system. Univ. Microfilms (1982) Gifford, D.K.: Information storage in a decentralized computer system. Univ. Microfilms (1982)
15.
Zurück zum Zitat Junqueira, F.P., Reed, B.C., Serafini, M.: Zab: high-performance broadcast for primary-backup systems. In: Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, pp. 245–256. IEEE Computer Society (2011) Junqueira, F.P., Reed, B.C., Serafini, M.: Zab: high-performance broadcast for primary-backup systems. In: Proceedings of the 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, pp. 245–256. IEEE Computer Society (2011)
16.
Zurück zum Zitat Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)CrossRef Lamport, L.: The part-time parliament. ACM Trans. Comput. Syst. 16(2), 133–169 (1998)CrossRef
17.
Zurück zum Zitat Mao, Y., Junqueira, F.P., Marzullo, K.: Mencius: building efficient replicated state machines for wans. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, pp. 369–384. USENIX Association (2008) Mao, Y., Junqueira, F.P., Marzullo, K.: Mencius: building efficient replicated state machines for wans. In: Proceedings of the 8th USENIX Conference on Operating Systems Design and Implementation, pp. 369–384. USENIX Association (2008)
18.
Zurück zum Zitat Moraru, I., Andersen, D.G., Kaminsky, M.: Paxos quorum leases: fast reads without sacrificing writes. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 22:1–22:13. ACM (2014) Moraru, I., Andersen, D.G., Kaminsky, M.: Paxos quorum leases: fast reads without sacrificing writes. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 22:1–22:13. ACM (2014)
19.
Zurück zum Zitat Oki, B.M., Liskov, B.H.: Viewstamped replication: a new primary copy method to support highly-available distributed systems. In: Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, PODC 1988, pp. 8–17. ACM (1988) Oki, B.M., Liskov, B.H.: Viewstamped replication: a new primary copy method to support highly-available distributed systems. In: Proceedings of the Seventh Annual ACM Symposium on Principles of Distributed Computing, PODC 1988, pp. 8–17. ACM (1988)
20.
Zurück zum Zitat Ongaro, D., Ousterhout, J.K.: In search of an understandable consensus algorithm. In: 2014 USENIX Annual Technical ConferenceATC, pp. 305–319 (2014) Ongaro, D., Ousterhout, J.K.: In search of an understandable consensus algorithm. In: 2014 USENIX Annual Technical ConferenceATC, pp. 305–319 (2014)
21.
Zurück zum Zitat Rao, J., Shekita, E.J., Tata, S.: Using paxos to build a scalable, consistent, and highly available datastore. PVLDB, pp. 243–254 (2011) Rao, J., Shekita, E.J., Tata, S.: Using paxos to build a scalable, consistent, and highly available datastore. PVLDB, pp. 243–254 (2011)
22.
Zurück zum Zitat Santos, N., Schiper, A.: Tuning paxos for high-throughput with batching and pipelining. In: 13th International Conference Distributed Computing and Networking, pp. 153–167 (2012) Santos, N., Schiper, A.: Tuning paxos for high-throughput with batching and pipelining. In: 13th International Conference Distributed Computing and Networking, pp. 153–167 (2012)
23.
Zurück zum Zitat Stonebraker, M.: Concurrency control and consistency of multiple copies of data in distributed ingres. IEEE Trans. Softw. Eng. 3, 188–194 (1979)CrossRefMATH Stonebraker, M.: Concurrency control and consistency of multiple copies of data in distributed ingres. IEEE Trans. Softw. Eng. 3, 188–194 (1979)CrossRefMATH
Metadaten
Titel
Fast Log Replication in Highly Available Data Store
verfasst von
Donghui Wang
Peng Cai
Weining Qian
Aoying Zhou
Tianze Pang
Jing Jiang
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63564-4_20