ABSTRACT
Low latency is critical for interactive networked applications. But while we know how to scale systems to increase capacity, reducing latency --- especially the tail of the latency distribution --- can be much more difficult. In this paper, we argue that the use of redundancy is an effective way to convert extra capacity into reduced latency. By initiating redundant operations across diverse resources and using the first result which completes, redundancy improves a system's latency even under exceptional conditions. We study the tradeoff with added system utilization, characterizing the situations in which replicating all tasks reduces mean latency. We then demonstrate empirically that replicating all operations can result in significant mean and tail latency reduction in real-world systems including DNS queries, database servers, and packet forwarding within networks.
- M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: dynamic ow scheduling for data center networks. In Proceedings of the 7th USENIX conference on Networked systems design and implementation, NSDI'10, pages 19--19, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
- M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In SIGCOMM, 2010. Google ScholarDigital Library
- M. Alizadeh, S. Yang, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker. Deconstructing datacenter packet transport. In Proceedings of the 11th ACM Workshop on Hot Topics in Networks, HotNets-XI, pages 133--138, New York, NY, USA, 2012. ACM. Google ScholarDigital Library
- G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica. Why let resources idle? Aggressive cloning of jobs with Dolly. In USENIX HotCloud, 2012. Google ScholarDigital Library
- D. G. Andersen, H. Balakrishnan, M. F. Kaashoek, and R. N. Rao. Improving web availability for clients with MONET. In USENIX NSDI, pages 115--128, Berkeley, CA, USA, 2005. USENIX Association. Google ScholarDigital Library
- S. Asmussen. Applied Probability and Queues. Wiley, 1987.Google Scholar
- D. Beaver, S. Kumar, H. C. Li, J. Sobel, and P. Vajgel. Finding a needle in haystack: facebook's photo storage. In Proceedings of the 9th USENIX conference on Operating systems design and implementation, OSDI'10, pages 1--8, Berkeley, CA, USA, 2010. USENIX Association. Google ScholarDigital Library
- T. Benson, A. Akella, and D. A. Maltz. Network traffic characteristics of data centers in the wild. In IMC, pages 267--280, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- J. Brutlag. Speed matters for Google web search, June 2009. http://services.google.com/fh/files/blogs/google_delayexp.pdf.Google Scholar
- Apache Cassandra. http://cassandra.apache.org.Google Scholar
- E. W. Chan, X. Luo, W. Li, W. W. Fok, and R. K. Chang. Measurement of loss pairs in network paths. In IMC, pages 88--101, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- J. Chu. Tuning TCP parameters for the 21st century. http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July 2009.Google Scholar
- J. Dean and L. A. Barroso. The tail at scale. Commun. ACM, 56(2):74--80, Feb. 2013. Google ScholarDigital Library
- P. Dixon. Shopzilla site redesign -- we get what we measure, June 2009. http://www.slideshare.net/shopzilla/shopzillas-you-get-what-you-measure-velocity-2009.Google Scholar
- C. C. Foster and E. M. Riseman. Percolation of code to enhance parallel dispatching and execution. IEEE Trans. Comput., 21(12):1411--1415, Dec. 1972. Google ScholarDigital Library
- Google AppEngine datastore: memcached cache. https://developers.google.com/appengine/docs/python/memcache/usingmemcache#Pattern.Google Scholar
- W. Gray and D. Boehm-Davis. Milliseconds matter: An introduction to microstrategies and to their use in describing and predicting interactive behavior. Journal of Experimental Psychology: Applied, 6(4):322, 2000.Google ScholarCross Ref
- A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: a scalable and exible data center network. In ACM SIGCOMM, pages 51--62, New York, NY, USA, 2009. ACM. Google ScholarDigital Library
- D. Han, A. Anand, A. Akella, and S. Seshan. RPT: re-architecting loss protection for content-aware networks. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, NSDI'12, pages 6--6, Berkeley, CA, USA, 2012. USENIX Association. Google ScholarDigital Library
- C. Hopps. Computing TCP's retransmission timer (RFC 6298), 2000.Google Scholar
- S. Jain, M. Demmer, R. Patra, and K. Fall. Using redundancy to cope with failures in a delay tolerant network. In ACM SIGCOMM, 2005. Google ScholarDigital Library
- J. Li, J. Stribling, R. Morris, and M. Kaashoek. Bandwidth-efficient management of DHT routing tables. In NSDI, 2005. Google ScholarDigital Library
- D. S. Myers and M. K. Vernon. Estimating queue length distributions for queues with random arrivals. SIGMETRICS Perform. Eval. Rev., 40(3):77--79, Jan. 2012. Google ScholarDigital Library
- M. Olvera-Cravioto, J. Blanchet, and P. Glynn. On the transition from heavy-traffic to heavy-tails for the m/g/1 queue: The regularly varying case. Annals of Applied Probability, 21:645--668, 2011.Google ScholarCross Ref
- S. Ramachandran. Web metrics: Size and number of resources, May 2010. https://developers.google.com/speed/articles/web-metrics.Google Scholar
- K. Sigman. Appendix: A primer on heavy-tailed distributions. Queueing Systems, 33(1-3):261--275, 1999. Google ScholarDigital Library
- E. Soljanin. Reducing delay with coding in (mobile) multi-agent information transfer. In Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on, pages 1428--1433. IEEE, 2010.Google ScholarCross Ref
- S. Souders. Velocity and the bottom line. http://radar.oreilly.com/2009/07/velocity-making-your-site-fast.html.Google Scholar
- A. Vulimiri, P. B. Godfrey, and S. Shenker. A cost-benefit analysis of low latency via added utilization, June 2013. http://web.engr.illinois.edu/~vulimir1/benchmark.pdf.Google Scholar
- A. Vulimiri, O. Michel, P. B. Godfrey, and S. Shenker. More is less: Reducing latency via redundancy. In Eleventh ACM Workshop on Hot Topics in Networks (HotNets-XI), October 2012. Google ScholarDigital Library
- X. Wu and X. Yang. Dard: Distributed adaptive routing for datacenter networks. In Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, ICDCS '12, pages 32--41, Washington, DC, USA, 2012. IEEE Computer Society. Google ScholarDigital Library
- M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In USENIX OSDI, pages 29--42, Berkeley, CA, USA, 2008. Google ScholarDigital Library
- A. P. Zwart. Queueing Systems With Heavy Tails. PhD thesis, Technische Universiteit Eindhoven, September 2001.Google Scholar
Index Terms
- Low latency via redundancy
Recommendations
Fine-grained latency and loss measurements in the presence of reordering
SIGMETRICS '11: Proceedings of the ACM SIGMETRICS joint international conference on Measurement and modeling of computer systemsModern trading and cluster applications require microsecond latencies and almost no losses in data centers. This paper introduces an algorithm called FineComb that can estimate fine-grain end-to-end loss and latency measurements between edge routers in ...
Fine-grained latency and loss measurements in the presence of reordering
Performance evaluation reviewModern trading and cluster applications require microsecond latencies and almost no losses in data centers. This paper introduces an algorithm called FineComb that can estimate fine-grain end-to-end loss and latency measurements between edge routers in ...
Performance analysis of an ATM switch with multiple paths
ICNP '95: Proceedings of the 1995 International Conference on Network ProtocolsAn ATM switch based on multistage interconnection networks with multiple paths can support higher bandwidth than that of buffered networks with single path by passing multiple packets to the same destination simultaneously. These multiple packets are ...
Comments