skip to main content
10.1145/2463676.2465272acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Photon: fault-tolerant and scalable joining of continuous data streams

Published:22 June 2013Publication History

ABSTRACT

Web-based enterprises process events generated by millions of users interacting with their websites. Rich statistical data distilled from combining such interactions in near real-time generates enormous business value. In this paper, we describe the architecture of Photon, a geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low latency, where the streams may be unordered or delayed. The system fully tolerates infrastructure degradation and datacenter-level outages without any manual intervention. Photon guarantees that there will be no duplicates in the joined output (at-most-once semantics) at any point in time, that most joinable events will be present in the output in real-time (near-exact semantics), and exactly-once semantics eventually.

Photon is deployed within Google Advertising System to join data streams such as web search queries and user clicks on advertisements. It produces joined logs that are used to derive key business metrics, including billing for advertisers. Our production deployment processes millions of events per minute at peak with an average end-to-end latency of less than 10 seconds. We also present challenges and solutions in maintaining large persistent state across geographically distant locations, and highlight the design principles that emerged from our experience.

References

  1. D. J. Abadi et al. "The Design of the Borealis Stream Processing Engine". Proc. of CIDR 2005, pp.277--289.Google ScholarGoogle Scholar
  2. J. Baker et al. "Megastore: Providing scalable, highly available storage for interactive devices". Proc. of CIDR 2011, pp.223--234.Google ScholarGoogle Scholar
  3. S. Blanas et al. "A comparison of join algorithms for log processing in Mapreduce". Proc. of SIGMOD 2010, pp.975--986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. D. Chandra, R. Griesemer, and J. Redstone. "Paxos made live: an engineering perspective". Proc. of ACM PODC 2007, pp.398--407. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Chandrasekaran and M. J. Franklin. "Streaming queries over streaming data". Proc. of VLDB 2002, pp.203--214. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Chang et al. "Bigtable: A Distributed Storage System for Structured Data". ACM TOCS 2008, 26.2, pp.4:1-4:26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. E. F. Codd. "A Relational Model of Data for Large Shared Data Banks", Communications of the ACM 13 (6): p377--387, 1970. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J. C. Corbett et al. "Spanner: Google's Globally-Distributed Database". Proc. of OSDI 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Das, J. Gehrke, and M. Riedewald. "Approximate join processing over data streams". Proc. of SIGMOD 2003, pp.40--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Dean and S. Ghemawat. "MapReduce: Simplified data processing on large clusters". Proc. of OSDI 2004, pp.137--149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. DeCandia et al. "Dynamo: Amazon's Highly Available Key-value Store". Proc. of SOSP. 2007, pp. 205--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. B. Fitzpatrick. "Distributed Caching with Memcached". Linux Journal, Issue 124, 2004, pp.5. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. B. Gedik, P. S. Yu, and R. R. Bordawekar. "Executing stream joins on the cell processor", Proc. of VLDB 2007, pp.363--374. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Ghemawat, H. Gobioff, and S-T Leung. "The Google File System". 19th Symposium on Operating Systems Principles 2003, pp.20--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. M. A. Hammad, W. G. Aref, and A. K. Elmagarmid. "Joining multiple data streams with window constraints". Computer Science Technical Reports, #02-115.Google ScholarGoogle Scholar
  16. J. Kang, J. F. Naughton, and S. D. Viglas. "Evaluating window joins over unbounded streams". Proc. of VLDB 2002, pp.341--352.Google ScholarGoogle Scholar
  17. D. Karger et al. "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web". Proc. of ACM SOTC 1997, pp.654--663. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. L. Lamport. "The part-time parliament", ACM TOCS 16.2 1998, pp.133--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. P. Mishra and M. H. Eich. "Join processing in relational databases". ACM Computing Surveys 1992, 24(1), pp.63--113. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. L. Neumeyer et al. "S4: Distributed Stream Computing Platform". Proc. of KDCloud 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Rao, E. J. Shekita, and S. Tata. "Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore". Proc. of VLDB 2011, pp.243--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. F. B. Schneider. "Implementing fault-tolerant services using the state machine approach: A tutorial", ACM Computing Surveys 22 1990, pp.299--319 (1990). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. Shasha and P. Bonnet. "Database Tuning: Principles, Experiments, and Troubleshooting Techniques". Proc. of SIGMOD 2004, pp.115--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Teubner and R. Mueller. "How soccer players would do stream joins". Proc. of SIGMOD 2011, pp.625--636. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Xie and J. Yang. "A survey of join processing in data streams". Data Streams - Models and Algorithms 2007, pp.209--236.Google ScholarGoogle Scholar
  26. M. Zaharia et al. "Discretized Streams: An Efficient and Fault-tolerant Model for Stream Processing on Large Clusters". Proc. of HotCloud 2012, pp.10--10. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Photon: fault-tolerant and scalable joining of continuous data streams

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
          June 2013
          1322 pages
          ISBN:9781450320375
          DOI:10.1145/2463676

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 22 June 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          SIGMOD '13 Paper Acceptance Rate76of372submissions,20%Overall Acceptance Rate785of4,003submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader