research-article

Photon: fault-tolerant and scalable joining of continuous data streams

Authors:
Rajagopal Ananthanarayanan

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Venkatesh Basker

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Sumit Das

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Ashish Gupta

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Haifeng Jiang

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Tianhao Qiu

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Alexey Reznichenko

Max Planck Institute for Software Systems, Kaiserslautern, Germany

Max Planck Institute for Software Systems, Kaiserslautern, Germany
View Profile

,
Deomid Ryabkov

Google Inc., Dublin, Ireland

Google Inc., Dublin, Ireland
View Profile

,
Manpreet Singh

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

,
Shivakumar Venkataraman

Google Inc., Mountain View, CA, USA

Google Inc., Mountain View, CA, USA
View Profile

SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataJune 2013Pages 577–588https://doi.org/10.1145/2463676.2465272

Published:22 June 2013Publication History

SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

Pages 577–588

ABSTRACT

Web-based enterprises process events generated by millions of users interacting with their websites. Rich statistical data distilled from combining such interactions in near real-time generates enormous business value. In this paper, we describe the architecture of Photon, a geographically distributed system for joining multiple continuously flowing streams of data in real-time with high scalability and low latency, where the streams may be unordered or delayed. The system fully tolerates infrastructure degradation and datacenter-level outages without any manual intervention. Photon guarantees that there will be no duplicates in the joined output (at-most-once semantics) at any point in time, that most joinable events will be present in the output in real-time (near-exact semantics), and exactly-once semantics eventually.

Photon is deployed within Google Advertising System to join data streams such as web search queries and user clicks on advertisements. It produces joined logs that are used to derive key business metrics, including billing for advertisers. Our production deployment processes millions of events per minute at peak with an average end-to-end latency of less than 10 seconds. We also present challenges and solutions in maintaining large persistent state across geographically distant locations, and highlight the design principles that emerged from our experience.

References

D. J. Abadi et al. "The Design of the Borealis Stream Processing Engine". Proc. of CIDR 2005, pp.277--289.Google Scholar
J. Baker et al. "Megastore: Providing scalable, highly available storage for interactive devices". Proc. of CIDR 2011, pp.223--234.Google Scholar
S. Blanas et al. "A comparison of join algorithms for log processing in Mapreduce". Proc. of SIGMOD 2010, pp.975--986. Google ScholarDigital Library
T. D. Chandra, R. Griesemer, and J. Redstone. "Paxos made live: an engineering perspective". Proc. of ACM PODC 2007, pp.398--407. Google ScholarDigital Library
S. Chandrasekaran and M. J. Franklin. "Streaming queries over streaming data". Proc. of VLDB 2002, pp.203--214. Google ScholarDigital Library
F. Chang et al. "Bigtable: A Distributed Storage System for Structured Data". ACM TOCS 2008, 26.2, pp.4:1-4:26. Google ScholarDigital Library
E. F. Codd. "A Relational Model of Data for Large Shared Data Banks", Communications of the ACM 13 (6): p377--387, 1970. Google ScholarDigital Library
J. C. Corbett et al. "Spanner: Google's Globally-Distributed Database". Proc. of OSDI 2012. Google ScholarDigital Library
A. Das, J. Gehrke, and M. Riedewald. "Approximate join processing over data streams". Proc. of SIGMOD 2003, pp.40--51. Google ScholarDigital Library
J. Dean and S. Ghemawat. "MapReduce: Simplified data processing on large clusters". Proc. of OSDI 2004, pp.137--149. Google ScholarDigital Library
G. DeCandia et al. "Dynamo: Amazon's Highly Available Key-value Store". Proc. of SOSP. 2007, pp. 205--220. Google ScholarDigital Library
B. Fitzpatrick. "Distributed Caching with Memcached". Linux Journal, Issue 124, 2004, pp.5. Google ScholarDigital Library
B. Gedik, P. S. Yu, and R. R. Bordawekar. "Executing stream joins on the cell processor", Proc. of VLDB 2007, pp.363--374. Google ScholarDigital Library
S. Ghemawat, H. Gobioff, and S-T Leung. "The Google File System". 19th Symposium on Operating Systems Principles 2003, pp.20--43. Google ScholarDigital Library
M. A. Hammad, W. G. Aref, and A. K. Elmagarmid. "Joining multiple data streams with window constraints". Computer Science Technical Reports, #02-115.Google Scholar
J. Kang, J. F. Naughton, and S. D. Viglas. "Evaluating window joins over unbounded streams". Proc. of VLDB 2002, pp.341--352.Google Scholar
D. Karger et al. "Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web". Proc. of ACM SOTC 1997, pp.654--663. Google ScholarDigital Library
L. Lamport. "The part-time parliament", ACM TOCS 16.2 1998, pp.133--169. Google ScholarDigital Library
P. Mishra and M. H. Eich. "Join processing in relational databases". ACM Computing Surveys 1992, 24(1), pp.63--113. Google ScholarDigital Library
L. Neumeyer et al. "S4: Distributed Stream Computing Platform". Proc. of KDCloud 2010. Google ScholarDigital Library
J. Rao, E. J. Shekita, and S. Tata. "Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore". Proc. of VLDB 2011, pp.243--254. Google ScholarDigital Library
F. B. Schneider. "Implementing fault-tolerant services using the state machine approach: A tutorial", ACM Computing Surveys 22 1990, pp.299--319 (1990). Google ScholarDigital Library
D. Shasha and P. Bonnet. "Database Tuning: Principles, Experiments, and Troubleshooting Techniques". Proc. of SIGMOD 2004, pp.115--116. Google ScholarDigital Library
J. Teubner and R. Mueller. "How soccer players would do stream joins". Proc. of SIGMOD 2011, pp.625--636. Google ScholarDigital Library
J. Xie and J. Yang. "A survey of join processing in data streams". Data Streams - Models and Algorithms 2007, pp.209--236.Google Scholar
M. Zaharia et al. "Discretized Streams: An Efficient and Fault-tolerant Model for Stream Processing on Large Clusters". Proc. of HotCloud 2012, pp.10--10. Google ScholarDigital Library

Index Terms

Photon: fault-tolerant and scalable joining of continuous data streams
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Parallel and distributed DBMSs
  2. World Wide Web
    1. Web applications
    2. Web services

Recommendations

A load shedding framework for XML stream joins
DEXA'10: Proceedings of the 21st international conference on Database and expert systems applications: Part I

Joining data streams using various types of windows is an established method of stream processing. The limitation of window size due to memory constraint takes a heavy toll on the accuracy of the query result. Through this paper, we propose a unique ...
Read More
Weighted Reservoir Sampling from Distributed Streams
PODS '19: Proceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems

We consider message-efficient continuous random sampling from a distributed stream, where the probability of inclusion of an item in the sample is proportional to a weight associated with the item. The unweighted version, where all weights are equal, is ...
Read More
Dcell: a scalable and fault-tolerant network structure for data centers

A fundamental challenge in data center networking is how to efficiently interconnect an exponentially increasing number of servers. This paper presents DCell, a novel network structure that has many desirable features for data center networking. DCell ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
June 2013
1322 pages
ISBN:9781450320375
DOI:10.1145/2463676
General Chairs:
Kenneth Ross
Columbia University
,
Divesh Srivastava
AT&T Research
,
Program Chair:
Dimitris Papadias
HKUST
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
continuous streams
fault-tolerance
paxos
stream joining
Qualifiers
- research-article
Conference

Acceptance Rates
SIGMOD '13 Paper Acceptance Rate76of372submissions,20%Overall Acceptance Rate785of4,003submissions,20%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 110
  Total Citations
  View Citations
- 1,064
  Total Downloads
- Downloads (Last 12 months)33
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Photon: fault-tolerant and scalable joining of continuous data streams

SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data

ABSTRACT

References

Cited By

Index Terms

Recommendations

A load shedding framework for XML stream joins

Weighted Reservoir Sampling from Distributed Streams

Dcell: a scalable and fault-tolerant network structure for data centers