ABSTRACT
We introduce the Tink library for distributed temporal graph analytics. Increasingly, reasoning about temporal aspects of graph-structured data collections is an important aspect of analytics. For example, in a communication network, time plays a fundamental role in the propagation of information within the network. Whereas existing tools for temporal graph analysis are built stand alone, Tink is a library in the Apache Flink ecosystem, thereby leveraging its advanced mature features such as distributed processing and query optimization. Furthermore, Flink requires little effort to process and clean the data without having to use different tools before analyzing the data. Tink focuses on interval graphs in which every edge is associated with a starting time and an ending time. The library provides facilities for temporal graph creation and maintenance, as well as standard temporal graph measures and algorithms. Furthermore, the library is designed for ease of use and extensibility.
- Renzo Angles, Marcelo Arenas, Pablo Barceló, Aidan Hogan, Juan L. Reutter, and Domagoj Vrgoc. 2017. Foundations of modern query languages for graph databases. ACM Comput. Surv., Vol. 50, 5 (2017), 68:1--68:40. Google ScholarDigital Library
- Guillaume Bagan, Angela Bonifati, Radu Ciucanu, George HL Fletcher, Aurélien Lemay, and Nicky Advokaat. 2017. gMark: schema-driven generation of graphs and queries. IEEE TKDE, Vol. 29, 4 (2017), 856--869. Google ScholarDigital Library
- Raymond Cheng et al. 2012. Kineograph: taking the pulse of a fast-changing and connected world ECCS'12. ACM, 85--98. Google ScholarDigital Library
- Antoine Dutot, Frédéric Guinand, Damien Olivier, and Yoann Pigné. 2007. Graphstream: A tool for bridging the gap between complex systems and dynamic graphs ECCS'07.Google Scholar
- Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: a graph engine for temporal graph analysis ECCS'14. ACM, 1. Google ScholarDigital Library
- Petter Holme. 2015. Modern temporal network theory: A colloquium. Eur Phys J B Vol. 88 (2015), 234--264.Google ScholarCross Ref
- Wouter Ligtenberg. 2017. Tink, a temporal graph analytics library for Apache Flink. Eindhoven University of Technology (2017).Google Scholar
- Ashwin Paranjape, Austin R. Benson, and Jure Leskovec. 2017. Motifs in Temporal Networks. In WSDM. 601--610. Google ScholarDigital Library
- Philip Stutz, Abraham Bernstein, and William Cohen. 2010. Signal/collect: graph algorithms for the (semantic) web. ISWC'10 (2010), 764--780. Google ScholarDigital Library
- Bimal Viswanath, Alan Mislove, Meeyoung Cha, and Krishna P Gummadi. 2009. On the evolution of user interaction in facebook. WOSN'09. ACM, 37--42. Google ScholarDigital Library
- Huanhuan Wu, James Cheng, Yiping Ke, Silu Huang, Yuzhen Huang, and Hejun Wu. 2016. Efficient Algorithms for Temporal Path Computation. IEEE TKDE, Vol. 28, 11 (2016), 2927--2942. Google ScholarDigital Library
Index Terms
- Tink: A Temporal Graph Analytics Library for Apache Flink
Recommendations
Analyzing extended property graphs with Apache Flink
NDA '16: Proceedings of the 1st ACM SIGMOD Workshop on Network Data AnalyticsGraphs are an intuitive way to model complex relationships between real-world data objects. Thus, graph analytics plays an important role in research and industry. As graphs often reflect heterogeneous domain data, their representation requires an ...
Mosaics in Big Data: Stratosphere, Apache Flink, and Beyond
DEBS '18: Proceedings of the 12th ACM International Conference on Distributed and Event-based SystemsThe global database research community has greatly impacted the functionality and performance of data storage and processing systems along the dimensions that define "big data", i.e., volume, velocity, variety, and veracity. Locally, over the past five ...
A Study on the Performance and Scalability of Apache Flink Over Hadoop MapReduce
With the advancements in science and technology, data is being generated at a staggering rate. The raw data generated is generally of high value and may conceal important information with the potential to solve several real-world problems. In order to ...
Comments