skip to main content
10.1145/2980523.2980527acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Analyzing extended property graphs with Apache Flink

Published:01 July 2016Publication History

ABSTRACT

Graphs are an intuitive way to model complex relationships between real-world data objects. Thus, graph analytics plays an important role in research and industry. As graphs often reflect heterogeneous domain data, their representation requires an expressive data model including the abstraction of graph collections, for example, to analyze communities inside a social network. Further on, answering complex analytical questions about such graphs entails combining multiple analytical operations. To satisfy these requirements, we propose the Extended Property Graph Model, which is semantically rich, schema-free and supports multiple distinct graphs. Based on this representation, it provides declarative and combinable operators to analyze both single graphs and graph collections. Our current implementation is based on the distributed dataflow framework Apache Flink. We present the results of a first experimental study showing the scalability of our implementation on social network data with up to 11 billion edges.

References

  1. A. Alexandrov et. al. The Stratosphere Platform for Big Data Analytics. The VLDB Journal, 23(6), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Angles. A Comparison of Current Graph Database Models. In Proc. ICDEW, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. R. Angles and C. Gutiérrez. Survey of graph database models. ACM Comput. Surv., 40(1), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. M. Capotă et. al. Graphalytics: A Big Data Benchmark for Graph-Processing Platforms. In Proc. GRADES, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Curtiss et. al. Unicorn: A System for Searching the Social Graph. PVLDB, 6(11), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. A. Dries, S. Nijssen, and L. De Raedt. A Query Language for Analyzing Networks. In Proc. CIKM, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75--174, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  8. B. Gallagher. Matching structure and semantics: A survey on graph-based pattern matching. AAAI FS, 6:45--53, 2006.Google ScholarGoogle Scholar
  9. A. Ghrab et al. A Framework for Building OLAP Cubes on Graphs. In Proc. ADBIS, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  10. H. He and A. K. Singh. Graphs-at-a-time: Query Language and Access Methods for Graph Databases. In Proc. SIGMOD, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. C. Jiang et al. A survey of Frequent Subgraph Mining algorithms. Knowledge Eng. Review, 28(1):75--105, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  12. M. Junghanns, A. Petermann, K. Gómez, and E. Rahm. GRADOOP: Scalable Graph Data Management and Analytics with Hadoop. arXiv:1506.00548, 2015.Google ScholarGoogle Scholar
  13. Z. J. Ling et. al. GEMINI: An Integrative Healthcare Analytics System. PVLDB, 7(13), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Petermann et. al. BIIIG: Enabling Business Intelligence with Integrated Instance Graphs. In Proc. ICDEW, 2014.Google ScholarGoogle Scholar
  15. A. Petermann et. al. Graph-based Data Integration and Business Intelligence with BIIIG. PVLDB, 7(13), 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. U. N. Raghavan et. al. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E, 76:036106, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  17. M. A. Rodriguez and P. Neubauer. Constructions from Dots and Lines. arXiv:1006.2361v1, 2010.Google ScholarGoogle Scholar
  18. R. S. Xin et. al. GraphX: A Resilient Distributed Graph System on Spark. In Proc. GRADES, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Analyzing extended property graphs with Apache Flink

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        NDA '16: Proceedings of the 1st ACM SIGMOD Workshop on Network Data Analytics
        July 2016
        40 pages
        ISBN:9781450345132
        DOI:10.1145/2980523

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 July 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        NDA '16 Paper Acceptance Rate4of8submissions,50%Overall Acceptance Rate4of8submissions,50%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader