Skip to main content

2017 | OriginalPaper | Buchkapitel

Efficient Compression on Real World Directed Graphs

verfasst von : Guohua Li, Weixiong Rao, Zhongxiao Jin

Erschienen in: Web and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Real world graphs typically exhibit power law degrees, and many of them are directed graphs. The growing scale of such graphs has made efficient execution of graph computation very challenging. Reducing graph size to fit in memory, for example by using the technique of lossless compression, is crucial in cutting the cost of large scale graph computation. Unfortunately, literature work on graph compression still suffers from issues including low compression ration and high decompression overhead. To address the above issue, in this paper, we propose a novel compression approach. The basic idea of our graph compression is to first cluster graph adjacency matrix via graph structure information, and then represent the clustered matrix by lists of encoded numbers. Our extensive evaluation on real data sets validates the tradeoff between space cost saving and graph computation time, and verifies the advantages of our work over state of art SlashBurn [1, 2] and Ligra+ [3].

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kang, U., Faloutsos, C.: Beyond ‘caveman communities’: hubs and spokes for graph compression and mining. In: 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, 11–14 December 2011, pp. 300–309 (2011) Kang, U., Faloutsos, C.: Beyond ‘caveman communities’: hubs and spokes for graph compression and mining. In: 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, 11–14 December 2011, pp. 300–309 (2011)
2.
Zurück zum Zitat Lim, Y., Kang, U., Faloutsos, C.: SlashBurn: graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26(12), 3077–3089 (2014)CrossRef Lim, Y., Kang, U., Faloutsos, C.: SlashBurn: graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26(12), 3077–3089 (2014)CrossRef
3.
Zurück zum Zitat Shun, J., Dhulipala, L., Blelloch, G.E.: Smaller and faster: parallel processing of compressed graphs with ligra+. In: 2015 Data Compression Conference, DCC 2015, Snowbird, UT, USA, 7–9 April 2015, pp. 403–412 (2015) Shun, J., Dhulipala, L., Blelloch, G.E.: Smaller and faster: parallel processing of compressed graphs with ligra+. In: 2015 Data Compression Conference, DCC 2015, Snowbird, UT, USA, 7–9 April 2015, pp. 403–412 (2015)
4.
Zurück zum Zitat Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012) Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012)
5.
Zurück zum Zitat Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2013, Shenzhen, China, 23–27 February 2013, pp. 135–146 (2013) Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2013, Shenzhen, China, 23–27 February 2013, pp. 135–146 (2013)
6.
Zurück zum Zitat Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, 22–27 June 2013, pp. 505–516 (2013) Shao, B., Wang, H., Li, Y.: Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, 22–27 June 2013, pp. 505–516 (2013)
7.
Zurück zum Zitat Johnson, D.S., Krishnan, S., Chhugani, J., Kumar, S., Venkatasubramanian, S.: Compressing large Boolean matrices using reordering techniques. In: (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, 31 August–3 September 2004, pp. 13–23 (2004) Johnson, D.S., Krishnan, S., Chhugani, J., Kumar, S., Venkatasubramanian, S.: Compressing large Boolean matrices using reordering techniques. In: (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, 31 August–3 September 2004, pp. 13–23 (2004)
8.
Zurück zum Zitat Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems. 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, Vancouver, British Columbia, Canada, 3–8 December 2001, pp. 849–856 (2001) Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems. 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, Vancouver, British Columbia, Canada, 3–8 December 2001, pp. 849–856 (2001)
9.
Zurück zum Zitat Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, 20–24 June 2007, pp. 209–216 (2007) Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-theoretic metric learning. In: Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, 20–24 June 2007, pp. 209–216 (2007)
10.
Zurück zum Zitat Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully automatic cross-associations. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, 22–25 August 2004, pp. 79–88 (2004) Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully automatic cross-associations. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, Washington, USA, 22–25 August 2004, pp. 79–88 (2004)
11.
Zurück zum Zitat Chierichetti, F., Kumar, R., Lattanzi, S., Mitzenmacher, M., Panconesi, A., Raghavan, P.: On compressing social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009, pp. 219–228 (2009) Chierichetti, F., Kumar, R., Lattanzi, S., Mitzenmacher, M., Panconesi, A., Raghavan, P.: On compressing social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, 28 June–1 July 2009, pp. 219–228 (2009)
14.
Zurück zum Zitat Adler, M., Mitzenmacher, M.: Towards compressing web graphs. In: Data Compression Conference, DCC 2001, Snowbird, Utah, USA, 27–29 March 2001, pp. 203–212 (2001) Adler, M., Mitzenmacher, M.: Towards compressing web graphs. In: Data Compression Conference, DCC 2001, Snowbird, Utah, USA, 27–29 March 2001, pp. 203–212 (2001)
15.
Zurück zum Zitat Boldi, P., Vigna, S.: The webgraph framework II: codes for the world-wide web. In: 2004 Data Compression Conference (DCC 2004), 23–25 March 2004, Snowbird, UT, USA, p. 528 (2004) Boldi, P., Vigna, S.: The webgraph framework II: codes for the world-wide web. In: 2004 Data Compression Conference (DCC 2004), 23–25 March 2004, Snowbird, UT, USA, p. 528 (2004)
17.
Zurück zum Zitat Kang, U., Tong, H., Sun, J., Lin, C., Faloutsos, C.: GBASE: an efficient analysis platform for large graphs. VLDB J. 21(5), 637–650 (2012)CrossRef Kang, U., Tong, H., Sun, J., Lin, C., Faloutsos, C.: GBASE: an efficient analysis platform for large graphs. VLDB J. 21(5), 637–650 (2012)CrossRef
18.
Zurück zum Zitat Buluç, A., Fineman, J.T., Frigo, M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2009, Calgary, Alberta, Canada, 11–13 August 2009, pp. 233–244 (2009) Buluç, A., Fineman, J.T., Frigo, M., Gilbert, J.R., Leiserson, C.E.: Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2009, Calgary, Alberta, Canada, 11–13 August 2009, pp. 233–244 (2009)
20.
Zurück zum Zitat Nilakant, K., Dalibard, V., Roy, A., Yoneki, E.: PrefEdge: SSD prefetcher for large-scale graph traversal. In: International Conference on Systems and Storage, SYSTOR 2014, Haifa, Israel, 30 June–02 July 2014, pp. 4:1–4:12 (2014) Nilakant, K., Dalibard, V., Roy, A., Yoneki, E.: PrefEdge: SSD prefetcher for large-scale graph traversal. In: International Conference on Systems and Storage, SYSTOR 2014, Haifa, Israel, 30 June–02 July 2014, pp. 4:1–4:12 (2014)
21.
Zurück zum Zitat Nourbakhsh, F., Bulò, S.R., Pelillo, M.: A matrix factorization approach to graph compression with partial information. Int. J. Mach. Learn. Cybern. 6(4), 523–536 (2015)CrossRef Nourbakhsh, F., Bulò, S.R., Pelillo, M.: A matrix factorization approach to graph compression with partial information. Int. J. Mach. Learn. Cybern. 6(4), 523–536 (2015)CrossRef
22.
Zurück zum Zitat Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system. In: The Ninth IEEE International Conference on Data Mining, ICDM 2009, Miami, Florida, USA, 6–9 December 2009, pp. 229–238 (2009) Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system. In: The Ninth IEEE International Conference on Data Mining, ICDM 2009, Miami, Florida, USA, 6–9 December 2009, pp. 229–238 (2009)
23.
Zurück zum Zitat Buluç, A., Williams, S., Oliker, L., Demmel, J.: Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In: 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16–20 May 2011 - Conference Proceedings, pp. 721–733 (2011) Buluç, A., Williams, S., Oliker, L., Demmel, J.: Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In: 25th IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2011, Anchorage, Alaska, USA, 16–20 May 2011 - Conference Proceedings, pp. 721–733 (2011)
24.
Zurück zum Zitat Karakasis, V., Gkountouvas, T., Kourtis, K., Goumas, G.I., Koziris, N.: An extended compression format for the optimization of sparse matrix-vector multiplication. IEEE Trans. Parallel Distrib. Syst. 24(10), 1930–1940 (2013)CrossRef Karakasis, V., Gkountouvas, T., Kourtis, K., Goumas, G.I., Koziris, N.: An extended compression format for the optimization of sparse matrix-vector multiplication. IEEE Trans. Parallel Distrib. Syst. 24(10), 1930–1940 (2013)CrossRef
25.
Zurück zum Zitat Fan, W., Li, J., Wang, X., Wu, Y.: Query preserving graph compression. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, 20–24 May 2012, pp. 157–168 (2012) Fan, W., Li, J., Wang, X., Wu, Y.: Query preserving graph compression. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, 20–24 May 2012, pp. 157–168 (2012)
Metadaten
Titel
Efficient Compression on Real World Directed Graphs
verfasst von
Guohua Li
Weixiong Rao
Zhongxiao Jin
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-63579-8_10