Skip to main content
Erschienen in: The Journal of Supercomputing 1/2019

06.02.2018

An efficient parallel similarity matrix construction on MapReduce for collaborative filtering

verfasst von: Seunghee Kim, Hongyeon Kim, Jun-Ki Min

Erschienen in: The Journal of Supercomputing | Ausgabe 1/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, the collaborative filtering becomes popular for recommendation systems. However, as the volume of data increases expansively, the construction of a similarity matrix becomes a performance bottleneck in recommendation systems. The MapReduce framework proposed by Google has been widely used for data-intensive application recently. Thus, in this work, we propose an efficient parallel algorithm ConSimMR for constructing a similarity matrix using MapReduce. We first partition a set of items into disjoint groups in each of which items rated by similar users tend to be located. We next compute the similarity of every pair of items belonging to the same group. Finally, we calculate the similarity of every item pair included in different groups. At this step, by using the rating list of each user rather than that of each item, we can compute the similarities in parallel resulting in the performance improvement. We conducted experiments to compare our parallel algorithm ConSimMR with the previous algorithms on real-life data sets and confirmed the efficiency as well as scalability of ConSimMR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749CrossRef
3.
Zurück zum Zitat Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp 43–52 Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp 43–52
4.
Zurück zum Zitat Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, IEEE, pp 21–29 Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, IEEE, pp 21–29
5.
Zurück zum Zitat Cohen E (1997) Size-estimation framework with applications to transitive closure and reachability. J Comput Syst Sci 55(3):441–453MathSciNetCrossRefMATH Cohen E (1997) Size-estimation framework with applications to transitive closure and reachability. J Comput Syst Sci 55(3):441–453MathSciNetCrossRefMATH
6.
Zurück zum Zitat Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web, ACM, pp 271–280 Das AS, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: Proceedings of the 16th International Conference on World Wide Web, ACM, pp 271–280
7.
Zurück zum Zitat Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
8.
Zurück zum Zitat Delgado J, Ishii N (1999) Memory-based weighted majority prediction. In: ACM SIGIR Workshop Recommender Systems Citeseer Delgado J, Ishii N (1999) Memory-based weighted majority prediction. In: ACM SIGIR Workshop Recommender Systems Citeseer
9.
Zurück zum Zitat Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst (TOIS) 22(1):143–177CrossRef Deshpande M, Karypis G (2004) Item-based top-n recommendation algorithms. ACM Trans Inf Syst (TOIS) 22(1):143–177CrossRef
10.
Zurück zum Zitat Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. Commun ACM 35(12):61–70CrossRef Goldberg D, Nichols D, Oki BM, Terry D (1992) Using collaborative filtering to weave an information tapestry. Commun ACM 35(12):61–70CrossRef
12.
Zurück zum Zitat Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, ACM, pp 604–613 Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, ACM, pp 604–613
13.
Zurück zum Zitat Jiang J, Lu J, Zhang G, Long G (2011) Scaling-up item-based collaborative filtering recommendation algorithm based on hadoop. In: 2011 IEEE World Congress on Services, pp 490–497 Jiang J, Lu J, Zhang G, Long G (2011) Scaling-up item-based collaborative filtering recommendation algorithm based on hadoop. In: 2011 IEEE World Congress on Services, pp 490–497
15.
Zurück zum Zitat Meng S, Dou W, Zhang X, Chen J (2014) KASR: a keyword-aware service recommendation method on mapreduce for big data applications. IEEE Trans Parallel Distrib Syst 25(12):3221–3231CrossRef Meng S, Dou W, Zhang X, Chen J (2014) KASR: a keyword-aware service recommendation method on mapreduce for big data applications. IEEE Trans Parallel Distrib Syst 25(12):3221–3231CrossRef
16.
Zurück zum Zitat Miller BN, Albert I, Lam SK, Konstan JA, Riedl J (2003) Movielens unplugged: experiences with an occasionally connected recommender system. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp 263–266 Miller BN, Albert I, Lam SK, Konstan JA, Riedl J (2003) Movielens unplugged: experiences with an occasionally connected recommender system. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp 263–266
17.
Zurück zum Zitat Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, pp 175–186 Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) Grouplens: an open architecture for collaborative filtering of netnews. In: Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, pp 175–186
18.
Zurück zum Zitat Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp 285–295 Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the 10th International Conference on World Wide Web, pp 285–295
19.
Zurück zum Zitat Schelter S, Boden C, Markl V (2012) Scalable similarity-based neighborhood methods with MapReduce. In: Proceedings of the Sixth ACM Conference on Recommender Systems, pp 163–170 Schelter S, Boden C, Markl V (2012) Scalable similarity-based neighborhood methods with MapReduce. In: Proceedings of the Sixth ACM Conference on Recommender Systems, pp 163–170
20.
Zurück zum Zitat Shardanand U, Maes P (1995) Social information filtering: algorithms for automating word of mouth. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 210–217 Shardanand U, Maes P (1995) Social information filtering: algorithms for automating word of mouth. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp 210–217
21.
Zurück zum Zitat Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4CrossRef Su X, Khoshgoftaar TM (2009) A survey of collaborative filtering techniques. Adv Artif Intell 2009:4CrossRef
22.
Zurück zum Zitat Wang P, Ye H (2009) A personalized recommendation algorithm combining slope one scheme and user based collaborative filtering. In: Proceedings of the International Conference on Industrial and Information Systems, pp 152–154 Wang P, Ye H (2009) A personalized recommendation algorithm combining slope one scheme and user based collaborative filtering. In: Proceedings of the International Conference on Industrial and Information Systems, pp 152–154
23.
Zurück zum Zitat Zhao ZD, Shang MS (2010) User-based collaborative-filtering recommendation algorithms on Hadoop. In: Proceedings of Third International Conference on Knowledge Discovery and Data Mining (WKDD), pp 478–481 Zhao ZD, Shang MS (2010) User-based collaborative-filtering recommendation algorithms on Hadoop. In: Proceedings of Third International Conference on Knowledge Discovery and Data Mining (WKDD), pp 478–481
Metadaten
Titel
An efficient parallel similarity matrix construction on MapReduce for collaborative filtering
verfasst von
Seunghee Kim
Hongyeon Kim
Jun-Ki Min
Publikationsdatum
06.02.2018
Verlag
Springer US
Erschienen in
The Journal of Supercomputing / Ausgabe 1/2019
Print ISSN: 0920-8542
Elektronische ISSN: 1573-0484
DOI
https://doi.org/10.1007/s11227-018-2271-3

Weitere Artikel der Ausgabe 1/2019

The Journal of Supercomputing 1/2019 Zur Ausgabe

Premium Partner