Skip to main content
Erschienen in: Artificial Life and Robotics 2/2018

29.12.2017 | Original Article

A real-time recommendation engine using lambda architecture

verfasst von: Thanisa Numnonda

Erschienen in: Artificial Life and Robotics | Ausgabe 2/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In a data science theory, the recommended methodology is one of the most popular theories and has been deployed in many real industries. However, one of the most challenging problems these days is how to recommend items with massively streaming data. Therefore, this paper aims to do a real-time recommendation engine using the Lambda architecture. The Apache Hadoop and Apache Spark frameworks were used in this research to process the MovieLens dataset comprised 100 K and 20 M ratings from the GroupLens research. Using alternating least squares (ALS) and k-means algorithms, the top K recommendation movies and the top K trending movies for each user were shown as results. Additionally, the mean squared error (MSE) and within cluster sum of squared error (WCSS) had been computed to evaluate the performance of the ALS and k-means algorithms, sequentially. The results showed that they are acceptable since the MSE and WCSS values are low when comparing to the size of data. However, they can still be improved by tuning some parameters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Kantor PB, Rokach L, Ricci F, Shapira B (2011) Recommender systems handbook. Springer, BerlinMATH Kantor PB, Rokach L, Ricci F, Shapira B (2011) Recommender systems handbook. Springer, BerlinMATH
2.
Zurück zum Zitat Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7:76–80CrossRef Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7:76–80CrossRef
3.
4.
Zurück zum Zitat Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. J Comput 42(8):30–37 Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. J Comput 42(8):30–37
5.
Zurück zum Zitat Pentreath N (2015) Machine learning with spark. Packt Publishing, Birmingham Pentreath N (2015) Machine learning with spark. Packt Publishing, Birmingham
6.
Zurück zum Zitat Panigrahia S, Lenkaa RK, Stitipragyana A (2016) A hybrid distributed collaborative filtering recommender engine using Apache Spark. International workshop on big data and data mining challenges on IoT and pervasive systems (BigD2M 2016), pp 1000–1006 Panigrahia S, Lenkaa RK, Stitipragyana A (2016) A hybrid distributed collaborative filtering recommender engine using Apache Spark. International workshop on big data and data mining challenges on IoT and pervasive systems (BigD2M 2016), pp 1000–1006
7.
Zurück zum Zitat Karanth S (2014) Mastering Hadoop. Packt Publishing, Birmingham Karanth S (2014) Mastering Hadoop. Packt Publishing, Birmingham
8.
Zurück zum Zitat Shvachko K (2010) The Hadoop distributed file system. In: Proceeding of 2010 IEEE 26th symposium, mass storage system and technology, (MSST’10), pp 1–10 Shvachko K (2010) The Hadoop distributed file system. In: Proceeding of 2010 IEEE 26th symposium, mass storage system and technology, (MSST’10), pp 1–10
9.
Zurück zum Zitat Deanand J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. OSDI Deanand J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. OSDI
10.
Zurück zum Zitat Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association
11.
Zurück zum Zitat Marz N, Warren J (2013) Big Data: principles and best practices of scalable real-time data systems. O’Reilly Media, Newton Marz N, Warren J (2013) Big Data: principles and best practices of scalable real-time data systems. O’Reilly Media, Newton
12.
Zurück zum Zitat Gong S (2010) A collaborative filtering recommendation algorithm based on user clustering and item clustering. JSW 5(7):745–752CrossRef Gong S (2010) A collaborative filtering recommendation algorithm based on user clustering and item clustering. JSW 5(7):745–752CrossRef
13.
Zurück zum Zitat Phorasim P, Yu L (2016) Movies recommendation system using collaborative filtering and k-means. Int J Adv Comput Res 7(29):52CrossRef Phorasim P, Yu L (2016) Movies recommendation system using collaborative filtering and k-means. Int J Adv Comput Res 7(29):52CrossRef
14.
Zurück zum Zitat Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the Netflix prize. Algorithmic aspects in information and management. Springer, Berlin, pp 337–348CrossRef Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the Netflix prize. Algorithmic aspects in information and management. Springer, Berlin, pp 337–348CrossRef
15.
Zurück zum Zitat Phulari SV, Shah PP, Kalpande AD, Pawar VA (2016) Clustering and filtering approach for searching Big Data application query. Int J Eng Sci Innov Technol 5(1):197–204 Phulari SV, Shah PP, Kalpande AD, Pawar VA (2016) Clustering and filtering approach for searching Big Data application query. Int J Eng Sci Innov Technol 5(1):197–204
16.
Zurück zum Zitat Liu Q, Xiaobing L (2015) A new parallel item-based collaborative filtering algorithm based on Hadoop. JSW 10(4):416–426CrossRef Liu Q, Xiaobing L (2015) A new parallel item-based collaborative filtering algorithm based on Hadoop. JSW 10(4):416–426CrossRef
17.
Zurück zum Zitat Dutta K, Jayapal M (2015) Big Data analytics in real time systems. In: Big Data analytics seminar, pp 1–13 Dutta K, Jayapal M (2015) Big Data analytics in real time systems. In: Big Data analytics seminar, pp 1–13
18.
Zurück zum Zitat Huang Y, Cui B, Zhang W, Jiang J, Xu Y (2015) TencentRec—real-time stream recommendation in practice, SIGMOD’15 Huang Y, Cui B, Zhang W, Jiang J, Xu Y (2015) TencentRec—real-time stream recommendation in practice, SIGMOD’15
Metadaten
Titel
A real-time recommendation engine using lambda architecture
verfasst von
Thanisa Numnonda
Publikationsdatum
29.12.2017
Verlag
Springer Japan
Erschienen in
Artificial Life and Robotics / Ausgabe 2/2018
Print ISSN: 1433-5298
Elektronische ISSN: 1614-7456
DOI
https://doi.org/10.1007/s10015-017-0424-8

Weitere Artikel der Ausgabe 2/2018

Artificial Life and Robotics 2/2018 Zur Ausgabe

Neuer Inhalt