Skip to main content
Top

2017 | OriginalPaper | Chapter

Multi-objective Big Data Optimization with jMetal and Spark

Authors : Cristóbal Barba-Gonzaléz, José García-Nieto, Antonio J. Nebro, José F. Aldana-Montes

Published in: Evolutionary Multi-Criterion Optimization

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Big Data Optimization is the term used to refer to optimization problems which have to manage very large amounts of data. In this paper, we focus on the parallelization of metaheuristics with the Apache Spark cluster computing system for solving multi-objective Big Data Optimization problems. Our purpose is to study the influence of accessing data stored in the Hadoop File System (HDFS) in each evaluation step of a metaheuristic and to provide a software tool to solve these kinds of problems. This tool combines the jMetal multi-objective optimization framework with Apache Spark. We have carried out experiments to measure the performance of the proposed parallel infrastructure in an environment based on virtual machines in a local cluster comprising up to 100 cores. We obtained interesting results for computational effort and propose guidelines to face multi-objective Big Data Optimization problems.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Abdul-Rahman, S., Bakar, A.A., Mohamed-Hussein, Z.-A.: Optimizing big data in bioinformatics with swarm algorithms. In: IEEE 16th International Conference on Computational Science and Engineering (CSE), pp. 1091–1095, December 2013 Abdul-Rahman, S., Bakar, A.A., Mohamed-Hussein, Z.-A.: Optimizing big data in bioinformatics with swarm algorithms. In: IEEE 16th International Conference on Computational Science and Engineering (CSE), pp. 1091–1095, December 2013
2.
go back to reference Aljarah, I., Ludwig, S.A.: Mapreduce intrusion detection system based on a particle swarm optimization clustering algorithm. In: IEEE Congress on Evolutionary Computation (CEC 2013), pp. 955–962, June 2013 Aljarah, I., Ludwig, S.A.: Mapreduce intrusion detection system based on a particle swarm optimization clustering algorithm. In: IEEE Congress on Evolutionary Computation (CEC 2013), pp. 955–962, June 2013
3.
go back to reference Thomas, S.A., Jin, Y.: Reconstructing biological gene regulatory networks: where optimization meets big data. Evol. Intel. 7(1), 29–47 (2014)CrossRef Thomas, S.A., Jin, Y.: Reconstructing biological gene regulatory networks: where optimization meets big data. Evol. Intel. 7(1), 29–47 (2014)CrossRef
4.
go back to reference Barba-González, C., Nebro, A.J., Cordero, J.A., García-Nieto, J., Durillo, J.J., Navas-Delgado, I., Aldana-Montes, J.F.: jMetalSP: a framework for dynamic multi-objective big data optimization. Applied Soft Computing (2016, submitted) Barba-González, C., Nebro, A.J., Cordero, J.A., García-Nieto, J., Durillo, J.J., Navas-Delgado, I., Aldana-Montes, J.F.: jMetalSP: a framework for dynamic multi-objective big data optimization. Applied Soft Computing (2016, submitted)
5.
go back to reference Cabanas-Abascal, A., García-Machicado, E., Prieto-González, L., de Amescua Seco, A.: An item based geo-recommender system inspired by artificial immune algorithms. J. Univ. Comput. Sci. 19(13), 2013–2033 (2013) Cabanas-Abascal, A., García-Machicado, E., Prieto-González, L., de Amescua Seco, A.: An item based geo-recommender system inspired by artificial immune algorithms. J. Univ. Comput. Sci. 19(13), 2013–2033 (2013)
6.
go back to reference Coello, C., Lamont, G.B., van Veldhuizen, D.A.: Multi-objective Optimization Using Evolutionary Algorithms, 2nd edn. Wiley, New York (2007)MATH Coello, C., Lamont, G.B., van Veldhuizen, D.A.: Multi-objective Optimization Using Evolutionary Algorithms, 2nd edn. Wiley, New York (2007)MATH
7.
go back to reference Cordero, J.A., Nebro, A.J., Barba-González, C., Durillo, J.J., García-Nieto, J., Navas-Delgado, I., Aldana-Montes, J.F.: Dynamic multi-objective optimization with jmetal and spark: a case study. In: Pardalos, P.M., Conca, P., Giuffrida, G., Nicosia, G. (eds.) MOD 2016. LNCS, vol. 10122, pp. 106–117. Springer, Heidelberg (2016). doi:10.1007/978-3-319-51469-7_9 CrossRef Cordero, J.A., Nebro, A.J., Barba-González, C., Durillo, J.J., García-Nieto, J., Navas-Delgado, I., Aldana-Montes, J.F.: Dynamic multi-objective optimization with jmetal and spark: a case study. In: Pardalos, P.M., Conca, P., Giuffrida, G., Nicosia, G. (eds.) MOD 2016. LNCS, vol. 10122, pp. 106–117. Springer, Heidelberg (2016). doi:10.​1007/​978-3-319-51469-7_​9 CrossRef
8.
go back to reference Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: region-based selection in evolutionary multi-objective optimization. In: Genetic and Evolutionary Computation Conference (GECCO 2001), pp. 283–290. Morgan Kaufmann (2001) Corne, D.W., Jerram, N.R., Knowles, J.D., Oates, M.J.: PESA-II: region-based selection in evolutionary multi-objective optimization. In: Genetic and Evolutionary Computation Conference (GECCO 2001), pp. 283–290. Morgan Kaufmann (2001)
9.
go back to reference Daoudi, M., Hamena, S., Benmounah, Z., Batouche, M.: Parallel diffrential evolution clustering algorithm based on MapReduce. In: 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014), pp. 337–341 (2014) Daoudi, M., Hamena, S., Benmounah, Z., Batouche, M.: Parallel diffrential evolution clustering algorithm based on MapReduce. In: 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR 2014), pp. 337–341 (2014)
10.
go back to reference Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRef Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRef
11.
go back to reference Durillo, J.J., Nebro, A.J.: jMetal: a Java framework for multi-objective optimization. Adv. Eng. Softw. 42, 760–771 (2011)CrossRef Durillo, J.J., Nebro, A.J.: jMetal: a Java framework for multi-objective optimization. Adv. Eng. Softw. 42, 760–771 (2011)CrossRef
12.
go back to reference Govindarajan, K., Somasundaram, T.S., Kumar, V.S., Kinshuk: Continuous clustering in big data learning analytics. In: IEEE Fifth International Conference on Technology for Education (T4E), pp. 61–64, December 2013 Govindarajan, K., Somasundaram, T.S., Kumar, V.S., Kinshuk: Continuous clustering in big data learning analytics. In: IEEE Fifth International Conference on Technology for Education (T4E), pp. 61–64, December 2013
13.
go back to reference Kitzler, E., Deb, K., Thiele, L.: Comparasion of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)CrossRef Kitzler, E., Deb, K., Thiele, L.: Comparasion of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8(2), 173–195 (2000)CrossRef
14.
go back to reference Kukkonen, S., Lampinen, J.: GDE3: the third evolution step of generalized differential evolution. In: IEEE Congress on Evolutionary Computation (CEC 2005), pp. 443–450 (2005) Kukkonen, S., Lampinen, J.: GDE3: the third evolution step of generalized differential evolution. In: IEEE Congress on Evolutionary Computation (CEC 2005), pp. 443–450 (2005)
17.
18.
go back to reference McNabb, A.W., Monson, C.K., Seppi, K.D.: Parallel PSO using MapReduce. IEEE Cong. Evol. Comput. CEC 2007, 7–14 (2007) McNabb, A.W., Monson, C.K., Seppi, K.D.: Parallel PSO using MapReduce. IEEE Cong. Evol. Comput. CEC 2007, 7–14 (2007)
19.
go back to reference Nebro, A.J., Durillo, J.J., Vergne, M.: Redesigning the jMetal multi-objective optimization framework. In: Genetic and Evolutionary Computation Conference (GECCO 2015) Companion, pp. 1093–1100, July 2015 Nebro, A.J., Durillo, J.J., Vergne, M.: Redesigning the jMetal multi-objective optimization framework. In: Genetic and Evolutionary Computation Conference (GECCO 2015) Companion, pp. 1093–1100, July 2015
20.
go back to reference Nebro, A.J., Durillo, J.J., García-Nieto, J., Coello Coello, C.A., Luna, F., Alba, E.: SMPSO: a new PSO-based metaheuristic for multi-objective optimization. In: IEEE Symposium on Computational Intelligence in Multicriteria Decision-Making (MCDM 2009), pp. 66–73. IEEE Press (2009) Nebro, A.J., Durillo, J.J., García-Nieto, J., Coello Coello, C.A., Luna, F., Alba, E.: SMPSO: a new PSO-based metaheuristic for multi-objective optimization. In: IEEE Symposium on Computational Intelligence in Multicriteria Decision-Making (MCDM 2009), pp. 66–73. IEEE Press (2009)
21.
go back to reference Shvachko, K., Kuang, H., Radia, S., Chansler R.: The Hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST 2010), Washington, DC, USA, pp. 1–10. IEEE Computer Society (2010) Shvachko, K., Kuang, H., Radia, S., Chansler R.: The Hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST 2010), Washington, DC, USA, pp. 1–10. IEEE Computer Society (2010)
22.
go back to reference Sun, W., Zhang, N., Wang, H., Yin, W., Qiu, T.: PACO: a period ACO based scheduling algorithm in cloud computing. In: International Conference on Cloud Computing and Big Data (CloudCom-Asia), pp. 482–486, December 2013 Sun, W., Zhang, N., Wang, H., Yin, W., Qiu, T.: PACO: a period ACO based scheduling algorithm in cloud computing. In: International Conference on Cloud Computing and Big Data (CloudCom-Asia), pp. 482–486, December 2013
23.
go back to reference Tannahill, K.B., Jamshidi, M.: System of systems and big data analytics bridging the gap. Comput. Electr. Eng. 40(1), 2–15 (2014)CrossRef Tannahill, K.B., Jamshidi, M.: System of systems and big data analytics bridging the gap. Comput. Electr. Eng. 40(1), 2–15 (2014)CrossRef
24.
go back to reference Wu, B., Wu, G., Yang, M.: A MapReduce based ant colony optimization approach to combinatorial optimization problems. In: 8th International Conference on Natural Computation (ICNC 2012), pp. 728–732, May 2012 Wu, B., Wu, G., Yang, M.: A MapReduce based ant colony optimization approach to combinatorial optimization problems. In: 8th International Conference on Natural Computation (ICNC 2012), pp. 728–732, May 2012
25.
go back to reference Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S. Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA, p. 10. USENIX Association (2010) Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S. Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, Berkeley, CA, USA, p. 10. USENIX Association (2010)
26.
go back to reference Zhou, Z., Chawla, N.V., Jin, Y., Williams, G.J.: Big data opportunities and challenges: Discussions from data analytics perspectives. IEEE Comput. Intell. Mag. 9(4), 62–74 (2014)CrossRef Zhou, Z., Chawla, N.V., Jin, Y., Williams, G.J.: Big data opportunities and challenges: Discussions from data analytics perspectives. IEEE Comput. Intell. Mag. 9(4), 62–74 (2014)CrossRef
27.
go back to reference Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength pareto evolutionary algorithm. In: Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems, EUROGEN 2001, Greece, Athens, pp. 95–100 (2002) Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: improving the strength pareto evolutionary algorithm. In: Evolutionary Methods for Design, Optimization and Control with Applications to Industrial Problems, EUROGEN 2001, Greece, Athens, pp. 95–100 (2002)
Metadata
Title
Multi-objective Big Data Optimization with jMetal and Spark
Authors
Cristóbal Barba-Gonzaléz
José García-Nieto
Antonio J. Nebro
José F. Aldana-Montes
Copyright Year
2017
DOI
https://doi.org/10.1007/978-3-319-54157-0_2

Premium Partner