Skip to main content
Erschienen in: Neural Computing and Applications 1/2016

01.01.2016 | Extreme Learning Machine and Applications

An efficient query processing optimization based on ELM in the cloud

verfasst von: Linlin Ding, Junchang Xin, Guoren Wang

Erschienen in: Neural Computing and Applications | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Nowadays, MapReduce has emerged as a facto programming model for parallel processing of large-scale datasets with a commodity cluster of machines. MapReduce and its variants have been widely researched in the industry and academic communities. ComMapReduce further extends MapReduce by adding lightweight communication mechanisms and also enhances the efficiency of query processing applications. However, we find that the performance of query processing applications changes a lot in different communication strategies of ComMapReduce framework. It is necessary to identify the most optimal communication strategies of the query processing applications. Extreme learning machine (ELM) can exactly provide classification performance with an extremely fast training speed. Therefore, in this paper, first, we propose an efficient query processing optimization approach based on ELM in ComMapReduce framework, named ELM_CMR. Then, we design two implementations of our ELM_CMR approach to further optimize the performance of query processing applications. Finally, extensive experiments are conducted to verify the effectiveness and efficiency of our proposed ELM_CMR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Dean Jeffrey, Ghemawat Sanjay (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef Dean Jeffrey, Ghemawat Sanjay (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRef
2.
Zurück zum Zitat Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 1099–1110 Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 1099–1110
3.
Zurück zum Zitat Thusoo A, Sarma Joydeep S, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proceed VLDB Endow 2(2):1626–1629CrossRef Thusoo A, Sarma Joydeep S, Jain N, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R (2009) Hive: a warehousing solution over a map-reduce framework. Proceed VLDB Endow 2(2):1626–1629CrossRef
4.
Zurück zum Zitat Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Zhang N, Antony S, Liu H, Murthy R (2010) Hive-a petabyte scale data warehouse using hadoop. In: Data Engineering (ICDE), pp 996–1005 Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Zhang N, Antony S, Liu H, Murthy R (2010) Hive-a petabyte scale data warehouse using hadoop. In: Data Engineering (ICDE), pp 996–1005
5.
Zurück zum Zitat Carstoiu D, Lepadatu E, Gaspar M (2010) Hbase-non sql database, performances evaluation. IJACT-AICIT 2(5):42–52CrossRef Carstoiu D, Lepadatu E, Gaspar M (2010) Hbase-non sql database, performances evaluation. IJACT-AICIT 2(5):42–52CrossRef
6.
Zurück zum Zitat Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A (2009) HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceed VLDB Endow 2(1):922–933CrossRef Abouzeid A, Bajda-Pawlikowski K, Abadi D, Silberschatz A, Rasin A (2009) HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads. Proceed VLDB Endow 2(1):922–933CrossRef
7.
Zurück zum Zitat Yang H-C, Dasdan A, Hsiao R-L, Parker DS (2007) Map-reduce-merge: simplified relational data processing on large clusters. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 1029–1040 Yang H-C, Dasdan A, Hsiao R-L, Parker DS (2007) Map-reduce-merge: simplified relational data processing on large clusters. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 1029–1040
8.
Zurück zum Zitat Jiang D, Tung Anthony KH, Chen G (2011) Map-join-reduce: toward scalable and efficient data analysis on large clusters. Knowl Data Eng 23(9):1299–1311CrossRef Jiang D, Tung Anthony KH, Chen G (2011) Map-join-reduce: toward scalable and efficient data analysis on large clusters. Knowl Data Eng 23(9):1299–1311CrossRef
9.
Zurück zum Zitat Blanas S, Patel JM, Ercegovac V, Rao J, Shekita EJ, Tian Y (2010) A comparison of join algorithms for log processing in mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, pp 975–986 Blanas S, Patel JM, Ercegovac V, Rao J, Shekita EJ, Tian Y (2010) A comparison of join algorithms for log processing in mapreduce. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data, pp 975–986
10.
Zurück zum Zitat Vernica R, Carey MJ, Li C (2010) Efficient parallel set-similarity joins using MapReduce. In: Proceedings of the 2010 international conference on management of data, pp 495–506 Vernica R, Carey MJ, Li C (2010) Efficient parallel set-similarity joins using MapReduce. In: Proceedings of the 2010 international conference on management of data, pp 495–506
11.
Zurück zum Zitat Afrati FN, Borkar V, Carey M, Polyzotis N, Ullman JD (2011) Map-reduce extensions and recursive queries. In: Proceedings of the 14th international conference on extending database technology, pp 1–8 Afrati FN, Borkar V, Carey M, Polyzotis N, Ullman JD (2011) Map-reduce extensions and recursive queries. In: Proceedings of the 14th international conference on extending database technology, pp 1–8
12.
Zurück zum Zitat Dittrich J, Quiané-Ruiz J-A, Jindal A, Kargin Y, Setty V, Schad J (2010) Hadoop++: making a yellow elephant run like a cheetah (without it even noticing). Proceed VLDB Endow 3(1–2):518–529 Dittrich J, Quiané-Ruiz J-A, Jindal A, Kargin Y, Setty V, Schad J (2010) Hadoop++: making a yellow elephant run like a cheetah (without it even noticing). Proceed VLDB Endow 3(1–2):518–529
13.
Zurück zum Zitat Jahani E, Cafarella MJ, Ré C (2011) Automatic optimization for MapReduce programs 4(6):385–396 Jahani E, Cafarella MJ, Ré C (2011) Automatic optimization for MapReduce programs 4(6):385–396
14.
Zurück zum Zitat Zhang X, Chen L, Wang M (2012) Efficient multi-way theta-join processing using MapReduce. Proceed VLDB Endow 5(11):1184–1195CrossRefMathSciNet Zhang X, Chen L, Wang M (2012) Efficient multi-way theta-join processing using MapReduce. Proceed VLDB Endow 5(11):1184–1195CrossRefMathSciNet
15.
Zurück zum Zitat Kim Y, Shim K (2012) Parallel top-k similarity join algorithms using MapReduce. In: Data Engineering (ICDE), pp 510–521 Kim Y, Shim K (2012) Parallel top-k similarity join algorithms using MapReduce. In: Data Engineering (ICDE), pp 510–521
16.
Zurück zum Zitat Ding L, Xin J, Wang G, Huang S (2012) ComMapReduce: an improvement of mapreduce with lightweight communication mechanisms, pp 150–168 Ding L, Xin J, Wang G, Huang S (2012) ComMapReduce: an improvement of mapreduce with lightweight communication mechanisms, pp 150–168
17.
Zurück zum Zitat Ding L, Wang G, Xin J, Wang X, Huang S, Zhang R (2013) ComMapReduce: an improvement of mapreduce with lightweight communication mechanisms. Data Knowl Eng Ding L, Wang G, Xin J, Wang X, Huang S, Zhang R (2013) ComMapReduce: an improvement of mapreduce with lightweight communication mechanisms. Data Knowl Eng
18.
Zurück zum Zitat Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings. 2004 IEEE international joint conference Neural Networks, 2004, pp 985–990 Huang G-B, Zhu Q-Y, Siew C-K (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings. 2004 IEEE international joint conference Neural Networks, 2004, pp 985–990
19.
Zurück zum Zitat Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161CrossRef Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149–161CrossRef
20.
Zurück zum Zitat Huang G-B, Wang Dian H, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122CrossRef Huang G-B, Wang Dian H, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122CrossRef
21.
Zurück zum Zitat Rong H-J, Huang G-B, Sundararajan N, Saratchandran P (2009) Online sequential fuzzy extreme learning machine for function approximation and classification problems. Systems Man Cybern Part B Cybern 39(4):1067–1072CrossRef Rong H-J, Huang G-B, Sundararajan N, Saratchandran P (2009) Online sequential fuzzy extreme learning machine for function approximation and classification problems. Systems Man Cybern Part B Cybern 39(4):1067–1072CrossRef
22.
Zurück zum Zitat Sun Y, Yuan Y, Wang G (2011) An OS-ELM based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443CrossRef Sun Y, Yuan Y, Wang G (2011) An OS-ELM based distributed ensemble classification framework in p2p networks. Neurocomputing 74(16):2438–2443CrossRef
23.
Zurück zum Zitat Wang B, Wang G, Li J, Wang B (2012) Update strategy based on region classification using ELM for mobile object index. Soft Comput 16(9):1607–1615CrossRef Wang B, Wang G, Li J, Wang B (2012) Update strategy based on region classification using ELM for mobile object index. Soft Comput 16(9):1607–1615CrossRef
24.
Zurück zum Zitat Wang G, Zhao Y, Wang D (2008) A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing 72(1):262–268CrossRef Wang G, Zhao Y, Wang D (2008) A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing 72(1):262–268CrossRef
25.
Zurück zum Zitat Zhang R, Huang G-B, Sundararajan N, Saratchandran P (2007) Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinformatics (TCBB) 4(3):485–495CrossRef Zhang R, Huang G-B, Sundararajan N, Saratchandran P (2007) Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinformatics (TCBB) 4(3):485–495CrossRef
26.
Zurück zum Zitat Zhao X-G, Wang G, Bi X, Gong P, Zhao Y (2011) XML document classification based on ELM. Neurocomputing 74(16):2444–2451CrossRef Zhao X-G, Wang G, Bi X, Gong P, Zhao Y (2011) XML document classification based on ELM. Neurocomputing 74(16):2444–2451CrossRef
27.
Zurück zum Zitat Jun W, Shitong W, Chung F-l (2011) Positive and negative fuzzy rule system, extreme learning machine and image classification. Int J Mach Learn Cybern 2(4):261–271CrossRef Jun W, Shitong W, Chung F-l (2011) Positive and negative fuzzy rule system, extreme learning machine and image classification. Int J Mach Learn Cybern 2(4):261–271CrossRef
28.
Zurück zum Zitat Wang X-Z, Shao Q-Y, Qing M, Jun-Hai Z (2013) Architecture selection for networks trained with extreme learning machine using localized generalization error model. Neurocomputing 102:3–9CrossRef Wang X-Z, Shao Q-Y, Qing M, Jun-Hai Z (2013) Architecture selection for networks trained with extreme learning machine using localized generalization error model. Neurocomputing 102:3–9CrossRef
29.
Zurück zum Zitat Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468CrossRef Huang G-B, Chen L (2008) Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16):3460–3468CrossRef
30.
Zurück zum Zitat Zhai J-h, Xu H-y, Wang X-z (2012) Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 16(9):1493–1502CrossRef Zhai J-h, Xu H-y, Wang X-z (2012) Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 16(9):1493–1502CrossRef
31.
Zurück zum Zitat He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102:52–58CrossRef He Q, Shang T, Zhuang F, Shi Z (2013) Parallel extreme learning machine for regression based on MapReduce. Neurocomputing 102:52–58CrossRef
32.
Zurück zum Zitat Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062CrossRef Huang G-B, Chen L (2007) Convex incremental extreme learning machine. Neurocomputing 70(16):3056–3062CrossRef
33.
Zurück zum Zitat Huang G-B, Chen L, Siew C-K (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes, Neural Networks. IEEE Trans 17(4):879–892 Huang G-B, Chen L, Siew C-K (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes, Neural Networks. IEEE Trans 17(4):879–892
34.
Zurück zum Zitat Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef
35.
Zurück zum Zitat Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on Data Engineering, pp 421–430 Borzsony S, Kossmann D, Stocker K (2001) The skyline operator. In: Proceedings of the 17th international conference on Data Engineering, pp 421–430
36.
Zurück zum Zitat Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Anal Mach Intell 27(8):1226–1238CrossRef Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. Pattern Anal Mach Intell 27(8):1226–1238CrossRef
Metadaten
Titel
An efficient query processing optimization based on ELM in the cloud
verfasst von
Linlin Ding
Junchang Xin
Guoren Wang
Publikationsdatum
01.01.2016
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 1/2016
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-013-1543-3

Weitere Artikel der Ausgabe 1/2016

Neural Computing and Applications 1/2016 Zur Ausgabe

Extreme Learning Machine and Applications

Extend semi-supervised ELM and a frame work