Skip to main content

2015 | OriginalPaper | Buchkapitel

A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing

verfasst von : Na Pang, Dali Zhu, Zheming Fan, Wenjing Rong, Weimiao Feng

Erschienen in: Applications and Techniques in Information Security

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Large amounts of data processing bring parallel computing into sharp focus. In order to solve the sorting problem of large-scale data in the era of internet, a large-scale distributed sorting algorithm based on cloud computing is proposed. The algorithm uses the ideas of quick-sort and merge-sort to sort and integrate the data on each cloud, which making best of clouds’ computing and storage resources. Taking advantage of the idea of parallel computing, we can reduce the computing time and improve the efficiency of sorting. By evaluating algorithm’s time complexity and organizing a simulation test, the effectiveness was verified.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Itani, W., Kayssi, A., Chehab, A.: Privacy as a service: privacy-aware data storage and processing in cloud computing architectures. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing. DASC 2009, pp. 711–716. IEEE (2009) Itani, W., Kayssi, A., Chehab, A.: Privacy as a service: privacy-aware data storage and processing in cloud computing architectures. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing. DASC 2009, pp. 711–716. IEEE (2009)
2.
Zurück zum Zitat Zhang, S., Zhang, S., Chen, X., Huo, X.: The comparison between cloud computing and grid computing. In: ICCASM 2010 - 2010 International Conference on Computer Application and System Modeling, vol. 11, pp. 1172–1175 (2010) Zhang, S., Zhang, S., Chen, X., Huo, X.: The comparison between cloud computing and grid computing. In: ICCASM 2010 - 2010 International Conference on Computer Application and System Modeling, vol. 11, pp. 1172–1175 (2010)
3.
Zurück zum Zitat Xia, T., Li, Z., Yu, N.: Research on cloud computing based on deep analysis to typical platforms. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing. LNCS, vol. 5931, pp. 601–608. Springer, Heidelberg (2009)CrossRef Xia, T., Li, Z., Yu, N.: Research on cloud computing based on deep analysis to typical platforms. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing. LNCS, vol. 5931, pp. 601–608. Springer, Heidelberg (2009)CrossRef
4.
Zurück zum Zitat Davidson, A., Tarjan, D., Garland, M., et al.: Efficient parallel merge sort for fixed and variable length keys. In: Innovative Parallel Computing (InPar), pp. 1–9. IEEE (2012) Davidson, A., Tarjan, D., Garland, M., et al.: Efficient parallel merge sort for fixed and variable length keys. In: Innovative Parallel Computing (InPar), pp. 1–9. IEEE (2012)
5.
Zurück zum Zitat Jeon, M., Kim, D.: Parallelizing merge sort onto distributed memory parallel computers. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 25–34. Springer, Heidelberg (2002)CrossRef Jeon, M., Kim, D.: Parallelizing merge sort onto distributed memory parallel computers. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 25–34. Springer, Heidelberg (2002)CrossRef
6.
Zurück zum Zitat Minsoo, J., Dongseung, K.: Parallel merge sort with load balancing. Int. J. Parallel Prog. 31, 21–33 (2003)MATHCrossRef Minsoo, J., Dongseung, K.: Parallel merge sort with load balancing. Int. J. Parallel Prog. 31, 21–33 (2003)MATHCrossRef
7.
Zurück zum Zitat Nanjesh, B.R., Tejonidhi, M.R., Rajesh, T.H., et al.: Parallel merge sort based performance evaluation and comparison of MPI and PVM. In: 2013 IEEE Conference on Information and Communication Technologies (ICT), pp. 530–534. IEEE (2013) Nanjesh, B.R., Tejonidhi, M.R., Rajesh, T.H., et al.: Parallel merge sort based performance evaluation and comparison of MPI and PVM. In: 2013 IEEE Conference on Information and Communication Technologies (ICT), pp. 530–534. IEEE (2013)
8.
Zurück zum Zitat Max, O.H., Christof, T.: Spatial sorting algorithms for parallel computing in networks. In: Proceedings - 2011 5th IEEE Conference on Self-adaptive and Self-organizing Systems Workshops, pp. 73–78 (2011) Max, O.H., Christof, T.: Spatial sorting algorithms for parallel computing in networks. In: Proceedings - 2011 5th IEEE Conference on Self-adaptive and Self-organizing Systems Workshops, pp. 73–78 (2011)
9.
Zurück zum Zitat Manouchehr Zadahmad, J., Parisa Yousefzadeh, F.: Heuristic and pattern based merge sort. Procedia Comput. Sci. 3, 322–324 (2011)CrossRef Manouchehr Zadahmad, J., Parisa Yousefzadeh, F.: Heuristic and pattern based merge sort. Procedia Comput. Sci. 3, 322–324 (2011)CrossRef
10.
Zurück zum Zitat Chang, R.C.H., Wei, M.F., Chen, H.L., et al.: Implementation of a high-throughput modified merge sort in MIMO detection systems. IEEE Trans. Circ. Syst. I: Regular Papers 61(9), 2730–2737 (2014) Chang, R.C.H., Wei, M.F., Chen, H.L., et al.: Implementation of a high-throughput modified merge sort in MIMO detection systems. IEEE Trans. Circ. Syst. I: Regular Papers 61(9), 2730–2737 (2014)
11.
Zurück zum Zitat Cérin, C., Koskas, M., Fkaier, H., Jemni, M.: Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters. Future Gener. Comput. Syst. 22, 776–783 (2006)CrossRef Cérin, C., Koskas, M., Fkaier, H., Jemni, M.: Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters. Future Gener. Comput. Syst. 22, 776–783 (2006)CrossRef
12.
Zurück zum Zitat Lin, H., Li, C., Wang, Q., Zhao, Y., Pan, N., Zhuang, X., Shao, L.: Automated tuning in parallel sorting on multi-core architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 14–25. Springer, Heidelberg (2010)CrossRef Lin, H., Li, C., Wang, Q., Zhao, Y., Pan, N., Zhuang, X., Shao, L.: Automated tuning in parallel sorting on multi-core architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 14–25. Springer, Heidelberg (2010)CrossRef
13.
Zurück zum Zitat Wei, J., Yu, H., Chen, J.H., et al.: Parallel clustering for visualizing large scientific line data. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 47–55. IEEE (2011) Wei, J., Yu, H., Chen, J.H., et al.: Parallel clustering for visualizing large scientific line data. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 47–55. IEEE (2011)
14.
Zurück zum Zitat Zhong, C., Qu, Z., Yang, F., et al.: Parallel multisets sorting using aperiodic multi-round distribution strategy on heterogeneous multi-core clusters. In: 2010 Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 247–254. IEEE (2010) Zhong, C., Qu, Z., Yang, F., et al.: Parallel multisets sorting using aperiodic multi-round distribution strategy on heterogeneous multi-core clusters. In: 2010 Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 247–254. IEEE (2010)
15.
Zurück zum Zitat Shirahata, K., Sato, H., Matsuoka, S.: Out-of-core GPU memory management for MapReduce-based large-scale graph processing. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 221–229. IEEE (2014) Shirahata, K., Sato, H., Matsuoka, S.: Out-of-core GPU memory management for MapReduce-based large-scale graph processing. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 221–229. IEEE (2014)
16.
Zurück zum Zitat Ye, C.J., Huang, M.X.: Multi-objective optimal power flow considering transient stability based on parallel NSGA-II. IEEE Trans. Power Syst. 30(2), 857–866 (2015)MathSciNetCrossRef Ye, C.J., Huang, M.X.: Multi-objective optimal power flow considering transient stability based on parallel NSGA-II. IEEE Trans. Power Syst. 30(2), 857–866 (2015)MathSciNetCrossRef
17.
Zurück zum Zitat Xiao, Z., Xiao, Y.: Security and privacy in cloud computing. IEEE Commun. Surv. Tutorials 15(2), 843–859 (2013)CrossRef Xiao, Z., Xiao, Y.: Security and privacy in cloud computing. IEEE Commun. Surv. Tutorials 15(2), 843–859 (2013)CrossRef
18.
Zurück zum Zitat Olman, V., Mao, F., Wu, H., et al.: Parallel clustering algorithm for large data sets with applications in bioinformatics. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(2), 344–352 (2009)CrossRef Olman, V., Mao, F., Wu, H., et al.: Parallel clustering algorithm for large data sets with applications in bioinformatics. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(2), 344–352 (2009)CrossRef
19.
Zurück zum Zitat Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011) Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011)
20.
Zurück zum Zitat Mehlhorn, K.: Data Structures and Algorithms 1: Sorting and Searching. Springer, Heidelberg (2013) Mehlhorn, K.: Data Structures and Algorithms 1: Sorting and Searching. Springer, Heidelberg (2013)
21.
Zurück zum Zitat Brin, S., Page, L.: Reprint of: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 56(18), 3825–3833 (2012)CrossRef Brin, S., Page, L.: Reprint of: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 56(18), 3825–3833 (2012)CrossRef
22.
Zurück zum Zitat Rao, S., Ramakrishnan, R., Silberstein, A., et al.: Sailfish: a framework for large scale data processing. In: Proceedings of the Third ACM Symposium on Cloud Computing, p. 4. ACM (2012) Rao, S., Ramakrishnan, R., Silberstein, A., et al.: Sailfish: a framework for large scale data processing. In: Proceedings of the Third ACM Symposium on Cloud Computing, p. 4. ACM (2012)
23.
Zurück zum Zitat Di Martino, A., Yan, C.G., Li, Q., et al.: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19(6), 659–667 (2014)CrossRef Di Martino, A., Yan, C.G., Li, Q., et al.: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19(6), 659–667 (2014)CrossRef
24.
Zurück zum Zitat Rasmussen, A., Porter, G., Conley, M., et al.: TritonSort: a balanced large-scale sorting system. In: NSDI (2011) Rasmussen, A., Porter, G., Conley, M., et al.: TritonSort: a balanced large-scale sorting system. In: NSDI (2011)
25.
Zurück zum Zitat Sakr, S., Liu, A., Batista, D.M., et al.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutorials 13(3), 311–336 (2011)CrossRef Sakr, S., Liu, A., Batista, D.M., et al.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutorials 13(3), 311–336 (2011)CrossRef
26.
Zurück zum Zitat Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011) Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)
27.
Zurück zum Zitat Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011) Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011)
Metadaten
Titel
A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing
verfasst von
Na Pang
Dali Zhu
Zheming Fan
Wenjing Rong
Weimiao Feng
Copyright-Jahr
2015
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-48683-2_20

Premium Partner