Skip to main content
Top

2015 | OriginalPaper | Chapter

A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing

Authors : Na Pang, Dali Zhu, Zheming Fan, Wenjing Rong, Weimiao Feng

Published in: Applications and Techniques in Information Security

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Large amounts of data processing bring parallel computing into sharp focus. In order to solve the sorting problem of large-scale data in the era of internet, a large-scale distributed sorting algorithm based on cloud computing is proposed. The algorithm uses the ideas of quick-sort and merge-sort to sort and integrate the data on each cloud, which making best of clouds’ computing and storage resources. Taking advantage of the idea of parallel computing, we can reduce the computing time and improve the efficiency of sorting. By evaluating algorithm’s time complexity and organizing a simulation test, the effectiveness was verified.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Itani, W., Kayssi, A., Chehab, A.: Privacy as a service: privacy-aware data storage and processing in cloud computing architectures. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing. DASC 2009, pp. 711–716. IEEE (2009) Itani, W., Kayssi, A., Chehab, A.: Privacy as a service: privacy-aware data storage and processing in cloud computing architectures. In: Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing. DASC 2009, pp. 711–716. IEEE (2009)
2.
go back to reference Zhang, S., Zhang, S., Chen, X., Huo, X.: The comparison between cloud computing and grid computing. In: ICCASM 2010 - 2010 International Conference on Computer Application and System Modeling, vol. 11, pp. 1172–1175 (2010) Zhang, S., Zhang, S., Chen, X., Huo, X.: The comparison between cloud computing and grid computing. In: ICCASM 2010 - 2010 International Conference on Computer Application and System Modeling, vol. 11, pp. 1172–1175 (2010)
3.
go back to reference Xia, T., Li, Z., Yu, N.: Research on cloud computing based on deep analysis to typical platforms. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing. LNCS, vol. 5931, pp. 601–608. Springer, Heidelberg (2009)CrossRef Xia, T., Li, Z., Yu, N.: Research on cloud computing based on deep analysis to typical platforms. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing. LNCS, vol. 5931, pp. 601–608. Springer, Heidelberg (2009)CrossRef
4.
go back to reference Davidson, A., Tarjan, D., Garland, M., et al.: Efficient parallel merge sort for fixed and variable length keys. In: Innovative Parallel Computing (InPar), pp. 1–9. IEEE (2012) Davidson, A., Tarjan, D., Garland, M., et al.: Efficient parallel merge sort for fixed and variable length keys. In: Innovative Parallel Computing (InPar), pp. 1–9. IEEE (2012)
5.
go back to reference Jeon, M., Kim, D.: Parallelizing merge sort onto distributed memory parallel computers. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 25–34. Springer, Heidelberg (2002)CrossRef Jeon, M., Kim, D.: Parallelizing merge sort onto distributed memory parallel computers. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 25–34. Springer, Heidelberg (2002)CrossRef
6.
go back to reference Minsoo, J., Dongseung, K.: Parallel merge sort with load balancing. Int. J. Parallel Prog. 31, 21–33 (2003)MATHCrossRef Minsoo, J., Dongseung, K.: Parallel merge sort with load balancing. Int. J. Parallel Prog. 31, 21–33 (2003)MATHCrossRef
7.
go back to reference Nanjesh, B.R., Tejonidhi, M.R., Rajesh, T.H., et al.: Parallel merge sort based performance evaluation and comparison of MPI and PVM. In: 2013 IEEE Conference on Information and Communication Technologies (ICT), pp. 530–534. IEEE (2013) Nanjesh, B.R., Tejonidhi, M.R., Rajesh, T.H., et al.: Parallel merge sort based performance evaluation and comparison of MPI and PVM. In: 2013 IEEE Conference on Information and Communication Technologies (ICT), pp. 530–534. IEEE (2013)
8.
go back to reference Max, O.H., Christof, T.: Spatial sorting algorithms for parallel computing in networks. In: Proceedings - 2011 5th IEEE Conference on Self-adaptive and Self-organizing Systems Workshops, pp. 73–78 (2011) Max, O.H., Christof, T.: Spatial sorting algorithms for parallel computing in networks. In: Proceedings - 2011 5th IEEE Conference on Self-adaptive and Self-organizing Systems Workshops, pp. 73–78 (2011)
9.
go back to reference Manouchehr Zadahmad, J., Parisa Yousefzadeh, F.: Heuristic and pattern based merge sort. Procedia Comput. Sci. 3, 322–324 (2011)CrossRef Manouchehr Zadahmad, J., Parisa Yousefzadeh, F.: Heuristic and pattern based merge sort. Procedia Comput. Sci. 3, 322–324 (2011)CrossRef
10.
go back to reference Chang, R.C.H., Wei, M.F., Chen, H.L., et al.: Implementation of a high-throughput modified merge sort in MIMO detection systems. IEEE Trans. Circ. Syst. I: Regular Papers 61(9), 2730–2737 (2014) Chang, R.C.H., Wei, M.F., Chen, H.L., et al.: Implementation of a high-throughput modified merge sort in MIMO detection systems. IEEE Trans. Circ. Syst. I: Regular Papers 61(9), 2730–2737 (2014)
11.
go back to reference Cérin, C., Koskas, M., Fkaier, H., Jemni, M.: Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters. Future Gener. Comput. Syst. 22, 776–783 (2006)CrossRef Cérin, C., Koskas, M., Fkaier, H., Jemni, M.: Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters. Future Gener. Comput. Syst. 22, 776–783 (2006)CrossRef
12.
go back to reference Lin, H., Li, C., Wang, Q., Zhao, Y., Pan, N., Zhuang, X., Shao, L.: Automated tuning in parallel sorting on multi-core architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 14–25. Springer, Heidelberg (2010)CrossRef Lin, H., Li, C., Wang, Q., Zhao, Y., Pan, N., Zhuang, X., Shao, L.: Automated tuning in parallel sorting on multi-core architectures. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010, Part I. LNCS, vol. 6271, pp. 14–25. Springer, Heidelberg (2010)CrossRef
13.
go back to reference Wei, J., Yu, H., Chen, J.H., et al.: Parallel clustering for visualizing large scientific line data. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 47–55. IEEE (2011) Wei, J., Yu, H., Chen, J.H., et al.: Parallel clustering for visualizing large scientific line data. In: 2011 IEEE Symposium on Large Data Analysis and Visualization (LDAV), pp. 47–55. IEEE (2011)
14.
go back to reference Zhong, C., Qu, Z., Yang, F., et al.: Parallel multisets sorting using aperiodic multi-round distribution strategy on heterogeneous multi-core clusters. In: 2010 Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 247–254. IEEE (2010) Zhong, C., Qu, Z., Yang, F., et al.: Parallel multisets sorting using aperiodic multi-round distribution strategy on heterogeneous multi-core clusters. In: 2010 Third International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), pp. 247–254. IEEE (2010)
15.
go back to reference Shirahata, K., Sato, H., Matsuoka, S.: Out-of-core GPU memory management for MapReduce-based large-scale graph processing. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 221–229. IEEE (2014) Shirahata, K., Sato, H., Matsuoka, S.: Out-of-core GPU memory management for MapReduce-based large-scale graph processing. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 221–229. IEEE (2014)
16.
go back to reference Ye, C.J., Huang, M.X.: Multi-objective optimal power flow considering transient stability based on parallel NSGA-II. IEEE Trans. Power Syst. 30(2), 857–866 (2015)MathSciNetCrossRef Ye, C.J., Huang, M.X.: Multi-objective optimal power flow considering transient stability based on parallel NSGA-II. IEEE Trans. Power Syst. 30(2), 857–866 (2015)MathSciNetCrossRef
17.
go back to reference Xiao, Z., Xiao, Y.: Security and privacy in cloud computing. IEEE Commun. Surv. Tutorials 15(2), 843–859 (2013)CrossRef Xiao, Z., Xiao, Y.: Security and privacy in cloud computing. IEEE Commun. Surv. Tutorials 15(2), 843–859 (2013)CrossRef
18.
go back to reference Olman, V., Mao, F., Wu, H., et al.: Parallel clustering algorithm for large data sets with applications in bioinformatics. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(2), 344–352 (2009)CrossRef Olman, V., Mao, F., Wu, H., et al.: Parallel clustering algorithm for large data sets with applications in bioinformatics. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(2), 344–352 (2009)CrossRef
19.
go back to reference Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011) Scaling up Machine Learning: Parallel and Distributed Approaches. Cambridge University Press (2011)
20.
go back to reference Mehlhorn, K.: Data Structures and Algorithms 1: Sorting and Searching. Springer, Heidelberg (2013) Mehlhorn, K.: Data Structures and Algorithms 1: Sorting and Searching. Springer, Heidelberg (2013)
21.
go back to reference Brin, S., Page, L.: Reprint of: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 56(18), 3825–3833 (2012)CrossRef Brin, S., Page, L.: Reprint of: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. 56(18), 3825–3833 (2012)CrossRef
22.
go back to reference Rao, S., Ramakrishnan, R., Silberstein, A., et al.: Sailfish: a framework for large scale data processing. In: Proceedings of the Third ACM Symposium on Cloud Computing, p. 4. ACM (2012) Rao, S., Ramakrishnan, R., Silberstein, A., et al.: Sailfish: a framework for large scale data processing. In: Proceedings of the Third ACM Symposium on Cloud Computing, p. 4. ACM (2012)
23.
go back to reference Di Martino, A., Yan, C.G., Li, Q., et al.: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19(6), 659–667 (2014)CrossRef Di Martino, A., Yan, C.G., Li, Q., et al.: The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol. Psychiatry 19(6), 659–667 (2014)CrossRef
24.
go back to reference Rasmussen, A., Porter, G., Conley, M., et al.: TritonSort: a balanced large-scale sorting system. In: NSDI (2011) Rasmussen, A., Porter, G., Conley, M., et al.: TritonSort: a balanced large-scale sorting system. In: NSDI (2011)
25.
go back to reference Sakr, S., Liu, A., Batista, D.M., et al.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutorials 13(3), 311–336 (2011)CrossRef Sakr, S., Liu, A., Batista, D.M., et al.: A survey of large scale data management approaches in cloud environments. IEEE Commun. Surv. Tutorials 13(3), 311–336 (2011)CrossRef
26.
go back to reference Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011) Cuzzocrea, A., Song, I.Y., Davis, K.C.: Analytics over large-scale multidimensional data: the big data revolution. In: Proceedings of the ACM 14th International Workshop on Data Warehousing and OLAP, pp. 101–104. ACM (2011)
27.
go back to reference Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011) Vora, M.N.: Hadoop-HBase for large-scale data. In: 2011 International Conference on Computer Science and Network Technology (ICCSNT), vol. 1, pp. 601–605. IEEE (2011)
Metadata
Title
A Large-Scale Distributed Sorting Algorithm Based on Cloud Computing
Authors
Na Pang
Dali Zhu
Zheming Fan
Wenjing Rong
Weimiao Feng
Copyright Year
2015
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-48683-2_20

Premium Partner