Skip to main content
Erschienen in: International Journal of Parallel Programming 3/2014

01.06.2014

Parallel Training of An Improved Neural Network for Text Categorization

verfasst von: Cheng Hua Li, Laurence T. Yang, Man Lin

Erschienen in: International Journal of Parallel Programming | Ausgabe 3/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper studies parallel training of an improved neural network for text categorization. With the explosive growth on the amount of digital information available on the Internet, text categorization problem has become more and more important, especially when millions of mobile devices are now connecting to the Internet. Improved back-propagation neural network (IBPNN) is an efficient approach for classification problems which overcomes the limitations of traditional BPNN. In this paper, we utilize parallel computing to speedup the neural network training process of IBPNN. The parallel IBNPP algorithm for text categorization is implemented on a Sun Cluster with 34 nodes (processors). The communication time and speedup for the parallel IBPNN versus various number of nodes are studied. Experiments are conducted on various data sets and the results show that the parallel IBPNN together with SVD technique achieves fast computational speed and high text categorization correctness.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Chen, G.A., Yu, X.H., Cheng, S.X.: Acceleration of backpropagation learning using optimized learning rate and momentum. Electron. Lett. 29(14), 1288–1289 (1993)CrossRef Chen, G.A., Yu, X.H., Cheng, S.X.: Acceleration of backpropagation learning using optimized learning rate and momentum. Electron. Lett. 29(14), 1288–1289 (1993)CrossRef
3.
Zurück zum Zitat Costa, M.A., Braga, A., de Menezes, B.R.: Improving neural networks generalization with new constructive and pruning methods. J. Intell. Fuzzy Syst. 13, 75–83 (2003)MATH Costa, M.A., Braga, A., de Menezes, B.R.: Improving neural networks generalization with new constructive and pruning methods. J. Intell. Fuzzy Syst. 13, 75–83 (2003)MATH
4.
Zurück zum Zitat Dahl, G., McAvinney, A., Newhall, T.: Parallelizing neural network training for cluster systems. In: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (2008) Dahl, G., McAvinney, A., Newhall, T.: Parallelizing neural network training for cluster systems. In: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (2008)
6.
Zurück zum Zitat Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nedellec, C., Rouveirol, C. (Eds.) Proceedings of the 10th European Conference on Machine Learning (ECML’98), pp. 137–142, Springer, Berlin (1998) Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nedellec, C., Rouveirol, C. (Eds.) Proceedings of the 10th European Conference on Machine Learning (ECML’98), pp. 137–142, Springer, Berlin (1998)
7.
Zurück zum Zitat Kontar, S.: Parallel training of neural network for speech recognition. In: Proceedings of the 12th International Conference on, Soft Computing (2006) Kontar, S.: Parallel training of neural network for speech recognition. In: Proceedings of the 12th International Conference on, Soft Computing (2006)
8.
Zurück zum Zitat Kramer, A.H., Sangiovanni-Vincentelli, A.: Efficient parallel learning algorithms for neural networks. In: Touretzky, S. (Ed.) Advances in Neural Information Processing Systems, pp. 40–48 (1989) Kramer, A.H., Sangiovanni-Vincentelli, A.: Efficient parallel learning algorithms for neural networks. In: Touretzky, S. (Ed.) Advances in Neural Information Processing Systems, pp. 40–48 (1989)
9.
Zurück zum Zitat Lai, K.K., Yu, L., Wang, S.: Neural network metalearning for parallel textual information retrieval. Int. Jo. Artif. Intell. 1(A08), 173–184 (2008) Lai, K.K., Yu, L., Wang, S.: Neural network metalearning for parallel textual information retrieval. Int. Jo. Artif. Intell. 1(A08), 173–184 (2008)
10.
Zurück zum Zitat Lewis, D.D.: Naive (Bayes) at forty. The independence assumption in information retrieval. In: Proceedings of the 10th European Conference on, Machine Learning (ECML’98), pp. 4–15 (1998) Lewis, D.D.: Naive (Bayes) at forty. The independence assumption in information retrieval. In: Proceedings of the 10th European Conference on, Machine Learning (ECML’98), pp. 4–15 (1998)
11.
Zurück zum Zitat Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR ’94 Proceedings of the 17th Annual International ACM SIGIR Conference, pp. 3–12 (1994) Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: SIGIR ’94 Proceedings of the 17th Annual International ACM SIGIR Conference, pp. 3–12 (1994)
12.
Zurück zum Zitat Li, C.H., Park, S.C.: Combination of modified bpnn algorithms and an efficient feature selection method for text categorization. Inf. Process. Manag. 45, 329–340 (2009)CrossRef Li, C.H., Park, S.C.: Combination of modified bpnn algorithms and an efficient feature selection method for text categorization. Inf. Process. Manag. 45, 329–340 (2009)CrossRef
13.
Zurück zum Zitat Li, C.H., Park, S.C.: An efficient document classification model using an improved back propagation neural network and singular value decomposition. Expert Syst. Appl. 36(2), 3208–3215 (2009)CrossRefMathSciNet Li, C.H., Park, S.C.: An efficient document classification model using an improved back propagation neural network and singular value decomposition. Expert Syst. Appl. 36(2), 3208–3215 (2009)CrossRefMathSciNet
14.
Zurück zum Zitat Lin, M., Ding, C.: Parallel genetic algorithms for dvs scheduling of distributed embedded systems. High Perform. Comput. Commun. LNCS 4782, 180–191 (2007)CrossRef Lin, M., Ding, C.: Parallel genetic algorithms for dvs scheduling of distributed embedded systems. High Perform. Comput. Commun. LNCS 4782, 180–191 (2007)CrossRef
15.
Zurück zum Zitat McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI’98 Workshop on Learning for Text Categorization, pp. 41–48 (1998) McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI’98 Workshop on Learning for Text Categorization, pp. 41–48 (1998)
16.
Zurück zum Zitat Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRef Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)CrossRef
17.
Zurück zum Zitat Skucas, I., Remeikis, N., Melninkaite, V.: A combined neural network and decision tree approach for text categorization. Inf. Syst. Dev. XXVII, 173–184 (2005) Skucas, I., Remeikis, N., Melninkaite, V.: A combined neural network and decision tree approach for text categorization. Inf. Syst. Dev. XXVII, 173–184 (2005)
18.
Zurück zum Zitat Srinivasan, P., Ruiz, M.E.: Automatic text categorization using neural network. In: Proceedings of the 8th ASIS SIG/CR Workshop on Classification Research, pp. 59–72 (1998) Srinivasan, P., Ruiz, M.E.: Automatic text categorization using neural network. In: Proceedings of the 8th ASIS SIG/CR Workshop on Classification Research, pp. 59–72 (1998)
19.
Zurück zum Zitat Tamura, H., Ishii, M., Wang, X.G., Tang, Z., Sun, W.D.: An improved backpropagation algorithm to avoid the local minima problem. Neurocomputing 56, 455–460 (2004)CrossRef Tamura, H., Ishii, M., Wang, X.G., Tang, Z., Sun, W.D.: An improved backpropagation algorithm to avoid the local minima problem. Neurocomputing 56, 455–460 (2004)CrossRef
20.
Zurück zum Zitat Tan, S.B.: An effective refinement strategy for KNN text classifier. Expert Syst. Appl. 30(2), 290–298 (2006)CrossRef Tan, S.B.: An effective refinement strategy for KNN text classifier. Expert Syst. Appl. 30(2), 290–298 (2006)CrossRef
21.
Zurück zum Zitat Windheuser, U., Zick, F.K., Krahl, D.: Data Mining–Einsatz Inder Praxis. Addison Wesley/Longman, Bonn (1998) Windheuser, U., Zick, F.K., Krahl, D.: Data Mining–Einsatz Inder Praxis. Addison Wesley/Longman, Bonn (1998)
22.
Zurück zum Zitat Yam, J.Y.F., Chow, T.W.S.: A weight initialization method for improving training speed in feed forward neural network. IEEE Trans. Neural Netw. 2(30), 219–232 (2000) Yam, J.Y.F., Chow, T.W.S.: A weight initialization method for improving training speed in feed forward neural network. IEEE Trans. Neural Netw. 2(30), 219–232 (2000)
23.
Zurück zum Zitat Yang, L.T., Xu, L., Lin, M.: Integer factorization by a parallel gnfs algorithm for public key cryptosystems. Embed. Softw. Syst. LNCS 3820, 683–695 (2005)CrossRef Yang, L.T., Xu, L., Lin, M.: Integer factorization by a parallel gnfs algorithm for public key cryptosystems. Embed. Softw. Syst. LNCS 3820, 683–695 (2005)CrossRef
24.
Zurück zum Zitat Zelikovitz, S., Hirsh, H.: Using lsi for text classification in the presence of background text. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 113–118, ACM Press (2001) Zelikovitz, S., Hirsh, H.: Using lsi for text classification in the presence of background text. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 113–118, ACM Press (2001)
25.
Zurück zum Zitat Zeng, H.J., Lu, Y.C., Shi, C.Y., Sun, J.T., Chen, Z., Ma, W.Y.: Supervised latent semantic indexing for document categorization. In: ICDM, pp. 535–538. IEEE Press (2004) Zeng, H.J., Lu, Y.C., Shi, C.Y., Sun, J.T., Chen, Z., Ma, W.Y.: Supervised latent semantic indexing for document categorization. In: ICDM, pp. 535–538. IEEE Press (2004)
Metadaten
Titel
Parallel Training of An Improved Neural Network for Text Categorization
verfasst von
Cheng Hua Li
Laurence T. Yang
Man Lin
Publikationsdatum
01.06.2014
Verlag
Springer US
Erschienen in
International Journal of Parallel Programming / Ausgabe 3/2014
Print ISSN: 0885-7458
Elektronische ISSN: 1573-7640
DOI
https://doi.org/10.1007/s10766-013-0245-x

Weitere Artikel der Ausgabe 3/2014

International Journal of Parallel Programming 3/2014 Zur Ausgabe