Skip to main content

2016 | OriginalPaper | Buchkapitel

A Hybrid Method for Manufacturing Text Mining Based on Document Clustering and Topic Modeling Techniques

verfasst von : Peyman Yazdizadeh Shotorbani, Farhad Ameri, Boonserm Kulvatunyou, Nenad Ivezic

Erschienen in: Advances in Production Management Systems. Initiatives for a Sustainable World

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the volume of online manufacturing information grows steadily, the need for developing dedicated computational tools for information organization and mining becomes more pronounced. This paper proposes a novel approach for facilitating search and organization of textual documents and also extraction of thematic patterns in manufacturing corpora using document clustering and topic modeling techniques. The proposed method adopts K-means and Latent Dirichlet Allocation (LDA) algorithms for document clustering and topic modeling, respectively. Through experimental validation, it is shown that topic modeling, in conjunction with document clustering, facilitates automated annotation and classification of manufacturing webpages as well as extraction of useful patterns, thus improving the intelligence of supplier discovery and knowledge acquisition tools.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yazdizadeh, P., Ameri, F.: A text mining technique for manufacturing supplier classification. In: 35th Computers and Information in Engineering (CIE) Conference, ASME IDETC 2015 (2015) Yazdizadeh, P., Ameri, F.: A text mining technique for manufacturing supplier classification. In: 35th Computers and Information in Engineering (CIE) Conference, ASME IDETC 2015 (2015)
2.
Zurück zum Zitat Liu, Y., Kung, J, James, L., Hsu, Y.B.: Using text mining to handle unstructured data in semiconductor manufacturing. In: Joint e-Manufacturing and Design Collaboration Symposium (eMDC), International Symposium on Semiconductor Manufacturing (ISSM), pp. 1–3. IEEE, Piscataway (2015) Liu, Y., Kung, J, James, L., Hsu, Y.B.: Using text mining to handle unstructured data in semiconductor manufacturing. In: Joint e-Manufacturing and Design Collaboration Symposium (eMDC), International Symposium on Semiconductor Manufacturing (ISSM), pp. 1–3. IEEE, Piscataway (2015)
3.
Zurück zum Zitat Dong, B., Liu, H.: Enterprise website topic modeling and web resource search. In: Sixth International Conference on Intelligent Systems Design and Applications (2006) Dong, B., Liu, H.: Enterprise website topic modeling and web resource search. In: Sixth International Conference on Intelligent Systems Design and Applications (2006)
4.
Zurück zum Zitat Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)CrossRef Blei, D.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)CrossRef
5.
Zurück zum Zitat Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH Manning, C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH
6.
Zurück zum Zitat Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (1999) Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (1999)
7.
Zurück zum Zitat Steyvers, M., Griffiths, T.L.: Probabilistic topic models. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum (2005) Steyvers, M., Griffiths, T.L.: Probabilistic topic models. In: Landauer, T., McNamara, D., Dennis, S., Kintsch, W. (eds.) Latent Semantic Analysis: A Road to Meaning. Laurence Erlbaum (2005)
8.
Zurück zum Zitat Masseroli, M., Chicco, D., Pinoli, P.: Probabilistic latent semantic analysis for prediction of gene ontology annotations. In: The 2012 International Joint Conference on Neural Networks (2012) Masseroli, M., Chicco, D., Pinoli, P.: Probabilistic latent semantic analysis for prediction of gene ontology annotations. In: The 2012 International Joint Conference on Neural Networks (2012)
9.
Zurück zum Zitat Alghamdi, R., Alfalqi, K.: A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. 6(1) (2015) Alghamdi, R., Alfalqi, K.: A survey of topic modeling in text mining. Int. J. Adv. Comput. Sci. Appl. 6(1) (2015)
10.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
11.
Zurück zum Zitat AlSumait, L., Barbará, D., Domeniconi, C.: On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking. In: 2008 Eighth IEEE International Conference on Data Mining (2008) AlSumait, L., Barbará, D., Domeniconi, C.: On-line LDA: adaptive topic models for mining text streams with applications to topic detection and tracking. In: 2008 Eighth IEEE International Conference on Data Mining (2008)
12.
Zurück zum Zitat Shulong, T., Yang, L., Huan, S., Ziyu, G., Xifeng, Y., Jiajun, B., Chun, C., Xiaofei, H.: Interpreting the public sentiment variations on Twitter. IEEE Trans. Knowl. Data Eng. 26(5), 1158–1170 (2014)CrossRef Shulong, T., Yang, L., Huan, S., Ziyu, G., Xifeng, Y., Jiajun, B., Chun, C., Xiaofei, H.: Interpreting the public sentiment variations on Twitter. IEEE Trans. Knowl. Data Eng. 26(5), 1158–1170 (2014)CrossRef
13.
Zurück zum Zitat Zhai, Z., Liu, B., Xu, H., Jia, P.: Constrained LDA for grouping product features in opinion mining. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 448–459. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20841-6_37 CrossRef Zhai, Z., Liu, B., Xu, H., Jia, P.: Constrained LDA for grouping product features in opinion mining. In: Huang, J.Z., Cao, L., Srivastava, J. (eds.) PAKDD 2011. LNCS (LNAI), vol. 6634, pp. 448–459. Springer, Heidelberg (2011). doi:10.​1007/​978-3-642-20841-6_​37 CrossRef
14.
Zurück zum Zitat Hu, Y., Boyd-Graber, J., Satinoff, B., Smith, A.: Interactive topic modeling. Mach. Learn. 95(3), 423–469 (2013)MathSciNetCrossRef Hu, Y., Boyd-Graber, J., Satinoff, B., Smith, A.: Interactive topic modeling. Mach. Learn. 95(3), 423–469 (2013)MathSciNetCrossRef
15.
Zurück zum Zitat Yang, T.I., Torget, A.J., Mihalcea, R.: Topic modeling on historical newspapers. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 96–104 (2011) Yang, T.I., Torget, A.J., Mihalcea, R.: Topic modeling on historical newspapers. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp. 96–104 (2011)
16.
Zurück zum Zitat Kodinariya, T.M., Makwana, P.R.: Review on determining number of cluster in k-means clustering. Int. J. Adv. Res. Comput. Sci. Manage. Stud. 1(6), 90–95 (2013) Kodinariya, T.M., Makwana, P.R.: Review on determining number of cluster in k-means clustering. Int. J. Adv. Res. Comput. Sci. Manage. Stud. 1(6), 90–95 (2013)
Metadaten
Titel
A Hybrid Method for Manufacturing Text Mining Based on Document Clustering and Topic Modeling Techniques
verfasst von
Peyman Yazdizadeh Shotorbani
Farhad Ameri
Boonserm Kulvatunyou
Nenad Ivezic
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-51133-7_91

Premium Partner