Skip to main content

2019 | OriginalPaper | Buchkapitel

Mining of Relevant and Informative Posts from Text Forums

verfasst von : Kseniya Buraya, Vladislav Grozin, Vladislav Trofimov, Pavel Vinogradov, Natalia Gusarova

Erschienen in: Electronic Governance and Open Society: Challenges in Eurasia

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In the modern world, the competitive advantage for every person is the possibility to obtain the information in a fast and comfortable way. Web forums occupy a significant place among the sources of information. It is a good place to gain professionally significant knowledge on different topics. However, sometimes it is not easy to identify the places on the forum, which contains useful information corresponding user demands. In this paper we consider the problem of automatic forum text summarization and describe the methods, which can help to solve it. We study the difference between relevance-oriented and useful-oriented query types. We will describe our dataset, that contains over 4000 of marked posts from web forums about various subject domains. The posts were marked by experts, by estimating them on a scale from 0 to 5 for selected query types. The results of our study can provide background for creation informational retrieval applications that will decrease the time of user’s searching and increase the quality of search results.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
2.
Zurück zum Zitat Al-Hashemi, R.: Text summarization extraction system (TSES) using extracted keywords. Int. Arab J. e-Technol. 1(4), 164–168 (2010) Al-Hashemi, R.: Text summarization extraction system (TSES) using extracted keywords. Int. Arab J. e-Technol. 1(4), 164–168 (2010)
3.
Zurück zum Zitat Almahy, I., Salim, N.: Web discussion summarization: study review. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 649–656. Springer, Singapore (2014). https://doi.org/10.1007/978-981-4585-18-7_73CrossRef Almahy, I., Salim, N.: Web discussion summarization: study review. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 649–656. Springer, Singapore (2014). https://​doi.​org/​10.​1007/​978-981-4585-18-7_​73CrossRef
4.
Zurück zum Zitat Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: An overview of graph-based keyword extraction methods and approaches. J. Inf. Organ. Sci. 39(1), 1–20 (2015) Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: An overview of graph-based keyword extraction methods and approaches. J. Inf. Organ. Sci. 39(1), 1–20 (2015)
5.
Zurück zum Zitat Bishop, C.M.: Pattern recognition. Mach. Learn. 128 (2006) Bishop, C.M.: Pattern recognition. Mach. Learn. 128 (2006)
6.
Zurück zum Zitat Biyani, P., Bhatia, S., Caragea, C., Mitra, P.: Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl. Based Syst. 69, 170–178 (2014)CrossRef Biyani, P., Bhatia, S., Caragea, C., Mitra, P.: Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl. Based Syst. 69, 170–178 (2014)CrossRef
7.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
8.
Zurück zum Zitat Bottenberg, R.A., Ward, J.H.: Applied multiple linear regression. Technical report, DTIC Document (1963) Bottenberg, R.A., Ward, J.H.: Applied multiple linear regression. Technical report, DTIC Document (1963)
9.
Zurück zum Zitat Elbedweihy, K.M., Wrigley, S.N., Clough, P., Ciravegna, F.: An overview of semantic search evaluation initiatives. Web Semant. Sci. Serv. Agents World Wide Web 30, 82–105 (2015)CrossRef Elbedweihy, K.M., Wrigley, S.N., Clough, P., Ciravegna, F.: An overview of semantic search evaluation initiatives. Web Semant. Sci. Serv. Agents World Wide Web 30, 82–105 (2015)CrossRef
11.
Zurück zum Zitat Grozin, V., Dobrenko, N., Gusarova, N., Ning, T.: The application of machine learning methods for analysis of text forums for creating learning objects. Comput. Linguist. Intellect. Technol. 1, 199–209 (2015) Grozin, V., Dobrenko, N., Gusarova, N., Ning, T.: The application of machine learning methods for analysis of text forums for creating learning objects. Comput. Linguist. Intellect. Technol. 1, 199–209 (2015)
13.
Zurück zum Zitat Harman, D.: Information Retrieval Evaluation. Synthesis Lectures on Information Concepts, Retrieval, and Services, vol. 3, no. 2, pp. 1–119 (2011CrossRef Harman, D.: Information Retrieval Evaluation. Synthesis Lectures on Information Concepts, Retrieval, and Services, vol. 3, no. 2, pp. 1–119 (2011CrossRef
14.
Zurück zum Zitat Kelly, D.: Methods for evaluating interactive information retrieval systems with users. Found. Trends Inf. Retr. 3(12), 1–224 (2009) Kelly, D.: Methods for evaluating interactive information retrieval systems with users. Found. Trends Inf. Retr. 3(12), 1–224 (2009)
15.
Zurück zum Zitat Lomakina, L., Rodionov, V., Surkova, A.: Hierarchical clustering of text documents. Autom. Remote Control 75(7), 1309–1315 (2014)CrossRef Lomakina, L., Rodionov, V., Surkova, A.: Hierarchical clustering of text documents. Autom. Remote Control 75(7), 1309–1315 (2014)CrossRef
16.
Zurück zum Zitat Lott, B.: Survey of keyword extraction techniques. UNM Education (2012) Lott, B.: Survey of keyword extraction techniques. UNM Education (2012)
17.
Zurück zum Zitat Mikolov, T., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013) Mikolov, T., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems (2013)
19.
20.
Zurück zum Zitat Oufaida, H., Nouali, O., Blache, P.: Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization. J. King Saud Univ. Comput. Inf. Sci. 26(4), 450–461 (2014) Oufaida, H., Nouali, O., Blache, P.: Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization. J. King Saud Univ. Comput. Inf. Sci. 26(4), 450–461 (2014)
21.
Zurück zum Zitat Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval. Inf. Process. Manage. 44(1), 22–38 (2008)CrossRef Petrelli, D.: On the role of user-centred evaluation in the advancement of interactive information retrieval. Inf. Process. Manage. 44(1), 22–38 (2008)CrossRef
22.
Zurück zum Zitat Ren, Z., Ma, J., Wang, S., Liu, Y.: Summarizing web forum threads based on a latent topic propagation process. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 879–884. ACM (2011). Mining of relevant and informative posts from text forums 15 Ren, Z., Ma, J., Wang, S., Liu, Y.: Summarizing web forum threads based on a latent topic propagation process. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 879–884. ACM (2011). Mining of relevant and informative posts from text forums 15
23.
Zurück zum Zitat Romero, C., López, M.I., Luna, J.M., Ventura, S.: Predicting students’ final performance from participation in on-line discussion forums. Comput. Educ. 68, 458–472 (2013)CrossRef Romero, C., López, M.I., Luna, J.M., Ventura, S.: Predicting students’ final performance from participation in on-line discussion forums. Comput. Educ. 68, 458–472 (2013)CrossRef
24.
Zurück zum Zitat Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–146. ACM (1995) Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–146. ACM (1995)
25.
Zurück zum Zitat Schütze, H.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference (2008) Schütze, H.: Introduction to information retrieval. In: Proceedings of the International Communication of Association for Computing Machinery Conference (2008)
26.
Zurück zum Zitat Sizov, G.: Extraction-based automatic summarization: theoretical and empirical investigation of summarization techniques (2010) Sizov, G.: Extraction-based automatic summarization: theoretical and empirical investigation of summarization techniques (2010)
27.
Zurück zum Zitat Smine, B., Faiz, R., Desclés, J.P.: Relevant learning objects extraction based on semantic annotation. Int. J. Metadata Semant. Ontol. 8(1), 13–27 (2013)CrossRef Smine, B., Faiz, R., Desclés, J.P.: Relevant learning objects extraction based on semantic annotation. Int. J. Metadata Semant. Ontol. 8(1), 13–27 (2013)CrossRef
28.
Zurück zum Zitat Sondhi, P., Gupta, M., Zhai, C., Hockenmaier, J.: Shallow information extraction from medical forum data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1158–1166. Association for Computational Linguistics (2010) Sondhi, P., Gupta, M., Zhai, C., Hockenmaier, J.: Shallow information extraction from medical forum data. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 1158–1166. Association for Computational Linguistics (2010)
29.
Zurück zum Zitat Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: SDM, vol. 9, pp. 1147–1158. SIAM (2009) Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: SDM, vol. 9, pp. 1147–1158. SIAM (2009)
30.
Zurück zum Zitat Wang, J.Z., Yan, Z., Yang, L.T., Huang, B.X.: An approach to rank reviews by fusing and mining opinions based on review pertinence. Inf. Fusion 23, 3–15 (2015)CrossRef Wang, J.Z., Yan, Z., Yang, L.T., Huang, B.X.: An approach to rank reviews by fusing and mining opinions based on review pertinence. Inf. Fusion 23, 3–15 (2015)CrossRef
31.
Zurück zum Zitat Wartena, C., Brussee, R.: Topic detection by clustering keywords. In: 2008 19th International Workshop on Database and Expert Systems Applications, pp. 54–58. IEEE (2008) Wartena, C., Brussee, R.: Topic detection by clustering keywords. In: 2008 19th International Workshop on Database and Expert Systems Applications, pp. 54–58. IEEE (2008)
32.
Zurück zum Zitat Zhao, H., Zeng, Q.: Micro-blog keyword extraction method based on graph model and semantic space. J. Multimed. 8(5), 611–617 (2013) Zhao, H., Zeng, Q.: Micro-blog keyword extraction method based on graph model and semantic space. J. Multimed. 8(5), 611–617 (2013)
Metadaten
Titel
Mining of Relevant and Informative Posts from Text Forums
verfasst von
Kseniya Buraya
Vladislav Grozin
Vladislav Trofimov
Pavel Vinogradov
Natalia Gusarova
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-13283-5_12