Skip to main content
Top

2019 | OriginalPaper | Chapter

Event-Oriented Keyphrase Extraction Based on Bi-clustering Model

Authors : Lin Zhao, Liangjun Zang, Longtao Huang, Jizhong Han, Songlin Hu

Published in: Computational Science – ICCS 2019

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Keyphrase extraction, as a basis for many natural language processing and information retrieval tasks, can help people efficiently discover their interested information from vast streams of online documents. Previous methods are mostly proposed in general purpose, where keyphrases that represent the main topics are extracted. However, such keyphrases can hardly distinguish events from massive streams of long text documents that share similar topics and contain highly redundant information. In this paper, we address the task of keyphrase extraction for event-oriented retrieval. We propose a novel bi-clustering model for clustering the documents and keyphrases simultaneously. The model consequently makes the extracted keyphrases more specific and related to the event. We conduct a series of experiments on a real-world dataset. The experimental results demonstrate the better performance of our approach than other unsupervised approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014) Chung, J., Gülçehre, Ç., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014)
2.
go back to reference Dai, Z., Xiong, C., Callan, J.P., Liu, Z.: Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: WSDM (2018) Dai, Z., Xiong, C., Callan, J.P., Liu, Z.: Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: WSDM (2018)
3.
go back to reference Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. In: SIGIR (2017) Dehghani, M., Zamani, H., Severyn, A., Kamps, J., Croft, W.B.: Neural ranking models with weak supervision. In: SIGIR (2017)
4.
go back to reference Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD (2001) Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD (2001)
5.
go back to reference Ding, Z., Zhang, Q., Huang, X.: Keyphrase extraction from online news using binary integer programming. In: IJCNLP (2011) Ding, Z., Zhang, Q., Huang, X.: Keyphrase extraction from online news using binary integer programming. In: IJCNLP (2011)
6.
go back to reference Farzindar, A., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell. 31, 132–164 (2015)MathSciNetCrossRef Farzindar, A., Khreich, W.: A survey of techniques for event detection in twitter. Comput. Intell. 31, 132–164 (2015)MathSciNetCrossRef
7.
go back to reference Feng, X., Huang, L., Tang, D., Ji, H., Qin, B., Liu, T.: A language-independent neural network for event detection. Sci. China Inf. Sci. 61, 1–12 (2016) Feng, X., Huang, L., Tang, D., Ji, H., Qin, B., Liu, T.: A language-independent neural network for event detection. Sci. China Inf. Sci. 61, 1–12 (2016)
8.
go back to reference Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., et al.: Domain-specific keyphrase extraction. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 668–673. Morgan Kaufmann Publishers (1999) Frank, E., Paynter, G.W., Witten, I.H., Gutwin, C., et al.: Domain-specific keyphrase extraction. In: Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, pp. 668–673. Morgan Kaufmann Publishers (1999)
9.
go back to reference Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: CIKM (2016) Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: CIKM (2016)
10.
go back to reference Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) (2014) Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) (2014)
11.
go back to reference Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NIPS (2014) Hu, B., Lu, Z., Li, H., Chen, Q.: Convolutional neural network architectures for matching natural language sentences. In: NIPS (2014)
12.
go back to reference Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: Sixth International Conference on Data Mining (ICDM 2006), pp. 275–284 (2006) Huang, C., Tian, Y., Zhou, Z., Ling, C.X., Huang, T.: Keyphrase extraction using semantic networks structure analysis. In: Sixth International Conference on Data Mining (ICDM 2006), pp. 275–284 (2006)
13.
go back to reference Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.P.: Learning deep structured semantic models for web search using click through data. In: CIKM (2013) Huang, P.S., He, X., Gao, J., Deng, L., Acero, A., Heck, L.P.: Learning deep structured semantic models for web search using click through data. In: CIKM (2013)
14.
go back to reference Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in NLP, pp. 216–223 (2003) Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Proceedings of the 2003 Conference on Empirical Methods in NLP, pp. 216–223 (2003)
15.
go back to reference Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using wikipedia and genetic algorithms. J. Inf. Sci. 39, 410–426 (2013)CrossRef Joorabchi, A., Mahdi, A.E.: Automatic keyphrase annotation of scientific documents using wikipedia and genetic algorithms. J. Inf. Sci. 39, 410–426 (2013)CrossRef
16.
go back to reference Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: ACL (2014) Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: ACL (2014)
17.
go back to reference Kim, J., Xue, X., Croft, W.B.: A probabilistic retrieval model for semistructured data. In: ECIR, pp. 228–239 (2009) Kim, J., Xue, X., Croft, W.B.: A probabilistic retrieval model for semistructured data. In: ECIR, pp. 228–239 (2009)
18.
go back to reference Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: EMNLP (2010) Liu, Z., Huang, W., Zheng, Y., Sun, M.: Automatic keyphrase extraction via topic decomposition. In: EMNLP (2010)
19.
go back to reference Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: EMNLP, pp. 257–266 (2009) Liu, Z., Li, P., Zheng, Y., Sun, M.: Clustering to find exemplar terms for keyphrase extraction. In: EMNLP, pp. 257–266 (2009)
20.
go back to reference Lu, Z., Li, H.: A deep architecture for matching short texts. In: Advances in Neural Information Processing Systems, pp. 1367–1375 (2013) Lu, Z., Li, H.: A deep architecture for matching short texts. In: Advances in Neural Information Processing Systems, pp. 1367–1375 (2013)
21.
go back to reference Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval (2008) Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval (2008)
22.
go back to reference Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004) Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP (2004)
23.
go back to reference Nikolaev, F., Kotov, A., Zhiltsov, N.: Parameterized fielded term dependence models for ad-hoc entity retrieval from knowledge graph. In: SIGIR (2016) Nikolaev, F., Kotov, A., Zhiltsov, N.: Parameterized fielded term dependence models for ad-hoc entity retrieval from knowledge graph. In: SIGIR (2016)
24.
go back to reference Ogilvie, P., Callan, J.P.: Combining document representations for known-item search. In: SIGIR (2003) Ogilvie, P., Callan, J.P.: Combining document representations for known-item search. In: SIGIR (2003)
25.
go back to reference Onal, K.D., Altingövde, I.S., Senkul, P., de Rijke, M.: Getting started with neural models for semantic matching in web search. CoRR abs/1611.03305 (2016) Onal, K.D., Altingövde, I.S., Senkul, P., de Rijke, M.: Getting started with neural models for semantic matching in web search. CoRR abs/1611.03305 (2016)
26.
go back to reference Robertson, S.E., Zaragoza, H., Taylor, M.J.: Simple BM25 extension to multiple weighted fields. In: CIKM (2004) Robertson, S.E., Zaragoza, H., Taylor, M.J.: Simple BM25 extension to multiple weighted fields. In: CIKM (2004)
27.
go back to reference Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: A latent semantic model with convolutional-pooling structure for information retrieval. In: CIKM (2014) Shen, Y., He, X., Gao, J., Deng, L., Mesnil, G.: A latent semantic model with convolutional-pooling structure for information retrieval. In: CIKM (2014)
28.
go back to reference Shi, T., Jiao, S., Hou, J., Li, M.: Improving keyphrase extraction using wikipedia semantics. In: 2008 Second International Symposium on Intelligent Information Technology Application, vol. 2, pp. 42–46 (2008) Shi, T., Jiao, S., Hou, J., Li, M.: Improving keyphrase extraction using wikipedia semantics. In: 2008 Second International Symposium on Intelligent Information Technology Application, vol. 2, pp. 42–46 (2008)
30.
go back to reference Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL Workshop on Multiword Expressions, pp. 33–40 (2003) Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: Proceedings of ACL Workshop on Multiword Expressions, pp. 33–40 (2003)
31.
go back to reference Tu, W., Cheung, D.W.L., Mamoulis, N., Yang, M., Lu, Z.: Real-time detection and sorting of news on microblogging platforms. In: PACLIC (2015) Tu, W., Cheung, D.W.L., Mamoulis, N., Yang, M., Lu, Z.: Real-time detection and sorting of news on microblogging platforms. In: PACLIC (2015)
32.
go back to reference Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retr. 2, 303–336 (2000)CrossRef Turney, P.D.: Learning algorithms for keyphrase extraction. Inf. Retr. 2, 303–336 (2000)CrossRef
33.
go back to reference Wan, X., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Trans. Inf. Syst. 28, 8:1–8:34 (2010)CrossRef Wan, X., Xiao, J.: Exploiting neighborhood knowledge for single document summarization and keyphrase extraction. ACM Trans. Inf. Syst. 28, 8:1–8:34 (2010)CrossRef
34.
go back to reference Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL (2007) Wan, X., Yang, J., Xiao, J.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. In: ACL (2007)
35.
go back to reference Xiong, C., Dai, Z., Callan, J.P., Liu, Z., Power, R.: End-to-end neural ad-hoc ranking with kernel pooling. In: SIGIR (2017) Xiong, C., Dai, Z., Callan, J.P., Liu, Z., Power, R.: End-to-end neural ad-hoc ranking with kernel pooling. In: SIGIR (2017)
36.
go back to reference Yang, L., Ai, Q., Guo, J., Croft, W.B.: aNMM: ranking short answer texts with attention-based neural matching model. In: CIKM (2016) Yang, L., Ai, Q., Guo, J., Croft, W.B.: aNMM: ranking short answer texts with attention-based neural matching model. In: CIKM (2016)
37.
go back to reference Yang, M., Cui, T., Tu, W.: Ordering-sensitive and semantic-aware topic modeling. In: AAAI (2015) Yang, M., Cui, T., Tu, W.: Ordering-sensitive and semantic-aware topic modeling. In: AAAI (2015)
38.
go back to reference Zhiltsov, N., Kotov, A., Nikolaev, F.: Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In: SIGIR (2015) Zhiltsov, N., Kotov, A., Nikolaev, F.: Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In: SIGIR (2015)
39.
go back to reference Zhu, J., Xu, C., Li, Z., Fung, G.P.C., Lin, X., Huang, J., Huang, C.: An examination of on-line machine learning approaches for pseudo-random generated data. Cluster Comput. 19, 1309–1321 (2016)CrossRef Zhu, J., Xu, C., Li, Z., Fung, G.P.C., Lin, X., Huang, J., Huang, C.: An examination of on-line machine learning approaches for pseudo-random generated data. Cluster Comput. 19, 1309–1321 (2016)CrossRef
Metadata
Title
Event-Oriented Keyphrase Extraction Based on Bi-clustering Model
Authors
Lin Zhao
Liangjun Zang
Longtao Huang
Jizhong Han
Songlin Hu
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-22750-0_16

Premium Partner