Skip to main content
Top
Published in:

30-10-2023

Graph-Based Interactive Matching for Pairs of News Articles

Authors: Kunhao Pan, Guowei Zhang, Meng Liao, Jin Xu

Published in: Cognitive Computation | Issue 2/2024

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Long-text document matching has been widely applied in many applications, such as topic detection and tracking and relative article recommendation. However, existing methods still have many defects in extracting and utilizing long text features, especially in news articles. In this paper, we propose a novel long-text pair matching framework that constructs texts into graphs and comprehensively utilizes graphs for interactive matching. We conduct extensive experiments on four datasets, including CNSE, CNSS, TNSE, and TNSS. Extensive experimental results demonstrate the significant improvements over a wide range of state-of-the-art methods. The proposed EEG model is novel, and it significantly outperforms an extensive range of baselines.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The two datasets will be released if the paper is accepted.
 
Literature
1.
go back to reference Yang Y, Carbonell J, Brown R, Lafferty J, Pierce T, Ault T. Multi-strategy learning for topic detection and tracking. In: Topic Detection and Tracking. Springer; 2002. p. 85–114. Yang Y, Carbonell J, Brown R, Lafferty J, Pierce T, Ault T. Multi-strategy learning for topic detection and tracking. In: Topic Detection and Tracking. Springer; 2002. p. 85–114.
2.
go back to reference Zhou D, Xu H, He Y. An unsupervised Bayesian modelling approach for storyline detection on news articles. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015. p. 1943–8. Zhou D, Xu H, He Y. An unsupervised Bayesian modelling approach for storyline detection on news articles. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015. p. 1943–8.
3.
go back to reference Brüggermann D, Hermey Y, Orth C, Schneider D, Selzer S, Spanakis G. Storyline detection and tracking using dynamic latent Dirichlet allocation. In: Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016). 2016. p. 9–19. Brüggermann D, Hermey Y, Orth C, Schneider D, Selzer S, Spanakis G. Storyline detection and tracking using dynamic latent Dirichlet allocation. In: Proceedings of the 2nd Workshop on Computing News Storylines (CNS 2016). 2016. p. 9–19.
4.
go back to reference Robertson S, Zaragoza H, et al. The probabilistic relevance framework: BM25 and beyond. Found Trends® Inf Ret. 2009;3(4):333–389. Robertson S, Zaragoza H, et al. The probabilistic relevance framework: BM25 and beyond. Found Trends® Inf Ret. 2009;3(4):333–389.
5.
go back to reference Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022.
6.
go back to reference Huang P-S, He X, Gao J, Deng L, Acero A, Heck L. Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. ACM; 2013. p. 2333–8. Huang P-S, He X, Gao J, Deng L, Acero A, Heck L. Learning deep structured semantic models for web search using clickthrough data. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. ACM; 2013. p. 2333–8.
7.
go back to reference Shen Y, He X, Gao J, Deng L, Mesnil G. Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web. ACM; 2014. p. 373–4. Shen Y, He X, Gao J, Deng L, Mesnil G. Learning semantic representations using convolutional neural networks for web search. In: Proceedings of the 23rd International Conference on World Wide Web. ACM; 2014. p. 373–4.
8.
go back to reference Mitra B, Diaz F, Craswell N. Learning to match using local and distributed representations of text for web search. In: Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee; 2017. p. 1291–9. Mitra B, Diaz F, Craswell N. Learning to match using local and distributed representations of text for web search. In: Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee; 2017. p. 1291–9.
9.
go back to reference Qiu X, Huang X. Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth International Joint Conference on Artificial Intelligence. 2015. Qiu X, Huang X. Convolutional neural tensor network architecture for community-based question answering. In: Twenty-Fourth International Joint Conference on Artificial Intelligence. 2015.
10.
go back to reference Wan S, Lan Y, Guo J, Xu J, Pang L, Cheng X. A deep architecture for semantic matching with multiple positional sentence representations. In: Thirtieth AAAI Conference on Artificial Intelligence. 2016. Wan S, Lan Y, Guo J, Xu J, Pang L, Cheng X. A deep architecture for semantic matching with multiple positional sentence representations. In: Thirtieth AAAI Conference on Artificial Intelligence. 2016.
11.
go back to reference Mueller J, Thyagarajan A. Siamese recurrent architectures for learning sentence similarity. In Thirtieth AAAI Conference on Artificial Intelligence. 2016. Mueller J, Thyagarajan A. Siamese recurrent architectures for learning sentence similarity. In Thirtieth AAAI Conference on Artificial Intelligence. 2016.
12.
go back to reference Hu B, Lu Z, Li H, Chen Q. Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems. 2014. p. 2042–50. Hu B, Lu Z, Li H, Chen Q. Convolutional neural network architectures for matching natural language sentences. In: Advances in Neural Information Processing Systems. 2014. p. 2042–50.
13.
go back to reference Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X. Text matching as image recognition. In: Thirtieth AAAI Conference on Artificial Intelligence. 2016. Pang L, Lan Y, Guo J, Xu J, Wan S, Cheng X. Text matching as image recognition. In: Thirtieth AAAI Conference on Artificial Intelligence. 2016.
14.
go back to reference Wang Z, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 2017. p. 4144–50. Wang Z, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 2017. p. 4144–50.
15.
go back to reference Chen H, Han FX, Niu D, Liu D, Lai K, Wu C, Xu Y. Mix: Multi-channel information crossing for text matching. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM; 2018. p. 110–9. Chen H, Han FX, Niu D, Liu D, Lai K, Wu C, Xu Y. Mix: Multi-channel information crossing for text matching. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM; 2018. p. 110–9.
16.
go back to reference Gong Y, Luo H, Zhang J. Natural language inference over interaction space. In: International Conference on Learning Representations. 2018. Gong Y, Luo H, Zhang J. Natural language inference over interaction space. In: International Conference on Learning Representations. 2018.
17.
go back to reference Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR). 2017. Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. In: International Conference on Learning Representations (ICLR). 2017.
18.
go back to reference Yanyan Z, Bing Q, Wan-Xiang C, Ting L. Research on Chinese event extraction. Journal of Chinese Information Processing. 2008;22(1):3–8. Yanyan Z, Bing Q, Wan-Xiang C, Ting L. Research on Chinese event extraction. Journal of Chinese Information Processing. 2008;22(1):3–8.
19.
go back to reference Walker C, Strassel S, Medero J, Maeda K. ACE 2005 Multilingual Training Corpus LDC2006T06. In: Web Download. Philadelphia: Linguistic Data Consortium; 2006. Walker C, Strassel S, Medero J, Maeda K. ACE 2005 Multilingual Training Corpus LDC2006T06. In: Web Download. Philadelphia: Linguistic Data Consortium; 2006.
20.
go back to reference Getman J, Ellis J, Song Z, Tracey J, Strassel SM. Overview of linguistic resources for the TAC KBP 2017 evaluations: methodologies and results. In: TAC. 2017. Getman J, Ellis J, Song Z, Tracey J, Strassel SM. Overview of linguistic resources for the TAC KBP 2017 evaluations: methodologies and results. In: TAC. 2017.
21.
go back to reference Makoto M, Rune S, Jin-Dong K, Jun’ichi T. Event extraction with complex event classification using rich features. J Bioinform Comput Biol. 2010;8(01):131–46.CrossRef Makoto M, Rune S, Jin-Dong K, Jun’ichi T. Event extraction with complex event classification using rich features. J Bioinform Comput Biol. 2010;8(01):131–46.CrossRef
22.
go back to reference Yue G, Hanwang Z, Xibin Z, Shuicheng Y. Event classification in microblogs via social tracking. ACM Trans Intell Syst Technol (TIST). 2017;8(3):1–14.CrossRef Yue G, Hanwang Z, Xibin Z, Shuicheng Y. Event classification in microblogs via social tracking. ACM Trans Intell Syst Technol (TIST). 2017;8(3):1–14.CrossRef
23.
go back to reference Yubo C, Liheng X, Kang L, Daojian Z, Jun Z, et al. Event extraction via dynamic multi-pooling convolutional neural networks. 2015. Yubo C, Liheng X, Kang L, Daojian Z, Jun Z, et al. Event extraction via dynamic multi-pooling convolutional neural networks. 2015.
24.
go back to reference Yang S, Feng D, Qiao L, Kan Z, Li D. Exploring pre-trained language models for event extraction and generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. p. 5284–94. Yang S, Feng D, Qiao L, Kan Z, Li D. Exploring pre-trained language models for event extraction and generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. p. 5284–94.
25.
go back to reference Nguyen TH, Cho K, Grishman R. Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. p. 300–9. Nguyen TH, Cho K, Grishman R. Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016. p. 300–9.
26.
go back to reference Wang Y, Ni X, Sun J-T, Tong Y, Chen Z. Representing document as dependency graph for document clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011. p. 2177–80. Wang Y, Ni X, Sun J-T, Tong Y, Chen Z. Representing document as dependency graph for document clustering. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 2011. p. 2177–80.
27.
go back to reference Leskovec J, Grobelnik M, Milic-Frayling N. Learning sub-structures of document semantic graphs for document summarization. In: LinkKDD Workshop. 2004. p. 133–8. Leskovec J, Grobelnik M, Milic-Frayling N. Learning sub-structures of document semantic graphs for document summarization. In: LinkKDD Workshop. 2004. p. 133–8.
28.
go back to reference Zhang T, Liu B, Niu D, Lai K, Xu Y. Multiresolution graph attention networks for relevance matching. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM; 2018. p. 933–42. Zhang T, Liu B, Niu D, Lai K, Xu Y. Multiresolution graph attention networks for relevance matching. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM; 2018. p. 933–42.
29.
go back to reference Nikolentzos G, Meladianos P, Rousseau F, Stavrakas Y, Vazirgiannis M. Shortest-path graph kernels for document similarity. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. p. 1890–900. Nikolentzos G, Meladianos P, Rousseau F, Stavrakas Y, Vazirgiannis M. Shortest-path graph kernels for document similarity. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. p. 1890–900.
30.
go back to reference Yang X, Liao L, Yang Q, Sun B, Xi J. Limited-energy output formation for multiagent systems with intermittent interactions. J Franklin Inst. 2021;358(13):6462–89. Elsevier.MathSciNetCrossRef Yang X, Liao L, Yang Q, Sun B, Xi J. Limited-energy output formation for multiagent systems with intermittent interactions. J Franklin Inst. 2021;358(13):6462–89. Elsevier.MathSciNetCrossRef
31.
go back to reference Hammouda KM, Kamel MS. Document similarity using a phrase indexing graph model. Knowl Inf Syst. 2003;6:710–27.CrossRef Hammouda KM, Kamel MS. Document similarity using a phrase indexing graph model. Knowl Inf Syst. 2003;6:710–27.CrossRef
32.
go back to reference Schenker A, Last M, Bunke H, Kandel A. Clustering of web documents using a graph model. In: Web Document Analysis. 2003. Schenker A, Last M, Bunke H, Kandel A. Clustering of web documents using a graph model. In: Web Document Analysis. 2003.
33.
go back to reference Yang X, Zhu M, Cai Y, Wang Z, Nie F. Fast spectral clustering with self-adapted bipartite graph learning. Inf Sci. 2023;644:118810. Elsevier.CrossRef Yang X, Zhu M, Cai Y, Wang Z, Nie F. Fast spectral clustering with self-adapted bipartite graph learning. Inf Sci. 2023;644:118810. Elsevier.CrossRef
34.
go back to reference Putra JWG, Tokunaga T. Evaluating text coherence based on semantic similarity graph. In: TextGraphs@ACL. 2017. Putra JWG, Tokunaga T. Evaluating text coherence based on semantic similarity graph. In: TextGraphs@ACL. 2017.
35.
go back to reference Liu B, Niu D, Wei H, Lin J, He Y, Lai K, Xu Y. Matching article pairs with graphical decomposition and convolutions. In: Proceedings of the 57th Conference of the Association for Computational Linguistics. 2019. p. 6284–94. Liu B, Niu D, Wei H, Lin J, He Y, Lai K, Xu Y. Matching article pairs with graphical decomposition and convolutions. In: Proceedings of the 57th Conference of the Association for Computational Linguistics. 2019. p. 6284–94.
36.
go back to reference Gómez MM, López-López A, Gelbukh A. Information retrieval with conceptual graph matching. In: International Conference on Database and Expert Systems Applications. Springer; 2000. p. 312–21. Gómez MM, López-López A, Gelbukh A. Information retrieval with conceptual graph matching. In: International Conference on Database and Expert Systems Applications. Springer; 2000. p. 312–21.
37.
go back to reference Haghighi AD, Ng AY, Manning CD. Robust textual inference via graph matching. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2005. p. 387–94. Haghighi AD, Ng AY, Manning CD. Robust textual inference via graph matching. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2005. p. 387–94.
38.
go back to reference Sandra K, Ryan M, Joakim N. Dependency parsing. Synth Lect Hum Lang Technol. 2009;1(1):1–127. Sandra K, Ryan M, Joakim N. Dependency parsing. Synth Lect Hum Lang Technol. 2009;1(1):1–127.
39.
go back to reference Wities R, Shwartz V, Stanovsky G, Adler M, Shapira O, Upadhyay S, Roth D, Martínez-Cámara E, Gurevych I, Dagan I. A consolidated open knowledge representation for multiple texts. In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-Level Semantics. 2017. p. 12–24. Wities R, Shwartz V, Stanovsky G, Adler M, Shapira O, Upadhyay S, Roth D, Martínez-Cámara E, Gurevych I, Dagan I. A consolidated open knowledge representation for multiple texts. In: Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-Level Semantics. 2017. p. 12–24.
40.
go back to reference Jacob D, Ming-Wei C, Kenton L, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. 2019. Jacob D, Ming-Wei C, Kenton L, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT. 2019.
41.
go back to reference Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2014. p. 55–60. Manning C, Surdeanu M, Bauer J, Finkel J, Bethard S, McClosky D. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2014. p. 55–60.
42.
go back to reference Pennington J, Socher R, Manning C. GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. p. 1532–43. Pennington J, Socher R, Manning C. GloVe: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2014. p. 1532–43.
Metadata
Title
Graph-Based Interactive Matching for Pairs of News Articles
Authors
Kunhao Pan
Guowei Zhang
Meng Liao
Jin Xu
Publication date
30-10-2023
Publisher
Springer US
Published in
Cognitive Computation / Issue 2/2024
Print ISSN: 1866-9956
Electronic ISSN: 1866-9964
DOI
https://doi.org/10.1007/s12559-023-10208-6

Other articles of this Issue 2/2024

Cognitive Computation 2/2024 Go to the issue

Premium Partner