Skip to main content
Erschienen in: Neural Computing and Applications 6/2021

20.06.2020 | Original Article

Learning semantic and relationship joint embedding for author name disambiguation

verfasst von: Bo Xiong, Peng Bao, Yilin Wu

Erschienen in: Neural Computing and Applications | Ausgabe 6/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Author name disambiguation is an important research topic in the academic information retrieval community. Existing methods rely either on feature engineering on rich attributes information or on relationship information to obtain documents’ similarity, but seldom consider the complementarity and the correlation between them. The feature engineering on attributes, especially on rich text information, could capture the global semantic concepts, while the relationship information could encode local structural proximity in multiple academic networks. To bridge the gap between semantic and relationship information in author name disambiguation, this paper presents a joint representation learning approach, which could encode both semantic and relationship information into a common low dimensional space. Specifically, the proposed method consists of four modules: (1) semantic embedding module; (2) relationship embedding module; (3) semantic and relationship joint embedding module; and (4) clustering module. Experimental results demonstrate that the proposed joint representation learning approach consistently outperforms the state-of-the-art methods on three benchmarks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Fu Y, Zhu L, Han H (2016) A survey of name disambiguation. Technol Intell Eng 2(1):053–058 Fu Y, Zhu L, Han H (2016) A survey of name disambiguation. Technol Intell Eng 2(1):053–058
2.
Zurück zum Zitat Cen L, Dragut E, Si L, Ouzzani M (2013) Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion. In: Proceedings of SIGIR, pp 741–744 Cen L, Dragut E, Si L, Ouzzani M (2013) Author disambiguation by hierarchical agglomerative clustering with adaptive stopping criterion. In: Proceedings of SIGIR, pp 741–744
3.
Zurück zum Zitat Han H, Giles L, Zha H, Li C, Tsioutsiouliklis K (2014) Two supervised learning approaches for name disambiguation in author citations. In: Proceedings of JCDL Han H, Giles L, Zha H, Li C, Tsioutsiouliklis K (2014) Two supervised learning approaches for name disambiguation in author citations. In: Proceedings of JCDL
4.
Zurück zum Zitat Zhang B, Dundar M, Hasan M (2016) Bayesian non-exhaustive classification a case study: online name disambiguation using temporal record streams. In: Proceedings of CIKM, pp 1341–1350 Zhang B, Dundar M, Hasan M (2016) Bayesian non-exhaustive classification a case study: online name disambiguation using temporal record streams. In: Proceedings of CIKM, pp 1341–1350
5.
Zurück zum Zitat Zhang B, Saha T, Hasan M (2014) Name disambiguation from link data in a collaboration graph. In: Proceedings of ASNAM, pp 8–84 Zhang B, Saha T, Hasan M (2014) Name disambiguation from link data in a collaboration graph. In: Proceedings of ASNAM, pp 8–84
6.
Zurück zum Zitat Zhang D, Tang J, Li J, Wang K (2007) A constraintbased probabilistic framework for name disambiguation. In: Proceedings of CIKM, 10191022 Zhang D, Tang J, Li J, Wang K (2007) A constraintbased probabilistic framework for name disambiguation. In: Proceedings of CIKM, 10191022
7.
Zurück zum Zitat Pucktada T, Lee G (2009) Disambiguating authors in academic publications using random forests. In: Proceedings of JCDL, pp 39–48 Pucktada T, Lee G (2009) Disambiguating authors in academic publications using random forests. In: Proceedings of JCDL, pp 39–48
8.
Zurück zum Zitat Wang X, Tang J, Cheng H, Yu P (2011) ADANA: active name disambiguation. In: International Conference on Data Mining (ICDM), pp 794–803 Wang X, Tang J, Cheng H, Yu P (2011) ADANA: active name disambiguation. In: International Conference on Data Mining (ICDM), pp 794–803
9.
Zurück zum Zitat Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002–1011 Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002–1011
10.
Zurück zum Zitat Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002-1011 Zhang Y, Zhang F, Yao P, Tang J (2018) Name disambiguation in AMiner: clustering, maintenance, and human in the loop. In: Proceedings of SIGKDD, pp 1002-1011
11.
Zurück zum Zitat Zhang B, Hasan M (2017) Name disambiguation in anonymized graphs using network embedding. In: Proceedings of CIKM, New York, pp 1239-1248 Zhang B, Hasan M (2017) Name disambiguation in anonymized graphs using network embedding. In: Proceedings of CIKM, New York, pp 1239-1248
12.
Zurück zum Zitat Qian Y, Zheng Q, Sakai T, Ye J, Liu J (2015) Dynamic author name disambiguation for growing digital libraries. Inf Retr J 18(5):379–412CrossRef Qian Y, Zheng Q, Sakai T, Ye J, Liu J (2015) Dynamic author name disambiguation for growing digital libraries. Inf Retr J 18(5):379–412CrossRef
13.
Zurück zum Zitat Han H, Yao C, Fu Y, Yu Y, Zhang Y, Xu S (2017) Semantic fingerprints-based author name disambiguation in Chinese documents. Scientometrics 111:1879–1896CrossRef Han H, Yao C, Fu Y, Yu Y, Zhang Y, Xu S (2017) Semantic fingerprints-based author name disambiguation in Chinese documents. Scientometrics 111:1879–1896CrossRef
14.
Zurück zum Zitat Silva J, Silva F (2017) Feature extraction for the author name disambiguation problem in a bibliographic database. In: Proceedings of the SAC, pp 783-789 Silva J, Silva F (2017) Feature extraction for the author name disambiguation problem in a bibliographic database. In: Proceedings of the SAC, pp 783-789
15.
Zurück zum Zitat Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef Zhang H, Guo H, Wang X, Ji Y, Wu QJ (2020) Clothescounter: a framework for star-oriented clothes mining from videos. Neurocomputing 377:38–48CrossRef
16.
Zurück zum Zitat Zhou Q, Liu Y, Wei Y, Wang W, Wang B, Wu S (2018) dirichlet process mixtures model based on variational inference for Chinese person name disambiguation. In: International Conference on Computing and Data Engineering (ICDE), pp 6-10 Zhou Q, Liu Y, Wei Y, Wang W, Wang B, Wu S (2018) dirichlet process mixtures model based on variational inference for Chinese person name disambiguation. In: International Conference on Computing and Data Engineering (ICDE), pp 6-10
17.
Zurück zum Zitat Gonçalves A, Laender M, Ferreira A, Anderson A (2015) On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method. Int J Dig Libr 16:229–246CrossRef Gonçalves A, Laender M, Ferreira A, Anderson A (2015) On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method. Int J Dig Libr 16:229–246CrossRef
18.
Zurück zum Zitat Fan X, Wang J, Pu X et al (2011) On graph-based name disambiguation. J Data Inf Qual 2(2):10 Fan X, Wang J, Pu X et al (2011) On graph-based name disambiguation. J Data Inf Qual 2(2):10
19.
Zurück zum Zitat Shin D, Kim T, Choi J et al (2014) Author name disambiguation using a graph model with node splitting and merging based on bibliographic information. Scientometrics 100(1):15–50CrossRef Shin D, Kim T, Choi J et al (2014) Author name disambiguation using a graph model with node splitting and merging based on bibliographic information. Scientometrics 100(1):15–50CrossRef
20.
Zurück zum Zitat Kim K, Giles C (2016) Financial entity record linkage with random forests. In: Proceedings of the Second International Workshop on data science for macro-modeling, article 13, 2 pages Kim K, Giles C (2016) Financial entity record linkage with random forests. In: Proceedings of the Second International Workshop on data science for macro-modeling, article 13, 2 pages
21.
Zurück zum Zitat Saha T, Zhang B, Hasan M (2015) Name disambiguation from link data in a collaboration graph using temporal and topological features. Soc Netw Anal Min 5(1):1–14CrossRef Saha T, Zhang B, Hasan M (2015) Name disambiguation from link data in a collaboration graph using temporal and topological features. Soc Netw Anal Min 5(1):1–14CrossRef
22.
Zurück zum Zitat D’Angelo C, Giuffrida C, Abramo G (2014) A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments. J Assoc Inf Sci Technol 62(2):257–269CrossRef D’Angelo C, Giuffrida C, Abramo G (2014) A heuristic approach to author name disambiguation in bibliometrics databases for large-scale research assessments. J Assoc Inf Sci Technol 62(2):257–269CrossRef
23.
Zurück zum Zitat Cetoli A, Akbari M, Bragaglia S, O’Harney A, Sloan M (2018) Named entity disambiguation using deep learning on graphs. arXiv preprint arXiv:1810.09164 Cetoli A, Akbari M, Bragaglia S, O’Harney A, Sloan M (2018) Named entity disambiguation using deep learning on graphs. arXiv preprint arXiv:​1810.​09164
24.
Zurück zum Zitat Huang D, Wang J (2017) An approach on Chinese microblog entity linking combining baidu encyclopaedia and word2vec. Proc Comput Sci 111:37–45CrossRef Huang D, Wang J (2017) An approach on Chinese microblog entity linking combining baidu encyclopaedia and word2vec. Proc Comput Sci 111:37–45CrossRef
25.
Zurück zum Zitat Zhu W, Zhang W, Li G, et al (2016) A study of damp-heat syndrome classification using Word2vec and TF-IDF. In: Proceedings of BIBM, pp 1415-1420 Zhu W, Zhang W, Li G, et al (2016) A study of damp-heat syndrome classification using Word2vec and TF-IDF. In: Proceedings of BIBM, pp 1415-1420
26.
Zurück zum Zitat Wang C, Chakrabarti K, Cheng T, et al (2012) Targeted disambiguation of ad-hoc, homogeneous sets of named entities. In: Proceedings of WWW, pp 719-728 Wang C, Chakrabarti K, Cheng T, et al (2012) Targeted disambiguation of ad-hoc, homogeneous sets of named entities. In: Proceedings of WWW, pp 719-728
27.
Zurück zum Zitat Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of CVPR, pp 815-823 Schroff F, Kalenichenko D, Philbin J (2015) Facenet: A unified embedding for face recognition and clustering. In: Proceedings of CVPR, pp 815-823
28.
Zurück zum Zitat Elmacioglu E, Tan Y, Yan S, et al (2017) Psnus: Web people name disambiguation by simple clustering with rich features. In: Proceedings of SemEval, pp 268-271 Elmacioglu E, Tan Y, Yan S, et al (2017) Psnus: Web people name disambiguation by simple clustering with rich features. In: Proceedings of SemEval, pp 268-271
29.
Zurück zum Zitat Xu J, Shen S, Li D, et al (2018) A network-embedding based method for author disambiguation. In: Proceedings of ICKM, pp 1735-1738 Xu J, Shen S, Li D, et al (2018) A network-embedding based method for author disambiguation. In: Proceedings of ICKM, pp 1735-1738
30.
31.
Zurück zum Zitat Tang J, Qu M, Wang M, et al (2015) Line: Large-scale information network embedding. In: Proceedings of WWW, 1067-1077 Tang J, Qu M, Wang M, et al (2015) Line: Large-scale information network embedding. In: Proceedings of WWW, 1067-1077
32.
Zurück zum Zitat Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of SIGKDD, pp 855-864 Grover A, Leskovec J (2016) node2vec: Scalable feature learning for networks. In: Proceedings of SIGKDD, pp 855-864
33.
Zurück zum Zitat Yang C, Liu Z, Zhao D, et al (2015) Network representation learning with rich text information. In: Proceedings of IJCAI Yang C, Liu Z, Zhao D, et al (2015) Network representation learning with rich text information. In: Proceedings of IJCAI
34.
Zurück zum Zitat Fu T, Lee W, Lei Z (2017) Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In: Proceedings of CIKM, pp 1797-1806 Fu T, Lee W, Lei Z (2017) Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In: Proceedings of CIKM, pp 1797-1806
35.
Zurück zum Zitat Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(2579–2605):85MATH Van der Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(2579–2605):85MATH
Metadaten
Titel
Learning semantic and relationship joint embedding for author name disambiguation
verfasst von
Bo Xiong
Peng Bao
Yilin Wu
Publikationsdatum
20.06.2020
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 6/2021
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-020-05088-y

Weitere Artikel der Ausgabe 6/2021

Neural Computing and Applications 6/2021 Zur Ausgabe