Skip to main content
Top

2016 | OriginalPaper | Chapter

Person Name Disambiguation for Building University Knowledge Base

Authors : Piotr Andruszkiewicz, Szymon Szepietowski

Published in: Intelligent Information and Database Systems

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper we propose a new algorithm for person name disambiguation within authors of scientific publications. The algorithm is effective, elastic, and tailored to a scientific knowledge base. Besides the common properties of publication; namely, title, venue, author and co-authors names, it also exploits references. One of the reasons is that we decided to enrich the University Knowledge Base with connections between publications, not only references represented by a reference (i.e. author’s name, title, etc.). Our algorithm utilises the unsupervised approach which does not require creating a training set, which is time and resources consuming. However, we want to leverage additional information available from crowd sourcing or authorised users which confirms authorship and citation relations between papers. By utilising this information default parameters of the unsupervised algorithm can be optimised for a given case by means of a genetic algorithm in order to increase the accuracy. The proposed method can be applied for three tasks: assigning a publication to a specific researcher, indicating that a new author is yet unknown to the database and clustering a set of publications into clusters that contain papers of one researcher. Validation results confirm high accuracy of the new algorithm and its usefulness in the process of populating a scientific knowledge base.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Koperwas, J., Skonieczny, Ł., Kozłowski, M., Andruszkiewicz, P., Rybiński, H., Struk, W.: AI platform for building university research knowledge base. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 405–414. Springer, Heidelberg (2014) Koperwas, J., Skonieczny, Ł., Kozłowski, M., Andruszkiewicz, P., Rybiński, H., Struk, W.: AI platform for building university research knowledge base. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 405–414. Springer, Heidelberg (2014)
2.
go back to reference Smalheiser, N.R., Torvik, V.I.: Author name disambiguation. ARIST 43(1), 1–43 (2009) Smalheiser, N.R., Torvik, V.I.: Author name disambiguation. ARIST 43(1), 1–43 (2009)
3.
go back to reference Ferreira, A.A., Gonçalves, M.A., Laender, A.H.F.: A brief survey of automatic methods for author name disambiguation. SIGMOD Rec. 41(2), 15–26 (2012)CrossRef Ferreira, A.A., Gonçalves, M.A., Laender, A.H.F.: A brief survey of automatic methods for author name disambiguation. SIGMOD Rec. 41(2), 15–26 (2012)CrossRef
4.
go back to reference Han, H., Giles, C.L., Zha, H., Li, C., Tsioutsiouliklis, K.: Two supervised learning approaches for name disambiguation in author citations. In: Chen, H., Wactlar, H.D., Chen, C., Lim, E., Christel, M.G. (eds.) Proceedings of ACM/IEEE Joint Conference on Digital Libraries, JCDL 2004, Tucson, AZ, USA, 7–11 June 2004, pp. 296–305. ACM (2004) Han, H., Giles, C.L., Zha, H., Li, C., Tsioutsiouliklis, K.: Two supervised learning approaches for name disambiguation in author citations. In: Chen, H., Wactlar, H.D., Chen, C., Lim, E., Christel, M.G. (eds.) Proceedings of ACM/IEEE Joint Conference on Digital Libraries, JCDL 2004, Tucson, AZ, USA, 7–11 June 2004, pp. 296–305. ACM (2004)
5.
go back to reference Ferreira, A.A., Veloso, A., Gonçalves, M.A., Laender, A.H.: Effective self-training author name disambiguation in scholarly digital libraries. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 39–48. ACM (2010) Ferreira, A.A., Veloso, A., Gonçalves, M.A., Laender, A.H.: Effective self-training author name disambiguation in scholarly digital libraries. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 39–48. ACM (2010)
6.
go back to reference Veloso, A., Ferreira, A.A., Gonçalves, M.A., Laender, A.H.F., Meira, Jr. W.: Cost-effective on-demand associative author name disambiguation. Inf. Process. Manage. vol. 48(4), pp. 680–967 (2012) Veloso, A., Ferreira, A.A., Gonçalves, M.A., Laender, A.H.F., Meira, Jr. W.: Cost-effective on-demand associative author name disambiguation. Inf. Process. Manage. vol. 48(4), pp. 680–967 (2012)
7.
go back to reference Tang, J., Yao, L., Zhang, D., Zhang, J.: A combination approach to web user profiling. ACM Trans. Knowl. Discov. Data 5(1), 2: 1–2: 44 (2010)CrossRef Tang, J., Yao, L., Zhang, D., Zhang, J.: A combination approach to web user profiling. ACM Trans. Knowl. Discov. Data 5(1), 2: 1–2: 44 (2010)CrossRef
8.
go back to reference Tang, J., Fong, A.C.M., Wang, B., Zhang, J.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. Knowl. Data Eng. 24(6), 975–987 (2012)CrossRef Tang, J., Fong, A.C.M., Wang, B., Zhang, J.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. Knowl. Data Eng. 24(6), 975–987 (2012)CrossRef
9.
go back to reference Li, S., Cong, G., Miao, C.: Author name disambiguation using a new categorical distribution similarity. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 569–584. Springer, Heidelberg (2012)CrossRef Li, S., Cong, G., Miao, C.: Author name disambiguation using a new categorical distribution similarity. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 569–584. Springer, Heidelberg (2012)CrossRef
10.
go back to reference Liu, Y., Li, W., Huang, Z., Fang, Q.: A fast method based on multiple clustering for name disambiguation in bibliographic citations. JASIST 66(3), 634–644 (2015) Liu, Y., Li, W., Huang, Z., Fang, Q.: A fast method based on multiple clustering for name disambiguation in bibliographic citations. JASIST 66(3), 634–644 (2015)
11.
go back to reference Yin, X., Han, J., Yu, P.S.: Object distinction: Distinguishing objects with identical names. In: Chirkova, R., Dogac, A., Özsu, M.T., Sellis, T.K., (eds.) Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, 15–20 April 2007, pp. 1242–1246. IEEE (2007) Yin, X., Han, J., Yu, P.S.: Object distinction: Distinguishing objects with identical names. In: Chirkova, R., Dogac, A., Özsu, M.T., Sellis, T.K., (eds.) Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, 15–20 April 2007, pp. 1242–1246. IEEE (2007)
12.
go back to reference Han, H., Zha, H., Giles, C.L.: Name disambiguation in author citations using a k-way spectral clustering method. In: Marlino, M., Sumner, T., III, F.M.S., (eds.) Proceedings of ACM/IEEE Joint Conference on Digital Libraries, JCDL 2005, Denver, CO, USA, 7–11 June 2005, pp. 334–343. ACM (2005) Han, H., Zha, H., Giles, C.L.: Name disambiguation in author citations using a k-way spectral clustering method. In: Marlino, M., Sumner, T., III, F.M.S., (eds.) Proceedings of ACM/IEEE Joint Conference on Digital Libraries, JCDL 2005, Denver, CO, USA, 7–11 June 2005, pp. 334–343. ACM (2005)
13.
go back to reference Cota, R.G., Ferreira, A.A., Nascimento, C., Gonçalves, M.A., Laender, A.H.F.: An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. JASIST 61(9), 1853–1870 (2010)CrossRef Cota, R.G., Ferreira, A.A., Nascimento, C., Gonçalves, M.A., Laender, A.H.F.: An unsupervised heuristic-based hierarchical method for name disambiguation in bibliographic citations. JASIST 61(9), 1853–1870 (2010)CrossRef
14.
go back to reference Pereira, D.A., Ribeiro-Neto, B.A., Ziviani, N., Laender, A.H.F., Gonçalves, M.A., Ferreira, A.A.: Using web information for author name disambiguation. In: Heath, F., Rice-Lively, M.L., Furuta, R., (eds.) Proceedings of the 2009 Joint International Conference on Digital Libraries, JCDL 2009, Austin, TX, USA, 15–19 June 2009, pp. 49–58. ACM (2009) Pereira, D.A., Ribeiro-Neto, B.A., Ziviani, N., Laender, A.H.F., Gonçalves, M.A., Ferreira, A.A.: Using web information for author name disambiguation. In: Heath, F., Rice-Lively, M.L., Furuta, R., (eds.) Proceedings of the 2009 Joint International Conference on Digital Libraries, JCDL 2009, Austin, TX, USA, 15–19 June 2009, pp. 49–58. ACM (2009)
15.
go back to reference Peng, H., Lu, C., Hsu, W., Ho, J.: Disambiguating authors in citations on the web and authorship correlations. Expert Syst. Appl. 39(12), 10521–10532 (2012)CrossRef Peng, H., Lu, C., Hsu, W., Ho, J.: Disambiguating authors in citations on the web and authorship correlations. Expert Syst. Appl. 39(12), 10521–10532 (2012)CrossRef
16.
go back to reference de Souza, E.A., Ferreira, A.A., Gonçalves, M.A.: Combining classifiers and user feedback for disambiguating author names. In: II, P.L.B., Allard, S., Mercer, H., Beck, M., Cunningham, S.J., Goh, D.H., Henry, G., (eds.) Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries, Knoxville, TN, USA, 21–25 June 2015, pp. 259–260. ACM (2015) de Souza, E.A., Ferreira, A.A., Gonçalves, M.A.: Combining classifiers and user feedback for disambiguating author names. In: II, P.L.B., Allard, S., Mercer, H., Beck, M., Cunningham, S.J., Goh, D.H., Henry, G., (eds.) Proceedings of the 15th ACM/IEEE-CE on Joint Conference on Digital Libraries, Knoxville, TN, USA, 21–25 June 2015, pp. 259–260. ACM (2015)
Metadata
Title
Person Name Disambiguation for Building University Knowledge Base
Authors
Piotr Andruszkiewicz
Szymon Szepietowski
Copyright Year
2016
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-49381-6_26

Premium Partner