Skip to main content

2016 | OriginalPaper | Buchkapitel

Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors

verfasst von : Amparo Elizabeth Cano-Basave, Francesco Osborne, Angelo Antonio Salatino

Erschienen in: Knowledge Engineering and Knowledge Management

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time \(t+1\), using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
3
Notice that we are following a one step memory approach, further historical data could be used in future research.
 
4
The data generated in the evaluation are available on request at http://​technologies.​kmi.​open.​ac.​uk/​rexplore/​ekaw2016/​OF/​.
 
Literatur
1.
Zurück zum Zitat Ahmed, A., Xing, E., Timeline.: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. Uncert. Artif. Intell. (2010) Ahmed, A., Xing, E., Timeline.: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. Uncert. Artif. Intell. (2010)
2.
Zurück zum Zitat Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, vol. 2, pp. 1171–1177. AAAI Press (2011) Andrzejewski, D., Zhu, X., Craven, M., Recht, B.: A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In: Proceedings of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, vol. 2, pp. 1171–1177. AAAI Press (2011)
3.
Zurück zum Zitat Bicer, V., Tran, T., Ma, Y., Studer, R.: TRM – learning dependencies between text and structure with topical relational models. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 1–16. Springer, Heidelberg (2013)CrossRef Bicer, V., Tran, T., Ma, Y., Studer, R.: TRM – learning dependencies between text and structure with topical relational models. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 1–16. Springer, Heidelberg (2013)CrossRef
4.
Zurück zum Zitat Ng, A.Y., Blei, D.M., Jordan, M.I.: Latent Dirichlet allocation. In. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH Ng, A.Y., Blei, D.M., Jordan, M.I.: Latent Dirichlet allocation. In. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
5.
Zurück zum Zitat Bolelli, L., Ertekin, Ş., Giles, C.L.: Topic and trend detection in text collections using latent Dirichlet allocation. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 776–780. Springer, Heidelberg (2009)CrossRef Bolelli, L., Ertekin, Ş., Giles, C.L.: Topic and trend detection in text collections using latent Dirichlet allocation. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 776–780. Springer, Heidelberg (2009)CrossRef
6.
Zurück zum Zitat Bolelli, L., Ertekin, S., Zhou, D., Giles, C. L.: Finding topic trends in digital libraries. In: Proceedings of 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 69–72. ACM, New York (2009) Bolelli, L., Ertekin, S., Zhou, D., Giles, C. L.: Finding topic trends in digital libraries. In: Proceedings of 9th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2009, pp. 69–72. ACM, New York (2009)
7.
Zurück zum Zitat Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006) Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: EACL, vol. 6, pp. 9–16 (2006)
8.
Zurück zum Zitat Chen, S., Beeferman, D., Rosenfeld, R.: Evaluation metrics for language models (1998) Chen, S., Beeferman, D., Rosenfeld, R.: Evaluation metrics for language models (1998)
9.
Zurück zum Zitat Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C.: No country for old members: user lifecycle and linguistic change in online communities. In: Proceedings of 22nd International Conference on World Wide Web, WWW 2013, pp. 307–318 (2013) Danescu-Niculescu-Mizil, C., West, R., Jurafsky, D., Leskovec, J., Potts, C.: No country for old members: user lifecycle and linguistic change in online communities. In: Proceedings of 22nd International Conference on World Wide Web, WWW 2013, pp. 307–318 (2013)
10.
Zurück zum Zitat Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C. X.: Probabilistic topic models with biased propagation on heterogeneous information networks. In: Proceedings of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1271–1279. ACM, New York (2011) Deng, H., Han, J., Zhao, B., Yu, Y., Lin, C. X.: Probabilistic topic models with biased propagation on heterogeneous information networks. In: Proceedings of 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011, pp. 1271–1279. ACM, New York (2011)
11.
Zurück zum Zitat Gohr, A., Hinneburg, A., Schult, R., Spiliopoulou, M.: Topic evolution in a stream of documents. In: SDM, pp. 859–872 (2009) Gohr, A., Hinneburg, A., Schult, R., Spiliopoulou, M.: Topic evolution in a stream of documents. In: SDM, pp. 859–872 (2009)
12.
Zurück zum Zitat Griffiths, T., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101(Suppl. 1), 52285235 (2004) Griffiths, T., Steyvers, M.: Finding scientific topics. Proc. Natl. Acad. Sci. U.S.A. 101(Suppl. 1), 52285235 (2004)
13.
Zurück zum Zitat He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., Giles, L.: Detecting topic evolution in scientific literature: how can citations help? In: Proceedings of 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 957–966. ACM, New York (2009) He, Q., Chen, B., Pei, J., Qiu, B., Mitra, P., Giles, L.: Detecting topic evolution in scientific literature: how can citations help? In: Proceedings of 18th ACM Conference on Information and Knowledge Management, CIKM 2009, pp. 957–966. ACM, New York (2009)
14.
Zurück zum Zitat Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Sig. Process. 35, 400–401 (1987)CrossRef Katz, S.M.: Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Sig. Process. 35, 400–401 (1987)CrossRef
15.
Zurück zum Zitat Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATH
16.
Zurück zum Zitat Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 198–207. ACM (2005) Mei, Q., Zhai, C.: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In: Proceedings of 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 198–207. ACM (2005)
17.
Zurück zum Zitat Minka, T.: Estimating a Dirichlet distribution. Technical report (2003) Minka, T.: Estimating a Dirichlet distribution. Technical report (2003)
18.
Zurück zum Zitat Monaghan, F., Bordea, G., Samp, K., Buitelaar, P.: Exploring your research: sprinkling some saffron on semantic web dog food. In: Semantic Web Challenge at the International Semantic Web Conference, vol. 117, pp. 420–435. Citeseer (2010) Monaghan, F., Bordea, G., Samp, K., Buitelaar, P.: Exploring your research: sprinkling some saffron on semantic web dog food. In: Semantic Web Challenge at the International Semantic Web Conference, vol. 117, pp. 420–435. Citeseer (2010)
19.
Zurück zum Zitat Morinaga, S., Yamanishi, K.: Tracking dynamics of topic trends using a finite mixture model. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004) Morinaga, S., Yamanishi, K.: Tracking dynamics of topic trends using a finite mixture model. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2004)
20.
Zurück zum Zitat Osborne, F., Motta, E.: Klink-2: integrating multiple web sources to generate semantic topic networks. In: 14th International Semantic Web Conference (2015) Osborne, F., Motta, E.: Klink-2: integrating multiple web sources to generate semantic topic networks. In: 14th International Semantic Web Conference (2015)
21.
Zurück zum Zitat Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)CrossRef Osborne, F., Motta, E., Mulholland, P.: Exploring scholarly data with rexplore. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 460–477. Springer, Heidelberg (2013)CrossRef
22.
Zurück zum Zitat Osborne, F., Salatino, A., Birukou, A., Mottam, E.: Automatic classification of springer nature proceedings with smart topic miner. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 383–399. Springer, Heidelberg (2016)CrossRef Osborne, F., Salatino, A., Birukou, A., Mottam, E.: Automatic classification of springer nature proceedings with smart topic miner. In: Groth, P., Simperl, E., Gray, A., Sabou, M., Krötzsch, M., Lecue, F., Flöck, F., Gil, Y. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 383–399. Springer, Heidelberg (2016)CrossRef
23.
Zurück zum Zitat Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012)CrossRef Pesquita, C., Couto, F.M.: Predicting the extension of biomedical ontologies. PLoS Comput. Biol. 8(9), e1002630 (2012)CrossRef
24.
Zurück zum Zitat Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004) Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of 20th Conference on Uncertainty in Artificial Intelligence, pp. 487–494. AUAI Press (2004)
25.
Zurück zum Zitat Tseng, Y.-H., Lin, Y.-I., Lee, Y.-Y., Hung, W.-C., Lee, C.-H.: A comparison of methods for detecting hot topics. Scientometrics 81(1), 73–90 (2009)CrossRef Tseng, Y.-H., Lin, Y.-I., Lee, Y.-Y., Hung, W.-C., Lee, C.-H.: A comparison of methods for detecting hot topics. Scientometrics 81(1), 73–90 (2009)CrossRef
26.
Zurück zum Zitat Wang, H., Tudorache, T., Dou, D., Noy, N.F., Musen, M.A.: Analysis and prediction of user editing patterns in ontology development projects. J. Data Semant. 4(2), 117–132 (2015)CrossRef Wang, H., Tudorache, T., Dou, D., Noy, N.F., Musen, M.A.: Analysis and prediction of user editing patterns in ontology development projects. J. Data Semant. 4(2), 117–132 (2015)CrossRef
28.
Zurück zum Zitat Zablith, F., Antoniou, G., d’Aquin, M., Flouris, G., Kondylakis, H., Motta, E., Plexousakis, D., Sabou, M.: Ontology evolution: a process-centric survey. Knowl. Eng. Rev. 30(01), 45–75 (2015)CrossRef Zablith, F., Antoniou, G., d’Aquin, M., Flouris, G., Kondylakis, H., Motta, E., Plexousakis, D., Sabou, M.: Ontology evolution: a process-centric survey. Knowl. Eng. Rev. 30(01), 45–75 (2015)CrossRef
Metadaten
Titel
Ontology Forecasting in Scientific Literature: Semantic Concepts Prediction Based on Innovation-Adoption Priors
verfasst von
Amparo Elizabeth Cano-Basave
Francesco Osborne
Angelo Antonio Salatino
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-49004-5_4