Skip to main content
Erschienen in: International Journal on Digital Libraries 4/2019

14.03.2018

Capisco: low-cost concept-based access to digital libraries

verfasst von: Annika Hinze, David Bainbridge, Sally Jo Cunningham, Craig Taube-Schock, Rangi Matamua, J. Stephen Downie, Edie Rasmussen

Erschienen in: International Journal on Digital Libraries | Ausgabe 4/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this article, we present the conceptual design and report on the implementation of Capisco—a low-cost approach to concept-based access to digital libraries. Capisco avoids the need for complete semantic document markup using ontologies by leveraging an automatically generated Concept-in-Context (CiC) network. The network is seeded by a priori analysis of Wikipedia texts and identification of semantic metadata. Our Capisco system disambiguates the semantics of terms in the documents by their semantics and context and identifies the relevant CiC concepts. Supplementary to this, the disambiguation of search queries is done interactively, to fully utilize the domain knowledge of the scholar. For established digital library systems, completely replacing, or even making significant changes to the document retrieval mechanism (document analysis, indexing strategy, query processing, and query interface) would require major technological effort and would most likely be disruptive. In addition to presenting Capisco, we describe ways to harness the results of our developed semantic analysis and disambiguation, while retaining the existing keyword-based search and lexicographic index. We engineer this so the output of semantic analysis (performed off-line) is suitable for import directly into existing digital library metadata and index structures, and thus incorporated without the need for architecture modifications.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
6
Technical non-experts are users who are domain experts but are not familiar with technical detail of semantic concepts [30].
 
10
These documents and other test collections have been provided by the HathiTrust.
 
12
For simplicity, we abstract from the precise locations in which the terms appear on each page.
 
13
Such as the advanced search for HathiTrust items at catalog.​hathitrust.​org/​Search/​Advanced.
 
14
The references link to the publications in which the corpora were first introduced.
 
Literatur
1.
Zurück zum Zitat Cunningham, S.J., Hinze, A.M., Bainbridge, D., Taube-Schock, C., Ryan, T.: Building heritage document collections for Pacific Island nations using semantic-enriched search. In: Proceedings of the Samoa Conference III. Sãmoa: National University of Sãmoa (2014) Cunningham, S.J., Hinze, A.M., Bainbridge, D., Taube-Schock, C., Ryan, T.: Building heritage document collections for Pacific Island nations using semantic-enriched search. In: Proceedings of the Samoa Conference III. Sãmoa: National University of Sãmoa (2014)
2.
Zurück zum Zitat Duineveld, A.J., Stoter, R., Weiden, M.R., Kenepa, B., Benjamins, V.R.: Wondertools? A Comparative Study of Ontological Engineering Tools Duineveld, A.J., Stoter, R., Weiden, M.R., Kenepa, B., Benjamins, V.R.: Wondertools? A Comparative Study of Ontological Engineering Tools
3.
Zurück zum Zitat Airio, E., Järvelin, K., Saatsi, P., Kekäläinen, J., Suomela, S.: Ciri-an ontology-based query interface for text retrieval. In: Web Intelligence: Proceedings of the 11th Finnish Artificial Intelligence Conference, Citeseer (2004) Airio, E., Järvelin, K., Saatsi, P., Kekäläinen, J., Suomela, S.: Ciri-an ontology-based query interface for text retrieval. In: Web Intelligence: Proceedings of the 11th Finnish Artificial Intelligence Conference, Citeseer (2004)
4.
Zurück zum Zitat Apperley, M., Cunningham, S.J., Keegan, T.T., Witten, I.H.: Niupepa: a historical newspaper collection. Commun. ACM 44(5), 86–87 (2001)CrossRef Apperley, M., Cunningham, S.J., Keegan, T.T., Witten, I.H.: Niupepa: a historical newspaper collection. Commun. ACM 44(5), 86–87 (2001)CrossRef
5.
Zurück zum Zitat Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval—The Concepts and Technology Behind Search, 2nd edn. Addison-Wesley, Reading (2011) Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval—The Concepts and Technology Behind Search, 2nd edn. Addison-Wesley, Reading (2011)
6.
Zurück zum Zitat Bainbridge, D., Don, K.J., Buchanan, G.R., Witten, I.H., Jones, S., Jones, M., Barr, M.I.: Dynamic digital library construction and configuration. In: Heery, R., Lyon, L. (eds.) Proceedings of the Research and Advanced Technology for Digital Libraries: 8th European Conference, ECDL 2004, Bath, UK, September 12–17, 2004, pp 1–13. Springer, Berlin (2004) Bainbridge, D., Don, K.J., Buchanan, G.R., Witten, I.H., Jones, S., Jones, M., Barr, M.I.: Dynamic digital library construction and configuration. In: Heery, R., Lyon, L. (eds.) Proceedings of the Research and Advanced Technology for Digital Libraries: 8th European Conference, ECDL 2004, Bath, UK, September 12–17, 2004, pp 1–13. Springer, Berlin (2004)
7.
Zurück zum Zitat Berrios, D.C.: Methods for Semi-automated Index Generation for High Precision Information Retrieval. PhD thesis, Stanford University (2001) Berrios, D.C.: Methods for Semi-automated Index Generation for High Precision Information Retrieval. PhD thesis, Stanford University (2001)
8.
Zurück zum Zitat Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J Mach. Learn. Res. 3(Jan), 993–1022 (2003)MATH Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J Mach. Learn. Res. 3(Jan), 993–1022 (2003)MATH
9.
Zurück zum Zitat Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, ACL, pp. 9–16 (2006) Bunescu, R.C., Pasca, M.: Using encyclopedic knowledge for named entity disambiguation. In: 11th Conference of the European Chapter of the Association for Computational Linguistics, ACL, pp. 9–16 (2006)
10.
Zurück zum Zitat Campbell, I.: The Ostensive Model of Developing Information-Needs. PhD thesis, University of Glasgow (2000) Campbell, I.: The Ostensive Model of Developing Information-Needs. PhD thesis, University of Glasgow (2000)
12.
Zurück zum Zitat Churchill, W.: Niue: a reconnaissance. Bull. Am. Geogr. Soc. 40(3), 150–156 (1908)CrossRef Churchill, W.: Niue: a reconnaissance. Bull. Am. Geogr. Soc. 40(3), 150–156 (1908)CrossRef
13.
Zurück zum Zitat Cimiano, P., Schultz, A., Sizov, S., Sorg, P., Staab, S.: Explicit versus latent concept models for cross-language information retrieval. IJCAI 9, 1513–1518 (2009) Cimiano, P., Schultz, A., Sizov, S., Sorg, P., Staab, S.: Explicit versus latent concept models for cross-language information retrieval. IJCAI 9, 1513–1518 (2009)
14.
Zurück zum Zitat Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic, pp. 708–716 (2007). http://www.aclweb.org/anthology/D07-1074 Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Association for Computational Linguistics, Prague, Czech Republic, pp. 708–716 (2007). http://​www.​aclweb.​org/​anthology/​D07-1074
15.
Zurück zum Zitat Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)CrossRef Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)CrossRef
16.
Zurück zum Zitat Downie, J.S., Cole, T., Senseney, M., Jett, J., Page, K., Hinze, A., Muñoz, T., Audenaert, N.: Workset Creation for Scholarly Analysis: Recommendations and Prototyping Project Reports. University of Illinois at Urbana-Champaign, Tech. rep. (2015) Downie, J.S., Cole, T., Senseney, M., Jett, J., Page, K., Hinze, A., Muñoz, T., Audenaert, N.: Workset Creation for Scholarly Analysis: Recommendations and Prototyping Project Reports. University of Illinois at Urbana-Champaign, Tech. rep. (2015)
17.
Zurück zum Zitat Dugan, J.M., Berrios, D.C., Liu, X., Kim, D.K., Kaizer, H., Fagan, L.M.: Automation and integration of components for generalized semantic markup of electronic medical texts. In: Proceedings of the AMIA Symposium, American Medical Informatics Association, pp. 736–740 (1999) Dugan, J.M., Berrios, D.C., Liu, X., Kim, D.K., Kaizer, H., Fagan, L.M.: Automation and integration of components for generalized semantic markup of electronic medical texts. In: Proceedings of the AMIA Symposium, American Medical Informatics Association, pp. 736–740 (1999)
18.
Zurück zum Zitat Efthimiadis, E.N.: Interactive query expansion: a user-based evaluation in a relevance feedback environment. J. Am. Soc. Inf. Sci. 51(11), 989–1003 (2000)CrossRef Efthimiadis, E.N.: Interactive query expansion: a user-based evaluation in a relevance feedback environment. J. Am. Soc. Inf. Sci. 51(11), 989–1003 (2000)CrossRef
19.
Zurück zum Zitat El-Beltagy, S.R., Rafea, A.: KP-Miner: a keyphrase extraction system for English and Arabic documents. Inf. Syst. 34(1), 132–144 (2009)CrossRef El-Beltagy, S.R., Rafea, A.: KP-Miner: a keyphrase extraction system for English and Arabic documents. Inf. Syst. 34(1), 132–144 (2009)CrossRef
21.
Zurück zum Zitat Flouris, G., Manakanatas, D., Kondylakis, H., Plexousakis, D., Antoniou, G.: Ontology change: classification and survey. Knowl. Eng. Rev. 23(02), 117–152 (2008)CrossRef Flouris, G., Manakanatas, D., Kondylakis, H., Plexousakis, D., Antoniou, G.: Ontology change: classification and survey. Knowl. Eng. Rev. 23(02), 117–152 (2008)CrossRef
23.
Zurück zum Zitat Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 1606–1611. Morgan Kaufmann (2007) Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence, pp. 1606–1611. Morgan Kaufmann (2007)
24.
Zurück zum Zitat Ganea, O.E., Ganea, M., Lucchi, A., Eickhoff, C., Hofmann, T.: Probabilistic bag-of-hyperlinks model for entity linking. In: Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 927–938 (2016) Ganea, O.E., Ganea, M., Lucchi, A., Eickhoff, C., Hofmann, T.: Probabilistic bag-of-hyperlinks model for entity linking. In: Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, pp. 927–938 (2016)
25.
Zurück zum Zitat Griffiths, T.L., Steyvers, M., Blei, D.M., Tenenbaum, J.B.: Integrating topics and syntax. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Proceedings of the 17th International Conference on Neural Information Processing Systems (NIPS’04), pp. 537–544 . MIT Press, Cambridge, MA, USA, (2004) Griffiths, T.L., Steyvers, M., Blei, D.M., Tenenbaum, J.B.: Integrating topics and syntax. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Proceedings of the 17th International Conference on Neural Information Processing Systems (NIPS’04), pp. 537–544 . MIT Press, Cambridge, MA, USA, (2004)
27.
Zurück zum Zitat Guha, R., McCool, R., Miller, E.: Semantic search. In: Proceedings of the 12th International Conference on World Wide Web. ACM, pp. 700–709 (2003) Guha, R., McCool, R., Miller, E.: Semantic search. In: Proceedings of the 12th International Conference on World Wide Web. ACM, pp. 700–709 (2003)
28.
Zurück zum Zitat Guppy, H.B.: Coral Islands and Savage Myths. Victoria Institute and Philosophical Society of Great Britain, London (1889) Guppy, H.B.: Coral Islands and Savage Myths. Victoria Institute and Philosophical Society of Great Britain, London (1889)
29.
Zurück zum Zitat Harris, P., Matamua, R., Smith, T., Kerr, H., Waaka, T.: A review of Māori astronomy in Aotaora-New Zealand. J. Astron. Hist. Herit. 16(3), 325–336 (2013) Harris, P., Matamua, R., Smith, T., Kerr, H., Waaka, T.: A review of Māori astronomy in Aotaora-New Zealand. J. Astron. Hist. Herit. 16(3), 325–336 (2013)
30.
Zurück zum Zitat Hinze, A., Heese, R., Luczak-Rösch, M., Paschke, A.: Semantic enrichment by non-experts: usability of manual annotation tools. In: The Semantic Web—ISWC 2012, pp. 165–181. Springer, Berlin (2012)CrossRef Hinze, A., Heese, R., Luczak-Rösch, M., Paschke, A.: Semantic enrichment by non-experts: usability of manual annotation tools. In: The Semantic Web—ISWC 2012, pp. 165–181. Springer, Berlin (2012)CrossRef
31.
Zurück zum Zitat Hinze, A., Heese, R., Schlegel, A., Luczak-Rösch, M.: User-defined semantic enrichment of full-text documents: experiences and lessons learned. In: Theory and Practice of Digital Libraries, pp. 209–214. Springer, Berlin (2012)CrossRef Hinze, A., Heese, R., Schlegel, A., Luczak-Rösch, M.: User-defined semantic enrichment of full-text documents: experiences and lessons learned. In: Theory and Practice of Digital Libraries, pp. 209–214. Springer, Berlin (2012)CrossRef
32.
Zurück zum Zitat Hinze, A., Taube-Schock, C., Bainbridge, D., Cunningham, S.J., Downie, J.S.: Introducing Capisco: A semantically-enhanced search and discovery system for large-scale text corpora. ACM SIGWEB Newsl. Autumn 2015, 4:1–4:14 (2015). https://doi.org/10.1145/2833219.2833223 Hinze, A., Taube-Schock, C., Bainbridge, D., Cunningham, S.J., Downie, J.S.: Introducing Capisco: A semantically-enhanced search and discovery system for large-scale text corpora. ACM SIGWEB Newsl. Autumn 2015, 4:1–4:14 (2015). https://​doi.​org/​10.​1145/​2833219.​2833223
33.
Zurück zum Zitat Hinze, A., Taube-Schock, C., Bainbridge, D., Matamua, R., Downie, J.S.: Improving access to large-scale digital libraries through semantic-enhanced search and disambiguation. In: Proceedings of the ACM/IEEE-CS on Joint Conference on Digital Libraries, pp. 147–156. ACM (2015) Hinze, A., Taube-Schock, C., Bainbridge, D., Matamua, R., Downie, J.S.: Improving access to large-scale digital libraries through semantic-enhanced search and disambiguation. In: Proceedings of the ACM/IEEE-CS on Joint Conference on Digital Libraries, pp. 147–156. ACM (2015)
34.
Zurück zum Zitat Hinze, A., Bainbridge, D., Cunningham, S.J., Downie, J.S.: Low-cost semantic enhancement to digital library metadata and indexing: simple yet effective strategies. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pp 93–102. ACM (2016). https://doi.org/10.1145/2910896.2910910 Hinze, A., Bainbridge, D., Cunningham, S.J., Downie, J.S.: Low-cost semantic enhancement to digital library metadata and indexing: simple yet effective strategies. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pp 93–102. ACM (2016). https://​doi.​org/​10.​1145/​2910896.​2910910
35.
36.
Zurück zum Zitat Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 782–792 (2011) Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 782–792 (2011)
37.
Zurück zum Zitat Hovy, E., Navigli, R., Ponzetto, S.P.: Collaboratively built semi-structured content and artificial intelligence: the story so far. Artif. Intell. 194, 2–27 (2013)MathSciNetCrossRef Hovy, E., Navigli, R., Ponzetto, S.P.: Collaboratively built semi-structured content and artificial intelligence: the story so far. Artif. Intell. 194, 2–27 (2013)MathSciNetCrossRef
38.
Zurück zum Zitat Huang, A., Milne, D., Frank, E., Witten, I.H.: Clustering documents using a Wikipedia-based concept representation. In: Proceedings of 13th Pacific-Asia Conference, pp. 628–636. Springer, Berlin (2009)CrossRef Huang, A., Milne, D., Frank, E., Witten, I.H.: Clustering documents using a Wikipedia-based concept representation. In: Proceedings of 13th Pacific-Asia Conference, pp. 628–636. Springer, Berlin (2009)CrossRef
39.
Zurück zum Zitat Jean-Louis, L., Zouaq, A., Gagnon, M., Ensan, F.: An assessment of online semantic annotators for the keyword extraction task. In: PRICAI 2014: Trends in Artificial Intelligence, pp. 548–560. Springer, Berlin (2014) Jean-Louis, L., Zouaq, A., Gagnon, M., Ensan, F.: An assessment of online semantic annotators for the keyword extraction task. In: PRICAI 2014: Trends in Artificial Intelligence, pp. 548–560. Springer, Berlin (2014)
40.
Zurück zum Zitat Johnes, A.J.: Johnes on the causes which have produced dissent from the established church in the principality of Wales. Henry Hooper, London (1870) Johnes, A.J.: Johnes on the causes which have produced dissent from the established church in the principality of Wales. Henry Hooper, London (1870)
41.
Zurück zum Zitat Jon, K.J., Bainbridge, D., Witten, I.H.: The Design of Greenstone 3: An Agent Based Dynamic Digital Library. Tech. rep., Department of Computer Science, University of Waikato (2002) Jon, K.J., Bainbridge, D., Witten, I.H.: The Design of Greenstone 3: An Agent Based Dynamic Digital Library. Tech. rep., Department of Computer Science, University of Waikato (2002)
42.
Zurück zum Zitat Karger, D.: Unference: UI (Not AI) as Key to the Semantic Web. Panel on Interaction Design Grand Challenges and the Semantic Web, at the 3rd International Semantic Web User Interaction Workshop (2006) Karger, D.: Unference: UI (Not AI) as Key to the Semantic Web. Panel on Interaction Design Grand Challenges and the Semantic Web, at the 3rd International Semantic Web User Interaction Workshop (2006)
44.
Zurück zum Zitat Kim, D.K., Fagan, L.M., Jones, K.T., Berrios, D.C., Yu, V.L.: MYCIN II: design and implementation of a therapy reference with complex content-based indexing. In: Proceedings of the AMIA Symposium, pp. 175–179. American Medical Informatics Association (1998) Kim, D.K., Fagan, L.M., Jones, K.T., Berrios, D.C., Yu, V.L.: MYCIN II: design and implementation of a therapy reference with complex content-based indexing. In: Proceedings of the AMIA Symposium, pp. 175–179. American Medical Informatics Association (1998)
45.
Zurück zum Zitat Köhncke, B., Balke, W.T.: Context-sensitive ranking using cross-domain knowledge for chemical digital libraries. In: International Conference on Theory and Practice of Digital Libraries, pp. 285–296. Springer, Berlin (2013)CrossRef Köhncke, B., Balke, W.T.: Context-sensitive ranking using cross-domain knowledge for chemical digital libraries. In: International Conference on Theory and Practice of Digital Libraries, pp. 285–296. Springer, Berlin (2013)CrossRef
46.
Zurück zum Zitat Köhncke, B., Siehndel, P., Balke, W.T.: Bridging the gap–using external knowledge bases for context-aware document retrieval. In: International Conference on Asian Digital Libraries, pp. 11–20. Springer, Berlin (2013)CrossRef Köhncke, B., Siehndel, P., Balke, W.T.: Bridging the gap–using external knowledge bases for context-aware document retrieval. In: International Conference on Asian Digital Libraries, pp. 11–20. Springer, Berlin (2013)CrossRef
47.
Zurück zum Zitat Kohomban, U.S., Lee, W.S.: Learning semantic classes for word sense disambiguation. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 34–41 (2005) Kohomban, U.S., Lee, W.S.: Learning semantic classes for word sense disambiguation. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 34–41 (2005)
48.
Zurück zum Zitat Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of Wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 457–466. ACM (2009) Kulkarni, S., Singh, A., Ramakrishnan, G., Chakrabarti, S.: Collective annotation of Wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 457–466. ACM (2009)
49.
Zurück zum Zitat Lei, Y., Uren, V., Motta, E.: Semsearch: a search engine for the semantic web. In: International Conference on Knowledge Engineering and Knowledge Management, pp. 238–245. Springer, Berlin (2006)CrossRef Lei, Y., Uren, V., Motta, E.: Semsearch: a search engine for the semantic web. In: International Conference on Knowledge Engineering and Knowledge Management, pp. 238–245. Springer, Berlin (2006)CrossRef
50.
Zurück zum Zitat Leonard, P.: Mining large datasets for the humanities. In: World Library and Information Congress. International Federation of Library Associations (2014) Leonard, P.: Mining large datasets for the humanities. In: World Library and Information Congress. International Federation of Library Associations (2014)
51.
Zurück zum Zitat Lin, Y., Michel, J.B., Aiden, E.L., Orwant, J., Brockman, W., Petrov, S.: Syntactic annotations for the Google books ngram corpus. In: Proceedings of the ACL 2012 System Demonstrations, pp. 169–174. ACL (2012) Lin, Y., Michel, J.B., Aiden, E.L., Orwant, J., Brockman, W., Petrov, S.: Syntactic annotations for the Google books ngram corpus. In: Proceedings of the ACL 2012 System Demonstrations, pp. 169–174. ACL (2012)
52.
Zurück zum Zitat Lytras, M., Sicilia, M., Davies, J., Kashyap, V., Stojanovic, N.: On the conceptualisation of the query refinement task. Library Manag. 26(4/5), 281–294 (2005)CrossRef Lytras, M., Sicilia, M., Davies, J., Kashyap, V., Stojanovic, N.: On the conceptualisation of the query refinement task. Library Manag. 26(4/5), 281–294 (2005)CrossRef
53.
Zurück zum Zitat Mäkelä, E.: Survey of semantic search research. In: Proceedings of the Seminar on Knowledge Management on the Semantic Web. Department of Computer Science, University of Helsinki, Helsinki (2005) Mäkelä, E.: Survey of semantic search research. In: Proceedings of the Seminar on Knowledge Management on the Semantic Web. Department of Computer Science, University of Helsinki, Helsinki (2005)
54.
Zurück zum Zitat Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontol. 2(1), 23–34 (2007)CrossRef Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontol. 2(1), 23–34 (2007)CrossRef
55.
Zurück zum Zitat Medelyan, O., Frank, E., Witten, I.H.: Human-competitive tagging using automatic keyphrase extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1318–1327. ACL (2009) Medelyan, O., Frank, E., Witten, I.H.: Human-competitive tagging using automatic keyphrase extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1318–1327. ACL (2009)
56.
Zurück zum Zitat Mihalcea, R., Csomai, A.: Wikify! Linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM (2007) Mihalcea, R., Csomai, A.: Wikify! Linking documents to encyclopedic knowledge. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM (2007)
57.
Zurück zum Zitat Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 509–518. ACM (2008) Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 509–518. ACM (2008)
58.
59.
Zurück zum Zitat Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from Wikipedia: a case study. In: Proceedings IEEE/WIC/ACM International Conference on Web Intelligence, pp. 442–448. IEEE (2006) Milne, D., Medelyan, O., Witten, I.H.: Mining domain-specific thesauri from Wikipedia: a case study. In: Proceedings IEEE/WIC/ACM International Conference on Web Intelligence, pp. 442–448. IEEE (2006)
60.
Zurück zum Zitat Milne, D.N., Witten, I.H., Nichols, D.M.: A knowledge-based search engine powered by Wikipedia. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 445–454. ACM (2007) Milne, D.N., Witten, I.H., Nichols, D.M.: A knowledge-based search engine powered by Wikipedia. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 445–454. ACM (2007)
61.
Zurück zum Zitat Moldovan, D.I., Mihalcea, R.: Using WordNet and lexical operators to improve internet searches. IEEE Internet Comput. 4(1), 34–43 (2000)CrossRef Moldovan, D.I., Mihalcea, R.: Using WordNet and lexical operators to improve internet searches. IEEE Internet Comput. 4(1), 34–43 (2000)CrossRef
62.
Zurück zum Zitat Müller, C., Gurevych, I.: Using Wikipedia and Wiktionary in domain-specific information retrieval. In: Cross-Language Evaluation Forum Conference on Evaluating Systems for Multilingual and Multimodal Information Access, pp. 219–226. Springer, Berlin (2009) Müller, C., Gurevych, I.: Using Wikipedia and Wiktionary in domain-specific information retrieval. In: Cross-Language Evaluation Forum Conference on Evaluating Systems for Multilingual and Multimodal Information Access, pp. 219–226. Springer, Berlin (2009)
63.
Zurück zum Zitat Nakayama, K., Hara, T., Nishio, S.: A thesaurus construction method from large scaleweb dictionaries. In: 21st International Conference on Advanced Information Networking and Applications, 2007 (AINA’07), pp. 932–939. IEEE (2007) Nakayama, K., Hara, T., Nishio, S.: A thesaurus construction method from large scaleweb dictionaries. In: 21st International Conference on Advanced Information Networking and Applications, 2007 (AINA’07), pp. 932–939. IEEE (2007)
64.
Zurück zum Zitat Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. (CSUR) 41(2), 10:1–10:69 (2009)CrossRef Navigli, R.: Word sense disambiguation: a survey. ACM Comput. Surv. (CSUR) 41(2), 10:1–10:69 (2009)CrossRef
65.
Zurück zum Zitat O’Brien, R.B. (ed.): Home Rule, Speeches by John Redmond. T. F Unwin, London (1910) O’Brien, R.B. (ed.): Home Rule, Speeches by John Redmond. T. F Unwin, London (1910)
66.
Zurück zum Zitat Peat, H.J., Willett, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems. J. Am. Soc. Inf. Sci. 42, 378–383 (1991)CrossRef Peat, H.J., Willett, P.: The limitations of term co-occurrence data for query expansion in document retrieval systems. J. Am. Soc. Inf. Sci. 42, 378–383 (1991)CrossRef
68.
Zurück zum Zitat Potthast, M., Stein, B., Anderka, M.: A Wikipedia-based multilingual retrieval model. In: European Conference on Information Retrieval, pp. 522–530. Springer, Berlin (2008) Potthast, M., Stein, B., Anderka, M.: A Wikipedia-based multilingual retrieval model. In: European Conference on Information Retrieval, pp. 522–530. Springer, Berlin (2008)
69.
Zurück zum Zitat Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to Wikipedia. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1375–1384. ACL (2011) Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to Wikipedia. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1375–1384. ACL (2011)
70.
Zurück zum Zitat Rito, J.S.T., Healy, S.M. (eds): Proceedings of the Traditional Knowledge Conference 2008: Traditional Knowledge and Gateways to Balanced Relationships. New Zealand’s Māori Centre of Research Excellence (2008) Rito, J.S.T., Healy, S.M. (eds): Proceedings of the Traditional Knowledge Conference 2008: Traditional Knowledge and Gateways to Balanced Relationships. New Zealand’s Māori Centre of Research Excellence (2008)
72.
Zurück zum Zitat Scheau, C., Rebedea, T., Chiru, C., Trausan-Matu, S.: Improving the relevance of search engine results by using semantic information from Wikipedia. In: 9th RoEduNet IEEE International Conference, pp. 151–156. IEEE (2010) Scheau, C., Rebedea, T., Chiru, C., Trausan-Matu, S.: Improving the relevance of search engine results by using semantic information from Wikipedia. In: 9th RoEduNet IEEE International Conference, pp. 151–156. IEEE (2010)
73.
Zurück zum Zitat Shapira, B., Ofek, N., Makarenkov, V.: Exploiting Wikipedia for information retrieval tasks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’15, pp. 1137–1140. ACM (2015). https://doi.org/10.1145/2766462.2767879 Shapira, B., Ofek, N., Makarenkov, V.: Exploiting Wikipedia for information retrieval tasks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR’15, pp. 1137–1140. ACM (2015). https://​doi.​org/​10.​1145/​2766462.​2767879
74.
Zurück zum Zitat Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a very large altavista query log. ACM SIGIR Forum 33, 6–12 (1998)CrossRef Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a very large altavista query log. ACM SIGIR Forum 33, 6–12 (1998)CrossRef
75.
Zurück zum Zitat Sinkkilä, R., Suominen, O., Hyvönen, E.: Automatic semantic subject indexing of web documents in highly inflected languages. In: The Semantic Web: Research and Applications, pp. 215–229. Springer, Berlin (2011) Sinkkilä, R., Suominen, O., Hyvönen, E.: Automatic semantic subject indexing of web documents in highly inflected languages. In: The Semantic Web: Research and Applications, pp. 215–229. Springer, Berlin (2011)
76.
Zurück zum Zitat Soderland, S., Aronow, D., Fisher, D., Aseltine, J., Lehnert, W.: Machine Learning of Text Analysis Rules for Clinical Records. Tech. rep., Dept. of Computer Science, University of Massachusetts (1995) Soderland, S., Aronow, D., Fisher, D., Aseltine, J., Lehnert, W.: Machine Learning of Text Analysis Rules for Clinical Records. Tech. rep., Dept. of Computer Science, University of Massachusetts (1995)
77.
Zurück zum Zitat Sorg, P., Cimiano, P.: Exploiting Wikipedia for cross-lingual and multilingual information retrieval. Data Knowl. Eng. 74, 26–45 (2012)CrossRef Sorg, P., Cimiano, P.: Exploiting Wikipedia for cross-lingual and multilingual information retrieval. Data Knowl. Eng. 74, 26–45 (2012)CrossRef
78.
Zurück zum Zitat Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley Longman, Reading (1984)MATH Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley Longman, Reading (1984)MATH
79.
Zurück zum Zitat Steyvers, M., Griffiths, T.: Probabilistic topic models. Handb. Latent Semant. Anal. 427(7), 424–440 (2007) Steyvers, M., Griffiths, T.: Probabilistic topic models. Handb. Latent Semant. Anal. 427(7), 424–440 (2007)
80.
Zurück zum Zitat Stojanovic, N.: Information-need driven query refinement. Web Intell. Agent Syst. 3(3), 155–169 (2005) Stojanovic, N.: Information-need driven query refinement. Web Intell. Agent Syst. 3(3), 155–169 (2005)
81.
Zurück zum Zitat Stojanovic, N., Studer, R., Stojanovic, L.: An approach for step-by-step query refinement in the ontology-based information retrieval. In: International Conference on Web Intelligence, WI’04, pp. 36–43. IEEE (2004). https://doi.org/10.1109/WI.2004.21 Stojanovic, N., Studer, R., Stojanovic, L.: An approach for step-by-step query refinement in the ontology-based information retrieval. In: International Conference on Web Intelligence, WI’04, pp. 36–43. IEEE (2004). https://​doi.​org/​10.​1109/​WI.​2004.​21
82.
Zurück zum Zitat Sykes, W.R.: Contributions to the Flora of Niue. Department of Scientific and Industrial Research, Christchurch (1970) Sykes, W.R.: Contributions to the Flora of Niue. Department of Scientific and Industrial Research, Christchurch (1970)
83.
Zurück zum Zitat Tregear, E.: The Maori Race. AD Willis, Wanganui (1904) Tregear, E.: The Maori Race. AD Willis, Wanganui (1904)
84.
Zurück zum Zitat Voorhees, E.M.: Query expansion using lexical-semantic relations. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 61–69 Springer, Berlin (1994)CrossRef Voorhees, E.M.: Query expansion using lexical-semantic relations. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 61–69 Springer, Berlin (1994)CrossRef
85.
Zurück zum Zitat Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 977–984. ACM (2006) Wallach, H.M.: Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 977–984. ACM (2006)
86.
Zurück zum Zitat Wei, W., Barnaghi, P.M., Bargiela, A.: Search with meanings: an overview of semantic search systems. Int. J. Commun. SIWN 3, 76–82 (2008) Wei, W., Barnaghi, P.M., Bargiela, A.: Search with meanings: an overview of semantic search systems. Int. J. Commun. SIWN 3, 76–82 (2008)
87.
Zurück zum Zitat Witten, I., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 25–30. AAAI Press, Chicago (2008) Witten, I., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 25–30. AAAI Press, Chicago (2008)
88.
Zurück zum Zitat Witten, I.H., Boddie, S.J., Bainbridge, D., McNab, R.J.: Greenstone: a comprehensive open-source digital library software system. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 113–121. ACM, New York (2000) Witten, I.H., Boddie, S.J., Bainbridge, D., McNab, R.J.: Greenstone: a comprehensive open-source digital library software system. In: Proceedings of the Fifth ACM Conference on Digital Libraries, pp. 113–121. ACM, New York (2000)
89.
Zurück zum Zitat Witten, I.H., Bainbridge, D., Nichols, D.M.: How to Build a Digital Library, 2nd edn. Morgan Kaufmann, San Francisco (2009) Witten, I.H., Bainbridge, D., Nichols, D.M.: How to Build a Digital Library, 2nd edn. Morgan Kaufmann, San Francisco (2009)
90.
Zurück zum Zitat Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: Wikiwalk: random walks on Wikipedia for semantic relatedness. In: Proceedings of the 2009 Workshop on Graph-Based Methods for Natural Language Processing, pp. 41–49. Association for Computational Linguistic (2009) Yeh, E., Ramage, D., Manning, C.D., Agirre, E., Soroa, A.: Wikiwalk: random walks on Wikipedia for semantic relatedness. In: Proceedings of the 2009 Workshop on Graph-Based Methods for Natural Language Processing, pp. 41–49. Association for Computational Linguistic (2009)
91.
Zurück zum Zitat Yesilada, Y., Bechhofer, S., Horan, B.: Cohse: Dynamic Linking of Web Resources. Tech. rep., Sun Microsystems Inc. (2007) Yesilada, Y., Bechhofer, S., Horan, B.: Cohse: Dynamic Linking of Web Resources. Tech. rep., Sun Microsystems Inc. (2007)
Metadaten
Titel
Capisco: low-cost concept-based access to digital libraries
verfasst von
Annika Hinze
David Bainbridge
Sally Jo Cunningham
Craig Taube-Schock
Rangi Matamua
J. Stephen Downie
Edie Rasmussen
Publikationsdatum
14.03.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal on Digital Libraries / Ausgabe 4/2019
Print ISSN: 1432-5012
Elektronische ISSN: 1432-1300
DOI
https://doi.org/10.1007/s00799-018-0232-3

Weitere Artikel der Ausgabe 4/2019

International Journal on Digital Libraries 4/2019 Zur Ausgabe