Skip to main content
Top

2016 | OriginalPaper | Chapter

Searching the Web by Meaning: A Case Study of Lithuanian News Websites

Authors : Tomas Vileiniškis, Algirdas Šukys, Rita Butkienė

Published in: Knowledge Discovery, Knowledge Engineering and Knowledge Management

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The daily growth of unstructured textual information created on the Web raises significant challenges when it comes to serving user information needs. On the other hand, evolving Semantic Web technology has influenced a wide body of research towards meaning-based text processing and information retrieval methods, that go beyond classical keyword-driven approaches. However, most of the work in the field targets English as the primary language of interest. Hence, in this paper we present a very first attempt to process unstructured Lithuanian text at the level of ontological semantics. We introduce an ontology-based semantic search framework capable of answering structured natural Lithuanian language questions, discuss its language-dependent design decisions and draw some observations from the results of a recent case study carried out over domain-specific Lithuanian web news corpus.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRefMATH Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)CrossRefMATH
2.
go back to reference Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1), 1 (2012)CrossRefMATH Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1), 1 (2012)CrossRefMATH
3.
go back to reference Stokoe, C., Oakes, M.P., Tait, J.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 159–166. ACM (2003) Stokoe, C., Oakes, M.P., Tait, J.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 159–166. ACM (2003)
4.
go back to reference Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontol. 2(1), 23–34 (2007)CrossRef Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontol. 2(1), 23–34 (2007)CrossRef
5.
go back to reference Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Welty, C., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010) Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Welty, C., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
6.
go back to reference Šveikauskienė, D., Telksnys, L.: Accuracy of the parsing of Lithuanian simple sentences. Inf. Technol. Control 43(4), 402–413 (2014) Šveikauskienė, D., Telksnys, L.: Accuracy of the parsing of Lithuanian simple sentences. Inf. Technol. Control 43(4), 402–413 (2014)
7.
go back to reference Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic annotation, indexing, and retrieval. Web Semant.: Sci. Serv. Agents World Wide Web 2(1), 49–79 (2004)CrossRef Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic annotation, indexing, and retrieval. Web Semant.: Sci. Serv. Agents World Wide Web 2(1), 49–79 (2004)CrossRef
8.
go back to reference Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans. Knowl. Data Eng. 19(2), 261–272 (2007)CrossRef Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans. Knowl. Data Eng. 19(2), 261–272 (2007)CrossRef
9.
go back to reference Fernández, M., Cantador, I., López, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced information retrieval: an ontology-based approach. Web Semant.: Sci. Serv. Agents World Wide Web 9(4), 434–452 (2011)CrossRef Fernández, M., Cantador, I., López, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced information retrieval: an ontology-based approach. Web Semant.: Sci. Serv. Agents World Wide Web 9(4), 434–452 (2011)CrossRef
10.
go back to reference Lopez, V., Uren, V., Sabou, M.R., Motta, E.: Cross ontology query answering on the semantic web: an initial evaluation. In: Proceedings of the Fifth International Conference on Knowledge Capture, pp. 17–24. ACM (2009) Lopez, V., Uren, V., Sabou, M.R., Motta, E.: Cross ontology query answering on the semantic web: an initial evaluation. In: Proceedings of the Fifth International Conference on Knowledge Capture, pp. 17–24. ACM (2009)
11.
go back to reference Zinkevičius, V.: Lemuoklis–morfologinei analizei. Darbai ir dienos 24, 245–274 (2000) Zinkevičius, V.: Lemuoklis–morfologinei analizei. Darbai ir dienos 24, 245–274 (2000)
12.
go back to reference Šveikauskienė, D.: Formal description of the syntax of the Lithuanian language. Inf. Technol. Control 34(3), 1–12 (2005)MATH Šveikauskienė, D.: Formal description of the syntax of the Lithuanian language. Inf. Technol. Control 34(3), 1–12 (2005)MATH
13.
go back to reference Kapociute-Dzikiene, J., Nivre, J., Krupavicius, A.: Lithuanian dependency parsing with rich morphological features. In: Fourth Workshop on Statistical Parsing of Morphologically Rich Languages, p. 12 (2013) Kapociute-Dzikiene, J., Nivre, J., Krupavicius, A.: Lithuanian dependency parsing with rich morphological features. In: Fourth Workshop on Statistical Parsing of Morphologically Rich Languages, p. 12 (2013)
14.
go back to reference Krilavičius, T., Medelis, Ž., Kapočiūtė-Dzikienė, J., Žalandauskas, T.: News media analysis using focused crawl and natural language processing: case of Lithuanian news websites. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 48–61. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33308-8_5 CrossRef Krilavičius, T., Medelis, Ž., Kapočiūtė-Dzikienė, J., Žalandauskas, T.: News media analysis using focused crawl and natural language processing: case of Lithuanian news websites. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 48–61. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33308-8_​5 CrossRef
15.
go back to reference Amardeilh, F.: Semantic annotation and ontology population. In: Semantic Web Engineering in the Knowledge Society, 424 p. (2008) Amardeilh, F.: Semantic annotation and ontology population. In: Semantic Web Engineering in the Knowledge Society, 424 p. (2008)
16.
go back to reference Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRefMATH Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)MathSciNetCrossRefMATH
17.
go back to reference OMG. Semantics of Business Vocabulary and Business Rules (SBVR). Version 1.0, December 2008, OMG Document Number: formal/2008-01-02 (2008) OMG. Semantics of Business Vocabulary and Business Rules (SBVR). Version 1.0, December 2008, OMG Document Number: formal/2008-01-02 (2008)
18.
go back to reference Goedertier, S., Vanthienen, J.: A vocabulary and execution model for declarative service orchestration. In: Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 496–501. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78238-4_50 CrossRef Goedertier, S., Vanthienen, J.: A vocabulary and execution model for declarative service orchestration. In: Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 496–501. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-78238-4_​50 CrossRef
19.
go back to reference Damiani, E., Ceravolo, P., Fugazza, C., Reed, K.: Representing and validating digital business processes. In: Filipe, J., Cordeiro, J. (eds.) WEBIST 2007. LNBIP, vol. 8, pp. 19–32. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68262-2_2 CrossRef Damiani, E., Ceravolo, P., Fugazza, C., Reed, K.: Representing and validating digital business processes. In: Filipe, J., Cordeiro, J. (eds.) WEBIST 2007. LNBIP, vol. 8, pp. 19–32. Springer, Heidelberg (2008). doi:10.​1007/​978-3-540-68262-2_​2 CrossRef
20.
go back to reference Karpovič, J., Kriščiūnienė, G., Ablonskis, L., Nemuraitė, L.: The comprehensive mapping of semantics of business vocabulary and business rules (SBVR) to OWL 2 ontologies. Inf. Technol. Control 43(3), 289–302 (2014) Karpovič, J., Kriščiūnienė, G., Ablonskis, L., Nemuraitė, L.: The comprehensive mapping of semantics of business vocabulary and business rules (SBVR) to OWL 2 ontologies. Inf. Technol. Control 43(3), 289–302 (2014)
21.
go back to reference Sukys, A., Nemuraite, L., Paradauskas, B., Sinkevicius, E.: Transformation framework for SBVR based semantic queries in business information systems. In: The Second International Conference on Business Intelligence and Technology, BUSTECH 2012, pp. 19–24 (2012) Sukys, A., Nemuraite, L., Paradauskas, B., Sinkevicius, E.: Transformation framework for SBVR based semantic queries in business information systems. In: The Second International Conference on Business Intelligence and Technology, BUSTECH 2012, pp. 19–24 (2012)
22.
go back to reference Sukys, A., Nemuraite, L., Paradauskas, B.: Representing and transforming SBVR question patterns into SPARQL. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 436–451. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33308-8_36 CrossRef Sukys, A., Nemuraite, L., Paradauskas, B.: Representing and transforming SBVR question patterns into SPARQL. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 436–451. Springer, Heidelberg (2012). doi:10.​1007/​978-3-642-33308-8_​36 CrossRef
23.
go back to reference Bernotaityte, G., Nemuraite, L., Butkiene, R., Paradauskas, B.: Developing SBVR vocabularies and business rules from OWL2 ontologies. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2013. CCIS, vol. 403, pp. 134–145. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41947-8_13 CrossRef Bernotaityte, G., Nemuraite, L., Butkiene, R., Paradauskas, B.: Developing SBVR vocabularies and business rules from OWL2 ontologies. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2013. CCIS, vol. 403, pp. 134–145. Springer, Heidelberg (2013). doi:10.​1007/​978-3-642-41947-8_​13 CrossRef
24.
go back to reference Shekarpour, S., Marx, E., Ngomo, A.C.N., Auer, S.: Sina: semantic interpretation of user queries for question answering on interlinked data. Web Semant.: Sci. Serv. Agents World Wide Web 30, 39–51 (2015)CrossRef Shekarpour, S., Marx, E., Ngomo, A.C.N., Auer, S.: Sina: semantic interpretation of user queries for question answering on interlinked data. Web Semant.: Sci. Serv. Agents World Wide Web 30, 39–51 (2015)CrossRef
25.
go back to reference Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of ACL (2014) Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of ACL (2014)
Metadata
Title
Searching the Web by Meaning: A Case Study of Lithuanian News Websites
Authors
Tomas Vileiniškis
Algirdas Šukys
Rita Butkienė
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-52758-1_4