Skip to main content
Erschienen in: Automatic Documentation and Mathematical Linguistics 2/2021

01.03.2021 | INFORMATION SEARCH

On One Approach to the Extraction of Entity and Relationships Names in the Task of Building a Semantic Search Image

verfasst von: O. L. Golitsina, A. S. Gavrilkina

Erschienen in: Automatic Documentation and Mathematical Linguistics | Ausgabe 2/2021

Einloggen, um Zugang zu erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Methods and tools for identifying names of entities and relationships based on lexical and syntactic templates are considered in the framework of the task of semantic indexing of document texts. The content of the text is considered as a set of elementary facts represented by triplets, including the names of entities and relationships. Relations are divided into immanent, situational, and structural–linguistic. A taxonomy of relations is used to typify situational relations, whose classes include linguistic constructions. Immanent relations are formed on the basis of a network of concepts (thesaurus). A taxonomy of properties and units of measurement is used to identify the properties of entities. The proposed approach allows one to use the names of entities and the names of relationships, as well as elementary facts and complete semantic constructions made up of them, as a search query.
Fußnoten
1
All examples in the paper are given in translation from Russian to English.
 
2
In this case, keywords are understood to be the words and phrases that are most essential for this purpose, which have a nominative (nominative) function. Nominal words do not denote an object, but highlight it by directions. Proper nouns also belong to the category of naming words. In addition to naming conventions, the corresponding numerical characteristics, chronological data, temperature ranges, pressures, etc. can also be used as keywords.
 
3
In the case where several names of properties from different fields of application correspond to a unit of measurement, to resolve ambiguity at the stage of conceptual analysis, a search can be carried out in the text by the components of the taxonomy associated with the identified properties (within the processed sentence), or the terminology in the sentence can be analyzed using the thesaurus and the text as a whole and clarified the scope. For example, in the sentence we are considering, the term pressure is present, which coincides with the name of the property Pressure in the taxonomy of properties and units.
 
4
For more detail see [20].
 
5
Thus, all relations of the Action type are additionally assigned two modality values, but it is possible to define two values of the first modal property for one relation.
 
Literatur
1.
Zurück zum Zitat Shelmanov, A.O., et al., Open information extraction from texts. Part I. Statement of the problem and review of methods, Iskusstv. Intell. Prinyatie Reshenii, 2018, no. 2, pp. 47–61. Shelmanov, A.O., et al., Open information extraction from texts. Part I. Statement of the problem and review of methods, Iskusstv. Intell. Prinyatie Reshenii, 2018, no. 2, pp. 47–61.
2.
Zurück zum Zitat Banko, M., et al., Open information extraction from the web, Proceedings of the 20th International Joint Conference on Artificial Intelligence, San Francisco: Morgan Kaufmann Publ. Inc., 2007, pp. 2670–2676. Banko, M., et al., Open information extraction from the web, Proceedings of the 20th International Joint Conference on Artificial Intelligence, San Francisco: Morgan Kaufmann Publ. Inc., 2007, pp. 2670–2676.
3.
Zurück zum Zitat Wu, F. and Weld, D.S., Open information extraction using Wikipedia, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala: Assoc. Comput. Linguist., 2010, pp. 118–127. Wu, F. and Weld, D.S., Open information extraction using Wikipedia, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala: Assoc. Comput. Linguist., 2010, pp. 118–127.
4.
Zurück zum Zitat Fader, A., Soderland, S., and Etzioni, O., Identifying relations for open information extraction, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh: Assoc. Comput. Linguist., 2011, pp. 1535–1545. Fader, A., Soderland, S., and Etzioni, O., Identifying relations for open information extraction, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh: Assoc. Comput. Linguist., 2011, pp. 1535–1545.
5.
Zurück zum Zitat Schmitz, M., et al., Open language learning for information extraction, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island: Assoc. Comput. Linguist., 2012, pp. 523–534. Schmitz, M., et al., Open language learning for information extraction, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island: Assoc. Comput. Linguist., 2012, pp. 523–534.
6.
Zurück zum Zitat Pal, H., et al., Demonyms and compound relational nouns in nominal open IE, Proceedings of the 5th Workshop on Automated Knowledge Base Construction, San Diego: Assoc. Comput. Linguist., 2016, pp. 35–39. Pal, H., et al., Demonyms and compound relational nouns in nominal open IE, Proceedings of the 5th Workshop on Automated Knowledge Base Construction, San Diego: Assoc. Comput. Linguist., 2016, pp. 35–39.
7.
Zurück zum Zitat Saha, S., et al., Bootstrapping for numerical open IE, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, Vancouver: Assoc. Comput. Linguist., 2017, pp. 317–323. Saha, S., et al., Bootstrapping for numerical open IE, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, Vancouver: Assoc. Comput. Linguist., 2017, pp. 317–323.
8.
Zurück zum Zitat Mausam, M., Open information extraction systems and downstream applications, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, Palo Alto: AAAI Press, 2016, pp. 4074–4077. Mausam, M., Open information extraction systems and downstream applications, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, Palo Alto: AAAI Press, 2016, pp. 4074–4077.
9.
Zurück zum Zitat Saha, S., et al., Open information extraction from conjunctive sentences, Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe: Assoc. Comput. Linguist., 2018, pp. 2288–2299. Saha, S., et al., Open information extraction from conjunctive sentences, Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe: Assoc. Comput. Linguist., 2018, pp. 2288–2299.
10.
Zurück zum Zitat Glauber, R. and Claro, D.B., A systematic mapping study on open information extraction, Expert Syst. Appl., 2018, vol. 112, pp. 372–387.CrossRef Glauber, R. and Claro, D.B., A systematic mapping study on open information extraction, Expert Syst. Appl., 2018, vol. 112, pp. 372–387.CrossRef
11.
Zurück zum Zitat Ermakov, A.E., Pleshko, V.V., and Mityunin, V.A., RCO Pattern Extractor: Component for highlighting special objects in the text, Sb. trudov XII Mezhdunarodnoi nauchnoi konferentsii “Informatizatsiya i informatsionnaya bezopasnost' pravookhranitel’nykh organov” (Proc. XII International Scientific Conference “Informatization and Information Security of Law Enforcement Agencies”), Moscow, 2003, pp. 312–317. Ermakov, A.E., Pleshko, V.V., and Mityunin, V.A., RCO Pattern Extractor: Component for highlighting special objects in the text, Sb. trudov XII Mezhdunarodnoi nauchnoi konferentsii “Informatizatsiya i informatsionnaya bezopasnost' pravookhranitel’nykh organov” (Proc. XII International Scientific Conference “Informatization and Information Security of Law Enforcement Agencies”), Moscow, 2003, pp. 312–317.
12.
Zurück zum Zitat Zhigalov, V.A., et al., Alex system as a tool for multi-purpose automated text processing, in Komp’yuternaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intellectual Technologies), Moscow: Nauka, 2002, pp. 192–208. Zhigalov, V.A., et al., Alex system as a tool for multi-purpose automated text processing, in Komp’yuternaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intellectual Technologies), Moscow: Nauka, 2002, pp. 192–208.
13.
Zurück zum Zitat Bol’shakova, E.I., Efremova, N.E., and Sharikov, G.F., Tools for the development of systems for extracting information from Russian-language texts, Nov. Inf. Tekhnol. Avtom. Sist., 2015, no. 18, pp. 533–543. Bol’shakova, E.I., Efremova, N.E., and Sharikov, G.F., Tools for the development of systems for extracting information from Russian-language texts, Nov. Inf. Tekhnol. Avtom. Sist., 2015, no. 18, pp. 533–543.
14.
Zurück zum Zitat Skatov, D.S., Liverko, S.V., and Okat’ev, V.V., The language for describing rules in the system of lexical analysis of NL-texts Dictascope Tokenizer, Komp’yut. Lingvist. Intell. Tekhnol., 2010, vol. 9, no. 16, pp. 442–449. Skatov, D.S., Liverko, S.V., and Okat’ev, V.V., The language for describing rules in the system of lexical analysis of NL-texts Dictascope Tokenizer, Komp’yut. Lingvist. Intell. Tekhnol., 2010, vol. 9, no. 16, pp. 442–449.
15.
Zurück zum Zitat Tomita Parser. Developer’s Guide. https://yandex.ru/ dev/tomita/doc/dg/concept/about.html. Accessed December 28, 2020. Tomita Parser. Developer’s Guide. https://​yandex.​ru/​ dev/tomita/doc/dg/concept/about.html. Accessed December 28, 2020.
16.
Zurück zum Zitat Mikhailov, A.M., Chernyi, A.I., and Gilyarevskii, R.S., Osnovy informatiki (Fundamentals of Informatics), Moscow: Nauka, 1968. Mikhailov, A.M., Chernyi, A.I., and Gilyarevskii, R.S., Osnovy informatiki (Fundamentals of Informatics), Moscow: Nauka, 1968.
17.
Zurück zum Zitat Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Strogonov, V.I., The ontological approach to the identification of information in tasks of document retrieval, Autom. Doc. Math. Linguist., 2012, vol. 46, no. 3, pp. 125–132.CrossRef Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Strogonov, V.I., The ontological approach to the identification of information in tasks of document retrieval, Autom. Doc. Math. Linguist., 2012, vol. 46, no. 3, pp. 125–132.CrossRef
18.
Zurück zum Zitat Belonogov, G.G., Bystrov, I.I., Novoselov, A.P., Kozachuk, M.V., Khoroshilov, A.A., and Khoroshilov, A.A., Automatic conceptual text analysis, Autom. Doc. Math. Linguist., 2002, vol. 36, no. 5, pp. 57–65. Belonogov, G.G., Bystrov, I.I., Novoselov, A.P., Kozachuk, M.V., Khoroshilov, A.A., and Khoroshilov, A.A., Automatic conceptual text analysis, Autom. Doc. Math. Linguist., 2002, vol. 36, no. 5, pp. 57–65.
19.
Zurück zum Zitat Maksimov, N., et al., Ontology of properties and its methods of use: Properties and unit extraction from texts, Procedia Comput. Sci., 2020, vol. 169, pp. 70–75.CrossRef Maksimov, N., et al., Ontology of properties and its methods of use: Properties and unit extraction from texts, Procedia Comput. Sci., 2020, vol. 169, pp. 70–75.CrossRef
20.
Zurück zum Zitat Maksimov, N.V., Gavrilkina, A.S., Andronova, V.V., and Tazieva, I.A., Systematization and identification of semantic relations in ontologies for scientific and technical subject areas, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 6, pp. 306–317.CrossRef Maksimov, N.V., Gavrilkina, A.S., Andronova, V.V., and Tazieva, I.A., Systematization and identification of semantic relations in ontologies for scientific and technical subject areas, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 6, pp. 306–317.CrossRef
Metadaten
Titel
On One Approach to the Extraction of Entity and Relationships Names in the Task of Building a Semantic Search Image
verfasst von
O. L. Golitsina
A. S. Gavrilkina
Publikationsdatum
01.03.2021
Verlag
Pleiades Publishing
Erschienen in
Automatic Documentation and Mathematical Linguistics / Ausgabe 2/2021
Print ISSN: 0005-1055
Elektronische ISSN: 1934-8371
DOI
https://doi.org/10.3103/S0005105521020023

Weitere Artikel der Ausgabe 2/2021

Automatic Documentation and Mathematical Linguistics 2/2021 Zur Ausgabe