nach oben

Automatic Documentation and Mathematical Linguistics

Erschienen in:

01.03.2021 | INFORMATION SEARCH

On One Approach to the Extraction of Entity and Relationships Names in the Task of Building a Semantic Search Image

verfasst von: O. L. Golitsina, A. S. Gavrilkina

Erschienen in: Automatic Documentation and Mathematical Linguistics | Ausgabe 2/2021

Einloggen, um Zugang zu erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Methods and tools for identifying names of entities and relationships based on lexical and syntactic templates are considered in the framework of the task of semantic indexing of document texts. The content of the text is considered as a set of elementary facts represented by triplets, including the names of entities and relationships. Relations are divided into immanent, situational, and structural–linguistic. A taxonomy of relations is used to typify situational relations, whose classes include linguistic constructions. Immanent relations are formed on the basis of a network of concepts (thesaurus). A taxonomy of properties and units of measurement is used to identify the properties of entities. The proposed approach allows one to use the names of entities and the names of relationships, as well as elementary facts and complete semantic constructions made up of them, as a search query.

Vorheriger Artikel The Rationale of Indicators for the Management of Scientific Achievements

Nächster Artikel A Conceptual Approach to Semantic Searching of Scientific and Technical Information on Internet Networks

All examples in the paper are given in translation from Russian to English.

In this case, keywords are understood to be the words and phrases that are most essential for this purpose, which have a nominative (nominative) function. Nominal words do not denote an object, but highlight it by directions. Proper nouns also belong to the category of naming words. In addition to naming conventions, the corresponding numerical characteristics, chronological data, temperature ranges, pressures, etc. can also be used as keywords.

In the case where several names of properties from different fields of application correspond to a unit of measurement, to resolve ambiguity at the stage of conceptual analysis, a search can be carried out in the text by the components of the taxonomy associated with the identified properties (within the processed sentence), or the terminology in the sentence can be analyzed using the thesaurus and the text as a whole and clarified the scope. For example, in the sentence we are considering, the term pressure is present, which coincides with the name of the property Pressure in the taxonomy of properties and units.

For more detail see [20].

Thus, all relations of the Action type are additionally assigned two modality values, but it is possible to define two values of the first modal property for one relation.

Shelmanov, A.O., et al., Open information extraction from texts. Part I. Statement of the problem and review of methods, Iskusstv. Intell. Prinyatie Reshenii, 2018, no. 2, pp. 47–61.

Banko, M., et al., Open information extraction from the web, Proceedings of the 20th International Joint Conference on Artificial Intelligence, San Francisco: Morgan Kaufmann Publ. Inc., 2007, pp. 2670–2676.

Wu, F. and Weld, D.S., Open information extraction using Wikipedia, Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala: Assoc. Comput. Linguist., 2010, pp. 118–127.

Fader, A., Soderland, S., and Etzioni, O., Identifying relations for open information extraction, Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh: Assoc. Comput. Linguist., 2011, pp. 1535–1545.

Schmitz, M., et al., Open language learning for information extraction, Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island: Assoc. Comput. Linguist., 2012, pp. 523–534.

Pal, H., et al., Demonyms and compound relational nouns in nominal open IE, Proceedings of the 5th Workshop on Automated Knowledge Base Construction, San Diego: Assoc. Comput. Linguist., 2016, pp. 35–39.

Saha, S., et al., Bootstrapping for numerical open IE, Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 2: Short Papers, Vancouver: Assoc. Comput. Linguist., 2017, pp. 317–323.

Mausam, M., Open information extraction systems and downstream applications, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, Palo Alto: AAAI Press, 2016, pp. 4074–4077.

Saha, S., et al., Open information extraction from conjunctive sentences, Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe: Assoc. Comput. Linguist., 2018, pp. 2288–2299.

10.

Glauber, R. and Claro, D.B., A systematic mapping study on open information extraction, Expert Syst. Appl., 2018, vol. 112, pp. 372–387.CrossRef

11.

Ermakov, A.E., Pleshko, V.V., and Mityunin, V.A., RCO Pattern Extractor: Component for highlighting special objects in the text, Sb. trudov XII Mezhdunarodnoi nauchnoi konferentsii “Informatizatsiya i informatsionnaya bezopasnost' pravookhranitel’nykh organov” (Proc. XII International Scientific Conference “Informatization and Information Security of Law Enforcement Agencies”), Moscow, 2003, pp. 312–317.

12.

Zhigalov, V.A., et al., Alex system as a tool for multi-purpose automated text processing, in Komp’yuternaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intellectual Technologies), Moscow: Nauka, 2002, pp. 192–208.

13.

Bol’shakova, E.I., Efremova, N.E., and Sharikov, G.F., Tools for the development of systems for extracting information from Russian-language texts, Nov. Inf. Tekhnol. Avtom. Sist., 2015, no. 18, pp. 533–543.

14.

Skatov, D.S., Liverko, S.V., and Okat’ev, V.V., The language for describing rules in the system of lexical analysis of NL-texts Dictascope Tokenizer, Komp’yut. Lingvist. Intell. Tekhnol., 2010, vol. 9, no. 16, pp. 442–449.

15.

Tomita Parser. Developer’s Guide. https://yandex.ru/ dev/tomita/doc/dg/concept/about.html. Accessed December 28, 2020.

16.

Mikhailov, A.M., Chernyi, A.I., and Gilyarevskii, R.S., Osnovy informatiki (Fundamentals of Informatics), Moscow: Nauka, 1968.

17.

Golitsyna, O.L., Maksimov, N.V., Okropishina, O.V., and Strogonov, V.I., The ontological approach to the identification of information in tasks of document retrieval, Autom. Doc. Math. Linguist., 2012, vol. 46, no. 3, pp. 125–132.CrossRef

18.

Belonogov, G.G., Bystrov, I.I., Novoselov, A.P., Kozachuk, M.V., Khoroshilov, A.A., and Khoroshilov, A.A., Automatic conceptual text analysis, Autom. Doc. Math. Linguist., 2002, vol. 36, no. 5, pp. 57–65.

19.

Maksimov, N., et al., Ontology of properties and its methods of use: Properties and unit extraction from texts, Procedia Comput. Sci., 2020, vol. 169, pp. 70–75.CrossRef

20.

Maksimov, N.V., Gavrilkina, A.S., Andronova, V.V., and Tazieva, I.A., Systematization and identification of semantic relations in ontologies for scientific and technical subject areas, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 6, pp. 306–317.CrossRef

Titel: On One Approach to the Extraction of Entity and Relationships Names in the Task of Building a Semantic Search Image
verfasst von: O. L. Golitsina
A. S. Gavrilkina
Publikationsdatum: 01.03.2021
Verlag: Pleiades Publishing
Erschienen in: Automatic Documentation and Mathematical Linguistics / Ausgabe 2/2021
Print ISSN: 0005-1055
Elektronische ISSN: 1934-8371
DOI: https://doi.org/10.3103/S0005105521020023

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Weitere Artikel der Ausgabe 2/2021

The Rationale of Indicators for the Management of Scientific Achievements

A Conceptual Approach to Semantic Searching of Scientific and Technical Information on Internet Networks

The Evolution of Globalism: From Computerization to e-Democracy and the Digital Economy of Knowledge

Analysis of the Speech Signal Quality of the Chechen Speech Synthesis System