Skip to main content
Erschienen in: Automatic Documentation and Mathematical Linguistics 3/2022

01.06.2022 | INFORMATION SEARCH

From Semantic to Cognitive Information Search: The Fundamental Principles and Models of Deep Semantic Search

verfasst von: N. V. Maksimov, O. L. Golitsyna

Erschienen in: Automatic Documentation and Mathematical Linguistics | Ausgabe 3/2022

Einloggen, um Zugang zu erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The features of human-machine documentary search focused on information support of cognitive processes are considered. The concepts of meaning and semantic information search are analyzed. The concept of deep semantic search is introduced, considered as an interactive process with search mechanisms on knowledge graph, similar to the mechanics of consciousness/cognition operations. The concept of cognitive information search is introduced, which is considered as the construction of a path of cognition—an interactive iterative and significantly dependent on the previous result formation of a target fact on a chaotic set of found facts. The result of such a search will be (1) the selection of fragments of documents that meet the real information need (and not copies of documents that meet the expressed need, as in traditional documentary retrieval system), and (2) an interactively generated semantic graph—a conceptual image of solving the user’s problem. Mathematical models of deep semantic search on knowledge graphs were given.
Fußnoten
1
Unlike the construction of a logical conclusion—new knowledge (and, accordingly, an increase in information), the purpose of information search is the selection of information (data, facts, etc.) to form a set of alternative, possibly hypothetical solutions, i.e., expanding the space choice (and, accordingly, a local increase in entropy).
 
2
Context in [8] is defined as “the environment of the event, which provides the opportunity (resources) for its adequate interpretation.”
 
3
Examples of semantic fields: the field of time, the field of chemistry, the field of names, the field of verbs of motion, the field of prepositions, the field of suffixes, etc., each of which is characterized by a certain differential feature and function.
 
4
In general, we can assume that the field is given on the space to which the points M belong if a rule is given according to which a certain value K is assigned to each point from M.
 
5
Here it is necessary to distinguish between search and find. Search is an organized purposeful process that involves the preliminary formation of a direct or indirect image of the desired (potentially useful), which is used to compare with the image (features) of an object stored in a certain environment. Find is a non-directed (not organized specifically for this) process, when a potentially useful object enters the field of view of the subject and can be selected by them.
 
6
The secondary nature of information search is determined by the fact that it is in the energy-material sphere that those objects and processes exist and are transformed, which will act, among other things, as a need and an object of activity in the life of a subject (it is impossible to live without food, and quite possible—without information about the properties of this food).
 
7
We can say that the cognitive state of the subject, formed by goals, possible methods, etc., is the meaning: it arises with thoughts (Russian: smysl—s myslyami) about the goal, ways to achieve it, etc., and it reflects their essence.
 
8
The corresponding review of approaches and solutions is quite fully presented in [13].
 
9
This paper does not consider the so-called question-answer systems, which have the ability to extract all possible information from an array as consequences, deduced through logic. Such systems are based on the thesis [14]: “The question, through its subject, sets the area of alternatives, and then prefaces the existing list with instructions, in accordance with which it is supposed to construct a direct answer.” However, this path is not trivial, as the choice of instruction is determined not only by the essence of the subject of the question, but also by the intention of the user, and the content of the answer essentially depends on the subject area and the nature of the problem solved by the user. This predetermines that the area of effective application of such systems is extremely limited and can in no way be extended to scientific research.
 
10
Such broad possibilities, implemented differently in different systems, lead to the fact that a request expressed in different ways (for example, by rearranging words) and sent to different information resources will produce different results. This is easy to check, for example, by searching Yandex and Google on the topic viburnum for lowering cholesterol. Moreover, since the user cannot know the details of the request processing technology (factors affecting the selection and formation of the issuance, in particular, the mechanics of selection and ranking), they cannot objectively assess the information completeness of the result and determine the ways of the request development.
 
11
The adjective semantic here emphasizes that meaning becomes an element of processing, since concepts are used in interrelation.
 
12
Cognitiveness is considered here as a metaphor, as an image of a super-task and a prototype that coordinates information objects and synchronizes processes in the overall system cognition–information search, but not as an intelligent system for deriving new knowledge based on the documents found. The term cognitive information search is introduced as a designation of the IRS class, which includes, in addition to deep search, the mechanisms of semantic and contextual search.
 
13
Already at the moment when the user receives the issue document (i.e., perceives and understands its meaning), its content is already being used to synthesize a possible solution to a pragmatic problem, i.e., switch to cognitive functions, interrupting the process of information activity.
 
14
This is fundamentally different from the ideology of the classical search, built on a query-response scheme, which assumes that the output is formed according to a semantically complete query expression.
 
15
Technological functions can also be intelligent: adaptive resource selection, flexible semantic matching criterion, adaptive interfaces, etc.
 
16
Including language as a means of communication and knowledge.
 
17
It should be noted that superposition punched cards, invented long ago (Taylor H. Selective device: patent 1,165,465. – US, 1915), in combination with long-forgotten KWIC (key word in context index) permutation indicators can be considered as a prototype of such deep search tools. However, it is clear that such a combination of objects on paper is technologically inconvenient and therefore ineffective.
 
18
According to the terminology of R.S. Taylor [9].
 
19
Maksimov N.V., Golitsyna O.L., Monankov K.V., and Gavrilkina A.S., Documentary information-analytical system xIRBIS (version 6.0): computer program, RF Certificate of state registration no. 2020661683, 2020.
 
20
The term seek here refers to non-semantic search engines, i.e., practically by the condition of full or partial coincidence.
 
21
In the search image, a fact is a triplet—a pair of entities connected by a typed relation.
 
22
Visualization in our case is considered as a search carried out by ranking an array. At the same time, a reduction in search efforts is achieved by reducing the space of perception, as well as ordering and formatting graph elements in acordance with the semantics of the document and the pragmatics of the task.
 
23
Examples and illustrations of the operation of deep semantic search mechanisms will be given in the article “Onto-graph mechanisms of deep semantic search.”
 
24
A search unit (result) in traditional documentary information search systems is a document inferred by the system in response to a preliminary formulated and preferably semantically complete query. It is important to note that in this case, it is fundamentally impossible to achieve exhaustive completeness and accuracy due to the existence of a system of uncertainties that are fundamentally inherent in information systems, as well as the need to be based on a conceptual and lexical community that ensures a mutual understanding of the user and the system.
 
25
A model is understood to be a description that allows one to reproduce the object in one or another specified form and volume.
 
Literatur
1.
Zurück zum Zitat Mikhailov, A.M., Chernyi, A.I., and Gilyarevskii, R.S., Osnovy informatiki (Foundations of Informatics), Moscow: Nauka, 1968. Mikhailov, A.M., Chernyi, A.I., and Gilyarevskii, R.S., Osnovy informatiki (Foundations of Informatics), Moscow: Nauka, 1968.
2.
Zurück zum Zitat Todd, P.M., Hills, T.T., and Robbins, T.W., Search, goals, and the brain, Cognitive Search: Evolution, Algorithms, and the Brain, Massachusetts: MIT Press, 2012., pp. 125–156.CrossRef Todd, P.M., Hills, T.T., and Robbins, T.W., Search, goals, and the brain, Cognitive Search: Evolution, Algorithms, and the Brain, Massachusetts: MIT Press, 2012., pp. 125–156.CrossRef
4.
Zurück zum Zitat Gualtieri, M., The Forrester wave: Cognitive search and knowledge discovery solution. https://www.forrester. com/blogs/17-06-12-cognitive_search_is_the_ai_version_ of_enterprise_search. Cited April 2, 2022. Gualtieri, M., The Forrester wave: Cognitive search and knowledge discovery solution. https://​www.​forrester.​ com/blogs/17-06-12-cognitive_search_is_the_ai_version_ of_enterprise_search. Cited April 2, 2022.
6.
Zurück zum Zitat Kravchenko, A.V., Methodological foundations of cognitive analysis of meaning, Kognitivnyi analiz slova (Cognitive Analysis of Word), Irkutsk: Irkutskaya Gos. Ekon. Akad., 2000. Kravchenko, A.V., Methodological foundations of cognitive analysis of meaning, Kognitivnyi analiz slova (Cognitive Analysis of Word), Irkutsk: Irkutskaya Gos. Ekon. Akad., 2000.
7.
Zurück zum Zitat Filosofskii entsiklopedicheskii slovar’. Sovetskaya entsiklopediya (Philosophical Encyclopedic Dictionary: Soviet Encyclopedia), Il’ichev, L.F., Fedoseev, P.N., Kovalev, S.M., and Panov, V.G., Moscow: Sovetskaya Entsiklopediya, 1983. Filosofskii entsiklopedicheskii slovar’. Sovetskaya entsiklopediya (Philosophical Encyclopedic Dictionary: Soviet Encyclopedia), Il’ichev, L.F., Fedoseev, P.N., Kovalev, S.M., and Panov, V.G., Moscow: Sovetskaya Entsiklopediya, 1983.
8.
Zurück zum Zitat Goodwin, C. and Duranti, A., Rethinking context: An introduction, Rethinking Context: Language as an Interactive Phenomenon, Duranti, A. and Goodwin, Ch., Eds., Studies in the Social and Cultural Foundations of Language, vol. 11, Cambridge: Cambridge Univ. Press, 1992, pp. 1–41. Goodwin, C. and Duranti, A., Rethinking context: An introduction, Rethinking Context: Language as an Interactive Phenomenon, Duranti, A. and Goodwin, Ch., Eds., Studies in the Social and Cultural Foundations of Language, vol. 11, Cambridge: Cambridge Univ. Press, 1992, pp. 1–41.
9.
Zurück zum Zitat Taylor, R.S., Question-negotiation and information seeking in libraries, College Res. Libr., 1968, vol. 29, no. 3, pp. 178–194.CrossRef Taylor, R.S., Question-negotiation and information seeking in libraries, College Res. Libr., 1968, vol. 29, no. 3, pp. 178–194.CrossRef
10.
Zurück zum Zitat Salton, G., Dynamic Information and Library Processing, Englewood Cliffs, N.J.: Prentice-Hall, 1975.MATH Salton, G., Dynamic Information and Library Processing, Englewood Cliffs, N.J.: Prentice-Hall, 1975.MATH
14.
Zurück zum Zitat Belnap, N.D., Jr., and Steel, T.B., Jr., The Logic of Questions and Answers, New Haven, Conn.: Yale Univ. Press, 1976.MATH Belnap, N.D., Jr., and Steel, T.B., Jr., The Logic of Questions and Answers, New Haven, Conn.: Yale Univ. Press, 1976.MATH
16.
Zurück zum Zitat Maksimov, N.V., Golitsyna, O.L., Monankov, K.V., and Gavrilkina, A.S., Methods of visual graphoanalytical representation and search of scientific and technical texts, Nauchn. Visualizatsiya, 2021, vol. 13, no. 1, pp. 138–161. Maksimov, N.V., Golitsyna, O.L., Monankov, K.V., and Gavrilkina, A.S., Methods of visual graphoanalytical representation and search of scientific and technical texts, Nauchn. Visualizatsiya, 2021, vol. 13, no. 1, pp. 138–161.
17.
Zurück zum Zitat Cognitive search. https://ru.wikipedia.org/wiki/%D0% 9A%D0%BE%D0%B3%D0%BD%D0%B8%D1%82% D0%B8%D0%B2%D0%BD%D1%8B%D0%B9_% D0%BF%D0%BE%D0%B8%D1%81%D0%BA. Cited April 2, 2022. Cognitive search. https://​ru.​wikipedia.​org/​wiki/​%D0% 9A%D0%BE%D0%B3%D0%BD%D0%B8%D1%82% D0%B8%D0%B2%D0%BD%D1%8B%D0%B9_% D0%BF%D0%BE%D0%B8%D1%81%D0%BA. Cited April 2, 2022.
19.
Zurück zum Zitat Van Dijk, T.A., et al., On macrostructures, mental models, and other inventions: a brief personal history of the kintsch-van dijk theory, Discourse Comprehension: Essays in Honor of Walter Kintsch, Lawrence Erlbaum Associates, 1995, pp. 383–410. Van Dijk, T.A., et al., On macrostructures, mental models, and other inventions: a brief personal history of the kintsch-van dijk theory, Discourse Comprehension: Essays in Honor of Walter Kintsch, Lawrence Erlbaum Associates, 1995, pp. 383–410.
20.
Zurück zum Zitat Van Dijk, T.A. and Kintsch, W., Strategies of Discourse Comprehension, New York: Academic Press, 1983. Van Dijk, T.A. and Kintsch, W., Strategies of Discourse Comprehension, New York: Academic Press, 1983.
21.
Zurück zum Zitat Zhang, X., Yang, A., Li, S., and Wang, Y., Machine reading comprehension: a literature review, 2019. ar-Xiv:1907.01686 [cs.CL] Zhang, X., Yang, A., Li, S., and Wang, Y., Machine reading comprehension: a literature review, 2019. ar-Xiv:1907.01686 [cs.CL]
22.
Zurück zum Zitat Chen, S., Wang, Y., Liu, J., and Wang, Y., Bidirectional machine reading comprehension for aspect sentiment triplet extraction, Proc. AAAI Conf. Artif. Intell., 2021, vol. 35, no. 14, pp. 12666–12674. Chen, S., Wang, Y., Liu, J., and Wang, Y., Bidirectional machine reading comprehension for aspect sentiment triplet extraction, Proc. AAAI Conf. Artif. Intell., 2021, vol. 35, no. 14, pp. 12666–12674.
23.
Zurück zum Zitat Zheng, Y., Mao, J., Liu, Y., Ye, Z., Zhang, M., and Ma, S., Human behavior inspired machine reading comprehension, SIGIR’19: Proc. 42nd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Paris, 2019, New York: Association for Computing Machinery, 2019, pp. 425–434. https://doi.org/10.1145/3331184.3331231 Zheng, Y., Mao, J., Liu, Y., Ye, Z., Zhang, M., and Ma, S., Human behavior inspired machine reading comprehension, SIGIR’19: Proc. 42nd Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, Paris, 2019, New York: Association for Computing Machinery, 2019, pp. 425–434.  https://​doi.​org/​10.​1145/​3331184.​3331231
26.
Zurück zum Zitat Shah, P., A model of the cognitive and perceptual processes in graphical display comprehension, AAAI Technical Report FS-97-03, 1997, pp. 94–101. Shah, P., A model of the cognitive and perceptual processes in graphical display comprehension, AAAI Technical Report FS-97-03, 1997, pp. 94–101.
27.
Zurück zum Zitat Peirce, C.S., Existential graphs. http://www.jfsowa.com/peirce/ms514.htm. Cited October 2, 2021. Peirce, C.S., Existential graphs. http://​www.​jfsowa.​com/​peirce/​ms514.​htm.​ Cited October 2, 2021.
30.
Zurück zum Zitat Chernavskii, D.S., Sinergetika i informatsiya (Synergetics and Information), Moskva: Editorial URSS, 2004. Chernavskii, D.S., Sinergetika i informatsiya (Synergetics and Information), Moskva: Editorial URSS, 2004.
Metadaten
Titel
From Semantic to Cognitive Information Search: The Fundamental Principles and Models of Deep Semantic Search
verfasst von
N. V. Maksimov
O. L. Golitsyna
Publikationsdatum
01.06.2022
Verlag
Pleiades Publishing
Erschienen in
Automatic Documentation and Mathematical Linguistics / Ausgabe 3/2022
Print ISSN: 0005-1055
Elektronische ISSN: 1934-8371
DOI
https://doi.org/10.3103/S0005105522030074

Weitere Artikel der Ausgabe 3/2022

Automatic Documentation and Mathematical Linguistics 3/2022 Zur Ausgabe

AUTOMATED TEXT PROCESSING

Creation of a Russian Stop Word List