Global machine learning for spatial ontology population

doi:10.1016/j.websem.2014.06.001

Journal of Web Semantics

Volume 30, January 2015, Pages 3-21

https://doi.org/10.1016/j.websem.2014.06.001 Get rights and content

Abstract

Understanding spatial language is important in many applications such as geographical information systems, human computer interaction or text-to-scene conversion. Due to the challenges of designing spatial ontologies, the extraction of spatial information from natural language still has to be placed in a well-defined framework. In this work, we propose an ontology which bridges between cognitive–linguistic spatial concepts in natural language and multiple qualitative spatial representation and reasoning models. To make a mapping between natural language and the spatial ontology, we propose a novel global machine learning framework for ontology population. In this framework we consider relational features and background knowledge which originate from both ontological relationships between the concepts and the structure of the spatial language. The advantage of the proposed global learning model is the scalability of the inference, and the flexibility for automatically describing text with arbitrary semantic labels that form a structured ontological representation of its content. The machine learning framework is evaluated with SemEval-2012 and SemEval-2013 data from the spatial role labeling task.

Introduction

An essential function of natural language is to talk about the location and translocation of objects in space. Understanding spatial language is important in many applications such as geographical information systems (GIS), human computer interaction, text-to-scene conversion, and representation and extraction of spatial information from web resources such as travelers blogs or websites about tourism. Due to the complexity of spatial primitives and notions, and the challenges of designing ontologies for formal spatial representation, the extraction of the spatial semantics from natural language still has to be placed in a well-defined framework.

We have two main contributions toward solving this problem. The first contribution is that we propose a spatial ontology based on two layers of semantics. This ontology is based on a previously proposed spatial annotation scheme by the authors [1]. Its first layer is based on commonly accepted cognitive spatial notions and the second is based on multiple well-known qualitative spatial reasoning models. An automatic mapping to such an ontology bridges between natural language and qualitative spatial representation and reasoning models, which makes automatic spatial reasoning based on spatial information in linguistic expressions feasible. This ontology can be integrated in larger ontologies, for example, to represent spatial meaning in unstructured data in the context of the Semantic Web.

The second contribution of this work is that we propose a novel global supervised machine learning model for spatial ontology population. For this supervised learning framework, we build rich annotated corpora and an evaluation scheme. We point to the linguistic features and structural characteristics of spatial language that aid the use of machine learning. We view ontology population as a means for creating meaning representations from text. In this model the segments of the input text are described by semantic abstractions or concepts and their relationships defined by the ontology, which form the output space of the learning problem [2]. In the proposed global learning framework, the ontology components including spatial roles and their relations, and multiple formal semantic types are learned while taking into account the ontological constraints and the structural characteristics of the spatial language.

Learning a model that considers the global correlations between the output components usually becomes computationally complex. To deal with the complexity in training and prediction phases, we use an efficient inference approach based upon combinatorial optimization techniques for both phases. This approach can deal with a large number of variables and constraints, and makes building a structured machine learning model for ontology population, feasible.

We decompose the learning problem into simpler problems that are jointly optimized. We propose a technique which we call communicative inference based on the ideas of alternating optimization for solving smaller subproblems of the main objective function [3]. Each subproblem is solved by using linear programming (LP) solvers and the subproblems communicate to each other by passing the local solutions. We show that the suggested framework is beneficial compared to local learning as well as compared to pipelining the independently learned models for the concepts in the ontology. The proposed inference approach makes the global learning scalable.

The application of the global machine learning model for ontology population is not limited to the extraction of spatial semantics; it could be used to populate any ontology. Moreover, due to decomposing the ontology to its solvable parts, this approach is scalable to be applied for approximate global learning for large ontologies of the Semantic Web. We argue therefore that this work is an important step towards automatically describing text with semantic labels that form a structured ontological representation of the content.

Our extensive experimental study using the spatial ontology indicates the advantage of global learning while considering ontological constraints and structural characteristics of the spatial language compared to learning local models for the various parts of the ontology independently. The experiments are performed using the corpora provided by the SemEval-2012 and SemEval-2013 shared task on spatial role labeling.

In Section 2, we provide the problem definition and the spatial ontology population task in its two layers of semantics. In Section 3, we discuss the features and constraints that are useful for learning the spatial ontology population. A background to structured learning is provided in Section 4. The proposed structured learning model for spatial ontology population is described in Section 5. The proposed inference approach is explained in Section 6. Section 7 specifies the details of the components of the spatial ontology population model. The various designed local and global models are clarified in Section 8. Section 9 presents the experimental results. An overview of the related research is provided in Section 10. We draw conclusions, set our work in a broader context, and point to the future extensions in Section 11.

Section snippets

General problem definition

We define a framework for mapping natural language to spatial ontologies. Although pragmatic, our proposed framework is based on the theoretical cognitive and linguistic foundations, as well as on cognitively adequate formal spatial models. The task is formulated as an ontology population to be performed via supervised machine learning models. We aim at learning to assign the segments in the sentence to the concepts in the ontology. The considered concepts form a light weight ontology which is

Constraints and features for the machine learning models

As in other computational linguistic tasks, the lexical, syntactic and semantic features of language can help with the extraction of spatial semantics. There is also linguistic and commonsense background knowledge on the spatial language to be exploited when designing an intelligent model for automatic spatial semantic extraction. In this section we aim to specify all types of information that can be useful for the machine learning models that we design. We divide these characteristics in two

Structured learning setting

In learning models for structured output prediction, given a set of $N$ input–output pairs of training examples $E = {(x^{i}, y^{i}) \in X \times Y : i = 1 \dots N}$ , we learn an objective function $g (x, y; W)$ which is a linear discriminant function defined over the combined feature representation of the inputs and outputs denoted by $f (x, y)$ [23]: $g (x, y; W) = 〈 W, f (x, y) 〉 .$ $W$ denotes a weight vector and $〈, 〉$ denotes a dot product between two vectors. A popular discriminative training approach is to minimize the following convex upper

Link-And-Label model

We aim to provide a simple and useful abstraction for designing global structured learning models for ontology population from text that is easily integrated in the above non-probabilistic structured output prediction models. We specify the learning components including input, output, joint feature function, global constraints, loss and inference in a framework which we name Link-And-Label (LAL) framework. The Link-And-Label name is inspired by the conceptualization process that a human does

Communicative inference

Solving the LAL objective function, given in Eq. (2), during training of the model can become highly inefficient for most relational data domains. This is because the objective function and the constraints are in fact expressed in a first order representation (i.e. templates and types), and the corresponding ontologies or output label structures often produce a large number of output labels and constraints when instantiated for each training example. To solve this problem we propose an

Model specification

In this section, we formulate the problem of mapping natural language to spatial ontologies. We represent the supervised structured learning model designed for solving this problem using the Link-And-Label model described in Section 5 and specify: (a) the input components and types; (b) the output single labels, linked labels and global constraints over the output structure; (c) the joint feature templates, candidate generation for the templates and the main objective function.

Local–global training and prediction models

In this section we collect the required pieces from the last sections and discuss the model variations belonging to the spectrum of local and global training and prediction models that we design.

The global loss augmented objective function of our problem is built by adding the components of Eqs. (19), (20). We train the parameters $W$ of the function $g$ in the framework of discriminative inference-based structured prediction models such as structured SVM, structured perceptron and average

Experiments

For the extraction of the linguistic features we use the LTH² tool that produces features in the CoNLL-08 format³ The applied machine learning techniques are the structured SVM using the SVM-struct Matlab wrapper [34] (coded as SSVM) and our implementation of the structured perceptron (coded as SPerc) and the averaged structured perceptron (coded as AvGSPerc). For local learning settings a

Related work

The ontology we use in this work is based on the spatial annotation scheme that we have proposed in a previous work [1]. We discuss the two layers of semantics and the adequacy of mapping to qualitative spatial representation and reasoning models in [12], [8], [9]. We have previously developed machine learning models, but they were restricted to the annotation of text with the concepts of the SpRL layer [10], [38]. The SpRL layer has been worked out by the participants of a semantic

Conclusions

We have proposed a framework for representing the spatial semantics in natural language in terms of multiple calculi models. Moreover, a novel structured machine learning framework for mapping natural language to ontologies is provided. We propose a framework that we call Link-And-Label which is able to deal with relational data both in the input and in the output and is able to consider ontological relationships and background knowledge about the task during training and prediction. Using the

Acknowledgments

The research was funded by the KU Leuven grant DBOF/08/043, the EU FP7-296703 project MUSE (Machine Understanding for interactive Story tElling) and by the KU Leuven Postdoctoral grant PDMK/13/115.

References (53)

L.A. Carlson et al.
The space in spatial language
J. Mem. Lang.
(2004)
B. Kuipers
The spatial semantic hierarchy
Artif. Intell.
(2000)
Y. Cao et al.
A structural support vector method for extracting contexts and answers of questions from online forums
Inf. Process. Manage.
(2011)
P. Kordjamshidi, M. van Otterlo, M.F. Moens, Spatial role labeling: task definition and annotation scheme, in: N....
G. Petasis et al.
Knowledge-Driven Multimedia Information Extraction and Ontology Evolution
(2011)
J.C. Bezdek et al.
Some notes on alternating optimization
J. Hois et al.
Natural language meets spatial calculi
J. Renz et al.
Qualitative spatial reasoning using constraint calculi
J.A. Bateman
Language and space: a two-level semantic approach based on principles of ontological engineering
Int. J. Speech Technol.
(2010)
P. Kordjamshidi, M. van Otterlo, M.F. Moens, From language towards formal spatial calculi, in: R. J. Ross, J. Hois, J....

P. Kordjamshidi, J. Hois, M. van Otterlo, M.F. Moens, Machine learning for interpretation of spatial natural language...

P. Kordjamshidi et al.

Spatial role labeling: towards extraction of spatial relations from natural language

ACM Trans. Speech Lang. Process.

(2011)

A. Galton

Spatial and temporal knowledge representation

J. Earth Sci Inform.

(2009)

P. Kordjamshidi et al.

Learning to interpret spatial natural language in terms of qualitative spatial relations

M. MacMahon, B. Stankiewicz, B. Kuipers, Walk the talk: connecting language, knowledge, and action in route...

W. Wong et al.

Ontology learning from text: a look back and into the future

ACM Comput. Surv.

(2012)

J. Zlatev, Holistic spatial semantics of Thai, Cognitive Linguistics and Non-Indo-European Languages (2003)...

J. Zlatev

Spatial semantics

P. Kordjamshidi

Structured machine learning for mapping natural language to spatial ontologies

(2013)

I. Mani et al.

Interpreting Motion: Grounded Representations for Spatial Language, Explorations in language and space

(2012)

D.A. Randell, Z. Cui, A.G. Cohn, A spatial logic based on regions and connection, in: Proceedings of the 3rd...

A. Klippel et al.

The endpoint hypothesis: a topological-cognitive assessment of geographic scale movement patterns

M.W. Chang et al.

Structured learning with constrained conditional models

Mach. Learn.

(2012)

I. Tsochantaridis et al.

Large margin methods for structured and interdependent output variables

J. Mach. Learn. Res.

(2006)

M. Collins

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

M. Collins

Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

Cited by (44)

Mixing Static Word Embeddings and RoBERTa for Spatial Role Labeling
2022, Procedia Computer Science
Language model pretraining has yielded significant results in diverse natural language processing tasks. RoberTa, an efficient method for pretraining self-supervised NLP systems, is a good example. Our hypothesis in this paper is that the performance of Spatial Role Labeling (SpRL) can be improved by combining static word vectors and bags of features with RoberTa vectors. Furthermore, we show that our method is successful in several SpRL datasets.
ISEE: A heterogeneous information system for event explainability in smart connected environments
2021, Internet of Things (Netherlands)
Citation Excerpt :
These two instances could be extracted, for example, from the employee record of the building office. Two main approaches are used in the literature for ontology instantiation: (1) algorithmic/heuristic based approaches [10–12] which usually involve NLP techniques, predefined mapping rules and human intervention for control and validation; (2) machine learning-based approaches [13–15] which also use NLP techniques associated with annotation methods from the semantic web. In our context, we used an algorithm-based approach, further details are presented in Section 4.
Smart connected environments as well as digital contents are more and more present in our daily life. The former monitors various data produced by sensors, while the latter contains valuable additional information (e.g., technical data sheets, maintenance reports, employee register). When an event occurs, users generally want to figure out why this event happened. Unfortunately, most information systems in connected environments do not combine sensor network data with document corpora. Consequently, users have to look for an event explanation by querying both complementary sources with different systems, which is indeed very tedious, time consuming and requires a huge compilation effort. In this article, we apply the 5W1H model (”What? Who? Where? When? Why? How?”), commonly used in question-answering, to bridge the gap between sensor networks and document corpora. Our framework entitled ISEE (Information System for Event Explainability) offers an original approach that (i) defines events along four dimensions, (ii) interconnects semantic information coming from sensor networks and documents with 5W1H connections, and (iii) provides to the user a set of preliminary event explanation according to 5W1H answers. A real motivating use-case based on a smart-building is presented and experimental results are discussed.
Ontology and rule-based natural language processing approach for interpreting textual regulations on underground utility infrastructure
2021, Advanced Engineering Informatics
Citation Excerpt :
Two types of mappings – pre-learned and hand-crafted – were used to map spatial indicators to spatial relation types. The pre-learned mapping relies on the connections between the spatial language (mostly spatial prepositions) to formal spatial relations learned through ML[48], e.g., the match between the spatial prepositions of “under” or “below” to the spatial relation of “Below”. The hand-crafted mapping is domain-specific.
The nation’s massive underground utility infrastructure must comply with a multitude of regulations. The regulatory compliance checking of underground utilities requires an objective and consistent interpretation of the regulations. However, utility regulations contain a variety of domain-specific terms and numerous spatial constraints regarding the location and clearance of underground utilities. It is challenging for the interpreters to understand both the domain and spatial semantics in utility regulations. To address the challenge, this paper adopts an ontology and rule-based Natural Language Processing (NLP) framework to automate the interpretation of utility regulations – the extraction of regulatory information and the subsequent transformation into logic clauses. Two new ontologies have been developed. The urban product ontology (UPO) is domain-specific to model domain concepts and capture domain semantics on top of heterogeneous terminologies in utility regulations. The spatial ontology (SO) consists of two layers of semantics – linguistic spatial expressions and formal spatial relations – for better understanding the spatial language in utility regulations. Pattern-matching rules defined on syntactic features (captured using common NLP techniques) and semantic features (captured using ontologies) were encoded for information extraction. The extracted information elements were then mapped to their semantic correspondences via ontologies and finally transformed into deontic logic (DL) clauses to achieve the semantic and logical formalization. The approach was tested on the spatial configuration-related requirements in utility accommodation policies. Results show it achieves a 98.2% precision and a 94.7% recall in information extraction, a 94.4% precision and a 90.1% recall in semantic formalization, and an 83% accuracy in logical formalization.
Spatial role labeling based on improved pre-trained word embeddings and transfer learning
2021, Procedia Computer Science
In several real-world applications, extracting spatial semantics from text is critical. Spatial Role Labeling (SpRL) introduces a language-independent annotation scheme used in these applications, particularly for reasoning purposes. This paper proposes, first of all, a transfer learning method with a word embeddings-based approach for SpRL. Then, we enhance the word vectors with POS tags and CNN-based character-level representations. Finally, we propose a Residual BiLSTM CRF deep learning model to identify the spatial roles. The experimental results on two datasets: SemEval-2012 and SemEval-2013 Task 3, show that the proposed model outperforms other machine learning approaches.
A tool to explore the population of a CIDOC-CRM ontology
2021, Procedia Computer Science
This paper presents a visualising tool to explore the population of an Ontology, obtained through the processes of automatic migration and text information extraction. It was developed in the context of EPISA project, a R&D project that aims to represent the Portuguese National Archives records information in CIDOC-CRM, an ontology developed for museums. The tool allows the migration process developers to visualise the instances and their properties, and to debug the migration process and the migration representation model, or to explore the Archives by final users. It uses modeling and reasoners OWL-API with SPARQL-DL queries to obtain the exploration results.
Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning
2020, Journal of Biomedical Informatics
Citation Excerpt :
Motivated by this, we utilize contextualized embedding models based on transformers by applying BERT- and XLNet-based models for extracting spatial relations. Understanding spatial relations relies on the syntactic structure of a sentence as demonstrated in previous works where various syntactic features and rules based on lexico-syntactic patterns and syntactic parse trees were employed [14–17]. BERT [18], based on a deep bi-directional transformer architecture, encodes rich linguistic information in a hierarchical manner with syntactic features in the middle layers.
Radiology reports contain a radiologist’s interpretations of images, and these images frequently describe spatial relations. Important radiographic findings are mostly described in reference to an anatomical location through spatial prepositions. Such spatial relationships are also linked to various differential diagnoses and often described through uncertainty phrases. Structured representation of this clinically significant spatial information has the potential to be used in a variety of downstream clinical informatics applications. Our focus is to extract these spatial representations from the reports. For this, we first define a representation framework based on the Spatial Role Labeling (SpRL) scheme, which we refer to as Rad-SpRL. In Rad-SpRL, common radiological entities tied to spatial relations are encoded through four spatial roles: Trajector, Landmark, Diagnosis, and Hedge, all identified in relation to a spatial preposition (or Spatial Indicator). We annotated a total of 2,000 chest X-ray reports following Rad-SpRL. We then propose a deep learning-based natural language processing (NLP) method involving word and character-level encodings to first extract the Spatial Indicators followed by identifying the corresponding spatial roles. Specifically, we use a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) neural network as the baseline model. Additionally, we incorporate contextualized word representations from pre-trained language models (BERT and XLNet) for extracting the spatial information. We evaluate both gold and predicted Spatial Indicators to extract the four types of spatial roles. The results are promising, with the highest average F1 measure for Spatial Indicator extraction being 91.29 (XLNet); the highest average overall F1 measure considering all the four spatial roles being 92.9 using gold Indicators (XLNet); and 85.6 using predicted Indicators (BERT pre-trained on MIMIC notes).
The corpus is available in Mendeley at http://dx.doi.org/10.17632/yhb26hfz8n.1 and https://github.com/krobertslab/datasets/blob/master/Rad-SpRL.xml.

View all citing articles on Scopus

View full text

Global machine learning for spatial ontology population

Abstract

Introduction

Section snippets

General problem definition

Constraints and features for the machine learning models

Structured learning setting

Link-And-Label model

Communicative inference

Model specification

Local–global training and prediction models

Experiments

Related work

Conclusions

Acknowledgments

J. Mem. Lang.

Artif. Intell.

Inf. Process. Manage.

Knowledge-Driven Multimedia Information Extraction and Ontology Evolution

Some notes on alternating optimization

Natural language meets spatial calculi

Qualitative spatial reasoning using constraint calculi

Language and space: a two-level semantic approach based on principles of ontological engineering

Int. J. Speech Technol.

Spatial role labeling: towards extraction of spatial relations from natural language

ACM Trans. Speech Lang. Process.

Spatial and temporal knowledge representation

J. Earth Sci Inform.

Learning to interpret spatial natural language in terms of qualitative spatial relations

Ontology learning from text: a look back and into the future

ACM Comput. Surv.

Spatial semantics

Structured machine learning for mapping natural language to spatial ontologies

Interpreting Motion: Grounded Representations for Spatial Language, Explorations in language and space

The endpoint hypothesis: a topological-cognitive assessment of geographic scale movement patterns

Structured learning with constrained conditional models

Mach. Learn.

Large margin methods for structured and interdependent output variables

J. Mach. Learn. Res.

Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

Parameter estimation for statistical parsing models: theory and practice of distribution-free methods