Skip to main content

2009 | Buch

Metadata and Semantic Research

Third International Conference, MTSR 2009, Milan, Italy, October 1-2, 2009. Proceedings

herausgegeben von: Fabio Sartori, Miguel Ángel Sicilia, Nikos Manouselis

Verlag: Springer Berlin Heidelberg

Buchreihe : Communications in Computer and Information Science

insite
SUCHEN

Inhaltsverzeichnis

Frontmatter

Theoretical Research: Results and Proposals

VMAP: A Dublin Core Application Profile for Musical Resources

This paper details a Dublin Core Application Profile defined for cataloguing musical resources described within the European eContentPlus project Variazioni. The metadata model is based on FRBR and has been formalised with DC-Text and implemented in an available web portal where users and music institutions can catalogue their musical assets in a collaborative way.

Carlos A. Iglesias, Mercedes Garijo, Daniel Molina, Paloma de Juan
Usage-Oriented Topic Maps Building Approach

In this paper, we present a collaborative and incremental construction approach of multilingual Topic Maps based on enrichment and merging techniques. In recent years, several Topic Map building approaches have been proposed endowed with different characteristics. Generally, they are dedicated to particular data types like text, semi-structured data, relational data, etc. We note also that most of these approaches take as input monolingual documents to build the Topic Map. The problem is that the large majority of resources available today are written in various languages, and these resources could be relevant even to non-native speakers. Thus, our work is driven towards a collaborative and incremental method for Topic Map construction from textual documents available in different languages. To enrich the Topic Map, we take as input a domain thesaurus and we propose also to explore the Topic Map usage which means available potential questions related to the source documents.

Nebrasse Ellouze, Nadira Lammari, Elisabeth Métais, Mohamed Ben Ahmed
ManagemOnt: A Semantic Approach to Software Engineering Management Process

Software engineering processes, today, tend to have a gap between the assets because of non-manageable experiences in the domain which causes the organizations to fail in process improvement activities and software engineering practices in terms of time and cost. The data maintained in current software engineering process models, such as project and resource plans, documents, metrics, etc. is syntactic and out of interpretation. The lack of interpretation results in redundant data for an asset of software engineering process. It is well-known for years that each asset in software engineering domain generates an output which is an input for another asset in the domain in a logically related manner. This approach to software engineering process assets reveals knowledge-based software engineering process modeling via inference and reuse of domain experiences. It is proposed to model semantic software engineering processes and their assets by means of ontologies to achieve the inference and reuse of domain knowledge in a way different from syntactic approach. In order to trigger semantic software engineering processes, project planning activity is prototyped from software engineering management process since this activity almost comprises the mentioned process data because of its position in software engineering processes and practices.

Baris Ulu, Banu Diri
Clarifying the Semantics of Relationships between Learning Objects

In this paper we discuss about the ambiguities and deficiencies of the Learning Object Metadata (LOM) standard to specify relationships between learning objects (LOs), specially those relationships that relate LOs instances. We also study the impact of relationships in the internal organizational structure of LOs. As main contribution, we develop a taxonomy of possible relationships between LOs that has been created by refining the LOM standard relationships with other meaningful relationships from a common sense ontology.

M. Elena Rodríguez, Jordi Conesa, Miguel Ángel Sicilia
A Framework for Automatizing and Optimizing the Selection of Indexing Algorithms

Inside an information system, the indexation process facilitates the retrieval of specific contents. However, this process is known as time and resource consuming. Simultaneously, the diversity of multimedia indexing algorithms is growing steeply which makes harder to select the best ones for particular user needs. In this article, we propose a generic framework which determines the most suitable indexing algorithms according to user queries, hence optimizing the indexation process. In this framework, the multimedia features are used to define multimedia metadata, user queries as well as indexing algorithm descriptions. The main idea is that, apart from retrieving contents, user queries could be also used to identify a relevant set of algorithms which detect the requested features. The application of our proposed framework is illustrated through the case of an RDF-based information system. In this case, our approach could be further optimized by a broader integration of Semantic Web technologies.

Mihaela Brut, Sébastien Laborie, Ana-Maria Manzat, Florence Sèdes
Empirical Analysis of Errors on Human-Generated Learning Objects Metadata

Learning object metadata is considered crucial for the right management of learning objects stored in public repositories. Search operations, in particular, rely on the quality of these metadata as an essential precondition for finding results adequate to users requirements and needs. However, learning object metadata are not always reliable, as many factors have a negative influence in metadata quality (human annotators not having the minimum skills, unvoluntary mistakes, lack of information, for instance). This paper analyses human-generated learning object metadata records described according to the IEEE LOM standard, identifies the most significant errors committed and points out which parts of the standard should be improved for the sake of quality.

Cristian Cechinel, Salvador Sánchez-Alonso, Miguel Ángel Sicilia
Analysis of Educational Metadata Supporting Complex Learning Processes

Educational metadata provide learning objects and designs with required information that is relevant to a learning situation. A learning design specifies how a learning process involves a set of people in specific groups and roles engaging learning activities with appropriate resources and services. These elements are usually described by using structured primitives of an Educational Modeling Language. Metadata records must explicitly provide a representation of the flow of learning activities and how learning resources and services are utilized. We have analyzed a number of common workflow patterns in order to extend current Educational Modeling Languages’ primitives used in complex learning flows. The information model of the Learning Process Execution and Composition Language is used as the basis to extend structured metadata required by such learning process descriptions.

Jorge Torres, Juan Manuel Dodero
A Fine-Grained Metric System for the Completeness of Metadata

Metadata quality is an issue that can be approached from different aspects. Among the most essential properties characterizing a quality metadata record is its sufficiency to describe a resource, which is expressed as the completeness of the record. The paper presents a fine-grained metric system for measuring metadata completeness that is capable of following the hierarchy of metadata as it is set by the metadata schema and admeasuring the effect of multiple values of multi-valued fields. Moreover, it introduces the aspect of the representation level of semantically equivalent information that should be taken into account when measuring completeness. The proposed metric system, based on the definition of completeness of a field, treats several deficiencies of the traditional coarse metrics and offers the ability of targeted measures of completeness throughout the metadata hierarchy.

Thomas Margaritopoulos, Merkourios Margaritopoulos, Ioannis Mavridis, Athanasios Manitsaris
Unified Semantic Search of Data and Services

The increasing availability of data and eServices on the Web allows users to search for relevant information and to perform operations through eServices. Current technologies do not support users in the execution of such activities as a unique task; thus users have first to find interesting information, and then, as a separate activity, to find and use eServices. In this paper we present a framework able to query an integrated view of heterogeneous data and to search for eServices related to retrieved data. A unified view of data and semantically described eServices is the way in which it is possible to unify data and service perspectives.

Domenico Beneventano, Francesco Guerra, Andrea Maurino, Matteo Palmonari, Gabriella Pasi, Antonio Sala
Preliminary Explorations on the Statistical Profiles of Highly-Rated Learning Objects

As learning object repositories grow and accumulate resources and metadata, the concern for quality has increased, leading to several approaches for quality assessment. The availability of on-line evaluations in some repositories has opened the opportunity to examine the characteristics of learning objects that are evaluated positively, in search of features that can be used as a priori predictors of quality. This paper reports a preliminary exploration of some learning object attributes that can be automatically analyzed and might serve as quality metrics, using a sample from the MERLOT repository. The bookmarking of learning objects in personal collections was found to be a potential predictor of quality. Among the initial metrics considered, the number of images has been found to be also a predictor in most of the disciplines and the only candidate for the Art discipline. More attributes have to be studied across disciplines to come up with automated analysis tools that have a degree of reliability.

Elena García-Barriocanal, Miguel Ángel Sicilia

Applications: Case Studies and Proposals

A Semantic Web Based System for Context Metadata Management

With the increasing usage of embedded systems and sensors in our surroundings, a new type of information systems – context aware systems – are gaining importance. These user-centric systems acquire context information which describes the state of the user and the user environment, and offer adaptable and personalized services based on the user context information. The central part of a context aware system is the context model used for describing user context information. The context information originates from a multitude of heterogeneous sources, such as personal calendars, sensors attached to the users or to the user’s environment and Web based sources, such as social networking sites. The information from these sources is typically on different abstraction levels and is organized according to different data models. This work proposes a Semantic Web based context metadata management system. The first part of the work develops an ontology model for user context. The user context model integrates information from multiple and heterogeneous sources which are modeled largely by reusing existing well accepted ontologies. The second part of the work proposes a method to infer and reason about additional user context information based on the available context information using rules and ontologies. We instantiate and evaluate the proposed system by performing a social networking case study called

meetFriends

. In this application, information is collected from various sources such as sensors attached to the users and public web sources - YellowPages. The moods of the users are inferred from a set of rules. A meeting between two users can be set up based on the moods, locations and preferences of the users. The results indicate that Semantic Web technologies are well suited for integrating various data sources, processing of user context information, and enabling adaptable and personalized services.

Svetlin Stefanov, Vincent Huang
An XML Pipeline Based System Architecture for Managing Bibliographic Metadata

In our knowledge-based society, bibliographic metadata is everywhere. Although several metadata standards for bibliographic information have been developed and established by the professional librarian community, home-grown ad-hoc solutions are still widespread in small to medium-sized institutions. This paper presents a framework for storing, indexing, and browsing bibliographic metadata that is designed to lower the barrier for metadata standard adoption by facilitating legacy data import and integration into existing infrastructure. These goals are achieved using XML pipelines as a central design paradigm. As a practical use case, we discuss the implementation of the described architecture at a research institute in our university, where it is now in productive use for managing publication lists and the local library.

Johannes Textor, Benjamin Feldner
DataStaR: Bridging XML and OWL in Science Metadata Management

DataStaR is a science data “staging repository” developed by Albert R. Mann Library at Cornell University that produces semantic metadata while enabling the publication of data sets and accompanying metadata to discipline-specific data centers or to Cornell’s institutional repository. DataStaR, which employs OWL and RDF in its metadata store, serves as a Web-based platform for production and management of metadata and aims to reduce redundant manual input by reusing named ontology individuals. A key requirement of DataStaR is the ability to produce metadata records conforming to existing XML schemas that have been adopted by scientific communities. To facilitate this, DataStaR integrates ontologies that directly reflect XML schemas, generates HTML editing forms, and “lowers” ontology axioms into XML documents compliant with existing schemas. This paper describes our approach and implementation, and discusses the challenges involved.

Brian Lowe
Structured Metadata for Representing and Managing Complex ‘Narrative’ Information

In this paper, we evoke first the ubiquity and the importance of the so-called ‘non-fictional narrative’ information. We show then that the usual knowledge representation and ‘ontological’ techniques have difficulties in finding complete solutions for representing and using this type of information. We supply then some details about NKRL, a (complex metadata) representation language and a querying/inferencing environment especially created for an ‘intelligent’ exploitation of (non-fictional) narratives. The paper will be illustrated with some examples concerning recent concrete applications of this environment/ language.

Gian Piero Zarri
A Semantic Web Framework to Support Knowledge Management in Chronic Disease Healthcare

Improving quality of healthcare for people with chronic conditions requires informed and knowledgeable healthcare providers and patients. Decision support and clinical information system are two of the main components to support improving chronic care. In this paper, we describe an ongoing initiative that emphasizes the need for healthcare knowledge management to support both components. Ontology-based knowledge acquisition and modeling based on knowledge engineering approach provides an effective mechanism in capturing expert opinion in form of clinical practice guidelines. The Semantic Web framework is adopted in building a knowledge management platform that allows integration between the knowledge with patient databases and supported publications. We discuss one of the challenges, which is to apply the healthcare knowledge into existing healthcare provider environments by focusing on augmenting decision making and improving quality of patient care services.

Marut Buranarach, Thepchai Supnithi, Noppadol Chalortham, Vasuthep Khunthong, Patcharee Varasai, Asanee Kawtrakul
Ontological Enrichment of the Genes-to-Systems Breast Cancer Database

Breast cancer research need the development of specific and suitable tools to appropriately manage biomolecular knowledge. The presented work deals with the integrative storage of breast cancer related biological data, in order to promote a system biology approach to this network disease. To increase data standardization and resource integration, annotations maintained in Genes-to-Systems Breast Cancer (G2SBC) database are associated to ontological terms, which provide a hierarchical structure to organize data enabling more effective queries, statistical analysis and semantic web searching. Exploited ontologies, which cover all levels of the molecular environment, from genes to systems, are among the most known and widely used bioinformatics resources. In G2SBC database ontology terms both provide a semantic layer to improve data storage, accessibility and analysis and represent a user friendly instrument to identify relations among biological components.

Federica Viti, Ettore Mosca, Ivan Merelli, Andrea Calabria, Roberta Alfieri, Luciano Milanesi
An Ontology Based Approach to Information Security

The semantically structure of knowledge, based on ontology approaches have been increasingly adopted by several expertise from diverse domains. Recently ontologies have been moved from the philosophical and metaphysics disciplines to be used in the construction of models to describe a specific theory of a domain. The development and the use of ontologies promote the creation of a unique standard to represent concepts within a specific knowledge domain. In the scope of information security systems the use of an ontology to formalize and represent the concepts of security information challenge the mechanisms and techniques currently used. This paper intends to present a conceptual implementation model of an ontology defined in the security domain. The model presented contains the semantic concepts based on the information security standard

ISO/IEC_JTC1

, and their relationships to other concepts, defined in a subset of the information security domain.

Teresa Pereira, Henrique Santos
Reusability Evaluation of Learning Objects Stored in Open Repositories Based on Their Metadata

Reusability is considered to be the key property of learning objects residing in open repositories. In consecuence, measurement instruments for learning object reusability should be developed. In this preliminary research we propose to evaluate the reusability of learning objects by a priori reusability analysis based on their metadata records. A set of reusability metrics extracted from metadata records are defined and a quality assessment of the metadata application profiles defined in repositories eLera and Merlot is exposed.

Javier Sanz, Salvador Sánchez-Alonso, Juan Manuel Dodero
A Comparison of Methods and Techniques for Ontological Query Expansion

This paper presents an ongoing research on the comparison of ontological query expansion methods. Query Expansion is a technique that aims to enhance the results of a search by adding terms to the search query; today, it is a very important research topic in the semantic web and information retrieval areas. Although many efforts have been form the theoretical point of view to implements effective and general methods for expanding queries, based on both statistical and ontological approaches, the practical applicability of is nowadays restricted to few and very specific domains. The aim of this paper is the definition of a platform for the implementation of a subset of such methods, in order to make comparisons among them and try to define how and when use ontological QE. This work is part of JUMAS, a research project funded by European Community where query expansion is used to support the retrieval of signifiant information from audio–video transcriptions in the legal domain.

Fabio Sartori
Exploring Characterizations of Learning Object Repositories Using Data Mining Techniques

Learning object repositories provide a platform for the sharing of Web-based educational resources. As these repositories evolve independently, it is difficult for users to have a clear picture of the kind of contents they give access to. Metadata can be used to automatically extract a characterization of these resources by using machine learning techniques. This paper presents an exploratory study carried out in the contents of four public repositories that uses clustering and association rule mining algorithms to extract characterizations of repository contents. The results of the analysis include potential relationships between different attributes of learning objects that may be useful to gain an understanding of the kind of resources available and eventually develop search mechanisms that consider repository descriptions as a criteria in federated search.

Alejandra Segura, Christian Vidal, Victor Menendez, Alfredo Zapata, Manuel Prieto

Special Track: Metadata and Semantics for Agriculture, Food and Environment

Assuring the Quality of Agricultural Learning Repositories: Issues for the Learning Object Metadata Creation Process of the CGIAR

The Consultative Group on International Agricultural Re- search (CGIAR) has established a digital repository to share its teaching and learning resources along with descriptive educational information based on the IEEE Learning Object Metadata (LOM) standard. As a critical component of any digital repository, quality metadata are critical not only to enable users to find more easily the resources they require, but also for the operation and interoperability of the repository itself. Studies show that repositories have difficulties in obtaining good quality metadata from their contributors, especially when this process involves many different stakeholders as is the case with the CGIAR as an international organization. To address this issue the CGIAR began investigating the Open ECBCheck as well as the ISO/IEC 19796-1 standard to establish quality protocols for its training. The paper highlights the implications and challenges posed by strengthening the metadata creation workflow for disseminating learning objects of the CGIAR.

Thomas Zschocke, Jan Beniest
Ontology Design Parameters for Aligning Agri-Informatics with the Semantic Web

In recent years there have been many efforts in the development of bio-ontologies, where the applied life sciences can see the benefits reaped from, and hurdles observed with, such early-adopter efforts. With the plethora of resources, where should one start developing one’s own domain ontology, what resources are available for reuse to speed up its development, for which purposes can the ontology be developed? We group inputs that determine effectiveness of ontology development and use into four types of parameters: purpose, ontology reuse, ways of ontology learning, and the language and reasoning services. We illustrate this for the agriculture domain by building upon experiences gained in previous and current projects.

C. Maria Keet
Developing an Ontology for Improving Question Answering in the Agricultural Domain

Numerous resources have been developed to have a better access to scientific information in the agricultural domain. However, they are rather concerned with providing general metadata of bibliographic references, which prevents users from accessing precise agricultural information in a transparent and simple manner. To overcome this drawback, in this paper, we propose to use domain-specific resources to improve the results in the answers obtained by an Open-Domain Question Answering (QA) system, obtaining a QA system for the agricultural domain. Specifically, it has been made by (i) creating an ontology that covers concepts and relationships from journal publications of the agricultural domain, (ii) enriching this ontology with some public data sources (e.g the Agrovoc thesaurus and the WordNet lexical database) in order to be precisely used in an agricultural domain, and (iii) aligning this enriched ontology with articles from our case-study journal, i.e. the Cuban Journal of Agricultural Science. Finally, we have developed a set of experiments in order to show the usefulness of our approach.

Katia Vila, Antonio Ferrández
A Service Architecture for Facilitated Metadata Annotation and Ressource Linkage Using agroXML and ReSTful Web Services

ReSTful web services are built by distributing state and functionalities of services across resources. In contrast to RPC services, where a single network object with a (often) large number of method invocations exists, in ReSTful services a large number of network objects, all with the same restricted set of method invocations are available. This allows for scalable and extensible services easily accessible using simple, standardized technology. As semantic web technologies like RDF rely on similar concepts - it is e. g. also possible to use URLs for identification - adding further layers to a service to annotate its content with metadata or to specify relationship between data becomes easy.

Daniel Martini, Mario Schmitz, Jürgen Frisch, Martin Kunisch
A Water Conservation Digital Library Using Ontologies

New technologies are emerging that assist in organizing and retrieving knowledge stored in a variety of forms (books, papers, models, decision support systems, databases), but they can only be evaluated through real world applications. Ontology has been used to manage the Water Conservation Digital Library holding a growing collection of various types of digital resources in the domain of urban water conservation in Florida, USA. The ontology based back-end powers a fully operational web interface, available at

http://library.conservefloridawater.org

. The system has already demonstrated numerous benefits of the ontology application, including: easier and more precise finding of resources, information sharing and reuse, and proved to effectively facilitate information management.

Lukasz Ziemba, Camilo Cornejo, Howard Beck
Evaluation of a Metadata Application Profile for Learning Resources on Organic Agriculture

Metadata specifications and standards serve as the basis for creating metadata application profiles that are particularly adapted to the needs of specific applications. The process of developing such application profiles is usually an iterative one, involving several stakeholders such as technical experts and domain experts. In this process, evaluation should have a pivotal role, by engaging methods and instruments that can ensure that the interests and needs of all stakeholders are reflected in the produced application profile. This paper presents how evaluation is dealt with, in a particular case study of developing a metadata application profile for learning resources. It particularly puts emphasis on the way the domain experts have evaluated the elements of the application profile, on dimensions related to their envisaged usefulness, comprehensibility, and ease to use during content annotation. The methodology followed, the pilot evaluation experiment with the domain experts, and the way the results have been incorporated in the application profile elaboration process, are discussed.

Nikos Palavitsinis, Nikos Manouselis, Salvador Sanchez Alonso
Ontology for Seamless Integration of Agricultural Data and Models

This paper presents a set of ontologies developed in order to facilitate the integration of a variety of combinatorial, simulation and optimization models related to agriculture. The developed ontologies have been exploited in the software lifecycle, by using them to specify data communication across the models, and with a relational database. The Seamless ontologies provide with definitions for crops and crop products, agricultural feasibility filters, agricultural management, and economic valuation of crop products, and agricultural and environmental policy, which are in principle the main types of data exchanged by the models. Issues related to translating data structures between model programming languages have been successfully tackled by employing annotations in the ontology.

Ioannis N. Athanasiadis, Andrea-Emilio Rizzoli, Sander Janssen, Erling Andersen, Ferdinando Villa
Assessment of Food and Nutrition Related Descriptors in Agricultural and Biomedical Thesauri

Food- and human nutrition-related subject headings or descriptors of the following thesauri-databases are assessed: NAL Thesaurus/Agricola, Agrovoc/Agris, CAB Thesaurus, FSTA Thesaurus, MeSH/Medline. Food concepts can be represented by thousands of different terms but subject scope of a particular term is sometimes vague. There exist important differences among thesauri regarding same or similar concept. A term that represents narrower or broader concept in one thesaurus can in another stand for a related concept or be non-existent. Sometimes there is no clear implication of differences between scientific (Latin) and common (English) names. Too many related terms can confuse end-users. Thesauri were initially employed mostly by information professionals but can now be used directly by users who may be unaware of differences. Thesauri are assuming new roles in classification of information as metadata. Further development towards ontologies must pay constant attention to taxonomic problems of representation of knowledge.

Tomaz Bartol
Networked Ontologies from the Fisheries Domain

In this paper we report on ongoing work concerning the creation of a network of ontologies based on metadata for time series relative to the domain of fisheries, and hint at the possibility of exploiting the network for web service applications. The results obtained so far show that the reengineering of classification systems stored as relational databases is possible, although some technical problems is still to be addressed.

Caterina Caracciolo, Juan Heguiabehere, Margherita Sini, Johannes Keizer
Improving Information Exchange in the Chicken Processing Sector Using Standardised Data Lists

Research has shown that to improve electronic communication between companies, universal standardised data lists are necessary. In food supply chains in particular there is an increased need to exchange data in the wake of food safety incidents. Food supply chain companies already record numerous measurements, properties and parameters. These records are necessary for legal reasons, labelling, traceability, profiling desirable characteristics, showing compliance and for meeting customer requirements. Universal standards for name and content of each of these data elements would improve information exchange between buyers, sellers, authorities, consumers and other interested parties. A case study, carried out for the chicken sector, attempted to identify the most relevant parameters including which of these were already communicated to external bodies.

Kathryn Anne-Marie Donnelly, Joop van der Roest, Stefán Torfi Höskuldsson, Petter Olsen, Kine Mari Karlsen
Navigation as a New Form of Search for Agricultural Learning Resources in Semantic Repositories

Education is essential when it comes to raise public awareness on the environmental and economic benefits of organic agriculture and agroecology (OA & AE). Organic.Edunet, an EU funded project, aims at providing a freely-available portal where learning contents on OA & AE can be published and accessed through specialized technologies. This paper describes a novel mechanism for providing semantic capabilities (such as semantic navigational queries) to an arbitrary set of agricultural learning resources, in the context of the Organic.Edunet initiative.

Ramiro Cano, Alberto Abián, Elena Mena
Backmatter
Metadaten
Titel
Metadata and Semantic Research
herausgegeben von
Fabio Sartori
Miguel Ángel Sicilia
Nikos Manouselis
Copyright-Jahr
2009
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-04590-5
Print ISBN
978-3-642-04589-9
DOI
https://doi.org/10.1007/978-3-642-04590-5