Skip to main content

About this book

These transactions publish research in computer-based methods of computational collective intelligence (CCI) and their applications in a wide range of fields such as the semantic Web, social networks, and multi-agent systems. TCCI strives to cover new methodological, theoretical and practical aspects of CCI understood as the form of intelligence that emerges from the collaboration and competition of many individuals (artificial and/or natural). The application of multiple computational intelligence technologies, such as fuzzy systems, evolutionary computation, neural systems, consensus theory, etc., aims to support human and other collective intelligence and to create new forms of CCI in natural and/or artificial systems. This twenty-first issue contains 7 carefully selected and revised contributions.

Table of Contents


Keyword-Based Search Over Databases: A Roadmap for a Reference Architecture Paired with an Evaluation Framework

Structured data sources promise to be the next driver of a significant socio-economic impact for both people and companies. Nevertheless, accessing them through formal languages, such as SQL or SPARQL, can become cumbersome and frustrating for end-users. To overcome this issue, keyword search in databases is becoming the technology of choice, even if it suffers from efficiency and effectiveness problems that prevent it from being adopted at Web scale.
In this paper, we motivate the need for a reference architecture for keyword search in databases to favor the development of scalable and effective components, also borrowing methods from neighbor fields, such as information retrieval and natural language processing. Moreover, we point out the need for a companion evaluation framework, able to assess the efficiency and the effectiveness of such new systems and in the light of real and compelling use cases.
Sonia Bergamaschi, Nicola Ferro, Francesco Guerra, Gianmaria Silvello

Entity-Based Keyword Search in Web Documents

In document search, documents are typically seen as a flat list of keywords. To deal with the syntactic interoperability, i.e., the use of different keywords to refer to the same real world entity, entity linkage has been used to replace keywords in the text with a unique identifier of the entity to which they are referring. Yet, the flat list of entities fails to capture the actual relationships that exist among the entities, information that is significant for a more effective document search. In this work we propose to go one step further from entity linkage in text, and model the documents as a set of structures that describe relationships among the entities mentioned in the text. We show that this kind of representation is significantly improving the effectiveness of document search. We describe the details of the implementation of the above idea and we present an extensive set of experimental results that prove our point.
Enrico Sartori, Yannis Velegrakis, Francesco Guerra

Evaluation of Keyword Search in Affective Multimedia Databases

Multimedia documents such as pictures, videos, sounds and text provoke emotional responses of different intensity and polarity. These stimuli are stored in affective multimedia databases together with description of their semantics based on keywords from unsupervised glossaries, expected emotion elicitation potential and other important contextual information. Affective multimedia databases are important in many different areas of research, such as affective computing, human-computer interaction and cognitive sciences, where it is necessary to deliberately modulate emotional states of individuals. However, restrictions in the employed semantic data models impair retrieval performance measures thus severely limiting the databases’ overall usability. An experimental evaluation of multi-keyword search in affective multimedia databases, using lift charts as binomial classifiers optimized for retrieval precision or sensitivity, is presented. Suggestions for improving expressiveness and formality of data models are elaborated, as well as introduction of dedicated ontologies which could lead to better data interoperability.
Marko Horvat, Marin Vuković, Željka Car

Data Driven Discovery of Attribute Dictionaries

Online product search engines such as Google and Yahoo shopping, rely on having extensive and complete product information to return accurate and timely search results. Given the expanding scope of products and updates to existing products, automated techniques are needed to ensure the underlying product dictionaries remain current and complete. Product search engines receive offers from merchants describing product specific attributes and characteristics. These offers normally contain structured attribute-value pairs, and unstructured (textual) descriptions describing product characteristics and features. For example, a laptop offer may contain attribute-value pairs such as “model-X42” and “RAM-8 GB”, and a text description of the software, accessories, battery features, warranty, etc. Updating the product dictionaries using the textual descriptions is a more challenging task than using the attribute-value pairs since the relevant attribute values must first be extracted. This task becomes difficult since the text descriptions often do not follow a predefined format, and the data in the descriptions vary across different merchants and products. However, this information needs to be captured to ensure a comprehensive and complete product listing. In this paper, we present techniques that extract attribute values from textual product descriptions. We introduce an end-to-end framework that takes an input string record, and parses the tokens in a record to identify candidate attribute values. We then map these values to attributes. We take an information theoretic approach to identify groups of tokens that represent an attribute value. We demonstrate the accuracy and relevance of our approach using a variety of real data sets.
Fei Chiang, Periklis Andritsos, Renée J. Miller

Subject-Related Message Filtering in Social Media Through Context-Enriched Language Models

Efficiently retrieving and understanding messages from social media is challenging, considering that shorter messages are strongly dependent on context. Assuming that their audience is aware of background and real world events, users can shorten their messages without compromising communication. However, traditional data mining algorithms do not account for contextual information. We argue that exploiting context can lead to advancements in the analysis of social media messages. Recall rate increases if context is taken into account, leading to context-aware methods for filtering messages without resorting only to keywords. A novel approach for subject classification of social media messages, using computational linguistics techniques, is proposed, employing both textual and extra-textual (or contextual) information. Experimental analysis over sports-related messages indicates over 50 % improvement in retrieval rate over text-based approaches due to the use of contextual information.
Alexandre Davis, Adriano Veloso

Improving Open Information Extraction for Semantic Web Tasks

Open Information Extraction (OIE) aims to automatically identify all the possible assertions within a sentence. Results of this task are usually a set of triples (subject, predicate, object). In this paper, we first present what OIE is and how it can be improved when we work in a given domain of knowledge. Using a corpus made up of sentences in building engineering construction, we obtain an improvement of more than 18 %. Next, we show how OIE can be used at a base of a high-level semantic web task. Here we have applied OIE on formalisation of natural language definitions. We test this formalisation task on a corpus of sentences defining concepts found in the pizza ontology. At this stage, 70.27 % of our 37 sentences-corpus are fully rewritten in OWL DL.
Cheikh Kacfah Emani, Catarina Ferreira Da Silva, Bruno Fiès, Parisa Ghodous

Searching Web 2.0 Data Through Entity-Based Aggregation

Entity-based searching has been introduced as a way of allowing users and applications to retrieve information about a specific real world object such as a person, an event, or a location. Recent advances in crawling, information extraction, and data exchange technologies have brought a new era in data management, typically referred to through the term Web 2.0. Entity searching over Web 2.0 data facilitates the retrieval of relevant information from the plethora of data available in semantic and social web applications.
Effective entity searching over a variety of sources requires the integration of the different pieces of information that refer to the same real world entity. Entity-based aggregation of Web 2.0 data is an effective mechanism towards this direction. Adopting the suggestions of the Linked Data movement, aggregators are able to efficiently match and merge the data that refer to the same real world object.
Ekaterini Ioannou, Yannis Velegrakis


Additional information

Premium Partner

    Image Credits