Skip to main content

Über dieses Buch

This, the 40th issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains five revised selected regular papers. Topics covered include personalized social query expansion approaches, continuous query on social media streams, elastic processing systems, and semantic interoperability for smart grids and NoSQL environments.



Personalized Social Query Expansion Using Social Annotations

Query expansion is a query pre-processing technique that adds to a given query, terms that are likely to occur in relevant documents in order to improve information retrieval accuracy. A key problem to solve is “how to identify the terms to be added to a query?” While considering social tagging systems as a data source, we propose an approach that selects terms based on (i) the semantic similarity between tags composing a query, (ii) a social proximity between the query and the user for a personalized expansion, and (iii) a strategy for expanding, on the fly, user queries. We demonstrate the effectiveness of our approach by an intensive evaluation on three large public datasets crawled from delicious, Flickr, and CiteULike. We show that the expanded queries built by our method provide more accurate results as compared to the initial queries, by increasing the MAP in a range of 10 to 16% on the three datasets. We also compare our method to three state of the art baselines, and we show that our query expansion method allows significant improvement in the MAP, with a boost in a range between 5 to 18%.
Mohamed Reda Bouadjenek, Hakim Hacid, Mokrane Bouzeghoub

A Data Services Composition Approach for Continuous Query on Social Media Streams

We witness a rapid increase in the number of social media streams due to development of Web2.0, IoT and Cloud Computing technology. These sources include both traditional relational databases and streaming data from messaging infrastructure. We would like to use multiple social media streams to answer complex queries to enable information sharing and intelligence gathering for better collaboration. For this purpose, we adopt data services as the basic abstraction for both traditional relational databases and data streams retrieval. A flexible continuous data service model with continuous query as service operation is proposed. Service operation instance is modeled as a view defined on data streams. In the view, data part and time synchronization part are separated from each other. Based on the continuous data service model, we proposed a continuous data service composition algorithm for answering queries across data streams and relational data. The main idea is to find the contained rewriting of user query on views satisfying both data part and time synchronization part containment relationship. We also present use case and experimental studies that indicate that the approach is effective and efficient.
Guiling Wang, Xiaojiang Zuo, Marc Hesenius, Yao Xu, Yanbo Han, Volker Gruhn

DABS-Storm: A Data-Aware Approach for Elastic Stream Processing

In the last decade, stream processing has become a very active research domain motivated by the growing number of stream-based applications. These applications make use of continuous queries, which are processed by a stream processing engine (SPE) to generate timely results given the ephemeral input data. Variations of input data streams, in terms of both volume and distribution of values, have a large impact on computational resource requirements. Dynamic and Automatic Balanced Scaling for Storm (DABS-Storm) is an original solution for handling dynamic adaptation of continuous queries processing according to evolution of input stream properties, while controlling the system stability. Both fluctuations in data volume and distribution of values within data streams are handled by DABS-Storm to adjust the resources usage that best meets processing needs. To achieve this goal, the DABS-Storm holistic approach combines a proactive auto-parallelization algorithm with a latency-aware load balancing strategy.
Roland Kotto Kombi, Nicolas Lumineau, Philippe Lamarre, Nicolo Rivetti, Yann Busnel

SSG: An Ontology-Based Information Model for Smart Grids

Nowadays, an electricity blackout can have a domino effect on the overall power system, causing extremely bad effects on the economical, ecological and operational countries perspectives. All this emphasizes the need for conceiving an upgraded vision of today’s and tomorrow’s power systems that have to be smart to meet the society expectations. Smart grids have been emerging as an appropriate solution for such needs. This work addresses two main related challenges encountered in the management of such power systems: (1) the semantic interoperability needed between their heterogeneous components in order to ensure seamless communication and integration, and (2) a means to consider their various objectives from economical, ecological, and operational perspectives, to mention some. In this paper, we propose a three-layered smart grid management framework, aiming at resolving these two issues. The backbone of the framework is SSG, a generic ontology-based model, detailed here. It aims at modeling the smart grid components, their features and properties, allowing the achievement of the smart grid objectives. Several evaluations have been conducted in order to validate our proposed framework and emphasize the SSG importance and utility in the energy domain. Obtained results are satisfactory and draw several promising perspectives.
Khouloud Salameh, Richard Chbeir, Haritza Camblong

Bridging the Semantic Web and NoSQL Worlds: Generic SPARQL Query Translation and Application to MongoDB

RDF-based data integration is often hampered by the lack of methods to translate data locked in heterogeneous silos into RDF representations. In this paper, we tackle the challenge of bridging the gap between the Semantic Web and NoSQL worlds, by fostering the development of SPARQL interfaces to heterogeneous databases. To avoid defining yet another SPARQL translation method for each and every database, we propose a two-phase method. Firstly, a SPARQL query is translated into a pivot abstract query. This phase achieves as much of the translation process as possible regardless of the database. We show how optimizations at this abstract level can save subsequent work at the level of a target database query language. Secondly, the abstract query is translated into the query language of a target database, taking into account the specific database capabilities and constraints. We demonstrate the effectiveness of our method with the MongoDB NoSQL document store, such that arbitrary MongoDB documents can be aligned on existing domain ontologies and accessed with SPARQL. Finally, we draw on a real-world use case to report experimental results with respect to the effectiveness and performance of our approach.
Franck Michel, Catherine Faron-Zucker, Johan Montagnat


Weitere Informationen

Premium Partner