Skip to main content

2015 | Buch

Transactions on Large-Scale Data- and Knowledge-Centered Systems XXII

herausgegeben von: Abdelkader Hameurlain, Josef Küng, Roland Wagner

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments.

This, the 22nd issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains six revised selected regular papers. Topics covered include algorithms for large-scale private analysis, modelling of entities from social and digital worlds and their relations, querying virtual security views of XML data, recommendation approaches using diversity-based clustering scores, hypothesis discovery, and data aggregation techniques in sensor netwo

rk environments.

Inhaltsverzeichnis

Frontmatter
BPMiner: Algorithms for Large-Scale Private Analysis
Abstract
An abundance of data generated from a multitude of sources, and intelligence derived by analyzing the same, has become an important asset across many walks of life. Simultaneously, it raises serious concerns about privacy. Differential privacy has become a popular way to reason about the amount of information about individual entries of a dataset that is divulged upon giving out a perturbed result for a query on a given data-set. However, current differentially-private algorithms are computationally inefficient, and do not explicitly exploit the abundance of data, thus wearing out the privacy budget irrespective of the volume of data. In this paper, we propose BPMiner, a solution that is both private and accurate, while simultaneously addressing the computation and budget challenges of very big datasets. The main idea is a non-trivial combination between differential privacy, sample-and-aggregation, and a classical statistical methodology called sequential estimation. Rigorous proof regarding the privacy and asymptotic accuracy of our solution are provided. Furthermore, experimental results over multiple datasets demonstrate that BPMiner outperforms current private algorithms in terms of computational and budget efficiencies, while achieving comparable accuracy. Overall, BPMiner is a practical solution based on strong theoretical foundations for privacy-preserving analysis on big datasets.
Quach Vinh Thanh, Anwitaman Datta
System Modeling and Trust Evaluation of Distributed Systems
Abstract
Nowadays, digital systems are connected through complex architectures. These systems involve persons, physical and digital resources such that we can consider that a system consists of elements from two worlds, the social world and the digital world, and their relations. Users perform activities like chatting, buying, sharing data, etc. Evaluating and choosing appropriate systems involve aspects like functionality, performance, QoS, ease of use, or price. Recently, trust appeared as another key factor for such an evaluation. In this context, we raise two issues, (i) how to formalize the entities that compose a system and their relations for a particular activity? and (ii) how to evaluate trust in a system for this activity? This work proposes answers to both questions. On the one hand, we propose SocioPath, a metamodel based on first order logic, that allows to model a system considering entities of the social and digital worlds and their relations. On the other hand, we propose two approaches to evaluate trust in systems, namely, SocioTrust and SubjectiveTrust. The former is based on probability theory to evaluate users’ trust in systems for a given activity. The latter is based on subjective logic to take into account uncertainty in trust values.
Nagham Alhadad, Patricia Serrano-Alvarado, Yann Busnel, Philippe Lamarre
Efficient Querying of XML Data Through Arbitrary Security Views
Abstract
We study the problem of querying virtual security views of XML data that has received a great attention during the past years. A major concern here is that user XPath queries posed on recursive views cannot be rewritten to be evaluated on the underlying XML data. Existing rewriting solutions are based on the non-standard language, “Regular XPath”, which makes rewriting possible under recursion. However, query rewriting under Regular XPath can be of exponential size. We show that query rewriting is always possible for arbitrary security views (recursive or not) by using only the expressive power of the standard XPath. We propose a more expressive language to specify XML access control policies as well as an efficient algorithm to enforce such policies. Finally, we present our system, called SVMAX, that implements our solutions and we show that it scales well through an extensive experimental study based on real-life DTD.
Houari Mahfoud, Abdessamad Imine
Increasing Coverage in Distributed Search and Recommendation with Profile Diversity
Abstract
With the advent of Web 2.0 users are producing bigger and bigger amounts of diverse data, which are stored in a large variety of systems. Since the users’ data spaces are scattered among those independent systems, data sharing becomes a challenging problem. Distributed search and recommendation provides a general solution for data sharing and among its various alternatives, gossip-based approaches are particularly interesting as they provide scalability, dynamicity, autonomy and decentralized control. Generally, in these approaches each participant maintains a cluster of “relevant” users, which are later employed in query processing. However, as we show in the paper, only considering relevance in the construction of the cluster introduces a significant amount of redundancy among users, which in turn leads to reduced recall. Indeed, when a query is submitted, due to the high similarity among the users in a cluster, the probability of retrieving the same set of relevant items increases, thus limiting the amount of distinct results that can be obtained.In this paper, we propose a gossip-based search and recommendation approach that is based on diversity-based clustering scores. We present the resultant new gossip-based clustering algorithms and validate them through experimental evaluation over four real datasets, based on MovieLens-small, MovieLens, LastFM and Delicious. Compared with state of the art solutions, we show that taking into account diversity-based clustering score enables to obtain major gains in terms of recall while reducing the number of users involved during query processing.
Maximilien Servajean, Esther Pacitti, Miguel Liroz-Gistau, Sihem Amer-Yahia, Amr El Abbadi
Hypothesis Discovery Exploiting Closed Chains of Relations
Abstract
The ever-growing literature in biomedicine makes it virtually impossible for individuals to grasp all the information relevant to their interests. Since even experts’ knowledge is limited, important associations among key biomedical concepts may remain unnoticed in the flood of information. Discovering those hidden associations is called hypothesis discovery or literature-based discovery. This paper propose an approach to this problem taking advantage of a closed, triangular chain of relations extracted from the existing literature. We consider such chains of relations as implicit rules to generate explanatory hypotheses. The hypotheses generated from the implicit rules are then compared with newer knowledge for assessing their validity and, if validated, they are served as positive examples for learning a regression model to rank hypotheses. As a proof of concept, the proposed framework is empirically evaluated on real-world knowledge extracted from the biomedical literature. The results demonstrate that the framework is able to produce legitimate hypotheses and that the proposed ranking approach is more effective than the previous work.
Kazuhiro Seki
An Analysis of Variance-Based Methods for Data Aggregation in Periodic Sensor Networks
Abstract
Given the vast area to be covered and the random deployment of the sensors, wireless sensor networks (WSNs) require scalable architecture and management strategies. In addition, sensors are usually powered by small batteries which are not always practical to recharge or replace. Hence, designing an efficient architecture and data management strategy for the sensor network are important to extend its lifetime. In this paper, we propose energy efficient two-level data aggregation technique based on clustering architecture with which data is sent periodically from nodes to their appropriate Cluster-Heads (CHs). The first level of data aggregation is applied at the node itself to eliminate redundancy from the collected raw data while the CH searches, at the second level, nodes that generate redundant data sets based on the variance study with three different Anova tests. Our proposed approach is validated via experiments on real sensor data and comparison with other existing data aggregation techniques.
Hassan Harb, Abdallah Makhoul, David Laiymani, Oussama Bazzi, Ali Jaber
Backmatter
Metadaten
Titel
Transactions on Large-Scale Data- and Knowledge-Centered Systems XXII
herausgegeben von
Abdelkader Hameurlain
Josef Küng
Roland Wagner
Copyright-Jahr
2015
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-662-48567-5
Print ISBN
978-3-662-48566-8
DOI
https://doi.org/10.1007/978-3-662-48567-5

Neuer Inhalt