Transactions on Large-Scale Data- and Knowledge-Centered Systems IX

herausgegeben von: Abdelkader Hameurlain, Josef Küng, Roland Wagner

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments. This, the ninth issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains five revised selected regular papers focusing on the following topics: top-k query processing in P2P systems, self-stabilizing consensus average algorithms in distributed sensor networks, recoverable encryption schemes, xml data in a multi-system environment, and pairwise similarity for cluster ensemble problems.

Inhaltsverzeichnis

Frontmatter

As-Soon-As-Possible Top-k Query Processing in P2P Systems

Abstract

Top-k query processing techniques provide two main advantages for unstructured peer-to-peer (P2P) systems. First they avoid overwhelming users with too many results. Second they reduce significantly network resources consumption. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query. As a result, query response time is dominated by the slowest queried peer. In this paper, we address this users’ waiting time problem. For this, we revisit top-k query processing in P2P systems by introducing two novel notions in addition to response time: the stabilization time and the cumulative quality gap. Using these notions, we formally define the as-soon-as-possible (ASAP) top-k processing problem. Then, we propose a family of algorithms called ASAP to deal with this problem. We validate our solution through implementation and extensive experimentation. The results show that ASAP significantly outperforms baseline algorithms by returning final top-k result to users in much better times.

William Kokou Dédzoé, Philippe Lamarre, Reza Akbarinia, Patrick Valduriez

Self-stabilizing Consensus Average Algorithm in Distributed Sensor Networks

Abstract

One important issue in sensor networks that has received renewed interest recently is average consensus, i.e., computing the average of n sensor measurements, where nodes iteratively exchange data with their neighbors and update their own data accordingly until reaching convergence to the right parameters estimate. In this paper, we introduce an efficient self-stabilizing algorithm to achieve/ensure the convergence of node states to the average of the initial measurements of the network. We prove that the convergence of the fusion process is finite and express an upper bound of the actual number of moves/iterations required by the algorithm. This means that our algorithm is guaranteed to reach a stable situation where no load will be sent from one sensor node to another. We also prove that the load difference between any two sensor nodes in the network is within \(\frac{\varepsilon}{D}\times\left\lfloor\frac{D+1}{2}\right\rfloor<\varepsilon,\) where ε is the prescribed global equilibrium threshold (this threshold is given by the system) and D is the diameter of the network.

Jacques M. Bahi, Mohammed Haddad, Mourad Hakem, Hamamache Kheddouci

Recoverable Encryption through a Noised Secret over a Large Cloud

Abstract

The safety of keys is the Achilles’ heel of cryptography. A key backup at an escrow service lowers the risk of loosing the key, but increases the danger of key disclosure. We propose Recoverable Encryption (RE) schemes that alleviate the dilemma. RE encrypts a backup of the key in a manner that restricts practical recovery by an escrow service to one using a large cloud. For example, a cloud with ten thousand nodes could recover a key in at most 10 minutes with an average recovery time of five minutes. A recovery attempt at the escrow agency, using a small cluster, would require seventy days with an average of thirty five days. Large clouds have become available even to private persons, but their pay-for-use structure makes their use for illegal purposes too dangerous. We show the feaibility of two RE schemes and give conditions for their deployment.

Sushil Jajodia, Witold Litwin, Thomas Schwarz SJ

Conservative Type Extensions for XML Data

Abstract

We introduce a method for building a minimal XML type (belonging to standard class of regular tree grammars) as an extension of other given types. Not only do we propose an easy-to-handle XML type evolution method, but we prove that this method computes the smallest extension of a given tree grammar, respecting pre-established constraints. We also adapt our technique to an interactive context, where an advised user is guided to build a new XML type from existing ones. A basic prototype of our tool is implemented.

Jacques Chabin, Mirian Halfeld Ferrari, Martin A. Musicante, Pierre Réty

Pairwise Similarity for Cluster Ensemble Problem: Link-Based and Approximate Approaches

Abstract

Cluster ensemble methods have emerged as powerful techniques, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. In particular, link-based similarity techniques have recently been introduced with superior performance to the conventional co-association method. Their potential and applicability are, however limited due to the underlying time complexity. In light of such shortcoming, this paper presents two approximate approaches that mitigate the problem of time complexity: the approximate algorithm approach (Approximate SimRank Based Similarity matrix) and the approximate data approach (Prototype-based cluster ensemble model). The first approach involves decreasing the computational requirement of the existing link-based technique; the second reduces the size of the problem by finding a smaller, representative, approximate dataset, derived by a density-biased sampling technique. The advantages of both approximate approaches are empirically demonstrated over 22 datasets (both artificial and real data) and statistical comparisons of performance (with 95% confidence level) with three well-known validity criteria. Results obtained from these experiments suggest that approximate techniques can efficiently help scaling up the application of link-based similarity methods to wider range of data sizes.

Natthakan Iam-On, Tossapon Boongoen

Backmatter

Titel: Transactions on Large-Scale Data- and Knowledge-Centered Systems IX
herausgegeben von: Abdelkader Hameurlain
Josef Küng
Roland Wagner
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-40069-8
Print ISBN: 978-3-642-40068-1
DOI: https://doi.org/10.1007/978-3-642-40069-8