Skip to main content

2013 | Buch

Transactions on Large-Scale Data- and Knowledge-Centered Systems IX

herausgegeben von: Abdelkader Hameurlain, Josef Küng, Roland Wagner

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments. This, the ninth issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains five revised selected regular papers focusing on the following topics: top-k query processing in P2P systems, self-stabilizing consensus average algorithms in distributed sensor networks, recoverable encryption schemes, xml data in a multi-system environment, and pairwise similarity for cluster ensemble problems.

Inhaltsverzeichnis

Frontmatter
As-Soon-As-Possible Top-k Query Processing in P2P Systems
Abstract
Top-k query processing techniques provide two main advantages for unstructured peer-to-peer (P2P) systems. First they avoid overwhelming users with too many results. Second they reduce significantly network resources consumption. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query. As a result, query response time is dominated by the slowest queried peer. In this paper, we address this users’ waiting time problem. For this, we revisit top-k query processing in P2P systems by introducing two novel notions in addition to response time: the stabilization time and the cumulative quality gap. Using these notions, we formally define the as-soon-as-possible (ASAP) top-k processing problem. Then, we propose a family of algorithms called ASAP to deal with this problem. We validate our solution through implementation and extensive experimentation. The results show that ASAP significantly outperforms baseline algorithms by returning final top-k result to users in much better times.
William Kokou Dédzoé, Philippe Lamarre, Reza Akbarinia, Patrick Valduriez
Self-stabilizing Consensus Average Algorithm in Distributed Sensor Networks
Abstract
One important issue in sensor networks that has received renewed interest recently is average consensus, i.e., computing the average of n sensor measurements, where nodes iteratively exchange data with their neighbors and update their own data accordingly until reaching convergence to the right parameters estimate. In this paper, we introduce an efficient self-stabilizing algorithm to achieve/ensure the convergence of node states to the average of the initial measurements of the network. We prove that the convergence of the fusion process is finite and express an upper bound of the actual number of moves/iterations required by the algorithm. This means that our algorithm is guaranteed to reach a stable situation where no load will be sent from one sensor node to another. We also prove that the load difference between any two sensor nodes in the network is within \(\frac{\varepsilon}{D}\times\left\lfloor\frac{D+1}{2}\right\rfloor<\varepsilon,\) where ε is the prescribed global equilibrium threshold (this threshold is given by the system) and D is the diameter of the network.
Jacques M. Bahi, Mohammed Haddad, Mourad Hakem, Hamamache Kheddouci
Recoverable Encryption through a Noised Secret over a Large Cloud
Abstract
The safety of keys is the Achilles’ heel of cryptography. A key backup at an escrow service lowers the risk of loosing the key, but increases the danger of key disclosure. We propose Recoverable Encryption (RE) schemes that alleviate the dilemma. RE encrypts a backup of the key in a manner that restricts practical recovery by an escrow service to one using a large cloud. For example, a cloud with ten thousand nodes could recover a key in at most 10 minutes with an average recovery time of five minutes. A recovery attempt at the escrow agency, using a small cluster, would require seventy days with an average of thirty five days. Large clouds have become available even to private persons, but their pay-for-use structure makes their use for illegal purposes too dangerous. We show the feaibility of two RE schemes and give conditions for their deployment.
Sushil Jajodia, Witold Litwin, Thomas Schwarz SJ
Conservative Type Extensions for XML Data
Abstract
We introduce a method for building a minimal XML type (belonging to standard class of regular tree grammars) as an extension of other given types. Not only do we propose an easy-to-handle XML type evolution method, but we prove that this method computes the smallest extension of a given tree grammar, respecting pre-established constraints. We also adapt our technique to an interactive context, where an advised user is guided to build a new XML type from existing ones. A basic prototype of our tool is implemented.
Jacques Chabin, Mirian Halfeld Ferrari, Martin A. Musicante, Pierre Réty
Pairwise Similarity for Cluster Ensemble Problem: Link-Based and Approximate Approaches
Abstract
Cluster ensemble methods have emerged as powerful techniques, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. In particular, link-based similarity techniques have recently been introduced with superior performance to the conventional co-association method. Their potential and applicability are, however limited due to the underlying time complexity. In light of such shortcoming, this paper presents two approximate approaches that mitigate the problem of time complexity: the approximate algorithm approach (Approximate SimRank Based Similarity matrix) and the approximate data approach (Prototype-based cluster ensemble model). The first approach involves decreasing the computational requirement of the existing link-based technique; the second reduces the size of the problem by finding a smaller, representative, approximate dataset, derived by a density-biased sampling technique. The advantages of both approximate approaches are empirically demonstrated over 22 datasets (both artificial and real data) and statistical comparisons of performance (with 95% confidence level) with three well-known validity criteria. Results obtained from these experiments suggest that approximate techniques can efficiently help scaling up the application of link-based similarity methods to wider range of data sizes.
Natthakan Iam-On, Tossapon Boongoen
Backmatter
Metadaten
Titel
Transactions on Large-Scale Data- and Knowledge-Centered Systems IX
herausgegeben von
Abdelkader Hameurlain
Josef Küng
Roland Wagner
Copyright-Jahr
2013
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-40069-8
Print ISBN
978-3-642-40068-1
DOI
https://doi.org/10.1007/978-3-642-40069-8