scroll identifier for mobile
main-content

## Über dieses Buch

The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments. This, the ninth issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, contains five revised selected regular papers focusing on the following topics: top-k query processing in P2P systems, self-stabilizing consensus average algorithms in distributed sensor networks, recoverable encryption schemes, xml data in a multi-system environment, and pairwise similarity for cluster ensemble problems.

## Inhaltsverzeichnis

### As-Soon-As-Possible Top-k Query Processing in P2P Systems

Top-k query processing techniques provide two main advantages for unstructured peer-to-peer (P2P) systems. First they avoid overwhelming users with too many results. Second they reduce significantly network resources consumption. However, existing approaches suffer from long waiting times. This is because top-k results are returned only when all queried peers have finished processing the query. As a result, query response time is dominated by the slowest queried peer. In this paper, we address this users’ waiting time problem. For this, we revisit top-k query processing in P2P systems by introducing two novel notions in addition to response time: the stabilization time and the cumulative quality gap. Using these notions, we formally define the as-soon-as-possible (ASAP) top-k processing problem. Then, we propose a family of algorithms called ASAP to deal with this problem. We validate our solution through implementation and extensive experimentation. The results show that ASAP significantly outperforms baseline algorithms by returning final top-k result to users in much better times.
William Kokou Dédzoé, Philippe Lamarre, Reza Akbarinia, Patrick Valduriez

### Self-stabilizing Consensus Average Algorithm in Distributed Sensor Networks

One important issue in sensor networks that has received renewed interest recently is average consensus, i.e., computing the average of n sensor measurements, where nodes iteratively exchange data with their neighbors and update their own data accordingly until reaching convergence to the right parameters estimate. In this paper, we introduce an efficient self-stabilizing algorithm to achieve/ensure the convergence of node states to the average of the initial measurements of the network. We prove that the convergence of the fusion process is finite and express an upper bound of the actual number of moves/iterations required by the algorithm. This means that our algorithm is guaranteed to reach a stable situation where no load will be sent from one sensor node to another. We also prove that the load difference between any two sensor nodes in the network is within $$\frac{\varepsilon}{D}\times\left\lfloor\frac{D+1}{2}\right\rfloor<\varepsilon,$$ where ε is the prescribed global equilibrium threshold (this threshold is given by the system) and D is the diameter of the network.

### Recoverable Encryption through a Noised Secret over a Large Cloud

The safety of keys is the Achilles’ heel of cryptography. A key backup at an escrow service lowers the risk of loosing the key, but increases the danger of key disclosure. We propose Recoverable Encryption (RE) schemes that alleviate the dilemma. RE encrypts a backup of the key in a manner that restricts practical recovery by an escrow service to one using a large cloud. For example, a cloud with ten thousand nodes could recover a key in at most 10 minutes with an average recovery time of five minutes. A recovery attempt at the escrow agency, using a small cluster, would require seventy days with an average of thirty five days. Large clouds have become available even to private persons, but their pay-for-use structure makes their use for illegal purposes too dangerous. We show the feaibility of two RE schemes and give conditions for their deployment.
Sushil Jajodia, Witold Litwin, Thomas Schwarz SJ

### Conservative Type Extensions for XML Data

We introduce a method for building a minimal XML type (belonging to standard class of regular tree grammars) as an extension of other given types. Not only do we propose an easy-to-handle XML type evolution method, but we prove that this method computes the smallest extension of a given tree grammar, respecting pre-established constraints. We also adapt our technique to an interactive context, where an advised user is guided to build a new XML type from existing ones. A basic prototype of our tool is implemented.
Jacques Chabin, Mirian Halfeld Ferrari, Martin A. Musicante, Pierre Réty

### Pairwise Similarity for Cluster Ensemble Problem: Link-Based and Approximate Approaches

Cluster ensemble methods have emerged as powerful techniques, aggregating several input data clusterings to generate a single output clustering, with improved robustness and stability. In particular, link-based similarity techniques have recently been introduced with superior performance to the conventional co-association method. Their potential and applicability are, however limited due to the underlying time complexity. In light of such shortcoming, this paper presents two approximate approaches that mitigate the problem of time complexity: the approximate algorithm approach (Approximate SimRank Based Similarity matrix) and the approximate data approach (Prototype-based cluster ensemble model). The first approach involves decreasing the computational requirement of the existing link-based technique; the second reduces the size of the problem by finding a smaller, representative, approximate dataset, derived by a density-biased sampling technique. The advantages of both approximate approaches are empirically demonstrated over 22 datasets (both artificial and real data) and statistical comparisons of performance (with 95% confidence level) with three well-known validity criteria. Results obtained from these experiments suggest that approximate techniques can efficiently help scaling up the application of link-based similarity methods to wider range of data sizes.
Natthakan Iam-On, Tossapon Boongoen

### Backmatter

Weitere Informationen

## BranchenIndex Online

Die B2B-Firmensuche für Industrie und Wirtschaft: Kostenfrei in Firmenprofilen nach Lieferanten, Herstellern, Dienstleistern und Händlern recherchieren.

## Whitepaper

- ANZEIGE -

### Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung

Unternehmen haben das Innovationspotenzial der eigenen Mitarbeiter auch außerhalb der F&E-Abteilung erkannt. Viele Initiativen zur Partizipation scheitern in der Praxis jedoch häufig. Lesen Sie hier  - basierend auf einer qualitativ-explorativen Expertenstudie - mehr über die wesentlichen Problemfelder der mitarbeiterzentrierten Produktentwicklung und profitieren Sie von konkreten Handlungsempfehlungen aus der Praxis.