Skip to main content
main-content

Über dieses Buch

The LNCS journal Transactions on Large-Scale Data- and Knowledge-Centered Systems focuses on data management, knowledge discovery, and knowledge processing, which are core and hot topics in computer science. Since the 1990s, the Internet has become the main driving force behind application development in all domains. An increase in the demand for resource sharing across different sites connected through networks has led to an evolution of data- and knowledge-management systems from centralized systems to decentralized systems enabling large-scale distributed applications providing high scalability. Current decentralized systems still focus on data and knowledge as their main resource. Feasibility of these systems relies basically on P2P (peer-to-peer) techniques and the support of agent systems with scaling and decentralized control. Synergy between grids, P2P systems, and agent technologies is the key to data- and knowledge-centered systems in large-scale environments.

This, the 42nd issue of Transactions on Large-Scale Data- and Knowledge-Centered Systems, consists of five revised selected regular papers, presenting the following topics: Privacy-Preserving Top-k Query Processing in Distributed Systems; Trust Factors and Insider Threats in Permissioned Distributed Ledgers: An Analytical Study and Evaluation of Popular DLT Frameworks; Polystore and Tensor Data Model for Logical Data Independence and Impedance Mismatch in Big Data Analytics; A General Framework for Multiple Choice Question Answering Based on Mutual Information and Reinforced Co-occurrence; Rejig: A Scalable Online Algorithm for Cache Server Configuration Changes.

Inhaltsverzeichnis

Frontmatter

Privacy-Preserving Top-k Query Processing in Distributed Systems

Abstract
We consider a distributed system that stores user sensitive data across multiple nodes. In this context, we address the problem of privacy-preserving top-k query processing. We propose a novel system, called SD-TOPK, which is able to evaluate top-k queries over encrypted distributed data without needing to decrypt the data in the nodes where they are stored. We implemented and evaluated our system over synthetic and real databases. The results show excellent performance for SD-TOPK compared to baseline approaches.
Sakina Mahboubi, Reza Akbarinia, Patrick Valduriez

Trust Factors and Insider Threats in Permissioned Distributed Ledgers

An Analytical Study and Evaluation of Popular DLT Frameworks
Abstract
Permissioned distributed ledgers have recently captured the attention of organizations looking to improve efficiency, transparency and auditability in value network operations. Often the technology is regarded as trustless or trust-free, resulting in a false sense of security. In this work, we review the various trust factors present in distributed ledger systems. We analyze the different groups of trust actors and their trust relationships to the software layers and inherent components of distributed ledgers. Based on these analyses, we investigate how insiders may conduct attacks based on trust in distributed ledger components. To verify practical feasiblity of these attack vectors, we conduct a technical study with four popular permissioned distributed ledger frameworks: Hyperledger Fabric, Hyperledger Sawtooth, Ethereum and R3 Corda. Finally, we highlight options for mitigation of these threats.
Benedikt Putz, Günther Pernul

Polystore and Tensor Data Model for Logical Data Independence and Impedance Mismatch in Big Data Analytics

Abstract
This paper presents a Tensor based Data Model (TDM) for polystore systems meant to address two major closely related issues in big data analytics architectures, namely logical data independence and data impedance mismatch. The TDM is an expressive model that subsumes traditional data models, it allows to link different data models of various data stores, and which also facilitates data transformations by using operators with clearly defined semantics. Our contribution is twofold. Firstly, it is the addition of the notion of a schema for the tensor mathematical object using typed associative arrays. Secondly, it is the definition of a set of operators to manipulate data through the TDM. In order to validate our approach we first show how our TDM model is inserted into a given polystore architecture. We then describe some use cases of real analyses using our TDM and its operators in the context of the French Presidential Election in 2017.
Éric Leclercq, Annabelle Gillet, Thierry Grison, Marinette Savonnet

A General Framework for Multiple Choice Question Answering Based on Mutual Information and Reinforced Co-occurrence

Abstract
As a result of the continuously growing volume of information available, browsing and querying of textual information in search of specific facts is currently a tedious task exacerbated by a reality where data presentation very often does not meet the needs of users. To satisfy these ever-increasing needs, we have designed an solution to provide an adaptive and intelligent solution for the automatic answer of multiple-choice questions based on the concept of mutual information. An empirical evaluation over a number of general-purpose benchmark datasets seems to indicate that this solution is promising.
Jorge Martinez-Gil, Bernhard Freudenthaler, A Min Tjoa

Rejig: A Scalable Online Algorithm for Cache Server Configuration Changes

Abstract
A cache server configuration describes an assignment of data fragments to cache manager instances (CMIs). A load balancer may change this assignment by migrating fragments from one CMI to another. Similarly, an auto-scaling component may change the assignment by either inserting or removing CMIs in response to load fluctuations. These changes may generate stale cache entries. Rejig is a scalable online algorithm that manages configuration changes while providing read-after-write consistency. It is novel for several reasons. First, it allows for a subset of its clients and CMIs to use different configurations. Second, its client components propagate configuration changes to one another on demand and by using CMIs. This enables Rejig to scale and support diverse application classes including trusted mobile clients accessing the caching layer. When clients have intermittent network connectivity, Rejig detects if their cached configurations may result in stale data and updates them to the latest with no performance impact on either the CMIs or other clients. Rejig’s overhead is in the form of 4 extra bytes of memory per cache entry and 4 extra bytes of the network bandwidth per request from a client to a CMI.
Shahram Ghandeharizadeh, Marwan Almaymoni, Haoyu Huang

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise