skip to main content
10.1145/3221269acmotherconferencesBook PagePublication PagesssdbmConference Proceedingsconference-collections
SSDBM '18: Proceedings of the 30th International Conference on Scientific and Statistical Database Management
ACM2018 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
SSDBM '18: 30th International Conference on Scientific and Statistical Database Management Bozen-Bolzano Italy July 9 - 11, 2018
ISBN:
978-1-4503-6505-5
Published:
09 July 2018

Bibliometrics
Skip Abstract Section
Abstract

The International Conference on Scientific and Statistical Database Management (SSDBM) brings together scientific domain experts, database researchers, practitioners, and developers for the presentation and exchange of current research results on concepts, tools, and techniques for scientific and statistical database applications. SSDBM 2018 continues the tradition of past SSDBM conferences in providing a stimulating environment to encourage discussion, fellowship and exchange of ideas in all aspects of research related to scientific and statistical data management. The conference is hosted by the Free University of Bozen-Bolzano, Italy, from July 9--11, 2018.

research-article
Metadata-driven error detection

Scientific data often originates from multiple sources and human agents. The integration of data from different sources must also resolve data quality problems that might occur because of inconsistency or different quality assurance levels of the ...

research-article
Towards meaningful distance-preserving encryption

Mining complex data is an essential and at the same time challenging task. Therefore, organizations pass on their encrypted data to service providers carrying out such analyses. Thus, encryption must preserve the mining results. Many mining algorithms ...

research-article
SBG-sketch: a self-balanced sketch for labeled-graph stream summarization

Applications in various domains rely on processing graph streams, e.g., communication logs of a cloud-troubleshooting system, road-network traffic updates, and interactions on a social network. A labeled-graph stream refers to a sequence of streamed ...

research-article
Multidimensional range queries on modern hardware

Range queries over multidimensional data are an important part of database workloads in many applications. Their execution may be accelerated by using multidimensional index structures (MDIS), such as kd-trees or R-trees. As for most index structures, ...

research-article
Massively-parallel break detection for satellite data

The field of remote sensing is nowadays faced with huge amounts of data. While this offers a variety of exciting research opportunities, it also yields significant challenges regarding both computation time and space requirements. In practice, the sheer ...

research-article
Declarative cartography under fine-grained access control

Visualization of spatial data is of increasing importance in science and society, but opens up justified concerns about data privacy and security. A classic methodology for cartography through generalization is data selection; however, data selection ...

research-article
COMPASS: compact array storage with value index

Efficient array storage is the backbone of scientific data processing. With an explosion of data, rapidly answering queries on array data is becoming increasingly important. Although most of the array storages today support subsetting of an array based ...

research-article
TIPP: parallel Delaunay triangulation for large-scale datasets

Because of the importance of Delaunay Triangulation in science and engineering, researchers have devoted extensive attention to parallelizing this fundamental algorithm. However, generating unstructured meshes for extremely large point sets remains a ...

research-article
Learning interesting attributes for automated data categorization

This work proposes and evaluates a novel approach to determining interesting attributes, in order to categorize entities accordingly. Once identified, such categories are of immense value to allow constraining (filtering) a user's current view to ...

research-article
Numerically stable parallel computation of (co-)variance
Article No.: 10, pp 1–12https://doi.org/10.1145/3221269.3223036

With the advent of big data, we see an increasing interest in computing correlations in huge data sets with both many instances and many variables. Essential descriptive statistics such as the variance, standard deviation, covariance, and correlation ...

research-article
A unified framework of density-based clustering for semi-supervised classification
Article No.: 11, pp 1–12https://doi.org/10.1145/3221269.3223037

Semi-supervised classification is drawing increasing attention in the era of big data, as the gap between the abundance of cheap, automatically collected unlabeled data and the scarcity of labeled data that are laborious and expensive to obtain is ...

research-article
Finding shortest keyword covering routes in road networks
Article No.: 12, pp 1–12https://doi.org/10.1145/3221269.3223038

Millions of users rely on navigation applications to compute an optimal route for their trips. The basic functionality of these applications is to find the minimum cost route between a source and target node in the transportation network. In this paper, ...

research-article
ERMrest: a web service for collaborative data management
Article No.: 13, pp 1–12https://doi.org/10.1145/3221269.3222333

The foundation of data oriented scientific collaboration is the ability for participants to find, access and reuse data created during the course of an investigation, what has been referred to as the FAIR principles. In this paper, we describe ERMrest, ...

research-article
Publishing spatial histograms under differential privacy
Article No.: 14, pp 1–12https://doi.org/10.1145/3221269.3223039

Studying trajectories of individuals has received growing interest. The aggregated movement behaviour of people provides important insights about their habits, interests, and lifestyles. Understanding and utilizing trajectory data is a crucial part of ...

research-article
Public Access
GeoSparkViz: a scalable geospatial data visualization framework in the apache spark ecosystem
Article No.: 15, pp 1–12https://doi.org/10.1145/3221269.3223040

Data Visualization allows users to summarize, analyze and reason about data. A map visualization tool first loads the designated geospatial data, processes the data and then applies the map visualization effect. Guaranteeing detailed and accurate ...

research-article
Efficient anti-community detection in complex networks
Article No.: 16, pp 1–12https://doi.org/10.1145/3221269.3221289

Modeling the relations between the components of complex systems as networks of vertices and edges is a commonly used method in many scientific disciplines that serves to obtain a deeper understanding of the systems themselves. In particular, the ...

research-article
Selecting representative and diverse spatio-textual posts over sliding windows
Article No.: 17, pp 1–12https://doi.org/10.1145/3221269.3221290

Thousands of posts are generated constantly by millions of users in social media, with an increasing portion of this content being geotagged. Keeping track of the whole stream of this spatio-textual content can easily become overwhelming for the user. ...

research-article
NoSingles: a space-efficient algorithm for influence maximization
Article No.: 18, pp 1–12https://doi.org/10.1145/3221269.3221291

Algorithmic problems of computing influence estimation and influence maximization have been actively researched for decades. We developed a novel algorithm, NoSingles, based on the Reverse Influence Sampling method proposed by Borgs et al. in 2013. ...

research-article
Order-independent constraint-based causal structure learning for gaussian distribution models using GPUs
Article No.: 19, pp 1–10https://doi.org/10.1145/3221269.3221292

Learning the causal structures in high-dimensional datasets allows deriving advanced insights from observational data, thus creating the potential for new applications. One crucial limitation of state-of-the-art methods for learning causal relationships,...

research-article
Feature-based comparison and generation of time series
Article No.: 20, pp 1–12https://doi.org/10.1145/3221269.3221293

For more than three decades, researchers have been developping generation methods for the weather, energy, and economic domain. These methods provide generated datasets for reasons like system evaluation and data availability. However, despite the ...

research-article
Public Access
Point pattern search in big data
Article No.: 21, pp 1–12https://doi.org/10.1145/3221269.3221294

Consider a set of points P in space with at least some of the pairwise distances specified. Given this set P, consider the following three kinds of queries against a database D of points : (i) pure constellation query: find all sets S in D of size |P| ...

research-article
Distributed caching for processing raw arrays
Article No.: 22, pp 1–12https://doi.org/10.1145/3221269.3221295

As applications continue to generate multi-dimensional data at exponentially increasing rates, fast analytics to extract meaningful results is becoming extremely important. The database community has developed array databases that alleviate this problem ...

research-article
GPU-based parallel indexing for concurrent spatial query processing
Article No.: 23, pp 1–12https://doi.org/10.1145/3221269.3221296

In most spatial database applications, the input data is very large. Previous work has shown the importance of using spatial indexing and parallel computing to speed up such tasks. In recent years, GPUs have become a mainstream platform for massively ...

short-paper
Optimizer time estimation for SQL queries

Predicting the amount of time a SQL query takes to execute can help in prioritizing, optimizing and scheduling the query execution. This also helps inoptimal utilization of hardware resources. The total execution time of a query can be split into the ...

short-paper
Scheduling data-intensive scientific workflows with reduced communication

Data-intensive scientific workflows, typically modelled by directed acyclic graphs, consist of inter-dependent tasks that exchange significant amounts of data and are executed on parallel/distributed clusters. However, the energy or monetary costs ...

short-paper
PARADISO: an interactive approach of parameter selection for the mean shift algorithm

Many algorithms have been developed for detecting clusters of various kinds over the past decades. However, just few attempts have been made to provide an interactive setting for the clustering algorithms. In this paper, we present PARADISO, an ...

short-paper
Towards an efficient and effective framework for the evolution of scientific databases

Database systems are well suited to scientific data management and analysis workloads, however, a database must evolve to keep pace with changing requirements and adjust to changes in the domain conceptualization as applications mature. Evolving a ...

short-paper
Public Access
Maximizing area-range sum for spatial shapes (MAxRS3)

We investigate a novel variant of the well-known MaxRS (Maximizing Range Sum) problem - namely, the MAxRS3 (Maximizing Area-Range Sum for Spatial Shapes). The MaxRS problem amounts to detecting a location where a fixed-size rectangle R should be placed, ...

demonstration
PathGraph: querying and exploring big data graphs

With the widespread diffusion of social networks and the dawn of data-intensive scientific applications, graphs became one of the foundations for modern data management applications. A key role in graph querying and analysis is played by Regular Path ...

demonstration
Crossing an OCEAN of queries: analyzing SQL query logs with OCEANLog

SQL queries encapsulate the knowledge of their authors about the usage of the queried data sources. This knowledge also contains aspects that cannot be inferred by analyzing the contents of the queried data sources alone. Due to the complexity of ...

Contributors
  • Université Libre de Bruxelles
  • Free University of Bozen-Bolzano
  • University of Zurich

Index Terms

  1. Proceedings of the 30th International Conference on Scientific and Statistical Database Management
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Acceptance Rates

        SSDBM '18 Paper Acceptance Rate30of75submissions,40%Overall Acceptance Rate56of146submissions,38%
        YearSubmittedAcceptedRate
        SSDBM '18753040%
        SSDBM '14712637%
        Overall1465638%