Skip to main content

The VLDB Journal OnlineFirst articles

28.03.2024 | Regular Paper

Hilogx: noise-aware log-based anomaly detection with human feedback

Log-based anomaly detection is essential for maintaining system reliability. Although existing log-based anomaly detection approaches perform well in certain experimental systems, they are ineffective in real-world industrial systems with noisy …

verfasst von:
Tong Jia, Ying Li, Yong Yang, Gang Huang

27.03.2024 | Regular Paper

MM-DIRECT

Main memory database instant recovery with tuple consistent checkpoint

Main memory databases (MMDBs) technology handles the primary database in Random Access Memory (RAM) to provide high throughput and low latency. However, volatile memory makes MMDBs much more sensitive to system failures. The contents of the …

verfasst von:
Arlino Magalhaes, Angelo Brayner, Jose Maria Monteiro

15.03.2024 | Regular Paper

How good are machine learning clouds? Benchmarking two snapshots over 5 years

We conduct an empirical study of machine learning functionalities provided by major cloud service providers, which we call machine learning clouds. Machine learning clouds hold the promise of hiding all the sophistication of running large-scale …

verfasst von:
Jiawei Jiang, Yi Wei, Yu Liu, Wentao Wu, Chuang Hu, Zhigao Zheng, Ziyi Zhang, Yingxia Shao, Ce Zhang

28.02.2024 | Regular Paper

Refiner: a reliable and efficient incentive-driven federated learning system powered by blockchain

Federated learning (FL) enables learning a model from data distributed across numerous workers while preserving data privacy. However, the classical FL technique is designed for Web2 applications where participants are trusted to produce correct …

verfasst von:
Hong Lin, Ke Chen, Dawei Jiang, Lidan Shou, Gang Chen

22.02.2024 | Editorial

Special issue: modern hardware

verfasst von:
Norman May, Spyros Blanas, Danica Porobic

20.02.2024 | Regular Paper

Ingress: an automated incremental graph processing system

The graph data keep growing over time in real life. The ever-growing amount of dynamic graph data demands efficient techniques of incremental graph computation. However, incremental graph algorithms are challenging to develop. Existing approaches …

verfasst von:
Shufeng Gong, Chao Tian, Qiang Yin, Zhengdong Wang, Song Yu, Yanfeng Zhang, Wenyuan Yu, Liang Geng, Chong Fu, Ge Yu, Jingren Zhou

16.02.2024 | Special Issue Paper

Speech-to-SQL: toward speech-driven SQL query generation from natural language question

Speech-based inputs have been gaining significant momentum with the popularity of smartphones and tablets in our daily lives, since voice is the most popular and efficient way for human–computer interaction. This paper works toward designing more …

verfasst von:
Yuanfeng Song, Raymond Chi-Wing Wong, Xuefang Zhao

15.02.2024 | Regular Paper

A new distributional treatment for time series anomaly detection

Time series is traditionally treated with two main approaches, i.e., the time domain approach and the frequency domain approach. These approaches must rely on a sliding window so that time-shift versions of a sequence can be measured to be …

verfasst von:
Kai Ming Ting, Zongyou Liu, Lei Gong, Hang Zhang, Ye Zhu

Open Access 13.02.2024 | Special Issue Paper

Assisted design of data science pipelines

When designing data science (DS) pipelines, end-users can get overwhelmed by the large and growing set of available data preprocessing and modeling techniques. Intelligent discovery assistants (IDAs) and automated machine learning (AutoML) …

verfasst von:
Sergey Redyuk, Zoi Kaoudi, Sebastian Schelter, Volker Markl

Open Access 13.02.2024 | Special Issue Paper

A learning-based framework for spatial join processing: estimation, optimization and tuning

The importance and complexity of spatial join operation resulted in the availability of many join algorithms, some of which are tailored for big-data platforms like Hadoop and Spark. The choice among them is not trivial and depends on different …

verfasst von:
Tin Vu, Alberto Belussi, Sara Migliorini, Ahmed Eldawy

12.02.2024 | Regular Paper

Time series data encoding in Apache IoTDB: comparative analysis and recommendation

Not only the vast applications but also the distinct features of time series data stimulate the booming growth of time series database management systems, such as Apache IoTDB, InfluxDB, OpenTSDB and so on. Almost all these systems employ columnar …

verfasst von:
Tianrui Xia, Jinzhao Xiao, Yuxiang Huang, Changyu Hu, Shaoxu Song, Xiangdong Huang, Jianmin Wang

25.01.2024 | Regular Paper

Sub-trajectory clustering with deep reinforcement learning

Sub-trajectory clustering is a fundamental problem in many trajectory applications. Existing approaches usually divide the clustering procedure into two phases: segmenting trajectories into sub-trajectories and then clustering these …

verfasst von:
Anqi Liang, Bin Yao, Bo Wang, Yinpei Liu, Zhida Chen, Jiong Xie, Feifei Li

Open Access 25.01.2024 | Regular Paper

Identifying similar-bicliques in bipartite graphs

Bipartite graphs have been widely used to model the relationship between entities of different types, where vertices are partitioned into two disjoint sets/sides. Finding dense subgraphs in a bipartite graph is of great significance and …

verfasst von:
Kai Yao, Lijun Chang, Jeffrey Xu Yu

Open Access 11.01.2024 | Special Issue Paper

Towards flexibility and robustness of LSM trees

Log-structured merge trees (LSM trees) are increasingly used as part of the storage engine behind several data systems, and are frequently deployed in the cloud. As the number of applications relying on LSM-based storage backends increases, the …

verfasst von:
Andy Huynh, Harshal A. Chaudhari, Evimaria Terzi, Manos Athanassoulis

27.12.2023 | Regular Paper

Scalable decoupling graph neural network with feature-oriented optimization

Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph neural networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up.

verfasst von:
Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, Pengcheng Yin

Open Access 27.12.2023 | Special Issue Paper

DB-BERT: making database tuning tools “read” the manual

DB-BERT is a database tuning tool that exploits information gained via natural language analysis of manuals and other relevant text documents. It uses text to identify database system parameters to tune as well as recommended parameter values.

verfasst von:
Immanuel Trummer

26.12.2023 | Regular Paper

Hypergraph motifs and their extensions beyond binary

Hypergraphs naturally represent group interactions, which are omnipresent in many domains: collaborations of researchers, co-purchases of items, and joint interactions of proteins, to name a few. In this work, we propose tools for answering the …

verfasst von:
Geon Lee, Seokbum Yoon, Jihoon Ko, Hyunju Kim, Kijung Shin

Open Access 22.12.2023 | Special Issue Paper

HPCache: memory-efficient OLAP through proportional caching revisited

Analytical engines rely on in-memory data caching to avoid storage accesses and provide timely responses by keeping the most frequently accessed data in memory. Purely frequency- and time-based caching decisions, however, are a proxy of the …

verfasst von:
Hamish Nicholson, Periklis Chrysogelos, Anastasia Ailamaki

Open Access 19.12.2023 | Regular Paper

A new window Clause for SQL++

Window queries are important analytical tools for ordered data and have been researched both in streaming and stored data environments. By incorporating ideas for window queries from existing streaming and stored data systems, we propose a new …

verfasst von:
James Fang, Dmitry Lychagin, Michael J. Carey, Vassilis J. Tsotras

16.12.2023 | Regular Paper

Label-constrained shortest path query processing on road networks

Computing the shortest path between two vertices is a fundamental problem in road networks. Most of the existing works assume that the edges in the road networks have no labels, but in many real applications, the edges have labels and label …

verfasst von:
Junhua Zhang, Long Yuan, Wentao Li, Lu Qin, Ying Zhang, Wenjie Zhang