Skip to main content
Top

2022 | Book

Algorithms and Architectures for Parallel Processing

21st International Conference, ICA3PP 2021, Virtual Event, December 3–5, 2021, Proceedings, Part III

insite
SEARCH

About this book

The three volume set LNCS 13155, 13156, and 13157 constitutes the refereed proceedings of the 21st International Conference on Algorithms and Architectures for Parallel Processing, ICA3PP 2021, which was held online during December 3-5, 2021.

The total of 145 full papers included in these proceedings were carefully reviewed and selected from 403 submissions. They cover the many dimensions of parallel algorithms and architectures including fundamental theoretical approaches, practical experimental projects, and commercial components and systems.

The papers were organized in topical sections as follows:

Part I, LNCS 13155: Deep learning models and applications; software systems and efficient algorithms; edge computing and edge intelligence; service dependability and security algorithms; data science;

Part II, LNCS 13156: Software systems and efficient algorithms; parallel and distributed algorithms and applications; data science; edge computing and edge intelligence; blockchain systems; deept learning models and applications; IoT;

Part III, LNCS 13157: Blockchain systems; data science; distributed and network-based computing; edge computing and edge intelligence; service dependability and security algorithms; software systems and efficient algorithms.

Table of Contents

Frontmatter

Blockchain Systems

Frontmatter
StateSnap: A State-Aware P2P Storage Network for Blockchain NFT Content Data

Non-Fungible Token (NFT) has gained worldwide attention, relying on their superior data representation, as a solution for describing complex digital assets. However, the scale of NFT content data is so huge that we have to use P2P storage network to store it. Due to the lack of management capabilities of global resource allocation, retrieving content in existing P2P storage network relies heavily on distributed collaboration and cannot support low-latency off-line retrieval. Besides if an accident occurs on the blockchain or storage network that is different from the expectation, such as fork, the data state on-chain and off-chain would may be inconsistent. Therefore, we propose the concept of State for P2P storage network to represent the resource allocation of data and assist in the off-line retrieval of resources. We propose a state-aware model based on the permissioned blockchain and P2P storage network, which core is a data structure called StateSnap. The model supports the verification of the consistency of the on-chain and off-chain data and support State rollback and switching. Through experiments, we show that our model reduces the data retrieval time by about 78 $$\%$$ % compared to the traditional IPFS and performs well in terms of scalability, robustness, and State switch efficiency.

Siqi Feng, Wenquan Li, Lanju Kong, Lijin Liu, Fuqi Jin, Xinping Min
Towards Requester-Provider Bilateral Utility Maximization and Collision Resistance in Blockchain-Based Microgrid Energy Trading

Microgrid is a promising system for coordinating distributed energy in the future Energy Internet, where energy market is crucial for facilitating multi-directional trading. Nevertheless, the traditional solutions for energy trading usually rely on a centralized framework, which is vulnerable to high operation cost and low transparency. To address the problem, the blockchain strategy has been widely deployed in microgrid energy trading. However, existing blockchain-based systems have not formally considered: (i) collusion attacks launched by a pair of dishonest consumer and microgrid; (ii) and the demands of maximizing bilateral utility. Therefore, in this work, we propose a blockchain-based microgrid energy trading system that allows a consumer to dynamically change its service quality quotes for a microgrid. In particular, bilateral utility of requester-provider is almost maximized, with employing a Bayesian Nash equilibrium strategy to model the interaction between consumers and microgrids. Moreover, collusion attacks during the bidding process are certainly prevented from a dishonest pair of consumer and microgrid. To achieve this, we employ a request-based comparable encryption technique to compare two microgrids’ revenues in a privacy-preserving way, in which each microgrid should submit an encrypted expected revenue. In addition, a general security analysis of our system is formally discussed. To illustrate the effectiveness, we take a blockchain-based framework to build the experiment platform, where the structure of nodes and transaction process are well formalized.

Hailun Wang, Kai Zhang, Lifei Wei, Lei Zhang
Evaluating the Parallel Execution Schemes of Smart Contract Transactions in Different Blockchains: An Empirical Study

In order to increase throughput, more and more blockchains begin to provide the ability to execute smart contract transactions in parallel. However, there is currently no research work on evaluating parallel execution schemes in different blockchains, which makes it difficult for developers to find a blockchain technology suitable for their application scenarios. In this paper, we firstly summarize existing parallel execution schemes of smart contract transactions. Then, based on their characteristics, we propose a comprehensive evaluation framework for parallel execution schemes of smart contract transactions and implement a benchmark tool, which can compare different parallel execution schemes and discover their limitations. Finally, we utilize the evaluation framework and the tool to evaluate several excellent blockchains in China and obtain some useful findings.

Jianfeng Shi, Chengzhi Li, Heng Wu, Heran Gao, Songchang Jin, Tao Huang, Wenbo Zhang
Misbehavior Detection in VANET Based on Federated Learning and Blockchain

As an irreversible trend, connected vehicles become increasingly more popular. They depend on the generation and sharing of data between vehicles to improve safety and efficiency of the transportation system. However, due to the open nature of the vehicle network, dishonest and misbehaving vehicles may exist in the vehicular network. Misbehavior detection has been studied using machine learning in recent years. Existing misbehavior detection approaches require network equipment with powerful computing capabilities to constantly train and update sophisticated network models, which reduces the efficiency of the misbehavior detection system due to limited resources and untimely model updates. In this paper, we propose a new federated learning scheme based on blockchain, which can reduce resource utilization while ensuring data security and privacy. Further, we also design a blockchain-based reward mechanism for participants by automatically executing smart contracts. Common data falsification attacks are studied in this paper, and the experimental results show that our proposed scheme is feasible and effective.

Pin Lv, Linyan Xie, Jia Xu, Taoshen Li

Data Science

Frontmatter
ABE-AC4DDS: An Access Control Scheme Based on Attribute-Based Encryption for Data Distribution Service

In response to the security threats faced by distributed real-time applications based on DDS, a fine-grained data access control scheme is proposed, which is based on attribute-based encryption theory and suitable for topic-based publish/subscribe communication model. The scheme takes the topic as the unit of data access control and integrates the access control process with the DDS communication process, In the discovery phase of DDS, the digital signature is used to verify the publication permission for a topic, and in the publish/subscribe phase of DDS, the CP-ABE is used to verify the subscription permission for a topic. The scheme ensures not only the privacy of users but also the confidentiality and authenticity of data. Theoretical analysis shows that this scheme can resist security threats such as unauthorized publication and unauthorized subscription. Moreover, the performance test of the prototype system shows that it matches the loose coupling and one to many characteristics of the publish/subscribe communication model and has good scalability in multi-subscriber scenarios while adjusting key parameters.

Peng Gao, Zhuowei Shen
An Interactive Visual System for Data Analytics of Social Media

With the development of the Internet, Big Data analysis of social media has become a hot research topic in recent years. However, a challenging problem in social media analysis is how to intelligently detect event information from massive media data and how to help users quickly understand event content. Therefore, we propose a method to extract important time periods of events from media data to analyze the media event information, and develop an interactive visual system based on the “5W” principle. The system provides an interactive analysis platform for exploring media Big Data (e.g., Twitter data). The system uses a dynamic topic model to extract topics, a Naive Bayesian classifier to distinguish emotions, and explores events based on the evolution of time, topics, and emotions. In terms of visualization, the system allows users to add some annotations based on segmentation of complex information and highlight important events by adding different story cards to the timeline so that users can get a quick overview of events. Users can also interactively modify the story cards during the exploration process. Finally, several case studies show that our system can effectively reduce the time required to understand social media data and allow users to quickly explore the full picture of an event through an interactive approach.

Yi Zhang, Zhaohui Li, Dewei Xi
Auto-Recon: An Automated Network Reconnaissance System Based on Knowledge Graph

Effective network security management is usually based on comprehensive, accurate and real-time control of enterprise network. With the rapid growth of enterprise network information, their exposure on the Internet is also expanding rapidly, which raises many security issues. It is hard for the existing approaches of network security management to cover dynamic and hidden assets. Moreover, black-box-based network reconnaissance is highly dependent on expert experience and time-consuming, which cannot meet the needs of enterprises to perform testing periodically. Therefore, target-oriented automated reconnaissance of network information becomes an urgent problem to be solved.We proposed and constructed a knowledge graph based network reconnaissance model, NRG, from a method-level perspective which describes the relationship between different network information and the way to reconnaissance them. Based on NRG, we have designed and implemented Auto-Recon, an automated network reconnaissance system using the distributed architecture. The purpose of Auto-Recon is to automatically find exposed surfaces of targets on the Internet using the primary domain as initial information. The system reduces strong dependence on network reconnaissance knowledge and experience. We conducted an experiment and the result shows that Auto-Recon has better performance in terms of efficiency, effectiveness and automation than existing tools.

Xiang Zhang, Qingli Guo, Yuan Liu, Xiaorui Gong
INGCF: An Improved Recommendation Algorithm Based on NGCF

Strengthening the representation and learning of user vector and item vector is the key of recommendation system. Neural Graph Collaborative Filtering (NGCF) has the problem of insufficient feature extraction of user vector and item vector. In addition, it uses the linear combination of the final embedding vectors of user and item to calculate the inner product, which is difficult to accurately obtain the user-item prediction score. In this article, we propose a graph neural network collaborative filtering algorithm based on NGCF, named INGCF. This new algorithm uses a 4-layer IndRNN layer and a feature extraction layer constructed by a self-attention mechanism to enhance the feature extraction capabilities of NGCF. And INGCF optimizes the user-item prediction score method to capture the complex structure of user-item interaction data to improve the accuracy of score calculations. We compare several indicators of INGCF, NGCF, PinSage and GCN by using same data sets and perform ablation experiments. The experimental results show that the INGCF algorithm has a better recommendation effect.

Weifeng Sun, Kangkang Chang, Lijun Zhang, Kelong Meng

Distributed and Network-Based Computing

Frontmatter
AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing

Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause computation imbalance and backpressure.We introduce AutoFlow, an automatic, hotspot-aware dynamic load balance system for streaming dataflows. It incorporates a centralized scheduler that monitors the load balance in the entire dataflow dynamically and implements state migrations correspondingly. The scheduler achieves these two tasks using a simple asynchronous distributed control message mechanism and a hotspot-diminishing algorithm. The timing mechanism supports implicit barriers and a highly efficient state-migration without global barriers or pauses to operators. It also supports a time-window based load-balance measurement and feeds them to the hotspot-diminishing algorithm without user interference. We implemented AutoFlow on top of Ray, an actor-based distributed execution framework. Our evaluation based on various streaming benchmark datasets shows that AutoFlow achieves good load-balance and incurs a low latency overhead in a highly data-skew workload.

Pengqi Lu, Yue Yue, Liang Yuan, Yunquan Zhang
BMTP: Combining Backward Matching with Tree-Based Pruning for Large-Scale Content-Based Pub/Sub Systems

Content-based publish/subscribe systems are widely deployed in many fields to realize selective data distribution. Event matching is a potential performance bottleneck in large-scale systems that maintain tens of millions of subscriptions, such as stock data updates. To deal with the large-scale subscription problem, in this paper, we propose a new matching algorithm called BMTP, which achieves high and robust matching performance. BMTP combines a Backward Matching method with a Tree-based Pruning method. The idea behind BMTP is that after filtering out most unmatching subscriptions, the efficiency of the backward matching method can be greatly improved. To evaluate the performance of BMTP, we conducted extensive experiments based on a real-world stock market dataset. The experiment results indicate that the matching performance of BMTP is improved by up to 96.6% than its counterparts.

Zhengyu Liao, Shiyou Qian, Zhonglong Zheng, Jian Cao, Guangtao Xue, Minglu Li
AF-TCP: Traffic Congestion Prediction at Arbitrary Road Segment and Flexible Future Time

Traffic congestion prediction is a fundamental yet challenging problem in Intelligent Transportation Systems (ITS). Due to the large scale of the road network, the high nonlinearity and complexity of traffic data, few methods are well-suited for citywide traffic congestion prediction. In this paper, we propose a novel deep model named AF-TCP for Traffic Congestion Prediction at Arbitrary road segment and Flexible future time. For the model to achieve the prediction at flexible future time, we construct all time representations in a unified vector space and further improve the model’s perception ability to different horizons. On the other hand, to realize the congestion prediction of arbitrary road segment within the city, we utilize road attributes and local neighbor structure to build the road segment representation, and design a deep model to fuse it with the corresponding historical traffic data. Extensive experiments on the real-world dataset demonstrate that our model exhibits stable performance at different prediction horizons and outperforms the baselines.

Xuefeng Xie, Jie Zhao, Chao Chen, Lin Wang
Collaborative QoS Prediction via Context-Aware Factorization Machine

With the prevalence of Web/Cloud/IoT services on the Internet, to select service with high quality is of paramount importance for building reliable distributed applications. However, the accurate values of the quality of services (QoS) are usually uneasy to obtain for they are typically personalized and highly depend on the contexts of users and services such as locations, bandwidths and other network conditions. Therefore, personalized and context-aware QoS prediction methods are desirable. By exploiting the QoS records generated by a set of users on a set of services, this paper proposes a collaborative QoS prediction method based on Context-Aware Factorization Machines named CAFM, which integrates the context information of services and users with classic factorization machines. Comprehensive experiments conducted on a real-world dataset show that the proposed method significantly outperforms existing methods in prediction accuracy.

Wenyu Tang, Mingdong Tang, Wei Liang
TDCT: Target-Driven Concolic Testing Using Extended Units by Calculating Function Relevance

Concolic unit testing is able to perform comprehensive analysis with a small function of the program. However, due to the following disadvantages, it cannot be widely and effectively applied to test the whole program. One is that it includes many false positives for lacking context-dependent information. The other is that it is difficult to automatically generate the whole program’s inputs by unit inputs. The researchers have proposed different ways to solve the above problems, but it also causes inaccuracy or performance problems in some extent. In this paper, we present a method called Target-Driven Concolic Testing (TDCT) to meet the challenges, which combines concolic unit testing and concolic testing. TDCT is a fine-grained method based on the interprocedural control flow graph (ICFG) to construct extended unit, which could obtain a comprehensive and accurate context of target function as far as possible. We present a custom target-driven search strategy in concolic execution to automatically generate the whole program’s inputs by unit inputs. It not only reduces the system performance overhead by discarding the search of irrelevant paths, but also further validates the authenticity of the potential bugs. We implement a prototype system of TDCT and apply it to 4 real-world C programs. The experiment shows that TDCT could find 83.87% of the target bugs and it possesses high precision with a true to false alarm ration is 1:5.2. It indicates that TDCT is able to effectively and accurately detect bugs and automatically generate the whole program’s inputs by unit inputs.

Meng Fan, Wenzhi Wang, Aimin Yu, Dan Meng
Multi-task Allocation Based on Edge Interaction Assistance in Mobile Crowdsensing

This paper studies the problem of multi-task allocation in hotspots. Due to the frequent task requests in hotspots, the traditional mobile crowdsensing (MCS) cannot meet its requirements, so we adopt a distributed MCS architecture to complete this process. This paper proposes an MCS multi-task allocation framework based on edge interaction assistance, which uses edge nodes to distribute the computing load of MCS platform. In the case of short time and heavy tasks, what this paper needs to do is to minimize delay and cost while ensuring data quality, so as to express the multi-task assignment problem as a multi-objective optimization problem. Through the analysis of the process, we propose a two-phase allocation scheme (TPAS) to determine task allocation. The first phase is the task selection based on cosine similarity. The number of tasks can be reduced by considering the similarity of tasks. The second phase is task allocation based on the improved NSGA-II algorithm. The simulation results show that under the premise of satisfying the task data quality, TPAS can obtain lower task cost and delay, and significantly improve the task completion rate.

Wenjuan Li, Guangsheng Feng, Yun Huang, Yuzheng Liu
A Variable-Way Address Translation Cache for the Exascale Supercomputer

Most of the parallel programs running at the exascale supercomputers have taken advantage of virtual address instead of physical address. Therefore the virtual and physical address translation mechanism is necessary and critical to bridge the hardware designs, driver programs and software applications. We have proposed a novel virtual and physical translation architecture. It includes the variable-way address translation Cache (VwATC), stall-hidden refresh scheme, the address validity checker and many reliability designs. The VwATC can be configured to four different tag ways and data memory capacities, and then the unused memory banks will be gated clock to reduce power dissipation. The VwATC adopts the large capacity eDRAM (embedded Dynamic Random Access Memory) to meet the high hit ratio of mapping the virtual address and physical address requirements. It also can run as a dual mode, and switch the Cache to buffer mode to reduce the accessing latency. Many experiments have been conducted on the real chip which implements the scalable address translation Cache. The results show that the VwATC has high hit ratio while running the well-known benchmarks, and save the dynamic power with different ways configuration.

Tiejun Li, Jianmin Zhang, Siqing Fu, Kefan Ma
PPCTS: Performance Prediction-Based Co-located Task Scheduling in Clouds

With increasing market competition among commercial cloud computing infrastructures, major cloud service providers are building co-located data centers to deploy online services and offline jobs in the same cluster to improve resource utilization. However, at present, related researches on the characteristics of co-location are still immature. Therefore, stable solutions for resource collaborative scheduling are still essentially absent, which restricts the further optimization of co-location technology. In this paper, we present performance prediction-based co-located task scheduling (PPCTS) model to perform fine-grained resource allocation under Quality of Service (QoS) constraints. PPCTS improves the overall cluster CPU resource utilization to 56.211%, which is competitive to the current related works. Besides, this paper fully implements a co-located system prototype. Compared with the traditional simulator-based scheduling research work, the prototype proposed in this paper is easier to be migrated to the real production environment.

Tianyi Yuan, Dongyang Ou, Jiwei Wang, Congfeng Jiang, Christophe Cérin, Longchuan Yan

Edge Computing and Edge Intelligence

Frontmatter
Risk-Aware Optimization of Distribution-Based Resilient Task Assignment in Edge Computing

Computing resources of mobile devices are growing, and unoccupied resources can be shared to provide support for edge computing services in edge clouds. Unlike stable servers, a significant challenge is that mobile devices may exit or join an edge cloud at any time due to change of position. This dynamic nature of mobile devices may result in abortions of task execution. In this paper, a risk-aware task assignment scheme called RATA is proposed. RATA minimizes the overhead caused by potential abortions of task execution by prioritizing tasks to the edge nodes which are unlikely to exit during task execution. We first quantify the abortion risk of each task-node pair to an expected extra overhead time, and formulate a risk-aware task assignment problem that strives to minimize the average completion time of all tasks, as well as the expected extra overhead time of each task. We then design a novel task assignment scheme to solve this problem with genetic algorithm. Finally, we implement a prototype system to evaluate the performance. The experimental results show that our scheme outperforms the state-of-art task assignment schemes in terms of average completion time and deadline miss rate in most cases.

Hang Jin, Jiayue Liang, Fang Liu
Budget-Aware Scheduling for Hyperparameter Optimization Process in Cloud Environment

Hyperparameter optimization, as a necessary step for majority machine learning models, is crucial to achieving optimal model performance. Unfortunately, the process of hyperparameter optimization is usually computation-intensive and time-consuming due to the large searching space. To date, with the popularity and maturity of cloud computing, many researchers leverage public cloud services (i.e. Amazon AWS) to train machine learning models. Time and monetary cost, two contradictory targets, are what cloud machine learning users are more concerned about. In this paper, we propose HyperWorkflow, a workflow engine service for hyperparameter optimization execution, that coordinates between hyperparameter optimization job and cloud service instances. HyperWorkflow orchestrates the hyperparameter optimization process in a parallel and cost-effective manner upon heterogeneous cloud resources, and schedules hyperparameter trials using bin packing approach to make the best use of cloud resources to speed up the tuning processing under budget constraint. The evaluations show that HyperWorkflow can speed up hyperparameter optimization execution across a range of different budgets.

Yan Yao, Jiguo Yu, Jian Cao, Zengguang Liu
Workload Prediction and VM Clustering Based Server Energy Optimization in Enterprise Cloud Data Center

The abstract should briefly summarize the contents of the Server energy consumption of data center is an important issue of energy management. Energy optimization of server is also necessary to reduce energy consumption of data center cooling and power supply, and reduce the operation cost of whole data center. High server energy consumption is mainly caused by excessive allocation of IT resources according to the highest application workload. This paper studies the optimization algorithm of server energy consumption in enterprise cloud environment. By introducing deep learning model LSTM to predict application workload, the proposed algorithm can dynamically determine the starting up and shutting down time of virtual machines (VMs) and physical machines (PMs) according to the workload to realize the matching of application workload needs between IT resources. K-mean clustering algorithm is used to find VMs with similar starting up and shutting down time and put them on same PM group. By properly extending the running time and increasing number of VMs, the algorithm can compensate the impact of inaccurate prediction and workload fluctuation and guarantee the applications QoS. The simulation results show that the proposed method in this paper can reduce the energy consumption of servers by 45–53% with QoS guarantee when the prediction relative error is 20%, which can provide a good balance between energy saving and application QoS.

Longchuan Yan, Wantao Liu, Biyu Zhou, Congfeng Jiang, Ruixuan Li, Songlin Hu
Soft Actor-Critic-Based DAG Tasks Offloading in Multi-access Edge Computing with Inter-user Cooperation

Multi-access edge computing (MEC) enables mobile applications, which consists of multiple dependent subtasks, to be executed in full parallelism between edge servers and mobile devices. The idea is to offload some of the subtasks to edge servers while utilizing the directed acyclic graphs (DAGs) based subtask dependency. In some cases involving multiple users running the same application, users can cooperate to improve the application’s performance (e.g., object detection accuracy in connected and autonomous vehicles) by sharing the intermediate results of the subtasks, which is referred to as cooperation gain. However, inter-user cooperation also introduces additional dependencies between the subtasks of different users, which inevitably increases the execution delay of the application with DAG tasks offloading. Therefore, how to jointly optimize execution delay and cooperation gain remains an open question. In this paper, we present an approach for DAG tasks offloading in MEC with inter-user cooperation to optimize the execution delay of the application and the cooperation gain. First, we introduce the concept of application utility, which incorporates both delay and cooperation gain. We formulate the DAG tasks offloading problem with inter-user cooperation as a Markov decision process (MDP) to maximize the total application utility. Next, we propose a quantized soft actor-critic (QSAC) algorithm to solve the formulated problem. Specifically, QSAC generates multiple potential solutions according to the order-preserving quantizing method, increasing the exploration efficiency to avoid falling into local optimal while maintaining the overall stability. Finally, simulation results validate that the proposed algorithm significantly improves the total application utility compared with the other benchmark algorithms.

Pengbo Liu, Shuxin Ge, Xiaobo Zhou, Chaokun Zhang, Keqiu Li

Service Dependability and Security Algorithms

Frontmatter
Sensor Data Normalization Among Heterogeneous Smartphones for Implicit Authentication

Nowadays, smartphones have become an important part in people’s life. Existing traditional explicit authentication mechanisms can hardly protect the privacy of users, which makes implicit authentication mechanisms a hot topic. Considering the non-uniform sensor data, which are collected from heterogeneous smartphones, e.g., different series or versions, a sensor data normalization solution among heterogeneous smartphones is proposed in this paper. First, the sensors used to collect data can be determined by whether they can describe users’ behavior characteristics well. On this basis, we propose a sensor data normalization solution from three aspects: the proportion-based one, statistics-based one and attitude-based normalization. Particularly, we establish an adaptive complementary filter with a dynamic parameter, making it possible to adjust the proportion of accelerometer and gyroscope. Finally, based on 2,000 samples collected from 40 volunteers on three different smartphones, we perform experiments to evaluate our proposed solution. Compared with non-normalized data, the authentication efficiency through our solution can respectively increase 15% and 18% in Accuracy and AUC, and reduce FAR and FRR by 10% and 18%, respectively.

Zuodong Jin, Muyan Yao, Dan Tao
Privacy-Preserving and Reliable Federated Learning

In Internet of Things (IoT), it is often impossible to share datasets owned by different participants (usually IoT devices) for machine learning model training due to privacy concerns. Federated learning (FL) is a promising technique to address this challenge. However, existing FL schemes face the problem of how to avoid low-quality/malicious update. To solve this problem, we propose a privacy-preserving and reliable federated learning scheme (PPRFLS) to select reliable participants and evaluate the quality of the participants’ updates. Analysis shows that the proposed scheme achieves data privacy and model reliability.

Yi Lu, Lei Zhang, Lulu Wang, Yuanyuan Gao
Security Authentication of Smart Grid Based on RFF

Due to the openness of the smart grid, the traditional physical isolation technology can no longer meet the communication security requirements of the smart grid system, and the current security authentication technologies commonly used also need strong computing power, which leads to inefficiency. Based on the above problems, we propose a smart grid security authentication method based on radio frequency fingerprint (RFF). On the one hand, we propose a lightweight deep learning framework for extracting RFFs from raw received signals, which reduces the complexity of the network model and achieves high accuracy. On the other hand, we propose an improved triplet loss to perform open-set recognition. The improved triple loss could make the similar features more compact so as to effectively distinguish the unknown signals. We use radio frequency (RF) signals collected in realistic environments to conduct experiments, and the results prove the effectiveness and robustness of our method.

Yang Lei, Caidan Zhao, Yilin Wang, Yicheng Zheng, Lei Zhang
SEPoW: Secure and Efficient Proof of Work Sidechains

Since the advent of sidechains in 2014, they have been acknowledged as the key enabler of blockchain interoperability and upgradability. However, sidechains suffer from significant challenges such as centralization, inefficiency and insecurity, meaning that they are rarely used in practice. In this paper, we present SEPoW, a secure and efficient sidechains construction that is suitable for proof of work (PoW) sidechain systems. The drawbacks for the centralized exchange of cross-chain assets in the participating blockchains are overcome by our decentralized SEPoW. To reduce the size of a cross-chain proof, we introduce merged mining into our SEPoW such that the proof consists of two Merkle tree paths regardless of the size of the current blockchain. We prove that the proposed SEPoW achieves the desirable security properties that a secure sidechains construction should have. As an exemplary concrete instantiation we propose SEPoW for a PoW blockchain system consistent with Bitcoin. We evaluate the size of SEPoW proof and compare it with the state-of-the-art PoW sidechains protocols. Results demonstrate that SEPoW achieves a proof size of 416 bytes which is roughly 123 $$\times $$ × , 510 $$\times $$ × and 62000 $$\times $$ × smaller than zkRelay proof, PoW sidechains proof and BTCRelay proof, respectively.

Taotao Li, Mingsheng Wang, Zhihong Deng, Dongdong Liu
A Spectral Clustering Algorithm Based on Differential Privacy Preservation

Spectral clustering is a widely used clustering algorithm based on the advantages of simple implementation, small computational cost, and good adaptability to arbitrarily shaped data sets. However, due to the lack of data protection mechanism in spectral clustering algorithm and the fact that the processed data often contains a large amount of sensitive user information, thus an existing risk of privacy leakage. To address this potential risk, a spectral clustering algorithm based on differential privacy protection is proposed in this paper, which uses the Laplace mechanism to add noise to the input data perturbing the original data information, and then perform spectral clustering, so as to achieve the purpose of privacy protection. Experiments show that the algorithm has both stability and usability, can correctly complete the clustering task with a small loss of accuracy, and can prevent reconstruction attacks, greatly reduce the risk of sensitive information leakage, and effectively protect the model and the original data.

Yuyang Cui, Huaming Wu, Yongting Zhang, Yonggang Gao, Xiang Wu
A Compact Secret Image Sharing Scheme Based on Flexible Secret Matrix Sharing Scheme

In social networks, Secret Image Sharing (SIS) provides an effective way to protect secret images. However, most existing SIS schemes only support limited access policies, which are not flexible enough for lots of scenarios. In this work, we propose an SIS scheme to solve this defect. The core contributions of our scheme can be summarized as follows. Firstly, we propose a Secret Matrix Sharing Scheme (SMSS), which is extended from the traditional Linear Secret Sharing Scheme (LSSS). Different from LSSS, SMSS shares a secret matrix instead of a single secret value. Secondly, based on our SMSS, we propose an SIS scheme named LM-SIS, which supports monotonous access policies. Compared with other SIS schemes, our scheme has advantages in flexibility and efficiency. Furthermore, the LM-SIS scheme is compact, which is reflected in its shadow size ratio is approximately 1/k, where k denotes the root threshold of the access tree. Finally, our scheme provides lossless or approximately lossless recovery, and experimental results show that the PSNR of the recovered images is always greater than 30 dB and the SSIM usually exceeds 0.99. By sacrificing a little storage cost, our LM-SIS scheme can achieve lossless recovery.

Lingfu Wang, Jing Wang, Mingwu Zhang, Weijia Huang
Robust Multi-model Personalized Federated Learning via Model Distillation

Federated Learning(FL) is a popular privacy-preserving machine learning paradigm that enables the creation of a robust centralized model without sacrificing the privacy of clients’ data. FL has a wide range of applications, but it does not integrate the idea of independent model design for each client, which is imperative in the areas like healthcare, banking sector, or AI as service (AIaaS), due to the paramount importance of data and heterogeneous nature of tasks. In this work, we propose a Robust Multi-Model FL (RMMFL) framework, an extension to FedMD, which under the same set of assumptions significantly improves the results of each individual model. RMMFL adapted two changes in the FedMD training process: First, a high entropy aggregation method is introduced to soften the output predictions. Second, a weighted ensemble technique is used to weigh the predictions of each client model in line with their performance. Extensive experiments are performed on heterogeneous models using CIFAR/MNIST as benchmark datasets, the simulations results obtained from our study shows that RMMFL exceeds the accuracy by over $$5\%$$ 5 % compare to the baseline method.

Adil Muhammad, Kai Lin, Jian Gao, Bincai Chen
V-EPTD: A Verifiable and Efficient Scheme for Privacy-Preserving Truth Discovery

Privacy-preserving truth discovery has been researched from many perspectives in the past few years. However, the complex iterative computation and multi-user feature makes it challenging to design a verifiable algorithm for it. In this paper, we propose a novel scheme named V-EPTD that not only protects the privacy information but also verifies the computing in truth discovery. The proposed technique adopts a threshold paillier cryptosystem to solve the multi-user problem so that all parties encrypt the data with the same public key while being unable to decrypt the ciphertext if there are not enough parties. V-EPTD also transforms complex iterative computation into polynomials, uses linear homomorphic hash, and commitment complete verification. The experimentation and analysis show that V-EPTD has good performances for users, verifiers, and the server, both in communication overhead and computation overhead.

Chang Xu, Hongzhou Rao, Liehuang Zhu, Chuan Zhang, Kashif Sharif
FedSP: Federated Speaker Verification with Personal Privacy Preservation

Automatic speaker verification (ASV) has been widely applied in a variety of industrial scenarios. In ASV, the universal background model (UBM) needs to be trained with a large variety of speaker data so that the UBM can learn the speaker-independent distribution of speech features for all speakers. However, the sensitive information contained in raw speech data is important and private for the speaker. According to the recent European Union privacy regulations, it is forbidden to upload private raw speech data to the cloud server. Thus, a new ASV model needs to be proposed to alleviate data scarcity and protect data privacy simultaneously in the industry. In this work, we propose a novel framework named Federated Speaker Verification with Personal Privacy Preservation, or FedSP, which enables multiple clients to jointly train a high-quality speaker verification model and provide strict privacy preservation for speaker. For data scarcity, FedSP is based on the federated learning (FL) framework, which keeps raw speech data on each device and jointly trains the UBM to learn the speech features well. For privacy preservation, FedSP provides more strict privacy preservation than traditional basic FL framework by selecting and hiding sensitive information from raw speech data before jointly training the UBM. Experimental results on two pair speech datasets demonstrate that FedSP has superior performances in terms of data-utility and privacy preservation.

Yangqian Wang, Yuanfeng Song, Di Jiang, Ye Ding, Xuan Wang, Yang Liu, Qing Liao
Security Performance Analysis for Cellular Mobile Communication System with Randomly-Located Eavesdroppers

Due to the broadcast characteristics of wireless communication, its security is vulnerable. The security performance of wireless communication has been widely studied. Our research focuses on the security performance analysis of 5G mobile cellular networks. Each network cell is modeled as a regular hexagonal area. For cellular users, we consider their mobile characteristics and a Random Waypoint mobility model is adopted for mobile nodes, with multiple eavesdroppers randomly located in the network. To analyze the security performance between Base Station (BS) and a mobile user, nodal distance distribution is required. We propose an approach to obtain the distribution of the distance between an arbitrary reference node inside or outside the regular hexagon and a mobile node inside the regular hexagon. The distance distribution between the mobile user and the BS located in the center of the regular hexagon is derived, by the above-mentioned approach. Further, we analyze the Signal-to-Noise Ratio of the mobile user and eavesdroppers and the Secrecy Outage Probability in a downlink scenario. The validity of the results is verified by extensive Monte Carlo simulations.

Ziyan Zhu, Fei Tong, Cheng Chen
Short and Distort Manipulations in the Cryptocurrency Market: Case Study, Patterns and Detection

Recently, with the development of blockchain, cryptocurrencies such as bitcoin are gaining popularity. However, at the same time, there are some problems such as lack of laws and supervision, leading to an endless stream of frauds and market manipulations, which seriously impacts the interests of investors and the development of the economic society. This paper studies Short and Distort, a new manipulation as opposed to Pump and Dump in the cryptocurrency market for the first time. By using the method of case study, Short and Distort is explained. Further, we mine three important patterns with ordinary least squares regression. That is, cryptocurrencies always show price reversal, abnormal trading volume, and widening of bid-ask spreads. Moreover, a model is constructed to detect Short and Distort. Thorough experiments show that our model is superior to the other four methods.

Xun Sun, Xi Xiao, Wentao Xiao, Bin Zhang, Guangwu Hu, Tian Wang
Privacy-Preserving Swarm Learning Based on Homomorphic Encryption

Swarm learning is a decentralized machine learning method that combines edge computing, blockchain based point-to-point network and coordination while maintaining consistency without the need for a central coordinator for data integration between different medical institutions. Swarm learning solves problems of federated learning that the model parameters are still handled by the central custodian and star structure reduces fault tolerance etc. Although Swarm learning allows fair, transparent and highly regulated shared data analysis on the premise of meeting the privacy law, there will be inference attacks targeting shared model information between nodes, which will lead to privacy disclosure. Therefore, we propose the privacy-preserving Swarm learning scheme based on homomorphic encryption: we use the threshold Paillier cryptosystem to encrypt the shared local model information without affecting the accuracy of the model. In addition, we design a partial decryption algorithm to prevent privacy disclosure caused by aggregation model information. Experiments show that our scheme can guarantee strong privacy and has negligible influence on the accuracy of the final model.

Lijie Chen, Shaojing Fu, Liu Lin, Yuchuan Luo, Wentao Zhao

Software Systems and Efficient Algorithms

Frontmatter
A Modeling and Verification Method of Modbus TCP/IP Protocol

With the informatization of industrial control system, industrial communication protocol is facing greater data pressure. At the same time, industrial communication protocols will face more security threats. In this paper, we use the method of transforming STM (State Transition Matrix) model to UPPAAL (a tool for verifying real-time system) model. In order to clearly understand the model and avoid some mistakes in the early stage of modeling, we first establish STM model for Modbus TCP/IP protocol. Finally, it is transformed into UPPAAL model. Five types of attributes are verified by UPPAAL tool, including the verification of unreachable attributes found by STM modeling. These five types of attributes verify the credibility of Modbus TCP/IP protocol. The experimental results show that this method can have a clear understanding of the model in the early stage. After converting STM model into UPPAAL model, more constraints can be found by referring to STM model. Compared with the existing methods, it studies the credibility of the protocol itself. Therefore, the method can find the root of the problem and solve it.

Jie Wang, Zhichao Chen, Gang Hou, Haoyu Gao, Pengfei Li, Ao Gao, Xintao Wu
Completely Independent Spanning Trees in the Line Graphs of Torus Networks

Due to the application in reliable information transmission, parallel transmission and safe distribution of information, and parallel diagnosis algorithm for faulty servers, completely independent spanning trees (CISTs) play important roles in the interconnection networks. So far, researchers have obtained many results on CISTs in many specific interconnection networks, but the results of their line graphs are limited. Some data center networks are constructed based on the line graphs of interconnected networks, such as SWCube, BCDC, AQLCube, etc. Therefore, it is also necessary to study the construction of CISTs in line graphs. A torus network is one of the most popular interconnection networks. The line graph of a torus network is 6-regular, whether there exist 3-CISTs is an open question. In this article, we established the relationship between the completely independent spanning trees in the line graph and the edge division of the original graph. By dividing the edges of the torus network, we can construct three completely independent spanning trees in its line graph in some cases. Some simulation experiments are also implemented to verify the validity.

Qingrong Bian, Baolei Cheng, Jianxi Fan, Zhiyong Pan
Adjusting OBSS/PD Based on Fuzzy Logic to Improve Throughput of IEEE 802.11ax Network

The densely-deployed Wireless Local Area Network (WLAN) exhibit underutilization of radio frequency, low network performance, etc. To enhance frequency spectrum efficiency and improve network performance, the IEEE 802.11ax standard, published in this year, introduces some new technologies, including Orthogonal Frequency Division Multiple Access (OFDMA) and Spatial Reuse (SR). In the SR, Overlapping Basic Service Set (OBSS) Packet Detection (OBSS/PD) is applied to increase parallel transmissions. The standard, however, leaves the unsolved problem of how to set suitable OBSS/PD for the nodes in OBSS. This paper makes efforts to solve this problem and proposes the fuzzy logic based OBSS/PD adjustment (FLOPA) scheme. The proposed FLOPA scheme integrates SR with OFDMA and lets Access Point (AP) maintain the OBSS/PDs for multiple nodes. The FLOPA embeds the fuzzy control system, in which the received signal strength indicator (RSSI), network throughput, and currently-applied OBSS/PD are chosen as input variables and the output variable is used to adjust the OBSS/PD. Simulation results show that the proposed scheme can effectively improve network throughput. Compared with the Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA) scheme in 802.11ax standard, the network throughput can be increased by more than 100%.

Chu Xin, Yi-hua Zhu
Two-Stage Evolutionary Algorithm Using Clustering for Multimodal Multi-objective Optimization with Imbalance Convergence and Diversity

This paper proposes a two-stage multimodal multi-objective evolutionary algorithm using clustering, termed as TS_MMOEAC. In the first-stage evolution of TS_MMOEAC, the convergence-penalized density (CPD) based k-means clustering is used to divide population into multiple subpopulations. Subsequently, a local archive mechanism is adopted to maintain diverse local Pareto optimal solutions in decision space. In the second-stage evolution, an identical k-means clustering based on distance among individuals in decision space is applied to form multiple subpopulations. In this case, the convergence performance of local Pareto optimal solutions is accelerated. Meanwhile, equivalent Pareto optimal solutions with the imbalance between convergence and diversity are located by a similar local archive method with a larger clearing radius. Experimental results validate the superior performance of TS_MMOEAC, and the proposed TS_MMOEAC is capable of finding equivalent Pareto optimal solutions with the imbalance between convergence and diversity.

Guoqing Li, Wanliang Wang, Yule Wang
SLA: A Cache Algorithm for SSD-SMR Storage System with Minimum RMWs

To satisfy the low-cost and massive data storage requirements imposed by data growth, Shingled Magnetic Recording (SMR) disks are extensively employed in the area of high-density storage. The primary drawback of SMR disks is the write amplification issue caused by sequential write constraints, which is becoming more prominent in the field of non-cold archives. Although a hybrid storage system comprised of Solid State Disks (SSDs) and SMRs may alleviate the aforementioned issue, current SSD cache replacement algorithms are still limited to the management method of Least Recently Used (LRU). The LRU queue, on the other hand, is ineffective in reducing the triggers of Read-Modify-Write (RMW), which is a critical factor for the performance of SMR disks. In this paper, we propose a new SMR Locality-Aware (SLA) algorithm based on a band-based management method. SLA adopts the Dual Locality Compare (DLC) strategy to solve the hit rate reduction problem caused by the traditional band-based management method, as well as the Relatively Clean Band First (RCBF) strategy to further minimize the number of RMW operations. Experiments indicate that, compared to the MSOT method, the SLA algorithm can maintain a similar hit rate as the LRU, while reducing the number of RMWs by 77.2% and the SMR disk write time by 95.1%.

Xuda Zheng, Chi Zhang, Keqiang Duan, Weiguo Wu, Jie Yan
Trajectory Similarity Search with Multi-level Semantics

With the widespread popularity of intelligent mobile devices, massive trajectory data have been captured by mobile devices. Although trajectory similarity search has been studied for a long time, most existing work merely considers spatial and temporal features or single-level semantic features, thus insufficient to support complex scenarios. Firstly, we define multi-level semantics trajectory to support flexible queries for more scenarios. Secondly, we present a new “spatial + multi-level semantic” trajectory similarity query, and then propose a framework to find k most similar ones from a trajectory database efficiently. Finally, to hasten query processing, we build a multi-layer inverted index for trajectories, design 4 light-weight pruning rules, and propose an adaptive updating method. The thorough experimental results show that our approach works efficiently in extensive and flexible scenarios.

Jianbing Zheng, Shuai Wang, Cheqing Jin, Ming Gao, Aoying Zhou, Liang Ni
Nonnegative Matrix Factorization Framework for Disease-Related CircRNA Prediction

Nowadays, more and more scholars and related studies have shown that circular RNA (circRNA) is related to the occurrence and development of human diseases. It is necessary to use computational approach to predict potential and unknown associations between circRNA and disease, which will save time and money in developing drugs for treating disease. In this paper, we propose a new computational model called robust nonnegative matrix factorization framework (RNMF), which learns a robust association discriminative representation by using the similarity information structure of network space and $$\ell_{2,1}$$ ℓ 2 , 1 -norm loss function. Specifically, we use integrated circRNA similarity network and integrated disease similarity network to update known association matrix from the perspective of matrix multiplication at first. Then, the problems of noise and outliers of elements in the updated association matrix are well addressed by the $$\ell_{2,1}$$ ℓ 2 , 1 -norm loss function. Finally, the computational model is expressed as an optimization problem of objective function solved by iterative algorithm. Experimental results show that our model can achieve better performance than state-of-the-art methods in several aspects.

Cheng Yang, Li Peng, Wei Liu, Xiangzheng Fu, Ni Li
A Fast Authentication and Key Agreement Protocol Based on Time-Sensitive Token for Mobile Edge Computing

Compared with cloud computing, Mobile Edge Computing (MEC) can sink some services and functions located in cloud servers to edge nodes to reduce network latency and provide real-time services. However, MEC not only inherits the security issues in cloud computing, but also faces new security risks. To ensure the security and privacy of messages transmitted in the channel, a secure Authentication and Key Agreement (AKA) protocol is essential. However, the existing AKA protocols are not lightweight enough or require a cloud server or a trusted third party to participate in the authentication process. Therefore, this paper designs a fast AKA protocol based on time-sensitive token for MEC. With this protocol, the terminal node can achieve fast authentication through the applied token, and the authentication process only requires a related edge node to participate. Simulation results based on ProVerif and informal security analysis show that our protocol can resist various common attacks. The comparison with related protocols shows that our protocol only needs to spend very little computational and communicational cost to authenticate a terminal node with a token.

Zisang Xu, Wei Liang, Jin Wang, Jianbo Xu, Li-Dan Kuang
Ferproof: A Constant Cost Range Proof Suitable for Floating-Point Numbers

The decentralization of the blockchain has led to the leakage of privacy data at the transaction layer, causing information security issues. The zero-knowledge range proof can confidentially verify that the transaction amount belongs to a legal positive integer range without revealing the transaction, effectively solving the privacy leakage of the blockchain transaction layer. The currently known blockchain range proof scheme is unable to process floating-point numbers, which limits the application fields of range proof. Moreover, the existing solutions can be further optimized in terms of proof speed, verification speed, and computational cost. We propose Ferproof, a constant cost range proof with floating-point number adaptation. Ferproof improves the zero-knowledge protocol based on Bulletproofs to optimize the proof structure, and a Lagrangian inner product vector generation method is issued to make the witness generation time constant and the commitment is constructed according to the floating-point number range relationship to implement floating-point range proof. Ferproof only relies on the discrete logarithm assumption, and the third-party credibility is not required. The communication cost and time complexity of Ferproof are constant. According to the experimental results, compared with the most advanced known range proof scheme, Ferproof’s proof speed is increased by 40.0%, and the verification speed is increased by 29.8%.

Yicong Li, Kuanjiu Zhou, Lin Xu, Meiying Wang, Nan Liu
NEPG: Partitioning Large-Scale Power-Law Graphs

We propose Neighbor Expansion on power-law graph(NEPG), a distributed graph partitioning method based on a specific power-law graph that offers both good scalability and high partitioning quality. NEPG is based on a heuristic method, Neighbor Expansion, which constructs the different partitions and greedily expands from vertices selected randomly. NEPG improves the partitioning quality by selecting the vertices according to the properties of the power-law graph. We put forward theoretical proof that NEPG can reach the higher upper bound in partitioning quality. The empirical evaluation demonstrates that compared with the state-of-the-art distributed graph partitioning algorithms, NEPG significantly improved partitioning quality while reducing the graph construction time. The performance evaluation demonstrates that the time efficiency of the proposed method outperforms the existing algorithms.

Jiaqi Si, Xinbiao Gan, Hao Bai, Dezun Dong, Zhengbin Pang
Accelerating DCNNs via Cooperative Weight/Activation Compression

Deep convolutional neural networks (DCNNs) have achieved great success in various applications. Nevertheless, training and deploying such DCNNs models require a huge amount of computation and storage resources. Weight pruning has emerged as an effective compression technique for DCNNs to reduce the consumption of hardware resources. However, existing weight pruning schemes neglect to simultaneously consider accuracy, compression ratio and hardware-efficiency in the design space, which inevitably leads to performance degradation. To overcome these limitations, in this work, we propose a cooperative weight/activation compression approach. Initially, we observe spatially insensitive consistency, namely the insensitive weight values that have no impact on accuracy distributed at the same spatial position in a mass of channels. Based on the key observation, we propose a cluster-based weight pattern pruning technique, which converges the channels with spatially insensitive consistency into a cluster and prunes them into the weight pattern with a uniform shape. In addition, there are many sparsity rows in the activation matrix due to the effect of the activation function–Rectifying Linear Unit (ReLU). We study the sparsity rows and find that rows with larger sparsity degree are insensitive to accuracy. Hence, we further propose a sparsity-row-based activation removal technique, which directly eliminates insensitive rows of activation. The two proposed techniques allow the hardware to execute at excellent parallel granularity and achieve better compression ratio with negligible accuracy loss. Experiment results on several popular DCNNs models show that our scheme reduces the computation by $$63\%$$ 63 % on average.

Yuhao Zhang, Xikun Jiang, Xinyu Wang, Yudong Pan, Pusen Dong, Bin Sun, Zhaoyan Shen, Zhiping Jia
PFA: Performance and Fairness-Aware LLC Partitioning Method

In server for cluster, the number of running programs is increasing. The benefit of consolidating multiple programs is good for server utilization, but it also leads to the program’s performance degradation. Severe performance degradation can result in significant losses. Therefore, it is essential to divide the shared resource to support consolidation. Our experiment showed that even some shared resources such as CPU cores and memory had been divided, the performance of programs still drop down significantly compared to the program running alone, then we found out the primary reason was the contention for LLC. In this paper, we proposed the LLC partitioning method to improve the performance for consolidation programs. We classify the LLC usage type of the program by analyzing the LLC behavior, then allocate reasonable LLC ways according to the LLC usage type. Meanwhile, we monitor the program’s performance in real-time and allocate the LLC ways dynamically. The experiment found that compared with the default LLC allocation method, our method reduced the performance loss by an average of 6.73% and improved the fairness by 0.03. Compared with the CPA method, our method reduced the performance loss by an average of 4.86%.

Donghua Li, Lin Wang, Tianyuan Huang, Xiaomin Zhu, Shichao Geng
SGP: A Parallel Computing Framework for Supporting Distributed Structural Graph Clustering

Structural graph clustering is an important problem in the domain of graph data management. Given a large graph G, structural graph clustering is to assign vertices to clusters where vertices in the same cluster are densely connected to each other and vertices in different clusters are loosely connected to each other. Due to its importance, many algorithms have been proposed to study this problem. However, no effort focuses on the distributed graph environment. In this paper, we propose a parallel computing framework named SGP (short for Statistics-based Graph Partition) to support large graph clustering under distributed environment. We first use historical clustering information to partition graph into a group of clusters. Based on the partition result, we can properly assign vertexes to different nodes based on connection relationship among vertex. When a clustering request is submitted, we can use properties leading by the partition for efficiently clustering. Finally, we conduct extensive performance studies on large real and synthetic graphs, which demonstrate that our new approach could efficiently support large graph clustering under distributed environment.

Xiufeng Xia, Peng Fang, Yunzhe An, Rui Zhu, Chuanyu Zong
LIDUSA – A Learned Index Structure for Dynamical Uneven Spatial Data

Based on the problem that existing learned indexes are difficult to adjust dynamically with data changes, a learned index Structure for dynamical uneven spatial data (LIDUSA) is presented in our paper. To handle with the problem of poor KNN query performance on sparse regions, LIDUSA could dynamically adjust data layout by merging and splitting corresponding grid cells, and relearn mapping function of this region to make data points stored in adjacent sparse grid cell also stored in neighboring disk pages. It combines the advantage of tree-shaped indexes, which could be adjusted dynamically, and that of learned indexes. In this paper, extensive experiments are conducted on real-world dataset and synthetic datasets. From experiment results, it could be seen that LIDUSA is twice as fast as other existing indexes in the scenario of KNN query, which will greatly extend the applicable scope of learned indexes.

Zejian Zhang, Yan Wang, Shunzhi Zhu
Why is Your Trojan NOT Responding? A Quantitative Analysis of Failures in Backdoor Attacks of Neural Networks

Backdoor has offered a new attack vector to degrade or even subvert deep learning systems and thus has been extensively studied in the past few years. In reality, however, it is not as robust as expected and oftentimes fails due to many factors, such as data transformations on backdoor triggers and defensive measures of the target model. Different backdoor algorithms vary from resilience to these factors. To evaluate the robustness of backdoor attacks, we conduct a quantitative analysis of backdoor failures and further provide an interpretable way to unveil why these transformations can counteract backdoors. First, we build a uniform evaluation framework in which five backdoor algorithms and three types of transformations are implemented. We randomly select a number of samples from each test dataset, and then these samples are poisoned by triggers. These distorted variants of samples are passed to the trojan models after various data transformations. We measure the differences of predicated results between input samples as influences of transformations for backdoor attacks. Moreover, we present a simple approach to interpret the caused degradation. The results as well as conclusions in this study shed light on the difficulties of backdoor attacks in the real world, and can facilitate the future research on robust backdoor attacks.

Xingbo Hu, Yibing Lan, Ruimin Gao, Guozhu Meng, Kai Chen
OptCL: A Middleware to Optimise Performance for High Performance Domain-Specific Languages on Heterogeneous Platforms

Programming on heterogeneous hardware architectures using OpenCL requires thorough knowledge of the hardware. Many High-Performance Domain-Specific Languages (HPDSLs) are aimed at simplifying the programming efforts by abstracting away hardware details, allowing users to program in a sequential style. However, most HPDSLs still require the users to manually map compute workloads to the best suitable hardware to achieve optimal performance. This again calls for knowledge of the underlying hardware and trial-and-error attempts. Further, very often they only consider an offloading mode where compute-intensive tasks are offloaded to accelerators. During this offloading period, CPUs remain idle, leaving parts of the available computational power untapped. In this work, we propose a tool named OptCL for existing HPDSLs to enable a heterogeneous co-execution mode when capable where CPUs and accelerators can process data simultaneously. Through a static analysis of data dependencies among compute-intensive code regions and performance predictions, the tool selects the best execution schemes out of purely CPU/accelerator execution or co-execution. We show that by enabling co-execution on dedicated and integrated CPU-GPU systems up to 13 $$\times $$ × and 21 $$\times $$ × speed-ups can be achieved.

Jiajian Xiao, Philipp Andelfinger, Wentong Cai, David Eckhoff, Alois Knoll
Backmatter
Metadata
Title
Algorithms and Architectures for Parallel Processing
Editors
Yongxuan Lai
Prof. Tian Wang
Min Jiang
Guangquan Xu
Wei Liang
Aniello Castiglione
Copyright Year
2022
Electronic ISBN
978-3-030-95391-1
Print ISBN
978-3-030-95390-4
DOI
https://doi.org/10.1007/978-3-030-95391-1

Premium Partner