main-content

## Über dieses Buch

This book constitutes the thoroughly refereed workshop proceedings of the 9th International Workshop on Structured Object-Oriented Formal Language and Method, SOFL+MSVL 2019, held in Shenzhen, China, in November 2019. The 23 revised full papers included in the volume were carefully reviewed and selected from 43 submissions. They are organized in the following topical sections: testing and debugging, formal verification, problem solving, software analysis and evolution, and software analysis and testing.

## Inhaltsverzeichnis

### Analysis and Remodeling of the DirtyCOW Vulnerability by Debugging and Abstraction

Abstract
It is hard to understand clearly the principle of DirtyCOW vulnerability of Linux operating system, even for many experienced kernel developers. An approach is presented to rebuild the design model of the related Linux system calls, which gives an insight into the vulnerability. The remodeling, i.e. model-rebuilding, is done by first constructing a control flow diagram based on the debugging and analysis of the OS kernel, and then turning the control flow diagram to an abstract program based on abstraction to the observed concrete states. The approach provides an effective way for the comprehension of complex legacy software.
Yanjun Wen, Ji Wang

### A Formal Technique for Concurrent Generation of Software’s Functional and Security Requirements in SOFL Specifications

Abstract
Formal methods offer a precise and concise way of expressing system requirements which can be tested and validated with formal proofs techniques. However, their application in the development of security aware industrial systems have been associated with high costs due to the difficulty in integrating the functional and security requirements which are generated in an ad hoc manner. Reducing this cost has been a subject of interest in software requirements engineering research. In this paper we propose a formal technique for concurrent generation of functional and security requirements that can help provide a systematic way of accounting for security requirements while specifying the functional requirements of a software. With this technique, the functional behaviors of the system are precisely defined using the Structured Object-Oriented Formal Language (SOFL). The security rules are systematically explored with the results incorporated into the functional specification as constraints. The resultant specification then defines the system functionality that implies the conformance to the security rules. Such a specification can be used as a firm foundation for implementation and testing of the implementation. We discuss the principle of integrating security rules with functional specifications and present a case study to demonstrate the feasibility of our technique.
Busalire Emeka, Shaoying Liu

### Distortion and Faults in Machine Learning Software

Abstract
Machine learning software, deep neural networks (DNN) software in particular, discerns valuable information from a large dataset, a set of data, so as to synthesize approximate input-output relations. The outcomes of such DNN programs are dependent on the quality of both learning programs and datasets. However, the quality assurance of DNN software is difficult. The trained machine learning models, defining the functional behavior of the approximate relations, are unknown prior to its development, and the validation is conducted indirectly in terms of the prediction performance. This paper introduces a hypothesis that faults in DNN programs manifest themselves as distortions in trained machine learning models. Relative distortion degrees measured with appropriate observer functions may indicate that the programs have some hidden faults. The proposal is demonstrated with the cases of the MNIST dataset.
Shin Nakajima

### A Divide & Conquer Approach to Testing Concurrent Java Programs with JPF and Maude

Abstract
The paper proposes a new testing technique for concurrent programs. The technique is basically a specification-based testing one. For a formal specification S and a concurrent program P, state sequences are generated from P and checked to be accepted by S. We suppose that S is specified in Maude and P is implemented in Java. Java Pathfinder (JPF) and Maude are then used to generate state sequences from P and to check if such state sequences are accepted by S, respectively. Even without checking any property violations with JPF, JPF often encounters the notorious state space explosion while only generating state sequences. Thus, we propose a technique to generate state sequences from P and check if such state sequences are accepted by S in a stratified way. Some experiments demonstrate that the proposed technique mitigates the state space explosion instances from which otherwise only one JPF instance cannot suffice.
Canh Minh Do, Kazuhiro Ogata

### An Approach to Modeling and Verifying Multi-level Interrupt Systems with TMSVL

Abstract
In embedded systems and operating systems, the interrupt mechanism is an important way to ensure real-time response to kinds of asynchronous events. While the interrupt mechanism changes the execution traces of the main program, it makes modeling and verification of systems with interrupt difficulty. Therefore, we propose an approach to modeling and verifying multi-level interrupt systems. Firstly, the model of multi-level interrupt systems based on Time Projection Temporal Logic (TPTL) is proposed. On this basis, the model can be used to extend the TMSVL language and the TMSVL interpreter so that multi-level interrupt systems can be modeled, simulated and verified automatically. Finally, a case study is given to show the correctness and practicability of the proposed approach.
Jin Cui, Xu Lu, Buwen Liang

### Towards Formal Verification of Neural Networks: A Temporal Logic Based Framework

Abstract
Due to extensive applications of deep learning and neural networks, their security has attracted more and more attentions from academic and industrial circles. Under the guidance of the theory of formal verification, this paper summarizes three basic problems which indicate the common features of different neural networks, and proposes three typical properties covering the correctness of a model, the correctness of a sample and the robustness of a model for neural network systems. The method is driven by these properties, the model is constructed using the MSVL language and the properties are characterized by the logic PPTL. On this basis, the modeling and verification process is done in the MC compiler.
Xiaobing Wang, Kun Yang, Yanmei Wang, Liang Zhao, Xinfeng Shu

### UMC4M: A Verification Tool via Program Execution

Abstract
Most of the software model checkers available for code level verification can only verify safety properties since desired properties are specified by assertions. However, other temporal properties such as liveness cannot be verified with these tools. To tackle this problem, we develop a verification tool called UMC4M to verify full regular temporal properties of programs. UMC4M takes a modeling, simulation and verification language (MSVL) program M and a desired property specified by a propositional projection temporal logic (PPTL) formula P as its input. $$\lnot P$$ is then translated to an MSVL program M. Thus, the property can be verified by checking whether there is an acceptable execution of “$$M~and~M'$$”, which can be solved with MSVL compiler MC. Further, UMC4M is used to verify the dining cryptographers protocol.
Meng Wang, Junfeng Tian, Hong Zhang

### Parallel Runtime Verification Approach for Alternate Execution of Multiple Threads

Abstract
Since resources are shared by threads created in a multi-threaded program, these threads are not completely independent of each other. The execution of these threads usually needs to satisfy a certain order restriction. In this paper, we employ a multi-core machine based parallel runtime verification approach to efficiently monitor the alternate execution of multiple threads. First, the problem is described in Modeling, Simulation and Verification Language (MSVL). Second, the desired periodically repeated property is specified by a Propositional Projection Temporal Logic (PPTL) formula. Third, the state sequence generated by the execution of the MSVL program is divided into several segments which are verified in parallel. Finally, verification results for different segments are merged. Experimental results show that the alternate execution of multiple threads implemented through invoking Windows Application Programming Interface (API) functions SuspendThread and ResumeThread will lead to these threads out of sequence.
Bin Yu, Jinhui Liu, Ming Lei, Yong Yu, Hao Chen

### A Planning Approach Based on APTL

Abstract
This paper proposes a path planning approach for multi-agent systems (MASs) with Alternating Projection Temporal Logic (APTL). Our approach aims to calculate a path efficiently which satisfies specific properties in a MAS. The MAS is formalized in an interpreted system, then symbolically represented, and properties of paths are expressed by APTL formulas. We implement a robotic soccer game in order to illustrate the availability of this approach, actions and strategies of each robot are described in detail in interpreted system. The toolkit MCMAS_APTL2 has been developed, where we just enter an APTL formula of specific properties and the model of the robotic soccer game into the toolkit, then a path satisfies the formula is calculated and provided.
Haiyang Wang, Yao Liu

### Solving Constraint Optimization Problems Based on Mathematica and Abstraction

Abstract
Solving constraint optimization problems is widely applicable in computer science. The computer algebra system Mathematica provides MaxValue, MinValue and other functions to solve constraint optimization problems. However many cases of constraint optimization problems cannot be solved by these functions directly. In this paper, based on these Mathematica functions and abstraction, a practical approach is presented for computing the upper bounds and the lower bounds of the optimization values of constraint optimization problems. The optimization values of many constraint optimization problems, which cannot be solved by Mathematica functions directly, can be computed automatically by using the approach presented in this paper. The experimental results demonstrate the practicality of this approach.
Guoteng Pan, Mengjun Li, Guodong Ou

### A Forward Chaining Heuristic Search with Spatio-Temporal Control Knowledge

Abstract
Planning systems use a variety of heuristic search or control knowledge in order to enhance the performance. Control knowledge is often described in a specific formalism, e.g. temporal logic, automata, or HTN (Hierarchical Task Network) etc. Heuristic search exploits heuristic functions to evaluate potential feasible moves. Control knowledge constraints the search space by pruning the states which violate the knowledge. In this paper, we propose a general heuristic algorithm that combines control knowledge specified by a spatio-temporal logic named PPTL$$^{\text{ SL }}$$. Both heuristic search and control knowledge are handled in a forward chaining manner. Our approach involves the evaluation of PPTL$$^{\text{ SL }}$$ formulas using a variant of the traditional progression technique during heuristic search. Consequently, we are enabled to take advantage of the two methods together to further reduce the search space.
Xu Lu, Jin Cui, Yansong Dong, Wensheng Wang, Runzhe Ma, Yifeng Li, Qing Feng

### Formal Development and Verification of Reusable Component in PAR Platform

Abstract
Formal method is an important approach to develop high trust software systems. Coq is an interactive proof assistant for the development of mathematical theories and formally certified software. Set, Bag, List, Tree, Graph are important reusable components in PAR platform. This paper tries to formally develop ‘Set’ components which have linear structure and verify the correctness of this component mechanically with Coq. The formal development of this component involves formalization of specification, the recurrence relation of problem-solving sequence and loop invariant. Specification language Radl of PAR platform was used to describe the specification, recurrence relation and loop invariants; Software modelling language Apla was used to describe the abstract model of those components. The Dijkstra’s Weakest Precondition method is used to verify abstract model by the interactive proof tool Coq. Finally, the abstract model denoted by Apla was transformed to concrete model written by executable language; such as C++, Java, VB and C#, etc., based on the program generating systems in PAR platform.
Qimin Hu, Jinyun Xue, Zhen You, Zhuo Cheng, Zhengkang Zuo

### A New Mutant Generation Algorithm Based on Basic Path Coverage for Mutant Reduction

Abstract
Mutation testing is a fault-based testing technique that can be used to measure the adequacy of a test set, but its application usually incurs a high cost due to the necessity of generating and executing a great number of mutants. How to reduce the cost still remains a challenge for research. In this paper, we present a new mutant generation algorithm based on a basic path coverage that can help reduce mutants. The algorithm is characterized by implementing a basic path segments identification criterion for determining appropriate program points at which faults are inserted and a mutant generation priority criterion for selecting proper mutant operators to make a fault for insertion. We discuss the algorithm by analysing how the two criteria are realized based on analysing the control flow graph of the program and applying effective mutation operators on the appropriate statements in the relevant path segments. We also present an automated mutation testing tool that supports the proposed approach, and a small experiment to evaluate our tool by comparing it with a traditional mutation testing method on six programs. The result of the experiment suggests that the proposed method can significantly reduce the number of mutants and improve the efficiency of mutation testing.
Xu Qin, Shaoying Liu, Zhang Tao

### Formal Specification and Model Checking of a Ride-sharing System in Maude

Abstract
We report on a case study in which we have formally specified a ride-sharing system in Maude, a rewriting logic-based specification/programming language and model checked that the system enjoys desired liveness as well as safety properties with the Maude LTL model checker. In our approach to formal specification of the system, a map, a collection of cars and a collection of persons are treated as parameters. Thus, it suffices to write one formal systems specification from which the specification instance is generated for each fixed map, each fixed collection of cars and each fixed collection of persons. We often need fairness assumptions to model check liveness properties, which makes model checking performance slower or even infeasible. The case study also demonstrates that such a situation can be alleviated by a divide & conquer approach to liveness model checking under fairness.
Eiichi Muramoto, Kazuhiro Ogata, Yoichi Shinoda

### Model Checking Python Programs with MSVL

Abstract
To verify the correctness of Python programs, a novel approach for model checking Python programs with MSVL (Modeling, Simulation and Verification Language) is advocated. To this end, the rules for decoding the object-oriented semantics of Python with the process-oriented semantics of MSVL are defined, and the technique for automatically rewriting a Python program into its equivalent MSVL program is formalized, which in turn can be verified with the model checking tool MSV. In addition, an example is given to illustrate how the approach works. The approach fully utilizes the powerful expressiveness of MSVL to verify Python programs in a direct way, and helps to improve the quality of the software system.
Xinfeng Shu, Fengyun Gao, Weiran Gao, Lili Zhang, Xiaobing Wang, Liang Zhao

### Prediction of Function Removal Propagation in Linux Evolution

Abstract
Software studies on the function level, which is inherently different than coarse-grained investigations, contributes to the deep understanding of the laws of software internal evolution. This paper focuses on the life cycle distribution and the propagation behaviour of removed functions in Linux kernels. After an in-depth analysis of 300 Linux kernels, from Version 2.6.12-rc2 to 3.7-rc6, we found that most removed functions have relatively low life cycles, which indicates that many functions are likely to be removed between two consecutive versions. Our experimental results also show that function removal propagation is closely related to the file containing the functions, function call dependency, and historical information. This motivated us to propose a few heuristics to predict function removal propagation, which are based on the file position, call graph, and historical information in git. Furthermore, we analyzed the impact of removed functions on software structure and found that Linux kernel has a strong resistance to function removal. The life cycle feature and the prediction heuristics presented in this paper can be utilized to facilitate the maintenance of large-scale complex software systems.
Lei Wang, Guoxiong Chen, Liang Li

### Regression Models for Performance Ranking of Configurable Systems: A Comparative Study

Abstract
Finding the best configurations for a highly configurable system is challenging. Existing studies learned regression models to predict the performance of potential configurations. Such learning suffers from the low accuracy and the high effort of examining the actual performance for data labeling. A recent approach uses an iterative strategy to sample a small number of configurations from the training pool to reduce the number of sampled ones. In this paper, we conducted a comparative study on the rank-based approach of configurable systems with four regression methods. These methods are compared on 21 evaluation scenarios of 16 real-world configurable systems. We designed three research questions to check the impacts of different methods on the rank-based approach. We find out that the decision tree method of Classification And Regression Tree (CART) and the ensemble learning method of Gradient Boosted Regression Trees (GBRT) can achieve better ranks among four regression methods under evaluation; the sampling strategy in the rank-based approach is useful to save the cost of sampling configurations; the measurement, i.e., rank difference correlates with the relative error in several evaluation scenarios.
Yuntianyi Chen, Yongfeng Gu, Lulu He, Jifeng Xuan

### Combining Model Learning and Model Checking to Analyze Java Libraries

Abstract
In the current technological era, the correct functionality and quality of software systems are of prime concern and require practical approaches to improve their reliability. Besides classical testing, formal verification techniques can be employed to enhance the reliability of software systems. In this connection, we propose an approach that combines the strengths of two effective techniques, i.e., Model learning and Model checking for the formal analysis of Java libraries. To evaluate the performance of the proposed approach, we consider the implementation of “Java.util” package as a case study. First, we apply model learning to infer behavior models of Java package and then use model checking to verify that the obtained models satisfy basic properties (safety, functional, and liveness) specified in LTL/CTL languages. Our results of the formal analysis reveal that the learned models of Java package satisfy all the selected properties specified in temporal logic. Moreover, the proposed approach is generic and very promising to analyze any software library/package considered as a black-box.
Shahbaz Ali, Hailong Sun, Yongwang Zhao

### Data Provenance Based System for Classification and Linear Regression in Distributed Machine Learning

Abstract
Nowadays, data provenance is widely used to increase the accuracy of machine learning models. However, facing the difficulties in information heredity, these models produce data association. Most of the studies in the field of data provenance are focused on specific domains. And there are only a few studies on a machine learning (ML) framework with distinct emphasis on the accurate partition of coherent and physical activities with implementation of ML pipelines for provenance. This paper presents a novel approach to usage of data provenance which is also called data provenance based system for classification and linear regression in distributed machine learning (DPMLR). To develop the comprehensive approach for data analysis and visualization based on a collective set of functions for various algorithms and provide the ability to run large scale graph analysis, we apply StellarGraph as our primary ML structure. The preliminary results on the complex data stream structure showed that the overall overhead is no more than 20%. It opens up opportunities for designing an integrated system which performs dynamic scheduling and network bounded synchronization based on the ML algorithm.
Muhammad Jahanzeb Khan, Ruoyu Wang, Daniel Sun, Guoqiang Li

### Metamorphic Testing in Fault Localization of Model Transformations

Abstract
Model transformations are cornerstone elements of Model Driven Engineering (MDE), and their quality directly affects the successful application of MDE in practice. However, due to the characteristics of model transformation programs, the debugging of model transformations faces the oracle problem. In this paper, we propose an approach of debugging model transformations by using the technique of metamorphic testing (MT). Our approach leverages MT to alleviate the oracle problem, and integrates MT with spectrum-based fault localization technique to locating faulty rules of model transformations. We conduct experiments to evaluate our approach by using open-source model transformation programs, and compare the effectiveness of our approach with that of a fault localization technique using constraint-based oracles. Both of the experimental analysis and the comparison study show that our approach is of promising effectiveness, suggesting that MT can be a good support for debugging model transformations.
Keke Du, Mingyue Jiang, Zuohua Ding, Hongyun Huang, Ting Shu

### A Fault Localization Method Based on Dynamic Failed Execution Blocks

Abstract
Software fault localization is the most tedious and time-consuming activity in program debugging. Spectrum-based software fault localization (SFL) is a typical light-weight automated diagnosis technique for boosting fault localization. In the previous work, the researchers has found that although there is no optimal formula, the accuracy of a specific SFL-based method may be affected by the used program spectra at different granularity levels, e.g., statements, methods, basic blocks. In this paper, therefore, we further explore the correlation between the spectra granularity and the fault localization accuracy, and then the dynamic failed execution block (DFEB), a novel granularity for program spectra, is introduced to improve the diagnostic accuracy of a specific SFL technique. Consequently, a new method for fault localization by using DFEBs is proposed. Finally, to validate the effectiveness of our proposed method, an experimental study on 8 classic SFL techniques and 11 well-known benchmark programs including Siemens suite, space, gzip, grep, and sed, is conducted in detail. Experimental results show that the new approach is more effective in most cases, in comparison with these subject SFL methods.
Baoyi Pan, Ting Shu, Jinsong Xia, Zuohua Ding, Mingyue Jiang

### Adaptive Random Testing by Bisection and Comprehensive Distance

Abstract
Adaptive random testing (ART) has been proved to be effective in improving the failure detection ability of random testing. As a lightweight ART algorithm, the ART by bisection (ART-B) can realize test case generation in a linear order of time complexity, but its ability for finding failures is not so strong. In this study, the dynamic bisection is used to generate candidates randomly from the empty regions as much as possible. For each candidate, two types of distances are taken into account. Then, a comprehensive distance metric is defined to determine the next test case from the candidate set. For the nearest neighbor query in the proposed ART by Bisection and Comprehensive Distance (ART-BCD), the distance-ware forgetting is adopted to ensure its computational cost is still in the linear order. To validate the effectiveness of ART-BCD algorithm, both simulation experiments and empirical studies are performed for comparative analysis. The experimental results show that ART-BCD is better than or comparable to the ART-B and other typical algorithms, such as the fixed-size-candidate-set ART (FSCS-ART), in most cases, especially for the block failure pattern.
Chengying Mao, Mengting Quan, Zhilei Chen, Tsong Yueh Chen

### CMM: A Combination-Based Mutation Method for SQL Injection

Abstract
With the rapid development of Web applications, SQL injection (SQLi) has been a serious security threat for years. Many systems use superimposed rules to prevent SQLi like backlists filtering rules and filter functions. However, these methods can not completely eliminate SQLi vulnerabilities. Many researchers and security experts hope to find a way to find SQLi vulnerabilities efficiently. Among them, mutation-based fuzzing plays an important role in Web security testing, especially for SQLi. Although this approach expands the space for test cases and improves vulnerability coverage to some extent, there are still some problems such as mutation operators cannot be fully covered, test cases space explosions, etc. In this paper, we present a new technique Combinatorial Mutation Method (CMM) to generate test set for SQLi. The test set applies t-way and variable strength Combinatorial Testing. It makes the mutation progress more aggressive and automated and generates test cases with better F-measure Metric and Efficiency Metric. We apply our approach to three open source benchmarks and compare it with sqlmap, FuzzDB and ART4SQLi. The experiment results demonstrate that the approach is effective in finding SQLi vulnerabilities with multiple filtering rules.
Jing Zhao, Tianran Dong, Yang Cheng, Yanbin Wang

### Backmatter

Weitere Informationen