Skip to main content
main-content

Über dieses Buch

This book constitutes revised and selected papers from the scientific satellite events held in conjunction with the18th International Conference on Service-Oriented Computing, ICSOC 2020. The conference was held virtually during December 14-17, 2020.

A total of 125 submissions were received for the satellite events. The volume includes

9 papers from the PhD Symposium Track,
4 papers from the Demonstration Track, and
45 papers from the following workshops:

International Workshop on Artificial Intelligence for IT Operations (AIOps)International Workshop on Cyber Forensics and Threat Investigations Challenges in Emerging Infrastructures (CFTIC 2020)2nd Workshop on Smart Data Integration and Processing (STRAPS 2020)International Workshop on AI-enabled Process Automation (AI-PA 2020)International Workshop on Artificial Intelligence in the IoT Security Services (AI-IOTS 2020)

Inhaltsverzeichnis

Frontmatter

Ph.D Symposium

Frontmatter

Staking Assets Management on Blockchains: Vision and Roadmap

This paper introduces and explores the vision wherefore stakeholders and the process of staking —that is, the idea of guaranteeing the quality of a process by risking valuable assets on their correct execution— may run both on and off a blockchain while in the context of cloud-enabled services and processes. The emerging trend behind blockchain-oriented computing and the reliance on stakeholders therein make distilling and evaluating this vision a priority to deliver high-quality, sustainable services of the future. We identify key defining concepts of stakeholders and the staking process, using three very different staking scenarios as a base. Subsequently, we analyze the key challenges that these stakeholders face and propose the development of a framework that can help overcome these challenges. Finally, we give a road-map to steer systematic research stemming from the proposed vision, leveraging design science along with short-cyclic experimentation.

Stefan Driessen

Hybrid Context-Aware Method for Quality Assessment of Data Streams

Data quality is one of the most important issues that if not taken into consideration appropriately, results in the low reliability of the knowledge extracted through big data analytics. Furthermore, the challenges with data quality management are even greater with streaming data. Most of the methods introduced in the literature for processing streaming data do not use contextual information for the purpose of addressing data quality issues, however, it is possible to improve the performance of these methods by considering the contextual information, especially those obtained from the external resources. Based on this point of view, our main objective in this thesis is to propose a hybrid multivariate context-aware approach for data quality assessment in streaming environments, such as smart city applications.

Mostafa Mirzaie

Container-Based Network Architecture for Mobility, Energy and Security Management as a Service in IoT Environments

Internet of Things is being used in every field of life. The increased IoT devices are building heterogeneous networks which result in introducing some serious challenges. The presence of mobile nodes makes the network unstable, the protocols used in these network do not offer the required security level and network growth results in more energy consumption. On the other hand, the Container-Based solutions are getting huge attention because they are lightweight then Virtual Machines (VMs). They are reusable, flexible and offer dynamic allocation of resources. In this article, we have proposed a Container-Based architecture which offers Mobility, Energy and Security Management as a Service (MESMaaS). Our main objective is to implement MESMaaS at the core of the network to address Network issues and to achieve improved network performance in terms of network life time, network stability, re-transmission of data, signaling cost, packet loss, data & network security and other communication issues.

Zahid Iqbal

Towards a Rule-Based Recommendation Approach for Business Process Modeling

Business process modeling can be time-consuming and error-prone, especially for inexperienced users. For this reason, graphical editors for business process modeling should support users by providing suggestions on how to complete a currently developed business process model. We address this problem with a rule-based activity recommendation approach, which suggests suitable activities to extend the business process model that is currently edited at a user-defined position. Contrary to alternative approaches, rules provide an additional explanation for the recommendation, which can be useful in cases where a user might be torn between two alternatives. We plan to investigate how rule learning can be efficiently designed for the given problem setting and how a rule-based approach performs compared to alternative methods. In this paper we describe the basic idea, a first implementation and first results.

Diana Sola

Towards a Privacy Conserved and Linked Open Data Based Device Recommendation in IoT

Interconnecting Internet of Things (IoT) devices creates a network of services capable of working together to accomplish certain goals in different domains. The heterogeneous nature of IoT environments makes it critical to find devices that extend existing architectures and helps in reaching the desired goal; especially if we have to take into consideration data privacy. In this paper, we present a Linked Open Data (LOD) based approach to semantically annotate and recommend IoT devices while adding a layer of data security and privacy through implementing the SOLID (SOcial LInked Data) framework.

Fouad Komeiha, Nasredine Cheniki, Yacine Sam, Ali Jaber, Nizar Messai, Thomas Devogele

Learning Performance Models Automatically

To ensure the quality of frequent releases in DevOps context, performance models enable system performance simulation and prediction. However, building performance models for microservice or serverless-based applications in DevOps is costly and error-prone. Thus, we propose to employ model discovery learning for performance models automatically. To generate basic models to represent the application, we first introduce performance-related TOSCA models as architectural models. Then we transform TOSCA models into layered queueing network models. A main challenge of performance model generation is model parametrization. We propose to learn parametric dependencies from monitoring data and systems analysis to capture the relationship between input data and resource demand. With frequent releases of new features, we consider employing detecting parametric dependencies incrementally to keep updating performance models in each iteration.

Runan Wang

Blockchain-Based Business Processes: A Solidity-to-CPN Formal Verification Approach

With its span of applications widening by the day, the technology of Blockchain has been gaining more interest in different domains. It has intrigued many investors, but also numerous malicious users who have put different Blockchain platforms under attack. It is therefore an inescapable necessity to guarantee the correctness of smart contracts as they are the core of Blockchain applications. Existing verification approaches, however, focus on targeting particular vulnerabilities, seldom supporting the verification of domain-specific properties.In this paper, we propose a translation of Solidity smart contracts into CPNs (Coloured Petri nets) and investigate the capability of CPN Tools to verify CTL (Computation Tree Logic) properties.

Ikram Garfatta, Kaïs Klai, Mahamed Graïet, Walid Gaaloul

Formal Quality of Service Analysis in the Service Selection Problem

The Service Selection problem has driven a lot of attention from the Service-Oriented community in the past few decades. Rapidly evolving cloud computing technologies foster the vision of a Service-Oriented Computing paradigm where multiple providers offer specific functionalities as services that compete against each other to be automatically selected by service consumers. We present a research program that focuses on Quality of Service aware Service Selection. We discuss our vision and research methodology in the context of the state of the art of the topic and review the main contributions of our approach.

Agustín Eloy Martinez-Suñé

Software Demonstrations

Frontmatter

A Crowdsourcing-Based Knowledge Graph Construction Platform

Nowadays, knowledge graphs are backbones of many information systems that require to have access to structured knowledge. While there are many openly available knowledge graphs, self-constructed knowledge graphs in specific domains are still in need, and the process of construction usually consumes a lot of manpower. In this paper, we present a novel platform that takes advantage of crowdsourcing to construct and manage knowledge graphs. The platform aims to provide knowledge graph automatic construction as a service and reduce the tenants’ effort to construct knowledge graphs.

Xingkun Liu, Zhiying Tu, Zhongjie Wang, Xiaofei Xu, Yin Chen

Data Interaction for IoT-Aware Wearable Process Management

Process execution and monitoring based on Internet of Things (IoT) data can enable a more comprehensive view on processes. In our previous research, we developed an approach that implements an IoT-aware Business Process Management System (BPMS), comprising an integrated architecture for connecting IoT data to a BPMS. Furthermore, a wearable process user interface allows process participants to be notified in real-time at any location in case new tasks occur. In many situations operators must be able to directly influence data of IoT objects, e.g., to control industrial machinery or to manipulate certain device parameters from arbitrary places. However, a BPM controlled interaction and manipulation of IoT data has been neglected so far. In this demo paper, we extend our approach towards a framework for IoT data interaction by means of wearable process management. BPM technology provides a transparent and controlled basis for data manipulation within the IoT.

Stefan Schönig, Richard Jasinski, Andreas Ermer

SiDD: The Situation-Aware Distributed Deployment System

Most of today’s deployment automation technologies enable the deployment of distributed applications in distributed environments, whereby the deployment execution is centrally coordinated either by a central orchestrator or a master in a distributed master-workers architectures. However, it is becoming increasingly important to support use cases where several independent partners are involved. As a result, decentralized distributed deployment automation approaches are required, since organizations typically do not provide access to their internal infrastructure to the outside or leave control over application deployments to others. Moreover, the choice of partners can depend heavily on the current situation at deployment time, e.g. the costs or availability of resources. Thus, at deployment time it is decided which partner will provide a certain part of the application depending on the situation. To tackle these challenges, we demonstrate the situation-aware distributed deployment (SiDD) system as an extension of the OpenTOSCA ecosystem.

Kálmán Képes, Frank Leymann, Benjamin Weder, Karoline Wild

AuraEN: Autonomous Resource Allocation for Cloud-Hosted Data Processing Pipelines

Ensuring cost-effective end-to-end QoS in an IoT data processing pipeline (DPP) is a non-trivial task. A key factor that affects the overall performance is the amount of computing resources allocated to each service in the pipeline. In this demo paper, we present AuraEN, an Autonomous resource allocation ENgine that can proactively scale the resources of each individual service in the pipeline in response to predicted workload variations so as to ensure end-to-end QoS while optimizing the associated costs. We briefly describe the AuraEN system architecture and its implementation and demonstrate how it can be used to manage the resources of a DPP hosted on the Amazon EC2 cloud.

Sunil Singh Samant, Mohan Baruwal Chhetri, Quoc Bao Vo, Ryszard Kowalczyk, Surya Nepal

Artificial Intelligence for IT Operations (AIOPS 2020)

Frontmatter

Performance Diagnosis in Cloud Microservices Using Deep Learning

Microservice architectures are increasingly adopted to design large-scale applications. However, the highly distributed nature and complex dependencies of microservices complicate automatic performance diagnosis and make it challenging to guarantee service level agreements (SLAs). In particular, identifying the culprits of a microservice performance issue is extremely difficult as the set of potential root causes is large and issues can manifest themselves in complex ways. This paper presents an application-agnostic system to locate the culprits for microservice performance degradation with fine granularity, including not only the anomalous service from which the performance issue originates but also the culprit metrics that correlate to the service abnormality. Our method first finds potential culprit services by constructing a service dependency graph and next applies an autoencoder to identify abnormal service metrics based on a ranked list of reconstruction errors. Our experimental evaluation based on injection of performance anomalies to a microservice benchmark deployed in the cloud shows that our system achieves a good diagnosis result, with 92% precision in locating culprit service and 85.5% precision in locating culprit metrics.

Li Wu, Jasmin Bogatinovski, Sasho Nedelkoski, Johan Tordsson, Odej Kao

Anomaly Detection at Scale: The Case for Deep Distributional Time Series Models

This paper introduces a new methodology for detecting anomalies in time series data, with a primary application to monitoring the health of (micro-) services and cloud resources. The main novelty in our approach is that instead of modeling time series consisting of real values or vectors of real values, we model time series of probability distributions. This extension allows the technique to be applied to the common scenario where the data is generated by requests coming in to a service, which is then aggregated at a fixed temporal frequency. We show the superior accuracy of our method on synthetic and public real-world data.

Fadhel Ayed, Lorenzo Stella, Tim Januschowski, Jan Gasthaus

A Systematic Mapping Study in AIOps

IT systems of today are becoming larger and more complex, rendering their human supervision more difficult. Artificial Intelligence for IT Operations (AIOps) has been proposed to tackle modern IT administration challenges thanks to AI and Big Data. However, past AIOps contributions are scattered, unorganized and missing a common terminology convention, which renders their discovery and comparison impractical. In this work, we conduct an in-depth mapping study to collect and organize the numerous scattered contributions to AIOps in a unique reference index. We create an AIOps taxonomy to build a foundation for future contributions and allow an efficient comparison of AIOps papers treating similar problems. We investigate temporal trends and classify AIOps contributions based on the choice of algorithms, data sources and the target components. Our results show a recent and growing interest towards AIOps, specifically to those contributions treating failure-related tasks (62%), such as anomaly detection and root cause analysis.

Paolo Notaro, Jorge Cardoso, Michael Gerndt

An Influence-Based Approach for Root Cause Alarm Discovery in Telecom Networks

Alarm root cause analysis is a significant component in the day-to-day telecommunication network maintenance, and it is critical for efficient and accurate fault localization and failure recovery. In practice, accurate and self-adjustable alarm root cause analysis is a great challenge due to network complexity and vast amounts of alarms. A popular approach for failure root cause identification is to construct a graph with approximate edges, commonly based on either event co-occurrences or conditional independence tests. However, considerable expert knowledge is typically required for edge pruning. We propose a novel data-driven framework for root cause alarm localization, combining both causal inference and network embedding techniques. In this framework, we design a hybrid causal graph learning method (HPCI), which combines Hawkes Process with Conditional Independence tests, as well as propose a novel Causal Propagation-Based Embedding algorithm (CPBE) to infer edge weights. We subsequently discover root cause alarms in a real-time data stream by applying an influence maximization algorithm on the weighted graph. We evaluate our method on artificial data and real-world telecom data, showing a significant improvement over the best baselines.

Keli Zhang, Marcus Kalander, Min Zhou, Xi Zhang, Junjian Ye

Localization of Operational Faults in Cloud Applications by Mining Causal Dependencies in Logs Using Golden Signals

Cloud based microservice architecture has become a powerful mechanism in helping organizations to scale operations by accelerating the pace of change at minimal cost. With cloud based applications being accessed from diverse geographies, there is a need for round-the-clock monitoring of faults to prevent or to limit the impact of outages. Pinpointing source(s) of faults in cloud applications is a challenging problem due to complex interdependencies between applications, middleware, and hardware infrastructure all of which may be subject to frequent and dynamic updates. In this paper, we propose a light-weight fault localization technique, which can reduce human effort and dependency on domain knowledge for localizing observable operational faults. We model multivariate error-rate time series using minimal runtime logs to infer causal relationship among the golden signal errors (error rates) and micro-service errors to discover ranked list of possible faulty components. Our experimental results show that our system can localize operational faults with high accuracy (F1 = 88.4%) underscoring the effectiveness of using golden signal error rates in fault localization.

Pooja Aggarwal, Ajay Gupta, Prateeti Mohapatra, Seema Nagar, Atri Mandal, Qing Wang, Amit Paradkar

Using Language Models to Pre-train Features for Optimizing Information Technology Operations Management Tasks

Information Technology (IT) Operations management is a vexing problem for most companies that rely on IT systems for mission-critical business applications. While IT operators are increasingly leveraging analytical tools powered by artificial intelligence (AI), the volume, the variety and the complexity of data generated in the IT Operations domain poses significant challenges in managing the applications. In this work, we present an approach to leveraging language models to pre-train features for optimizing IT Operations management tasks such as anomaly prediction from logs. Specifically, using log-based anomaly prediction as the task, we show that the machine learning models built using language models (embeddings) trained with IT Operations domain data as features outperform those AI models built using language models with general-purpose data as features. Furthermore, we present our empirical results outlining the influence of factors such as the type of language models, the type of input data, and the diversity of input data, on the prediction accuracy of our log anomaly prediction model when language models trained from IT Operations domain data are used as features. We also present the run-time inference performance of log anomaly prediction models built using language models as features in an IT Operations production environment.

Xiaotong Liu, Yingbei Tong, Anbang Xu, Rama Akkiraju

Towards Runtime Verification via Event Stream Processing in Cloud Computing Infrastructures

Software bugs in cloud management systems often cause erratic behavior, hindering detection, and recovery of failures. As a consequence, the failures are not timely detected and notified, and can silently propagate through the system. To face these issues, we propose a lightweight approach to runtime verification, for monitoring and failure detection of cloud computing systems. We performed a preliminary evaluation of the proposed approach in the OpenStack cloud management platform, an “off-the-shelf” distributed system, showing that the approach can be applied with high failure detection coverage.

Domenico Cotroneo, Luigi De Simone, Pietro Liguori, Roberto Natella, Angela Scibelli

Decentralized Federated Learning Preserves Model and Data Privacy

The increasing complexity of IT systems requires solutions, that support operations in case of failure. Therefore, Artificial Intelligence for System Operations (AIOps) is a field of research that is becoming increasingly focused, both in academia and industry. One of the major issues of this area is the lack of access to adequately labeled data, which is majorly due to legal protection regulations or industrial confidentiality. Methods to mitigate this stir from the area of federated learning, whereby no direct access to training data is required. Original approaches utilize a central instance to perform the model synchronization by periodical aggregation of all model parameters. However, there are many scenarios where trained models cannot be published since its either confidential knowledge or training data could be reconstructed from them. Furthermore the central instance needs to be trusted and is a single point of failure. As a solution, we propose a fully decentralized approach, which allows to share knowledge between trained models. Neither original training data nor model parameters need to be transmitted. The concept relies on teacher and student roles that are assigned to the models, whereby students are trained on the output of their teachers via synthetically generated input data. We conduct a case study on log anomaly detection. The results show that an untrained student model, trained on the teachers output reaches comparable F1-scores as the teacher. In addition, we demonstrate that our method allows the synchronization of several models trained on different distinct training data subsets.

Thorsten Wittkopp, Alexander Acker

Online Memory Leak Detection in the Cloud-Based Infrastructures

A memory leak in an application deployed on the cloud can affect the availability and reliability of the application. Therefore, to identify and ultimately resolve it quickly is highly important. However, in the production environment running on the cloud, memory leak detection is a challenge without the knowledge of the application or its internal object allocation details.This paper addresses this challenge of online detection of memory leaks in cloud-based infrastructure without having any internal application knowledge by introducing a novel machine learning based algorithm Precog. This algorithm solely uses one metric i.e. the system’s memory utilization on which the application is deployed for the detection of a memory leak. The developed algorithm’s accuracy was tested on 60 virtual machines manually labeled memory utilization data provided by our industry partner Huawei Munich Research Center and it was found that the proposed algorithm achieves the accuracy score of 85% with less than half a second prediction time per virtual machine.

Anshul Jindal, Paul Staab, Jorge Cardoso, Michael Gerndt, Vladimir Podolskiy

Multi-source Anomaly Detection in Distributed IT Systems

The multi-source data generated by distributed systems, provide a holistic description of the system. Harnessing the joint distribution of the different modalities by a learning model can be beneficial for critical applications for maintenance of the distributed systems. One such important task is the task of anomaly detection where we are interested in detecting the deviation of the current behaviour of the system from the theoretically expected. In this work, we utilize the joint representation from the distributed traces and system log data for the task of anomaly detection in distributed systems. We demonstrate that the joint utilization of traces and logs produced better results compared to the single modality anomaly detection methods. Furthermore, we formalize a learning task - next template prediction NTP, that is used as a generalization for anomaly detection for both logs and distributed trace. Finally, we demonstrate that this formalization allows for the learning of template embedding for both the traces and logs. The joint embeddings can be reused in other applications as good initialization for spans and logs.

Jasmin Bogatinovski, Sasho Nedelkoski

TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services

Deployment, operation and maintenance of large IT systems becomes increasingly complex and puts human experts under extreme stress when problems occur. Therefore, utilization of machine learning (ML) and artificial intelligence (AI) is applied on IT system operation and maintenance - summarized in the term AIOps. One specific direction aims at the recognition of re-occurring anomaly types to enable remediation automation. However, due to IT system specific properties, especially their frequent changes (e.g. software updates, reconfiguration or hardware modernization), recognition of reoccurring anomaly types is challenging. Current methods mainly assume a static dimensionality of provided data. We propose a method that is invariant to dimensionality changes of given data. Resource metric data such as CPU utilization, allocated memory and others are modelled as multivariate time series. The extraction of temporal and spatial features together with the subsequent anomaly classification is realized by utilizing TELESTO, our novel graph convolutional neural network (GCNN) architecture. The experimental evaluation is conducted in a real-word cloud testbed deployment that is hosting two applications. Classification results of injected anomalies on a cassandra database node show that TELESTO outperforms the alternative GCNNs and achieves an overall classification accuracy of 85.1%. Classification results for the other nodes show accuracy values between 85% and 60%.

Dominik Scheinert, Alexander Acker

Discovering Alarm Correlation Rules for Network Fault Management

Fault management is critical to telecommunication networks. It consists of detecting, diagnosing, isolating and fixing network problems, a task that is time-consuming. A promising approach to improve fault management is to find patterns revealing the relationships between network alarms, to then only show the most important alarms to network operators. However, a limitation of current algorithms of this type is that they ignore the network topology. But the network topology is important to understand how alarms propagate on a network. This paper addresses this issue by modeling a real-life telecommunication network as a dynamic attributed graph and then extracting correlation patterns between network alarms called Alarm Correlation Rules. Experiments on a large telecommunication network show that interesting patterns are found that can greatly compress the number of alarms presented to network operators, which can reduce network maintenance costs.

Philippe Fournier-Viger, Ganghuan He, Min Zhou, Mourad Nouioua, Jiahong Liu

Resource Sharing in Public Cloud System with Evolutionary Multi-agent Artificial Swarm Intelligence

Artificial Intelligence for IT operations (AIOps) is an emerging research area for public cloud systems. The research topics of AIOps have been expanding from robust and reliable systems to cloud resource allocation in general. In this paper we propose a resource sharing scheme between cloud users, to minimize the resource utilization while guaranteeing Quality of Experience (QoE) of the users. We utilise the concept of recently emerged Artificial Swarm Intelligence (ASI) for resource sharing between users, by using Artificial-Intelligence-based agents to mimic human user behaviours. In addition, with the variation of real-time resource utilisation, the swarm of agents share their spare resource with each other according to their needs and their Personality Traits (PT). In this paper, we first propose and implement an Evolutionary Multi-robots Personality (EMP) model, which considers the constraints from the environment (resource usage states of the agents) and the evolution of two agents’ PT at each sharing step. We then implement a Single Evolution Multi-robots Personality (SEMP) model, which only considers to evolve agent’s PT and neglects the resource usage states. For benchmarking we also implement a Nash Bargaining Solution Sharing (NBSS) model which uses game theory but does not involve PT or risks of usage states. The objective of our proposed models is to make all the agents get sufficient resources while reducing the total amount of excessive resources. The results show that our EMP model performs the best, with least iteration steps leading to the convergence and best resource savings.

Beiran Chen, Yi Zhang, George Iosifidis

SLMAD: Statistical Learning-Based Metric Anomaly Detection

Technology companies have become increasingly data-driven, collecting and monitoring a growing list of metrics, such as response time, throughput, page views, and user engagement. With hundreds of metrics in a production environment, an automated approach is needed to detect anomalies and alert potential incidents in real-time. In this paper, we develop a time series anomaly detection framework called Statistical Learning-Based Metric Anomaly Detection (SLMAD) that allows for the detection of anomalies from key performance indicators (KPIs) in streaming time series data. We demonstrate the integrated workflow and algorithms of our anomaly detection framework, which is designed to be accurate, efficient, unsupervised, online, robust, and generalisable. Our approach consists of a three-stage pipeline including analysis of time series, dynamic grouping, and model training and evaluation. The experimental results show that the SLMAD can accurately detect anomalies on a number of benchmark data sets and Huawei production data while maintaining efficient use of resources.

Arsalan Shahid, Gary White, Jaroslaw Diuwe, Alexandros Agapitos, Owen O’Brien

Software Reliability Engineering for Resilient Cloud Operations

In the last decade, cloud environments become the most sophisticated software systems. Due to the inevitable occurrences of failures, software reliability engineering is top priority for cloud developers and maintainers. In this essay, we introduce several frameworks to provide resilient cloud operations from different development phases, ranging from fault prevention before deployment and fault removal at run-time.

Michael R. Lyu, Yuxin Su

AI-Enabled Process Automation (AI-PA 2020)

Frontmatter

The Future of Robotic Process Automation (RPA)

While there has been considerable industry interest in the deployment and uptake of Robotic Process Automation (RPA) technology, very little has been done by way of generating technological foresight into the manner in which RPA systems might evolve in the short- to medium-term. This paper seeks to fill that gap.

Aditya Ghose, Geeta Mahala, Simon Pulawski, Hoa Dam

Adaptive Summaries: A Personalized Concept-Based Summarization Approach by Learning from Users’ Feedback

Exploring the tremendous amount of data efficiently to make a decision, similar to answering a complicated question, is challenging with many real-world application scenarios. In this context, automatic summarization has substantial importance as it will provide the foundation for big data analytic. Traditional summarization approaches optimize the system to produce a short static summary that fits all users that do not consider the subjectivity aspect of summarization, i.e., what is deemed valuable for different users, making these approaches impractical in real-world use cases. This paper proposes an interactive concept-based summarization model, called Adaptive Summaries, that helps users make their desired summary instead of producing a single inflexible summary. The system learns from users’ provided information gradually while interacting with the system by giving feedback in an iterative loop. Users can choose either reject or accept action for selecting a concept being included in the summary with the importance of that concept from users’ perspectives and confidence level of their feedback. The proposed approach can guarantee interactive speed to keep the user engaged in the process. Furthermore, it eliminates the need for reference summaries, which is a challenging issue for summarization tasks. Evaluations show that Adaptive Summaries helps users make high-quality summaries based on their preferences by maximizing the user-desired content in the generated summaries.

Samira Ghodratnama, Mehrdad Zakershahrak, Fariborz Sobhanmanesh

TAP: A Two-Level Trust and Personality-Aware Recommender System

Recommender systems (RSs) have been adopted in a variety set of web services to provide a list of items which a user may interact with in near future. Collaborative filtering (CF) is one of the most widely used mechanism in RSs that focuses on preferences of neighbours of similar users. Therefore, it is a critical challenge for CF models to discover a set of appropriate neighbors for a particular user. Most of the current approaches exploit users’ ratings information to find similar users by comparing their rating patterns. However, this may be a simple idea and over-tested by the current studies, which may fail under data sparsity problem. Recommender system as an intelligent system needs to help users with their decision making process, and facilitate them with personalized suggestions. In real world, people are willing to share similar interest with those who have the same personality type; and then among all similar personality users pope may only take advice and recommendation from the trustworthy ones. Therefore, in this paper we propose a two-level model, TAP, which analyzes users’ behaviours to first detect their personality types, and then incorporate trust information to provide more customized recommendations. We mathematically model our approach based on the matrix factorization to consider personality and trust information simultaneously. Experimental results on a real-world dataset demonstrate the effectiveness of our model.

Shahpar Yakhchi, Seyed Mohssen Ghafari, Mehmet Orgun

Am I Rare? an Intelligent Summarization Approach for Identifying Hidden Anomalies

Monitoring network traffic data to detect any hidden patterns of anomalies is a challenging and time-consuming task which requires high computing resources. To this end, an appropriate summarization technique is of great importance, where it can be a substitute for the original data. However, the summarized data is under the threat of removing anomalies. Therefore, it is vital to create a summary that can reflect the same pattern as the original data. Therefore, in this paper, we propose an INtelligent Summarization approach for IDENTifying hidden anomalies, called INSIDENT. The proposed approach guarantees to keep the original data distribution in summarized data. Our approach is a clustering-based algorithm that dynamically maps original feature space to a new feature space by locally weighting features in each cluster. Therefore, in new feature space, similar samples are closer, and consequently, outliers are more detectable. Besides, selecting representatives based on cluster size keeps the same distribution as the original data in summarized data. INSIDENT can be used both as the preprocess approach before performing anomaly detection algorithms and anomaly detection algorithm. The experimental results on benchmark datasets prove a summary of the data can be a substitute for original data in the anomaly detection task.

Samira Ghodratnama, Mehrdad Zakershahrak, Fariborz Sobhanmanesh

On How Cognitive Computing Will Plan Your Next Systematic Review

Systematic literature reviews (SLRs) are at the heart of evidence-based research, setting the foundation for future research and practice. However, producing good quality timely contributions is a challenging and highly cognitive endeavor, which has lately motivated the exploration of automation and support in the SLR process. In this paper we address an often overlooked phase in this process, that of planning literature reviews, and explore under the lenses of cognitive process augmentation how to overcome its most salient challenges. In doing so, we report on the insights from 24 SLR authors on planning practices, its challenges as well as feedback on support strategies inspired by recent advances in cognitive computing. We frame our findings under the cognitive augmentation framework, and report on a prototype implementation and evaluation focusing on further informing the technical feasibility.

Maisie Badami, Marcos Baez, Shayan Zamanirad, Wei Kang

Security Professional Skills Representation in Bug Bounty Programs and Processes

The ever-increasing amount of security vulnerabilities discovered and reported in recent years are significantly raising the concerns of organizations and businesses regarding the potential risks of data breaches and attacks that may affect their assets (e.g. the cases of Yahoo and Equifax). Consequently, organizations, particularly those suffering from these attacks are relying on the job of security professionals. Unfortunately, due to a wide range of cyber-attacks, the identification of such skilled security professional is a challenging task. One such reason is the “skill gap” problem, a mismatch between the security professionals’ skills and the skills required for the job (vulnerability discovery in our case). In this work, we focus on platforms and processes for crowdsourced security vulnerability discovery (bug bounty programs) and present a framework for the representation of security professional skills. More specifically, we propose an embedding-based clustering approach that exploits multiple and rich information available across the web (e.g. job postings, vulnerability discovery reports) to translate the security professional skills into a set of relevant skills using clustering information in a semantic vector space. The effectiveness of this approach is demonstrated through experiments, and the results show that our approach works better than baseline solutions in selecting the appropriate security professionals.

Sara Mumtaz, Carlos Rodriguez, Shayan Zamanirad

Stage-Based Process Performance Analysis

Process performance mining utilizes the event data generated and stored during the execution of business processes. For the successful application of process performance mining, one needs reliable performance statistics based on an understandable representation of the process. However, techniques developed for the automated analysis of event data typically solely focus on one aspect of the aforementioned requirements, i.e., the techniques either focus on increasing the analysis interpretability or on computing and visualizing the performance metrics. As such, obtaining performance statistics at the higher level of abstraction for analysis remains an open challenge. Hence, using the notion of process stages, i.e., high-level process steps, we propose an approach that supports human analysts to analyze the performance at the process-stage-level. An extensive set of experiments shows that our approach, without much effort from users, supports such analysis with reliable results.

Chiao-Yun Li, Sebastiaan J. van Zelst, Wil M. P. van der Aalst

AudioLens: Audio-Aware Video Recommendation for Mitigating New Item Problem

From the early years, the research on recommender systems has been largely focused on developing advanced recommender algorithms. These sophisticated algorithms are capable of exploiting a wide range of data, associated with video items, and build quality recommendations for users. It is true that the excellency of recommender systems can be very much boosted with the performance of their recommender algorithms. However, the most advanced algorithms may still fail to recommend video items that the system has no form of representative data associated to them (e.g., tags and ratings). This is a situation called New Item problem and it is part of a major challenge called Cold Start. This problem happens when a new item is added to the catalog of the system and no data is available for that item. This can be a serious issue in video-sharing applications where hundreds of hours of videos are uploaded in every minute, and considerable number of these videos may have no or very limited amount of associated data.In this paper, we address this problem by proposing recommendation based on novel features that do not require human-annotation, as they can be extracted completely automatic. This enables these features to be used in the cold start situation where any other source of data could be missing. Our proposed features describe audio aspects of video items (e.g., energy, tempo, and danceability, and speechiness) which can capture a different (still important) picture of user preferences. While recommendation based on such preferences could be important, very limited attention has been paid to this type of approaches.We have collected a large dataset of unique audio features (from Spotify) extracted from more than 9000 movies. We have conducted a set of experiments using this dataset and evaluated our proposed recommendation technique in terms of different metrics, i.e., Precision@K, Recall@K, RMSE, and Coverage. The results have shown the superior performance of recommendations based on audio features, used individually or combined, in the cold start evaluation scenario.

Mohammad Hossein Rimaz, Reza Hosseini, Mehdi Elahi, Farshad Bakhshandegan Moghaddam

Scalable Online Conformance Checking Using Incremental Prefix-Alignment Computation

Conformance checking techniques aim to collate observed process behavior with normative/modeled process models. The majority of existing approaches focuses on completed process executions, i.e., offline conformance checking. Recently, novel approaches have been designed to monitor ongoing processes, i.e., online conformance checking. Such techniques detect deviations of an ongoing process execution from a normative process model at the moment they occur. Thereby, countermeasures can be taken immediately to prevent a process deviation from causing further, undesired consequences. Most online approaches only allow to detect approximations of deviations. This causes the problem of falsely detected deviations, i.e., detected deviations that are actually no deviations. We have, therefore, recently introduced a novel approach to compute exact conformance checking results in an online environment. In this paper, we focus on the practical application and present a scalable, distributed implementation of the proposed online conformance checking approach. Moreover, we present two extensions to said approach to reduce its computational effort and its practical applicability. We evaluate our implementation using data sets capturing the execution of real processes.

Daniel Schuster, Gero Joss Kolhof

Bringing Cognitive Augmentation to Web Browsing Accessibility

In this paper we explore the opportunities brought by cognitive augmentation to provide a more natural and accessible web browsing experience. We explore these opportunities through conversational web browsing, an emerging interaction paradigm for the Web that enables blind and visually impaired users (BVIP), as well as regular users, to access the contents and features of websites through conversational agents. Informed by the literature, our previous work and prototyping exercises, we derive a conceptual framework for supporting BVIP conversational web browsing needs, to then focus on the challenges of automatically providing this support, describing our early work and prototype that leverage heuristics that consider structural and content features only.

Alessandro Pina, Marcos Baez, Florian Daniel

Towards Knowledge-Driven Automatic Service Composition for Wildfire Prediction

Wildfire prediction from Earth Observation (EO) data has gained much attention in the past years, through the development of connected sensors and weather satellites. Nowadays, it is possible to extract knowledge from collected EO data and to learn from this knowledge without human intervention to trigger wildfire alerts. However, exploiting knowledge extracted from multiple EO data sources at run-time and predicting wildfire raise multiple challenges. One major challenge is to provide dynamic construction of service composition plans, according to the data obtained from sensors. In this paper, we present a knowledge-driven Machine Learning approach that relies on historical data related to wildfire observations to guide the collection of EO data and to automatically and dynamically compose services for triggering wildfire alerts.

Hela Taktak, Khouloud Boukadi, Chirine Ghedira Guégan, Michael Mrissa, Faïez Gargouri

Eyewitness Prediction During Crisis via Linguistic Features

Social media is one of the first places people share information about serious topics, such as a crisis event. Stakeholders, including the agencies of crisis response, seek to understand this valuable information in order to reach affected people. This paper addresses the problem of locating eyewitnesses during times of crisis. We included published tweets of 26 crises of various types, including earthquakes, floods, train crashes, and others. This paper investigated the impact of linguistic features extracted from tweets on different learning algorithms and included two languages, English and Italian. Better results than the state of the art were achieved; in the cross-event scenario, we achieved F1-scores of 0.88 for English and 0.86 for Italian; in the split-across scenario, we achieved F1-scores of 0.69 for English and 0.89 for Italian.

Suliman Aladhadh

Smart Data Integration and Processing on Service Based Environments (STRAPS 2020)

Frontmatter

On the Definition of Data Regulation Risk

The rapid development of Information and Communication Technologies (ICT) has led to firms embracing data processing. Scholars and professionals have developed a range of assessments and management methodologies to better answer the needs for trust and privacy in ICT. With the ambition of establishing trust by reinforcing the protection of individuals’ rights and privacy, economic interests and national security, policy makers attempt to regulate data processing through enactment of laws and regulations. Non-compliance with these norms may harm companies which in turn need to incorporate it in their risk assessment. We propose to define this new class of risk: “Data Regulation Risk” (DRR) as “a risk originating from the possibility of a penalty from a regulatory agency following evidence of non-compliance with regulated data processing and/or ICT governances and processes and/or information technologies and services”. Our definition clarifies the meaning of the defined terms in a given context and adds a specific scope to facilitate and optimize decision-making.

Guillaume Delorme, Guilaine Talens, Eric Disson, Guillaume Collard, Elise Gaget

Classifying Micro-text Document Datasets: Application to Query Expansion of Crisis-Related Tweets

Twitter is an active communication channel for spreading information during crises (e.g., earthquake). To exploit this information, civilians require to explore the tweets produced along a crisis period. For instance, for getting information about crisis’ related events (e.g. landslide, building collapse), and their associated relief actions (e.g., gathering of food supply, search for victims). However, such Twitter usage demand significant effort and answers must be accurate to support the coordination of actions in response to crisis events (e.g., avoiding a massive concentration of efforts in only one place). This requirement calls for efficient information classification so that people can perform agile and useful relief actions. This paper introduces an approach based on classification and query expansion techniques in the context of micro-texts (i.e., tweets) search. In our approach, a user’s query is rewritten using a classified vocabulary derived from top-k results, to reflect her search intent better. For classification purpose, we study and compare different models to find the one that can best provide answers to a user query. Our experimental results show that the use of Multi-Task Deep Neural Network (MT-DNN) models further improves micro-text classification. Also, the experimental results demonstrate that our query expansion method is effective and reduces noise in the expanded query terms when looking for crisis tweets on Twitter datasets.

Mehrdad Farokhnejad, Raj Ratn Pranesh, Javier A. Espinosa-Oviedo

Data Centered and Usage-Based Security Service

Protecting Information Systems (IS) relies traditionally on security risk analysis methods. Designed for well-perimetrised environments, these methods rely on a systematic identification of threats and vulnerabilities to identify efficient control-centered protection countermeasures. Unfortunately, this does not fit security challenges carried out by the opened and agile organizations provided by the Social, Mobile, big data Analytics, Cloud and Internet of Things (SMACIT) environment. Due to their inherently collaborative and distributed organization, such multi-tenancy systems require the integration of contextual vulnerabilities, depending on the a priori unknown way of using, storing and exchanging data in opened cloud environment. Moreover, as data can be associated to multiple copies, different protection requirements can be set for each of these copies, which may lead the initial data owner lose control on the data protection. This involves (1) turning the traditional control-centered security vision to a dynamic data-centered protection and even (2) considering that the way a data is used can be a potential threat that may corrupt data protection efficiency. To fit these challenges, we propose a Data-centric Usage-based Protection service (DUP). This service is based on an information system meta-model, used to identify formally data assets and store the processes using copies of these assets. To define a usage-entered protection, we extend the Usage Based Access Control model, which is mostly focused on managing CRUD operations, to more complex operation fitting the SMACIT context. These usage rules are used to generate smart contracts, storing usage consents and managing usage control for cloud services.

Jingya Yuan, Frédérique Biennier, Nabila Benharkat

XYZ Monitor: IoT Monitoring of Infrastructures Using Microservices

One of the main features of the Internet of Things (IoT) is the ability to collect data from everywhere, convert this data into knowledge, and then use this knowledge to monitor about an undesirable situation. Monitoring needs to be done automatically to be practical and should be related to the ontological structure of the information being processed to be useful. However, current solutions do not allow to properly handle this information from a wide range of IoT devices and also to be able to react if a certain value threshold is exceeded. This is the main purpose of XYZ Monitor, the system we propose here: to monitor IoT devices so that it can automatically react and notify when a given alarm is detected. We deal with alarms defined by means of business rules and allow setting ontological requirements over the information handled.

Marc Vila, Maria-Ribera Sancho, Ernest Teniente

Multi-cloud Solution Design for Migrating a Portfolio of Applications to the Cloud

Migrating applications to the cloud is rapidly increasing in many organizations as it enables them to take advantages of the cloud, such as the lower costs and accessibility of data. Moreover, such organizations typically try to avoid sticking to a single cloud provider and rather prefer to be able to spread out their applications across different providers. However, there are many challenges in achieving this. First, many of the applications that are required to be moved to the cloud might be legacy applications that do not have good documentation, and so it is not trivial to even assess whether it is feasible to move them to the cloud or not. Moreover, such legacy applications might need a significant architecture overhaul to achieve the task of moving them to the cloud. Large client may have significant percentage of applications in this category. So, one has to evaluate cloud feasibility and understand whether there is a need to re-architect application based on what services providers are able to offer. Second, clients usually define multiple features, encryption/security level, and other service level requirements they expect in the providers they will migrate each of their applications to. Thus, choosing the right providers for different application is another challenging task here. In this work-in-progress paper, we present a novel methodology for preparing such a cloud migration solution, where we perform text mining on application data to evaluate cloud-migration feasibility and then recommend the optimal solution using a mathematical optimization model. We illustrate our approach with an example use case.

Shubhi Asthana, Aly Megahed, Ilyas Iyoob

Higher Order Statistical Analysis in Multiresolution Domain - Application to Breast Cancer Histopathology

Objective is to analyze textures in breast histopathology images for cancer diagnosis.Background: It is observed that breast cancer has second highest mortality rate in women. Detection of cancer in early stages can give more treatment options and thus reduce the mortality rate. In cancer diagnosis using histopathology images, histologists examine biopsy samples based on cell morphology, tissue distribution, randomness in their growth or placements. These methods are time taking and sometime leads to incorrect diagnosis. These methods are highly subjective/arbitrary. The new techniques use computers, archived data and standard algorithms to provide fast and accurate results.Material & Methods: In this work we have proposed a multiresolution statistical model in wavelet domain. The primary idea is to study complex random field of histopathology images which contain long–range and nonlinear spatial interactions in wavelet domain. This model emphasizes the contribution of Gray level Run Length Matrix (GLRLM) and related higher order statistical features in wavelet subbands. The image samples are taken from ‘BreaKhis’ database. The standard database generated in collaboration with the P&D Laboratory—Pathological Anatomy and Cytopathology, Parana, Brazil. This study has been designed for breast cancer histopathology images of ductal carcinoma. GLRLM feature dataset further classified by SVM classifier with linear kernel. The classification accuracies of signal resolution and multiresolution have been compared.Results: The results show that the GLRLM based features provides exceptional distinguishing features for multiresolution analysis of histopathology images. Apart from recent deep learning method this study proposes use of higher order statistics to gain stronger image features. These features carry inherent discriminative properties. This higher order statistical model will be suitable for cancer detection.Conclusion: This work proposes automated diagnosis. Tumor spatial heterogeneity is the main concern in analyzing, diagnosing and grading cancer. This model focuses on Long range spatial dependencies in heterogeneous spatial process and offers solutions for accurate classification in two class problems. The work describes an innovative way of using GLRLM based textural features to extract underlying information in breast cancer images.

Durgamahanthi Vaishali, P. Vishnu Priya, Nithyasri Govind, K. Venkat Ratna Prabha

Ontology Evolution Using Recoverable SQL Logs

Logs of SQL queries are useful for building the system design, upgrading, and checking which SQL queries are running on certain applications. These SQL queries provide us useful information and knowledge about the system operations. The existing works use SQL query logs to find patterns when the underlying data and database schema is not available. For this purpose, a knowledge-base in the form of an ontology is created which is then mined for knowledge extraction. In this paper, we have proposed an approach to create and evolve an ontology from logs of SQL queries. Furthermore, when these SQL queries are transformed into the ontology, they loose their original form/shape i.e., we do not have original SQL queries. Therefore, we have further proposed a strategy to recover these SQL queries in their original form. Experiments on real world datasets demonstrate the effectiveness of the proposed approach.

Awais Yousaf, Asad Masood Khattak, Kifayat Ullah Khan

Artificial Intelligence in the IoT Security Services (AI-IOTS 2020)

Frontmatter

A Novel Automated System for Hospital Acquired Infection Monitoring and Prevention

According to the World Health Organization (WHO), 1.7 million people suffer from hospital acquired infections each year in the United States alone, which accounts for 99,000 deaths. The most prominent reason for spreading these infections is poor hand hygiene compliance in hospitals. This paper proposes an automated system which can monitor and enforce proper hand hygiene compliance in hospitals as stipulated by WHO. The proposed system is a multi-module system based on microcontroller and multiple sensors that track hand hygiene compliance throughout a hospital, sends real-time compliance alerts to staff for immediate corrective actions, and provides automated compliance report generation for the hospital staff. This system is based on four modules, one module is worn by staff, it provides staff’s unique ID to other modules and receives real time hand hygiene compliance alerts. The other three modules detect staff and use unique algorithms to do detailed hand hygiene compliance checks at patient beds, sinks, and alcohol dispensers. A custom software was developed to control all modules and upload compliance data to the central server. This system makes hospital hand hygiene compliance monitoring and tracking fully automated, real-time, and scalable. Once deployed it has the potential to significantly reduce the rate of infections and save many lives. With minor changes to the algorithms this system can find applications in other areas such as restaurants, shops, and households for hand hygiene monitoring.

Samyak Shrimali

A Novel Approach for Detecting IoT Botnet Using Balanced Network Traffic Attributes

Over the evolution of internet technology give rise to the intelligence among tiny objects so called IoT devices. At the same time, this scenario increases the intrusion of malwares into the IoT devices e.g. Mirai, bashlite. Researchers have proposed many framework by addressing this issue. But the framework of those proposed work which are tested using Real time traffic of IoT devices is very fewer. In this work, the class imbalance problem has been identified in the BoT-IoT dataset. This problem is overcome by the random over sampling technique. Then this resultant dataset is further classified into normal and attack traffic using three machine learning classifier such as Support Vector Machine, Naive Bayes, and Decision Tree (j48) and deep learning technique such as deep neural network. The performance of the security model is evaluated using quality metrics like Precision, Recall, F-measure, Response time and ROC to identify the best classifier which is apt to detect malware in IoT devices.

M. Shobana, Sugumaran Poonkuzhali

KMeans Kernel-Learning Based AI-IoT Framework for Plant Leaf Disease Detection

Development of IoT based solutions in agriculture is changing the sector with Smart Agriculture. Plant Leaf Disease Detection (PLDD) using ICT is one of the most active and challenging research areas because of its potential in the food security topic. Some of current solutions based on AI/Machine learning techniques (E.g. KNN, CNN) are very efficient. However, deploying them in the context of Africa will be challenging knowing that computation resources, connectivity to data centers, and electrical power supply won’t be guaranteed. In this paper we propose an AI-IoT Framework based on KMeans Kernel Learning to build Artificial Intelligence services on Core Network and deploy it to Edge AI-IoT Network. AI-Service Segment selects leaves images that have representative characteristics of diseased leaves (Kernel-Images), uses KMeans machine learning algorithm to build clusters of Kernel-Images so that diseased regions are contained cluster. We call the resulting models KMeans Kernel Models. Main outcome of our proposal is designing a low-computation and economic Edge AI-AoT Framework as efficient as sophisticated methods. We have evaluated that our system is efficient and provides a very good result with a rate of 96% accuracy with a low number of training images. Our proposed framework reduces the need for large training datasets to be efficient (in comparison to KNN/SVM and CNN) and learned models are embeddable in IoT devices near the plants.

Youssouph Gueye, Maïssa Mbaye

System for Monitoring and Control of Vehicle’s Carbon Emissions Using Embedded Hardwares and Cloud Applications

Today, the electronic devices such as sensors and actuators, forming the Internet of Things (IoT) are presented naturally in the people’s day to day life. Billions of devices are sensing and acting upon the physical world and exchange information. All major industries like transportation, manufacturing, healthcare, agriculture, etc. adopt IoT solutions. Not only people and industry are affected in a positive way by IoT, but also the nature and environment. The IoT is recognized as a key lever in the urge to save the climate. It has a major potential in reducing air carbon emissions and pollution. Taking into account the promising sectors of IoT application, this paper proposes a solution for monitoring and control of carbon emissions from vehicles. It consists of hardware device that ingests data related to vehicles’ carbon emissions and cloud based services for data storage, analysis and representation. It controls the carbon emissions via notifications and vehicle’s power restrictions.

Tsvetan Tsokov, Hristo Kostadinov

Cyber Forensics and Threat Investigations Challenges in Emerging Infrastructures (CFTIC 2020)

Frontmatter

An Information Retrieval-Based Approach to Activity Recognition in Smart Homes

One of the principal challenges in developing robust Machine Learning (ML) classification algorithms for Human Activity Recognition (HAR) from real-time smart home sensor data is how to account for variations in 1) the activity sequence length, 2) the contribution each sensor has to an activity, and 3) the amount of activity class imbalance. Such changes generate observations that do not conform to expected patterns potentially reducing the efficacy of classification models. Moreover the architecture of prior solutions have been quite complex which have resulted in large training times for these approaches to achieve acceptable classification accuracy. In this paper we address these three issues by 1) proposing a data structure representing the duration and frequency information of each sensor for an activity, 2) transforming this data structure into an Information Retrieval (IR)-based representation, and finally 3) compare and contrast the utility of this IR-based representation using four different supervised classifiers. Our proposed framework in combination with a state-of-the-art ensemble learner results in more accurate and scalable ML classification models that are better suited toward off-line HAR in a smart home setting.

Brendon J. Woodford, Ahmad Ghandour

Botnet Sizes: When Maths Meet Myths

This paper proposes a method and empirical pieces of evidence to investigate the claim commonly made that proxy services used by web scraping bots have millions of residential IPs at their disposal. Using a real-world setup, we have had access to the logs of close to 20 heavily targeted websites and have carried out an experiment over a two months period. Based on the gathered empirical pieces of evidence, we propose mathematical models that indicate that the amount of IPs is likely 2 to 3 orders of magnitude smaller than the one claimed. This finding suggests that an IP reputation-based blocking strategy could be effective, contrary to what operators of these websites think today.

Elisa Chiapponi, Marc Dacier, Massimiliano Todisco, Onur Catakoglu, Olivier Thonnard

Cyber Security Education and Future Provision

Cybersecurity education is a crucial element to provide a workforce for the future to have an awareness together with the skills and knowledge enabling them to adapt and diversify in the field. Cybercrime is covered by the need for distinct aspects of security and measures of control on subsequent systems and devices. As the need for Cybersecurity specialists has increased in recent years and during the exceptional circumstances of the Covid-19 pandemic, the provision of education in secondary, post-16 and higher education sectors needs to be met. Utilisation of strategies and innovations to meet industry and educational expectations is key for future provision. Drive for different strategies and innovations in place by governments and organisations throughout the world, as provision in education is not balanced enough to cope with the increasing demand for a cybersecurity workforce. This study will inform recommendations for effective provision of future cybersecurity education.

Gaynor Davies, Mamoun Qasem, Ahmed M. Elmisery

Efficient Threat Hunting Methodology for Analyzing Malicious Binaries in Windows Platform

The rising cyber threat puts organizations and ordinary users at risk of data breaches. In many cases, Early detection can hinder the occurrence of these incidents or even prevent a full compromise of all internal systems. The existing security controls such as firewalls and intrusion prevention systems are constantly blocking numerous intrusions attempts that happen on a daily basis. However, new situations may arise where these security controls are not sufficient to provide full protection. There is a necessity to establish a threat hunting methodology that can assist investigators and members of the incident response team to analyse malicious binaries quickly and efficiently. The methodology proposed in this research is able to distinguish malicious binaries from benign binaries using a quick and efficient way. The proposed methodology consists of static and dynamic hunting techniques. Using these hunting techniques, the proposed methodology is not only capable of identifying a range of signature-based anomalies but also to pinpoint behavioural anomalies that arise in the operating system when malicious binaries are triggered. Static hunting can describe any extracted artifacts as malicious depending on a set of pre-defined patterns of malicious software. Dynamic hunting can assist investigators in finding behavioural anomalies. This work focuses on applying the proposed threat hunting methodology on samples of malicious binaries, which can be found in common malware repositories and presenting the results.

Ahmed M. Elmisery, Mirela Sertovic, Mamoun Qasem

Peer-to-Peer Application Threat Investigation

Understanding the layers of an application leads to a better threat investigation outcome as well as helping with developing proper controls with optimized cost. Starting with the blockchain as the peer-to-peer application we will analyze the peer-to-peer networks and how they provide the underlay for blockchain. We’ll have a look at the layers of the peer-to-peer application starting from the network layer, communication flows and communication ports through analyzing packet captures collected from both client side and network taps, up to the blockchain layer and client-side processes where we start to have a look at imported functions, memory and CPU utilization. We aim to have a structured approach for threat investigation for peer-to-peer applications.

Mohamed Mahdy

Backmatter

Weitere Informationen

Premium Partner