Skip to main content

2021 | Buch

Machine Learning for Networking

Third International Conference, MLN 2020, Paris, France, November 24–26, 2020, Revised Selected Papers

insite
SUCHEN

Über dieses Buch

This book constitutes the thoroughly refereed proceedings of the Second International Conference on Machine Learning for Networking, MLN 2019, held in Paris, France, in December 2019. The 26 revised full papers included in the volume were carefully reviewed and selected from 75 submissions. They present and discuss new trends in deep and reinforcement learning, pattern recognition and classification for networks, machine learning for network slicing optimization, 5G system, user behavior prediction, multimedia, IoT, security and protection, optimization and new innovative machine learning methods, performance analysis of machine learning algorithms, experimental evaluations of machine learning, data mining in heterogeneous networks, distributed and decentralized machine learning algorithms, intelligent cloud-support communications, ressource allocation, energy-aware communications, software de ned networks, cooperative networks, positioning and navigation systems, wireless communications, wireless sensor networks, underwater sensor networks.

Inhaltsverzeichnis

Frontmatter
Better Anomaly Detection for Access Attacks Using Deep Bidirectional LSTMs
Abstract
Recent evaluations show that the current anomaly-based network intrusion detection methods fail to detect remote access attacks reliably [10]. Here, we present a deep bidirectional LSTM approach that is designed specifically to detect such attacks as contextual network anomalies. The model efficiently learns short-term sequential patterns in network flows as conditional event probabilities to identify contextual anomalies. To verify our improvements on current detection rates, we re-implemented and evaluated three state-of-the-art methods in the field. We compared results on an assembly of datasets that provides both representative network access attacks as well as real normal traffic over a long timespan, which we contend is closer to a potential deployment environment than current NIDS benchmark datasets. We show that by building a deep model, we are able to reduce the false positive rate to \(0.16\%\) while detecting effectively, which is significantly lower than the operational range of other methods. Furthermore, we reduce overall misclassification by more than \(100\%\) from the next best method.
Henry Clausen, Gudmund Grov, Marc Sabate, David Aspinall
Using Machine Learning to Quantify the Robustness of Network Controllability
Abstract
This paper presents machine learning based approximations for the minimum number of driver nodes needed for structural controllability of networks under link-based random and targeted attacks. We compare our approximations with existing analytical approximations and show that our machine learning based approximations significantly outperform the existing closed-form analytical approximations in case of both synthetic and real-world networks. Apart from targeted attacks based upon the removal of so-called critical links, we also propose analytical approximations for out-in degree-based attacks.
Ashish Dhiman, Peng Sun, Robert Kooij
Configuration Faults Detection in IP Virtual Private Networks Based on Machine Learning
Abstract
Network incidents are largely due to configuration errors, particularly within network service providers who manage large complex networks. Such providers offer virtual private networks to their customers to interconnect their remote sites and provide Internet access. The growing demand for virtual private networks leads service providers to search for novel scalable approaches to locate incidents arising from configuration faults. In this paper, we propose a machine learning approach that aims to locate customer connectivity issues coming from configurations errors, in a BGP/MPLS IP virtual private network architecture. We feed the learning model with valid and faulty configuration data and train it using three algorithms: decision tree, random forest and multi-layer perceptron. Since failures can occur on several routers, we consider the learning problem as a supervised multi-label classification problem, where each customer router is represented by a unique label. We carry out our experiments on three network sizes containing different types of configuration errors. Results show that multi-layer perceptron has a better accuracy in detecting faults than the other algorithms, making it a potential candidate to validate offline network configurations before online deployment.
El-Heithem Mohammedi, Emmanuel Lavinal, Guillaume Fleury
Improving Android Malware Detection Through Dimensionality Reduction Techniques
Abstract
Mobile malware poses undoubtedly a major threat to the continuously increasing number of mobile users worldwide. While researchers have been trying vigorously to find optimal detection solutions, mobile malware is becoming more sophisticated and its writers are getting more and more skilled in hiding malicious code. In this paper, we examine the usefulness of two known dimensionality reduction transformations namely, Principal Component Analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) in malware detection. Starting from a large set of base prominent classifiers, we study how they can be combined to build an accurate ensemble. We propose a simple ensemble aggregated base model of similar feature type as well as a complex ensemble that can use multiple and possibly heterogeneous base models. The experimental results in contemporary Androzoo benchmark corpora verify the suitability of ensembles for this task and clearly demonstrate the effectiveness of our method.
Vasileios Kouliaridis, Nektaria Potha, Georgios Kambourakis
A Regret Minimization Approach to Frameless Irregular Repetition Slotted Aloha: IRSA-RM
Abstract
Wireless communications play an important part in the systems of the Internet of Things (IoT). Recently, there has been a trend towards long-range communications systems for IoT, including cellular networks. For many use cases, such as massive machine-type communications (mMTC), performance can be gained by going out of the classical model of connection establishment and adopting the random access methods. Associated with physical layer techniques such as Successive Interference Cancellation (SIC), or Non-Orthogonal Multiple Access (NOMA), the performance of random access can be dramatically improved, giving the novel random access protocol designs. This article studies one of these modern random access protocols: Irregular Repetition Slotted Aloha (IRSA). Because optimizing its parameters is not an easily solved problem, in this article, we use a reinforcement learning approach for that purpose. We adopt one specific variant of reinforcement learning, Regret Minimization, to learn the protocol parameters. We explain why it is selected, how to apply it to our problem with centralized learning, and finally, we provide both simulation results and insights into the learning process. The obtained results show the excellent performance of IRSA when it is optimized with Regret Minimization.
Iman Hmedoush, Cédric Adjih, Paul Mühlethaler
Mobility Based Genetic Algorithm for Heterogeneous Wireless Networks
Abstract
In heterogeneous wireless networks, collaboration between mobile and static elements allows optimal exchange (in terms of latency, reliability and energy savings) ensured by the mobile elements of the data collected with precision by the static elements. In this article, we focus on this collaboration. We base our study on genetic algorithms to select the best next destination from the event area according to different criteria, such as the amount of data collected and where the passage of the UAVs (Unmanned Aerial Vehicles) is delayed. Choosing the best destination ensures better latency and a high level of received data. The events and characteristics of the event area will be sensed by the static elements (this is to ensure optimal precision since the static element will be placed in the desired location to be studied), while the mobile elements will be charged to collect the data sensed by the static elements, and to ensure their routing to the collecting station. The results confirm the effectiveness of our collaborative approach compared to a solution based on random mobility of mobile elements.
Kamel Barka, Lyamine Guezouli, Samir Gourdache, Sara Ameghchouche
Geographical Information Based Clustering Algorithm for Internet of Vehicles
Abstract
Nowadays, Internet of Vehicles (IoV) are considered as the most important promoter domain in the Intelligent Transportation System (ITS). Vehicles in IoV are characterized by a high nodes’ mobility, high nodes’ number and high data storage. However, IoV suffer from many challenges in order to achieve robust communication between vehicles such as frequent link disconnection, delay, and network overhead. In the traditional Vehicular Ad hoc NETworks (VANETs), these problems have often been solved by using clustering algorithms. Clustering in IoV can overcome and minimize the communication problems that face vehicles by reducing the network overhead and ensure some Quality of Service (QoS) to make network connectivity more stable. In this work, we propose a new Geo-graphical Information based Clustering Algorithm “GICA” destined to IoV environment. The proposal aims to maintain the cluster structure while respecting the quality of service requirements as the network evolves. We evaluated our proposed approach using the NS3 simulator and the realistic mobility model SUMO.
Rim Gasmi, Makhlouf Aliouat, Hamida Seba
Active Probing for Improved Machine-Learned Recognition of Network Traffic
Abstract
Information about the network protocols used by the background traffic can be important to the foreground traffic. Whether that knowledge is exploited via optimization through protocol selection (OPS) or through other forms of parameter tuning, a machine-learned classifier is one tool to identifying background traffic protocols. Unfortunately, global knowledge can be difficult to obtain in a dynamic distributed system like a shared, wide-area network (WAN).
Previous techniques for protocol identification have focused on passive or end-point signals for classification. For example, end-to-end round trip time (RTT) can, especially when gathered as a time series, reveal a lot about what is happening on the network. Other related signals, such as bandwidth, and the number of retransmissions can also be used for protocol classification. However, as noted, these signals are typically gathered by passive means, which may limit their usefulness.
We introduce and provide a proof-of-concept of active probing, which is the systematic and deliberate perturbation of traffic on a network for the purpose of gathering information. The time-series data generated by active probing improves our machine-learned classifiers because different network protocols react differently to the probing. Whereas passive probing might be limiting the time series observations to a period of steady state (e.g., saturated network), active probing forces the system out of that steady state. We show that active probing improves on prior work (with passive probing of RTT) by between 7% to 16% in additional accuracy (depending on the window size), and reaching 90% averages in precision, recall, and F1-scores.
Hamidreza Anvari, Paul Lu
A Dynamic Time Warping and Deep Neural Network Ensemble for Online Signature Verification
Abstract
Dynamic Time Warping (DTW) is a tried and tested online signature verification technique that still finds relevance in modern studies. However, DTW operates in a writer-dependent manner and its algorithm outputs unbounded distance values. The introduction of bounded outputs offers the prospect of cross pollination with other regression models which provide normalized outputs. Writer-dependent methods are heavily influenced by the richness of the available reference signature sets. Although writer-independent methods also use reference signatures, they have the ability to learn general characteristics of genuine and forged signatures. This ability particularly gives them an edge at detecting skilled forgeries. Noting that DTW, on the other hand, has a strength at random signature verification, this study proposes a model which combines DTW and Deep Neural Networks (DNNs). When trained on a class balanced training set from the BiosecurID dataset, using a best vs 1 reference signature selection scheme, the proposed hybrid model outperforms previous methods, achieving Equal Error Rates of 5.17 and 2.64 for skilled and random signature cases, respectively.
Mandlenkosi Victor Gwetu
Performance Evaluation of Some Machine Learning Algorithms for Security Intrusion Detection
Abstract
The growth of the Internet and the opening of systems have led to an increasing number of attacks on computer networks. Security vulnerabilities are increasing, in the design of communication protocols as well as in their implementation. On another side, the knowledge, tools and scripts, to launch attacks, become readily available and more usable. Hence, the need for an intrusion detection system (IDS) is also more apparent. This technology consists in searching for a series of words or parameters characterizing an attack in a packet flow. Intrusion Detection Systems has become an essential and critical component in an IT security architecture. An IDS should be designed as part of a global security policy. The objective of an IDS is to detect any violation of the rules according to the local security policy, it thus makes it possible to report attacks. This last multi-faceted, difficult to pin down when not handled, but most of the work done in this area remains difficult to compare, that's why the aim of our article is to analyze and compare intrusion detection techniques with several machine learning algorithms. Our research indicates which algorithm offers better overall performance than the others with the IDS field.
Ouafae Elaeraj, Cherkaoui Leghris, Éric Renault
Three Quantum Machine Learning Approaches for Mobile User Indoor-Outdoor Detection
Abstract
There is a growing trend in using machine learning techniques for detecting environmental context in communication networks. Machine learning is one of the promising candidate areas where quantum computing can show a quantum advantage over their classical algorithmic counterpart on near term Noisy Intermediate-Scale Quantum (NISQ) devices. The goal of this paper is to give a practical overview of (supervised) quantum machine learning techniques to be used for indoor-outdoor detection. Due to the small number of qubits in current quantum hardware, real application is not yet feasible. Our work is intended to be a starting point for further explorations of quantum machine learning techniques for indoor-outdoor detection.
Frank Phillipson, Robert S. Wezeman, Irina Chiscop
Learning Resource Allocation Algorithms for Cellular Networks
Abstract
Resource allocation algorithms in wireless networks can require solving complex optimization problems at every decision epoch. For large scale networks, when decisions need to be taken on time scales of milliseconds, using standard convex optimization solvers for computing the optimum can be a time-consuming affair that may impair real-time decision making. In this paper, we propose to use Deep Feedforward Neural Networks (DFNN) for learning the relation between inputs and the outputs of two such resource allocation algorithms that were proposed in [18, 19]. On numerical examples with realistic mobility patterns, we show that the learning algorithm yields an approximate yet satisfactory solution with much less computation time.
Thi Thuy Nga Nguyen, Olivier Brun, Balakrishna J. Prabhu
Enhanced Pub/Sub Communications for Massive IoT Traffic with SARSA Reinforcement Learning
Abstract
Sensors are being extensively deployed and are expected to expand at significant rates in the coming years. They typically generate a large volume of data on the internet of things (IoT) application areas like smart cities, intelligent traffic systems, smart grid, and e-health. Cloud, edge and fog computing are potential and competitive strategies for collecting, processing, and distributing IoT data. However, cloud, edge, and fog-based solutions need to tackle the distribution of a high volume of IoT data efficiently through constrained and limited resource network infrastructures. This paper addresses the issue of conveying a massive volume of IoT data through a network with limited communications resources (bandwidth) using a cognitive communications resource allocation based on Reinforcement Learning (RL) with SARSA algorithm. The proposed network infrastructure (PSIoTRL) uses a Publish/Subscribe architecture to access massive and highly distributed IoT data. It is demonstrated that the PSIoTRL bandwidth allocation for buffer flushing based on SARSA enhances the IoT aggregator buffer occupation and network link utilization. The PSIoTRL dynamically adapts the IoT aggregator traffic flushing according to the Pub/Sub topic’s priority and network constraint requirements.
Carlos E. Arruda, Pedro F. Moraes, Nazim Agoulmine, Joberto S. B. Martins
Deep Learning-Aided Spatial Multiplexing with Index Modulation
Abstract
In this paper, deep learning (DL)-aided data detection of spatial multiplexing (SMX) multiple-input multiple-output (MIMO) transmission with index modulation (IM) (Deep-SMX-IM) has been proposed. Deep-SMX-IM has been constructed by combining a zero-forcing (ZF) detector and DL technique. The proposed method uses the significant advantages of DL techniques to learn transmission characteristics of the frequency and spatial domains. Furthermore, thanks to using subblock-based detection provided by IM, Deep-SMX-IM is a straightforward method, which eventually reveals reduced complexity. It has been shown that Deep-SMX-IM has significant error performance gains compared to ZF detector without increasing computational complexity for different system configurations.
Merve Turhan, Ersin Öztürk, Hakan Ali Çırpan
A Self-gated Activation Function SINSIG Based on the Sine Trigonometric for Neural Network Models
Abstract
Deep learning models are based on a succession of multiple layers of artificial neural networks, which allows us to approach the resolution of several mathematical transformations and feed the next layer. This process is turned by exploiting the principle of non-linearity of the activation function that determine the output of neural network layer in aim to facilitate the learning process during training. Indeed, to improve the performance of these functions, it is essential to understand their non-linear behavior, in particular concerning their negative parts. In this context, the enhanced new activation functions which were implemented after ReLU function exploit the negative values to further optimize the gradient descent. In this paper, we propose a new activation function which is based on a trigonometric function and allows to further overcome the gradient problem, with less computation time compared to that of Mish function. The experiments that are performed over multiple datasets challenge show that the proposed activation function gives a high test accuracy than both ReLU and Mish functions in many deep network models.
Khalid Douge, Aissam Berrahou, Youssef Talibi Alaoui, Mohammed Talibi Alaoui
Spectral Analysis for Automatic Speech Recognition and Enhancement
Abstract
Accurate recognition of noisy speech signal is still an obstacle for wider application of speech recognition technology. The robustness of a speech recognition system is heavily influenced by the ability to handle the presence of background noise. In this work, a Short Time Fourier Transform (STFT) filtering technique for the enhancement and recognition of the speech signal is presented. Conventionally, STFT filtering has been applied in speech analysis. However, in this study the combination of modified STFT with Adaptive window width based on the Chirp Rate, termed ASTFT, in conjunction with Spectrogram Features is proposed for optimal speech recognition and enhancement. LibriSpeech ASR Corpus is the benchmark dataset for this experiment. The spectrum from the enhanced Speech signal is estimated using several spectrogram features to obtain a unit peak amplitude. Priori Signal-to-Noise Ratio (SNR) estimation is performed on the modified STFT speech signal, and it achieved an SNR of 31.86 dB which is considered to be an effectively clean speech signal.
Jane Oruh, Serestina Viriri
Road Sign Identification with Convolutional Neural Network Using TensorFlow
Abstract
With the use and continuous development of deep learning methods, the recognition of images and scenes captured from the real environment has also undergone a major transformation in the techniques and parameters used. In most of the methods, we notice that recognition is based on extraction. This paper proposes a classification technique based on convolutional features in the context of Traffic Sign Detection and Recognition (TSDR) which uses an enriched dataset of traffic signs. This solution offers an additional level of assistance to the driver, allowing better safety of passengers, road users, and cars. An experimental evaluation on publicly available scene image datasets with convolutional features presents results with an accuracy of 94.7% of our classification model.
Mohammed Kherarba, Mounir Tahar Abbes, Selma Boumerdassi, Mohammed Meddah, Abdelhak Benhamada, Mohammed Senouci
A Semi-automated Approach for Identification of Trends in Android Ransomware Literature
Abstract
Android ransomware is seen in the highlights of cyber security world reports. Ransomware is considered to be the most popular as well as threatening mobile malware. These are specsial malware used to extort money in return of access and data without user’s consent. The exponential growth in mobile transactions from 9.47 crore in 2013–14 to 72 crores in 2016–17 could be a potential motivation for numerous ransomware attacks in the recent past. Attackers are consistently working on producing advanced methods to deceit the victim and generate revenue. Therefore, study of Android stealth malware, its detection and analysis gained a substantial interest among researchers, thereby producing sufficiently large body of literature in a very short period. Manual reviews do provide insight but they are prone to be biased, time consuming and pose a great challenge on number of articles that needs investigation. This study uses Latent Semantic Analysis (LSA), an information modelling technique to deduce core research areas, research trends and widely investigated areas within corpus. This work takes a large corpus of 487 research articles (published during 2009–2019) as input and produce three core research areas and thirty emerging research trends in field of stealth malwares as primary goal. LSA, a semi-automated approach is helpful in achieving a significant innovation over traditional methods of literature review and had shown great performance in many other research fields like medical, supply chain management, open street map etc. The secondary aim of this study is to investigate popular latent topics by mapping core research trends with core research areas. This study also provides prospective future directions for heading researchers.
Tanya Gera, Jaiteg Singh, Deepak Thakur, Parvez Faruki
Towards Machine Learning in Distributed Array DBMS: Networking Considerations
Abstract
Computer networks are veins of modern distributed systems. Array DBMS (Data Base Management Systems) operate on big data which is naturally modeled as arrays, e.g. Earth remote sensing data and numerical simulation. Big data makes array DBMS to be distributed and highly utilize computer networks. The R&D area of array DBMS is relatively young and machine learning is just paving its way to array DBMS. Hence, existing work is this area is rather sparse and is just emerging. This paper considers distributed, large matrix multiplication (LMM) executed directly inside array DBMS. LMM is the core operation for many machine learning techniques on big data. LMM directly inside array DBMS is not well studied and optimized. We present novel LMM approaches for array DBMS and analyze the intricacies of LMM in array DBMS including execution plan construction and network utilization. We carry out performance evaluation in Microsoft Azure Cloud on a network cluster of virtual machines, report insights derived from the experiments, and present our vision for the future machine learning R&D directions based on LMM directly inside array DBMS.
Ramon Antonio Rodriges Zalipynis
Deep Learning Environment Perception and Self-tracking for Autonomous and Connected Vehicles
Abstract
Autonomous and Connected Vehicle (CAV) refers to an intelligent vehicle that is capable of moving, making its own decisions without the assistance of a human driver and ensure the communication with its environment. CAVs will not only change the way we travel, their deployment will make an impact on the evolution of society in terms of safety, environment and urban planning. In the automotive industry, researchers and developers are actively pushing approaches based on artificial intelligence, in particular, deep learning to enhance autonomous driving. However, before an autonomous vehicle finds its way into the road, it must first overcome a set of challenges regarding functional safety and driving efficiency.
This paper proposes an autonomous driving approach based on deep learning and computer vision, by guaranteeing the basic driving functions, the communication between the vehicle and its environment, obstacles detection and traffic signs identification. The obtained results show the effectiveness of the environment perception, the lane tracking and the appropriate decisions making.
Ihab Benamer, Arslane Yahiouche, Afifa Ghenai
Remote Sensing Scene Classification Based on Effective Feature Learning by Deep Residual Networks
Abstract
Remote sensing image scene interpretation has many applications on land use land covers; thanks to many satellite technologies innovations that generate high-quality images periodically for analysis and interpretation through computer vision techniques. In recent literature, deep learning techniques have demonstrated to be effective in image feature learning thus aiding several computer vision applications on land use land cover. However, most deep learning techniques suffer from problems such as the vanishing gradients, network over-fitting, among other challenges of which the different literature works have attempted to address from varying perspectives. The goal of machine learning in remote sensing is to learn image feature patterns extracted by computer vision techniques for scene classification tasks. Many applications that utilize data from remote sensing are on the surge, this include, aerial surveillance and security, smart farming, among others. These applications require to process satellite image information effectively and reliably for appropriate responses. This research proposes the deep residual feature learning network that is effective in image feature learning which can be utilizable in a networked environment for appropriate decision making processes. The proposed strategy utilizes short-cut connections and mapping functions for deep feature learning. The proposed technique is evaluated on two publicly available remote sensing datasets and it attains superior classification accuracy results of 96.30% and 92.56% respectively on the Ucmerced and Whu-siri datasets, improving the state-of-the-art significantly.
Ronald Tombe, Serestina Viriri
Identifying Device Types for Anomaly Detection in IoT
Abstract
With the advances in Internet of Things (IoT) technologies, more and more smart sensors and devices are connected to the Internet. Since the original idea of smart devices is better connection with each other, very limited security mechanism has been designed. Due to the diverse behaviors for various types of devices, it would be costly to manually design separate security mechanism. To prevent these devices from potential threats, It would be helpful if we could learn the characteristics of diverse device types based on the network packets generated. In this paper, we propose a machine learning approach to device type identification through network traffic analysis for anomaly detection in IoT. First, characteristics of different types of IoT devices are extracted from the generated network packets and learned using unsupervised and supervised learning methods. Second, we apply feature selection methods to the model learned from device type identification module to improve the performance of classification. In our experiments, the performance of device type identification on real data in a smart factory using supervised learning is better than unsupervised learning. The best performance can be achieved by XGBoost with an accuracy of 97.6% and micro-averaging F1 score of 97.6%. This shows the potential of the proposed approach for automatically identifying devices for anomaly detection in smart factories. Further investigation is needed to verify the proposed approach using more types of devices.
Chin-Wei Tien, Tse-Yung Huang, Ping Chun Chen, Jenq-Haur Wang
A Novel Heuristic Optimization Algorithm for Solving the Delay-Constrained Least-Cost Problem
Abstract
The quality of the multicast routing service (QoS) is an NP multi-objective optimization problem. It is one of the main issues of transmission in communication networks that consist of concurrently sending the same information from a source to a subset of all possible destinations in a computer network. Thus, it becomes an important technology communication. To solve the problem, a current approach for efficiently supporting a multicast session in a network is establishing a multicast tree that covers the source and all terminal nodes. This problem can be reduced to a minimal Steiner tree problem (MST), which aims to look for a tree that covers a set of nodes with a minimum total cost that has been proven to be NP-complete. In this paper, we propose a novel algorithm based on the greedy randomized search procedure (GRASP) for the Delay-Constrained Least-Cost problem. Constrained with the construction and improvement phase, the proposed algorithm makes the difference in the construction phase through using a new method called EB heuristic. The procedure was first applied to improve the KMB heuristic in order to solve the Steiner tree problem. Obtained solutions were improved by using the tabu search method in the enhancement process. Computational experiments on medium-sized problems (50–100 nodes) from literature show that the proposed metaheuristic gives competitive results in terms of cost and delay compared to the proposed results in the literature.
Amina Boudjelida, Ali Lemouari
Terms Extraction from Clustered Web Search Results
Abstract
With a significant increase in web content, users cannot find easily the exact information that needs, so users are forced to read many search results sequentially until they reach what they want. In this paper, we proposed an approach to extract terms that represent sets of search results. These terms help users to access what they are looking for quickly and easily. Experimental results show that our proposed approach efficiently extract terms that represent these sets.
Chouaib Bourahla, Ramdane Maamri, Zaidi Sahnoun, Nardjes Bouchemal
Backmatter
Metadaten
Titel
Machine Learning for Networking
herausgegeben von
Éric Renault
Selma Boumerdassi
Paul Mühlethaler
Copyright-Jahr
2021
Electronic ISBN
978-3-030-70866-5
Print ISBN
978-3-030-70865-8
DOI
https://doi.org/10.1007/978-3-030-70866-5