nach oben

2013 | Buch

Kapitel lesen Erstes Kapitel lesen

Information Sciences and Systems 2013

Proceedings of the 28th International Symposium on Computer and Information Sciences

herausgegeben von: Erol Gelenbe, Ricardo Lent

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Electrical Engineering

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Based on a rigorous selection from 58 proposals coming from across the world, this volume will include some of the most recent ideas and technical results in computer systems, computer science, and computer-communication networks. The book will offer the reader with a timely access to innovative research from many different areas of the world where advances in computing and communications are created.

Inhaltsverzeichnis

Frontmatter

Smart Algorithms

Frontmatter

Adaptive Curve Tailoring

This paper deals with the problem of finding an average of several curves subject to qualitative constraints and restrictions on the curves. The unknown average curve is the solution of a weighted least squares problem involving the deviation between the given curves and the unknown curve. The qualitative constraints are that some curves are preferred compared to other curves. The qualitative information is converted into constraints on the weights in the least squares problem defining the average curve. The model defining the curves is parameterized and restrictions on the curves are defined in terms of restrictions on the parameters. We give an example where the curves determined from three data sets are required to be monotone and convex. We also show that one curve being preferred restricts the set of possible curves that can be an average curve.

Trond Steihaug, Wenli Wang

Regularizing Soft Decision Trees

Recently, we have proposed a new decision tree family called soft decision trees where a node chooses both its left and right children with different probabilities as given by a gating function, different from a hard decision node which chooses one of the two. In this paper, we extend the original algorithm by introducing local dimension reduction via

$$L_1$$

and

$$L_2$$

regularization for feature selection and smoother fitting. We compare our novel approach with the standard decision tree algorithms over 27 classification data sets. We see that both regularized versions have similar generalization ability with less complexity in terms of number of nodes, where

$$L_2$$

seems to work slightly better than

$$L_1$$

Olcay Taner Yıldız, Ethem Alpaydın

A Simple Yet Fast Algorithm for the Closest-Pair Problem Using Sorted Projections on Multi-Dimensions

We present a simple greedy algorithm,

QuickCP

, for finding the closest-pair of points on a plane. It is based on the observation that if two points are close to each other, then it is very likely that their sorted projections to x-axis and/or to y-axis will reflect that closeness. Therefore we order our search starting from the pairs with closest x-projections (and closest y-projections) to find the closest pair quickly and make a fast exit. Empirical data up to

$$2^{26}$$

(over 60 million points) indicates that this approach quickly detects the closest pair and it is, on average 70 % faster than the classical divide-and-conquer algorithm registering, to the best of our knowledge, the fastest recorded solution times in the literature for the planar closest-pair problem. A second contribution of our work is that

QuickCP

can be used as part of the divide-and-conquer algorithms to handle small sub-problems. Measurements on data up to

$$2^{26}$$

points show that this co-operation speeds up the basic divide-and-conquer algorithm on average 50 %.

Mehmet E. Dalkılıç, Serkan Ergun

DARWIN: A Genetic Algorithm Language

This article describes the DARWIN Project, which is a Genetic Algorithm programming language and its C Cross-Compiler. The primary aim of this project is to facilitate experimentation of Genetic Algorithm solution representations, operators and parameters by requiring just a minimal set of definitions and automatically generating most of the program code. The syntax of the DARWIN language and an implementational overview of the the cross-compiler will be presented. It is assumed that the reader is familiar with Genetic Algorithms, Programming Languages and Compilers.

Arslan Arslan, Göktürk Üçoluk

Distributed Selfish Algorithms for the Max-Cut Game

The Max-Cut problem consists in splitting in two parts the set of vertices of a given graph so as to maximize the sum of weights of the edges crossing the partition. We here address the problem of computing locally maximum cuts in general undirected graphs in a distributed manner. To achieve this, we interpret these cuts as the pure Nash equilibria of a

-player non-zero sum game, each vertex being an agent trying to maximize her selfish interest. A distributed algorithm can then be viewed as the choice of a policy for every agent, describing how to adapt her strategy to other agents’ decisions during a repeated play. In our setting, the only information available to any vertex is the number of its incident edges that cross, or do not cross the cut. In the general, weighted case, computing such an equilibrium can be shown to be PLS-complete, as it is often the case for potential games. We here focus on the (polynomial) unweighted case, but with the additional restriction that algorithms have to be distributed as described above. First, we describe a simple distributed algorithm for general graphs, and prove that it reaches a locally maximum cut in expected time

$$4\Delta |E|$$

, where

$$E$$

is the set of edges and

$$\Delta $$

its maximal degree. We then turn to the case of the complete graph, where we prove that a slight variation of this algorithm reaches a locally maximum cut in expected time

$$O(\log \log n)$$

. We conclude by giving experimental results for general graphs.

D. Auger, J. Cohen, P. Coucheney, L. Rodier

Analysis, Modelling and Optimisation

Frontmatter

Distributed Binary Consensus in Dynamic Networks

Motivated by the distributed binary interval consensus and the results on its convergence time we propose a distributed binary consensus algorithm which targets the shortfall of consensus algorithms when it comes to dynamic networks. We show that using our proposed algorithm nodes can join and leave the network at any time and the consensus result would always stay correct i.e. the consensus would always be based on the majority of the nodes which are currently present in the network. We then analyse our algorithm for the case of complete graphs and prove that the extra time it takes for nodes to implement our algorithm (to cope with the dynamic setting) does not depend on the size of the network and only depends on the voting margin. Our results are especially of interest in wireless sensor networks where nodes leave or join the network.

Arta Babaee, Moez Draief

Computing Bounds of the MTTF for a Set of Markov Chains

We present an algorithm to find some upper and lower bounds of the Mean Time To Failure (MTTF) for a set of absorbing Discrete Time Markov Chains (DTMC). We first present a link between the MTTF of an absorbing chain and the steady-state distribution of an ergodic DTMC derived from the absorbing one. The proposed algorithm is based on the polyhedral theory developed by Courtois and Semal and on a new iterative algorithm which gives bounds of the steady-state distribution of the associated ergodic DTMC at each iteration.

F. Aït-Salaht, J. M. Fourneau, N. Pekergin

Analysing and Predicting Patient Arrival Times

We fit a Hidden Markov Model (HMM) to patient arrivals data, represented as a discrete data trace. The processing of the data trace makes use of a simple binning technique, followed by clustering, before it is input into the Baum-Welch algorithm, which estimates the parameters of the underlying Markov chain’s state-transition matrix. Upon convergence, the HMM predicts its own synthetic traces of patient arrivals, behaving as a fluid input model. The Viterbi algorithm then decodes the hidden states of the HMM, further explaining the varying rate of patient arrivals at different times of the hospital schedule. The HMM is validated by comparing means, standard deviations and autocorrelation functions of raw and synthetic traces. Finally, we explore an efficient optimal parameter initialization for the HMM, including choosing the number of hidden states. We summarize our findings, comparing results with other work in the field, and proposals for future work.

Tiberiu Chis, Peter G. Harrison

Optimal Behaviour of Smart Wireless Users

We consider a smart or cognitive user (CU) that operates as a secondary user of a cognitive channel. Before transmission, the CU samples the channel until it estimates that it can be accessed successfully. When the CU transmits a packet, it may nevertheless be unsuccessful because its estimate was wrong. The CU then knows about its failure and stops the ongoing transmission sometime before the transmission ends. Then the CU restarts sensing the channel. In this paper we analyse the total delay experienced by a CU. We derive the first and second moments of the effective transmission time of a packet sent by such a smart user, in view of the fact that the sensing and errors made in transmitting a packet when the channel is actually unavailable, will introduce additional delay. This is similar to a “vacation time” in queues, though it differs from conventional vacation time models because such “sampling vacations” can be beneficial to the future transmission.

Boris Oklander, Erol Gelenbe

Hyper-Heuristics for Performance Optimization of Simultaneous Multithreaded Processors

In Simultaneous Multi-Threaded (SMT) processor datapaths, there are many datapath resources that are shared by multiple threads. Currently, there are a few heuristics that distribute these resources among threads for better performance. A selection hyper-heuristic is a search method which mixes a fixed set of heuristics to exploit their strengths while solving a given problem. In this study, we propose learning selection hyper-heuristics for predicting, choosing and running the best performing heuristic. Our initial test results show that hyper-heuristics may improve the performance of the studied workloads by around 2%, on the average. The peak performance improvement is observed to be 41% over the best performing heuristic, and more than 15% over all heuristics that are studied. Our best hyper-heuristic performs better than the state-of-the art heuristics on almost 60% of the simulated workloads.

I. A. Güney, Gürhan Küçük, Ender Özcan

A Model of Speculative Parallel Scheduling in Networks of Unreliable Sensors

As systems scale up, their mean-time-to-failure reduces drastically. We consider parallel servers subject to permanent failures but such that only one needs to survive in order to execute a given task. This kind of failure-model is appropriate in at least two types of systems: systems in which repair cannot take place (e.g. spacecraft) and systems that have strict deadlines (e.g. navigation systems). We use multiple replicas to perform the same task in order to improve the reliability of systems. The server in the system is subject to failure while it is on and the time to failure is memoryless, i.e. exponentially distributed. We derive expressions for the Laplace transform of the sojourn time distribution of a tagged task, jointly with the probability that the tagged task completes service, for a network of one or more parallel servers with exponential service times and times to failure.

Zhan Qiu, Peter G. Harrison

Energy-Aware Admission Control for Wired Networks

Overprovisioning and redundancy has contributed towards better survivability and performance in networks, but has led to inefficient use of energy. Proposals for energy aware networks of the near future aim to reduce the energy consumption by switching off or putting to sleep individual network devices. Here we propose a mechanism that is taking this concept once step further through the use of admission control. Admission control has been traditionally used in wired networks to control traffic congestion and guarantee quality of service. We propose a two-fold approach. First, an admission control mechanism delays the users that are projected to be the most energy demanding, and whose acceptance would require the turning on of devices. At the same time, an auto-hibernation mechanism regulates the rate at which machines are turned off due to inactivity. Collectively, the two mechanisms contribute towards energy saving by monitoring both at the level of entry in the network and at the level of active operation.

Christina Morfopoulou, Georgia Sakellari, Erol Gelenbe

Computational Linguistics

Frontmatter

Named Entity Recognition in Turkish with Bayesian Learning and Hybrid Approaches

Named entity recognition is one of the significant textual information extraction tasks. In this paper, we present two approaches for named entity recognition on Turkish texts. The first is a Bayesian learning approach which is trained on a considerably limited training set. The second approach comprises two hybrid systems based on joint utilization of this Bayesian learning approach and a previously proposed rule-based named entity recognizer. All of the proposed three approaches achieve promising performance rates. This paper is significant as it reports the first use of the Bayesian approach for the task of named entity recognition on Turkish texts for which especially practical approaches are still insufficient.

Sermet Reha Yavuz, Dilek Küçük, Adnan Yazıcı

Transfer Learning Using Twitter Data for Improving Sentiment Classification of Turkish Political News

In this paper, we aim to determine the overall sentiment classification of Turkish political columns. That is, our goal is to determine whether the whole document has positive or negative opinion regardless of its subject. In order to enhance the performance of the classification, transfer learning is applied from unlabeled Twitter data to labeled political columns. A variation of self-taught learning has been proposed, and implemented for the classification. Different machine learning techniques, including support vector machine, maximum entropy classification, and Naive-Bayes has been used for the supervised learning phase. In our experiments we have obtained up to 26 % increase in the accuracy of the classification with the inclusion of the Twitter data into the sentiment classification of Turkish political columns using transfer learning.

Mesut Kaya, Guven Fidan, I Hakkı Toroslu

A Fully Semantic Approach to Large Scale Text Categorization

Text categorization is usually performed by supervised algorithms on the large amount of hand-labelled documents which are labor-intensive and often not available. To avoid this drawback, this paper proposes a text categorization approach that is designed to fully exploiting semantic resources. It employs the ontological knowledge not only as lexical support for disambiguating terms and deriving their sense inventory, but also to classify documents in topic categories. Specifically, our work relates to apply two corpus-based thesauri (i.e. WordNet and WordNet Domains) for selecting the correct sense of words in a document while utilizing domain names for classification purposes. Experiments presented show how our approach performs well in classifying a large corpus of documents. A key part of the paper is the discussion of important aspects related to the use of surrounding words and different methods for word sense disambiguation.

Nicoletta Dessì, Stefania Dessì, Barbara Pes

Emotion Analysis on Turkish Texts

Automatically analyzing the user’s emotion from his/her texts has been gaining interest as a research field. Emotion classification of English texts is studied by several researchers and promising results have been achieved. In this work, an emotion classification study on Turkish texts is presented. To the best of our knowledge, this is the first study conducted on emotion classification for Turkish texts. Due to the nature of Turkish language, several pruning tasks are applied and new features are constructed in order to improve the emotion classification accuracy. We compared the performance of several classification algorithms for emotion analysis and reported the results.

Z. Boynukalin, P. Karagoz

A Comparative Study to Determine the Effective Window Size of Turkish Word Sense Disambiguation Systems

In this paper, the effect of different windowing schemes on word sense disambiguation accuracy is presented. Turkish Lexical Sample Dataset has been used in the experiments. We took the samples of ambiguous verbs and nouns of the dataset and used bag-of-word properties as context information. The experi-ments have been repeated for different window sizes based on several machine learning algorithms. We follow 2/3 splitting strategy (2/3 for training, 1/3 for test-ing) and determine the most frequently used words in the training part. After re-moving stop words, we repeated the experiments by using most frequent 100, 75, 50 and 25 content words of the training data. Our findings show that the usage of most frequent 75 words as features improves the accuracy in results for Turkish verbs. Similar results have been obtained for Turkish nouns when we use the most frequent 100 words of the training set. Considering this information, selected al-gorithms have been tested on varying window sizes {30, 15, 10 and 5}. Our find-ings show that Naïve Bayes and Functional Tree methods yielded better accuracy results. And the window size

$$\pm $$

5 gives the best average results both for noun and the verb groups. It is observed that the best results of the two groups are 65.8 and 56 % points above the most frequent sense baseline of the verb and noun groups respectively.

Bahar İlgen, Eşref Adalı, A. Cüneyd Tantuğ

Computer Vision

Frontmatter

Eyes Detection Combined Feature Extraction and Mouth Information

In this paper, we propose an eye localization approach combining facial feature extraction and mouth information in color image. This approach includes feature extraction, face mask label, mouth location and eye region detection. In order to quickly obtain facial information, the coarse facial region is first localized by V-J filter, and then the compact facial region by our proposed filter can be decide and non-face region will be filtered out. Next, we label a face mask to modify skin region, extract the facial features and locate mouth information. Finally, the position of two eyes is detected based on the prior information. We present the experimental results on the various cases, including profile face, frontal face, and whether glasses or not, to demonstrate the effectiveness of our method.

Hui-Yu Huang, Yan-Ching Lin

Depth from Moving Apertures

Two new focus measure operators for Shape From Focus to estimate the three-dimensional shape of a surface are proposed in this paper. The images are formed by a camera using moving apertures. The well-focused image pixels are identified by frequency analysis or matching with the all-focused image of the scene. Frequency analysis involves the usage of classical focus measure operators and summing up for each aperture to find the focus quality to be maximized. The lesser method uses the match-points and error ratios of any matching algorithm used within an all-focused image region and all same-focus-different-aperture images to find the total displacement. The inverse of total displacement can be used as a focus quality measure. The experiments on real images show that the introduced ideas work effectively and efficiently.

Mahmut Salih Sayar , Yusuf Sinan Akgül

Score Level Fusion for Face-Iris Multimodal Biometric System

A high performance face-iris multimodal biometric system based on score level fusion techniques is presented in this paper. The aim of multimodal biometric systems is to improve the recognition performance by fusing information from more than one physical and/or behavioral characteristics of a person. This paper focuses on combining the strengths of face and iris modalities by employing the optimization method particle swarm optimization (PSO) to select facial features and the well known matching score level fusion technique, Weighted-Sum Rule, in order to obtain better recognition accuracy. Prior to performing fusion of face and iris modalities, standard feature extraction methods on face and iris images are employed separately. The unimodal and multimodal systems are experimented on different subsets of FERET, BANCA, and UBIRIS databases. Evaluation of the overall results based on recognition performance and ROC analysis demonstrates that the proposed multimodal biometric system achieves improved results compared to unimodal and other multimodal systems.

Maryam Eskandari, Önsen Toygar

Feature Selection for Enhanced 3D Facial Expression Recognition Based on Varying Feature Point Distances

Face is the most dynamic part of the human body which comprises information about the feelings of people with facial expressions. In this paper, we propose a novel feature selection procedure applied to 3-Dimensional (3D) geometrical facial feature points selected from MPEG-4 Facial Definition Parameters (FDPs) in order to achieve robust classification performance. Distances between 3D feature point pairs are used to describe a facial expression. Support Vector Machine (SVM) is employed as the classifier. The system is tested on 3D facial expression database BU-3DFE and shows significant improvements with the proposed feature selection algorithm.

Kamil Yurtkan, Hamit Soyel, Hasan Demirel

Data and Web Engineering

Frontmatter

DAPNA: An Architectural Framework for Data Processing Networks

A data processing network is as a set of (software) components connected through communication channels to apply a series of operations on data. Realization and maintenance of large-scale data processing networks necessitate an architectural approach that supports analysis, verification, implementation and reuse. However, existing tools and architectural styles fall short to support all these features. In this paper, we introduce an architectural style and framework for documenting and realizing data processing networks. Our framework employs reusable and composable data filters. These filters are annotated with their deployment information. The overall architecture is specified with an XML-based architecture description language. The specification is processed by a toolset for analysis and code generation. The framework has been utilized for defining and realizing an environmental monitoring application.

Hasan Sözer, Sander Nouta, Andreas Wombacher, Paolo Perona

Crescent: A Byzantine Fault Tolerant Delivery Framework for Durable Composite Web Services

Composite web service delivery is a very complex process that involves many complex tasks such as capacity management, components discovery, provisioning, monitoring, composition and coordination, customers’ SLAs management, moreover it requires cancellation and billing management. Indeed, managing all these tasks manually is a very cumbersome operation, nevertheless it is time consuming and prone to errors. To overcome such problems, this paper proposes Crescent; a Byzantine fault tolerant service delivery framework for durable composite web services. Crescent ensures the full automation of the composite web service delivery process. Furthermore, Crescent combines between quorum-based and practical Byzantine fault tolerance protocols as well as components adaptive parallel provisioning approaches to ensure reliable Byzantine fault tolerant service delivery. Crescent enables customers to have differentiated levels of service by allowing the composite web service to support different types of workflows. Experimental results showed that Crescent increases the reliability and throughput of composite web service delivery when compared with existing approaches.

Islam Elgedawy

Morphological Document Recovery in HSI Space

Old documents frequently appear with digitalization errors, uneven background, bleed-through effect etc... Motivated by the challenge to improve printed and handwritten text, we developed a new approach based on morphological color operators using

HSI

color space. Our approach is composed of a morphological background estimation for foreground/background separation and text segmentation, a background smoothing and a color text recovery. Experimental results carried onto ancient documents have proven that

ISH

lexicographic order is the most effective to estimate the background and recover ancient texts in uneven and foxed background images.

Ederson Marcos Sgarbi, Wellington Aparecido Della Mura, Nikolas Moya, Jacques Facon

Ontological Approach to Data Warehouse Source Integration

In the early stages of data warehouse design, the integration of several source databases must be addressed. Data-oriented and hybrid methodologies need to consider a global schema coming from the integration of source databases, in order to start the conceptual design. Since each database relies on its own conceptual schema, in the integration process a reconciliation phase is necessary, in order to solve syntactical and/or semantic inconsistencies among concepts. In this paper, we present an ontology-based approach to perform the integration of different conceptual schemas automatically.

Francesco Di Tria, Ezio Lefons, Filippo Tangorra

Adaptive Oversampling for Imbalanced Data Classification

Data imbalance is known to significantly hinder the generalization performance of supervised learning algorithms. A common strategy to overcome this challenge is synthetic oversampling, where synthetic minority class examples are generated to balance the distribution between the examples of the majority and minority classes. We present a novel adaptive oversampling algorithm,

Virtual

, that combines the benefits of oversampling and active learning. Unlike traditional resampling methods which require preprocessing of the data,

Virtual

generates synthetic examples for the minority class during the training process, therefore it removes the need for an extra preprocessing stage. In the context of learning with Support Vector Machines, we demonstrate that

Virtual

outperforms competitive oversampling techniques both in terms of generalization performance and computational complexity.

Şeyda Ertekin

Wireless Sensor Networks

Frontmatter

Energy-Aware Distributed Hash Table-Based Bootstrapping Protocol for Randomly Deployed Heterogeneous Wireless Sensor Networks

Distributed Hash Table (DHT)-based protocols having the structure of a ring, are promotive solutions to randomly deployed location unaware Wireless Sensor Networks (WSN). These protocols are able to ensure routing and data management efficiently and independently from any location information. However, their bootstrapping phase is challenging since each node should be well positioned into a global virtual ring at the same time. Simultaneous nodes joining leads to multiple inconsistencies. Moreover, existing DHT-based bootstrapping protocols do not take into account nodes heterogeneity whereas most of WSN are nowadays heterogeneous. In this paper, we propose an energy-aware bootstrapping protocol that not only organizes directly the nodes into a global consistent virtual ring but also allows it to build an energy efficient backbone for routing and data management. Simulation results show that our bootstrapping protocol improves drastically the DHT-based routing protocol performance and extends the network lifetime.

Ghofrane Fersi, Wassef Louati, Maher Ben Jemaa

Sensor-Activity Relevance in Human Activity Recognition with Wearable Motion Sensors and Mutual Information Criterion

Selecting a suitable sensor configuration is an important aspect of recognizing human activities with wearable motion sensors. This problem encompasses selecting the number and type of the sensors, configuring them on the human body, and identifying the most informative sensor axes. In earlier work, researchers have used customized sensor configurations and compared their activity recognition rates with those of others. However, the results of these comparisons are dependent on the feature sets and the classifiers employed. In this study, we propose a novel approach that utilizes the time-domain distributions of the raw sensor measurements. We determine the most informative sensor types (among accelerometers, gyroscopes, and magnetometers), sensor locations (among torso, arms, and legs), and measurement axes (among three perpendicular coordinate axes at each sensor) based on the mutual information criterion.

Oğuzcan Dobrucalı, Billur Barshan

Routing Emergency Evacuees with Cognitive Packet Networks

Providing optimal and safe routes to evacuees in emergency situations requires fast and adaptive algorithms. The common approaches are often too slow to converge, too complex, or only focus on one aspect of the problem, e.g. finding the shortest path. This paper presents an adaptation of the Cognitive Packet Network (CPN) concept to emergency evacuation problems. Using Neural Networks, CPN is able to rapidly explore a network and allocate overhead in proportion to the perceived likelihood of finding an optimal path there. CPN is also flexible, as it can operate with any user-defined cost function, such as congestion, path length, safety, or even compound metrics. We compare CPN with optimal algorithms such as Dijkstra’s Shortest Path using a discrete-event emergency evacuation simulator. Our experiments show that CPN reaches the performance of optimal path-finding algorithms. The resulting side-effect of such smart or optimal algorithms is in the greater congestion that is encountered along the safer paths; therefore we indicate how the quality of service objective used by CPN can also be used to avoid congestion for further improvements in evacuee exit times.

Huibo Bi, Antoine Desmet, Erol Gelenbe

Detection and Evaluation of Physical Therapy Exercises by Dynamic Time Warping Using Wearable Motion Sensor Units

We develop an autonomous system that detects and evaluates physical therapy exercises using wearable motion sensors. We propose an algorithm that detects all the occurrences of one or more template signals (representing exercise movements) in a long signal acquired during a physical therapy session. In matching the signals, the algorithm allows some distortion in time, based on dynamic time warping (DTW). The algorithm classifies the executions in one of the exercises and evaluates them as correct/incorrect, giving the error type if there is any. It also provides a quantitative measure of similarity between each matched execution and its template. To evaluate the performance of the algorithm in physical therapy, a dataset consisting of one template execution and ten test executions of each of the three execution types of eight exercises performed by five subjects is recorded, having a total of 120 and 1,200 exercise executions in the training and test sets, respectively, as well as many idle time intervals in the test signals. The proposed algorithm detects 1,125 executions in the whole test set. 8.58 % of the 1,200 executions are missed and 4.91 % of the idle time intervals are incorrectly detected as executions. The accuracy is 93.46 % only for exercise classification and 88.65 % for simultaneous exercise and execution type classification. The proposed system may be used for both estimating the intensity of the physical therapy session and evaluating the executions to provide feedback to the patient and the specialist.

Aras Yurtman, Billur Barshan

Network Security, Data Integrity and Privacy

Frontmatter

Commutative Matrix-based Diffie-Hellman-Like Key-Exchange Protocol

A matrix-based Diffie-Hellman-like key exchange protocol and utilizing it secure key-exchange protocol similar to HMQV are proposed. The proposed key exchange protocol uses matrix multiplication operation only; it does not rely on the complexity of the discrete logarithm problem contrary to the prototype and its known variants. Two-way arrival at the common key, similar to that employed in the Diffie-Hellman protocol, is provided by specially constructed commutative matrices. The trap-door property ensuring the proposed protocol security is based on exploiting of a non-invertible public matrix in the key generating process.

Alexander G. Chefranov, Ahmed Y. Mahmoud

Anonymity in Multi-Instance Micro-Data Publication

In this paper we study the problem of anonymity in multi-instance (MI) micro-data publication. The classical k-anonymity approach is shown to be insufficient and/or inappropriate for MI databases. Thus, it is extended to MI databases, resulting in a more general setting of MI k-anonymity. We show that MI k-anonymity problem is NP-Hard and the attack model for MI databases is different from that of single-instance databases. We make an observation that the introduced MI k-anonymity is not a strong privacy guarantee when anonymity sets are highly unbalanced with respect to instance counts. To this end a new anonymity principle, called p-certainty, which is unique to MI case is introduced. A clustering algorithms solving the p-certainty anonymity principle is developed and experimentally evaluated.

Osman Abul

Homomorphic Minimum Bandwidth Repairing Codes

To store data reliably, a number of coding schemes including Exact-Minimum Bandwidth Regenerating codes (exact-MBR) and Homomorphic Self Repairing Codes (HSRC) exist. Exact-MBR offers minimum bandwidth usage whereas HSRC has low computational overhead in node repair. We propose a new hybrid scheme, Homomorphic Minimum Bandwidth Repairing Codes, derived from the above coding schemes. Our coding scheme provides two options for node repair operation. The first option offers to repair a node using minimum bandwidth and higher computational complexity while the second one repairs a node using fewer nodes, lower computational complexity and higher bandwidth. In addition, our scheme introduces a basic integrity checking mechanism.

Elif Haytaoglu, Mehmet Emin Dalkilic

Recreating a Large-Scale BGP Incident in a Realistic Environment

The Internet has become a critical asset for both the economy and the society. Realistic experimentation environments are needed to study and improve the resilience and the stability of the Internet. In this paper, we propose a methodology that allows to: (1) model an Internet-like topology, (2) recreate the model with realistic parameters and conditions, (3) reproduce large-scale incidents, and (4) test various what-if scenarios. As a prood of concept, a valid abstraction of the Europe Internet backbone is created where Network Service Providers (NSP) are connected to each other in various Internet Exchange Points (IXP). This topology is emulated on a Emulab-based testbed. A well-known BGP-route hijacking incident is replayed and studied under hypothetical scenarios of network operators reactions and collaboration. The results of the experiments are then analysed showing the potential value of the proposed methodology.

Enis Karaarslan, Andres Garcia Perez, Christos Siaterlis

Uneven Key Pre-Distribution Scheme for Multi-Phase Wireless Sensor Networks

In multi-phase Wireless Sensor Networks (WSNs), sensor nodes are redeployed periodically to replace nodes whose batteries are depleted. In order to keep the network resilient against node capture attacks across different deployment epochs, called

generations

, it is necessary to refresh the key pools from which cryptographic keys are distributed. In this paper, we propose Uneven Key Pre-distribution (UKP) scheme that uses multiple different key pools at each generation. Our UKP scheme provides self healing that improves the resiliency of the network at a higher level as compared to an existing scheme in the literature. Moreover, our scheme provides perfect local and global connectivity. We conduct our simulations in mobile environment to see how our scheme performs under more realistic scenarios.

Onur Catakoglu, Albert Levi

NEMESYS: Enhanced Network Security for Seamless Service Provisioning in the Smart Mobile Ecosystem

As a consequence of the growing popularity of smart mobile devices, mobile malware is clearly on the rise, with attackers targeting valuable user information and exploiting vulnerabilities of the mobile ecosystems. With the emergence of large-scale mobile botnets, smartphones can also be used to launch attacks on mobile networks. The NEMESYS project will develop novel security technologies for seamless service provisioning in the smart mobile ecosystem, and improve mobile network security through better understanding of the threat landscape. NEMESYS will gather and analyze information about the nature of cyber-attacks targeting mobile users and the mobile network so that appropriate counter-measures can be taken. We will develop a data collection infrastructure that incorporates virtualized mobile honeypots and a honeyclient, to gather, detect and provide early warning of mobile attacks and better understand the modus operandi of cyber-criminals that target mobile devices. By correlating the extracted information with the known patterns of attacks from wireline networks, we will reveal and identify trends in the way that cyber-criminals launch attacks against mobile devices.

Erol Gelenbe, Gökçe Görbil, Dimitrios Tzovaras, Steffen Liebergeld, David Garcia, Madalina Baltatu, George Lyberopoulos

Towards Visualizing Mobile Network Data

This paper presents the research directions that the visualization in the NEMESYS project will follow, so as to visualize mobile network data and detect possible anomalies. These directions are based on the previous approaches on network security visualization and attack attribution techniques, while possible extensions are also discussed based on the presented approaches.

Stavros Papadopoulos, Dimitrios Tzovaras

Infrastructure for Detecting Android Malware

Malware for smartphones have sky-rocketed these last years, particularly for Android platforms. To tackle this threat, services such as Google Bouncer have intended to counter-attack. However, it has been of short duration since the malware have circumvented the service by changing their behaviors. Therefore, we propose a malware taxonomy, a survey of attack vectors to better understand the Android malware, a survey of the modus-operandi of attackers for infecting the smartphones, and the design of components that are responsible for analyzing and detecting Android malware of the NEMESYS infrastructure. This infrastructure aims at understanding and detecting attacks both at the network and smartphone level.

Laurent Delosières, David García

NEMESYS: First Year Project Experience in Telecom Italia Information Technology

In 2011, when mobile malware quadrupled in volume, it became clear that malware for mobile platforms would experience a rapid increase. During 2012 it continued to grow steadily, closely following the spreading of smartphones all over the world. For mobile network operators it is also clear that this growing phenomenon must be at least constantly monitored, if not prevented altogether. At the present moment there are no effective tools to achieve this purpose. NEMESYS comes to fill in this vacuum. This paper presents the vision, position and activities of Telecom Italia Information Technology as part of the NEMESYS project. The emphasis is given to the perception and practical experience of mobile threats and mobile security in Telecom Italia Mobile Network, and on the advances in the art that our company is hoping to achieve from its active participation to the project.

Madalina Baltatu, Rosalia D’Alessandro, Roberta D’Amico

Android Security, Pitfalls and Lessons Learned

Over the last two years Android became the most popular mobile operating system. But Android is also targeted by an over-proportional share of malware. In this paper we systematize the knowledge about the Android security mechanisms and formulate how the pitfalls can be avoided when building a mobile operating system.

Steffen Liebergeld, Matthias Lange

Mobile Network Threat Analysis and MNO Positioning

The dramatic increase of smart mobile devices and applications, the advent of Android OS, the increased number of wireless radios (incl. NFC) the support and the low awareness about security and privacy risks on the one hand, and the flatter, IP-based network architecture, the introduction of new radio technologies (femtocells, WiFi, LTE) and applications (M2M, NFC) on the other, have changed the mobile threats landscape and will change the way security will be dealt in the coming years. Mobile Network Operators (MNOs) have started to investigate the possibility to introduce additional measures to secure their networks, providing thus a defense before security threats materialize.

George Lyberopoulos, Helen Theodoropoulou, Konstantinos Filis

Mobile Network Anomaly Detection and Mitigation: The NEMESYS Approach

Mobile malware and mobile network attacks are becoming a significant threat that accompanies the increasing popularity of smart phones and tablets. Thus in this paper we present our research vision that aims to develop a network-based security solution combining analytical modelling, simulation and learning, together with billing and control-plane data, to detect anomalies and attacks, and eliminate or mitigate their effects, as part of the EU FP7 NEMESYS project. These ideas are supplemented with a careful review of the state-of-the-art regarding anomaly detection techniques that mobile network operators may use to protect their infrastructure and secure users against malware.

Omer H. Abdelrahman, Erol Gelenbe, Gökçe Görbil , Boris Oklander

Backmatter

Titel: Information Sciences and Systems 2013
herausgegeben von: Erol Gelenbe
Ricardo Lent
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-01604-7
Print ISBN: 978-3-319-01603-0
DOI: https://doi.org/10.1007/978-3-319-01604-7