Skip to main content

2015 | Buch

Distributed Computing and Internet Technology

11th International Conference, ICDCIT 2015, Bhubaneswar, India, February 5-8, 2015. Proceedings

herausgegeben von: Raja Natarajan, Gautam Barua, Manas Ranjan Patra

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 11th International Conference on Distributed Computing and Internet Technology, ICDCIT 2015, held in Bhubaneswar, India, in February 2015. The 12 revised full papers presented together with 30 short papers and 9 invited talks in this volume were carefully reviewed and selected from 221 submissions. The papers cover topics such as distributed computing and algorithms; internet technologies and Web services; secure computing and communication; cloud computing; information retrieval and recommender systems and societal applications.

Inhaltsverzeichnis

Frontmatter

Invited Papers

Models of Circular Causality

Causality is often interpreted as establishing dependencies between events. The standard view is that an event

b

causally depends on an event

a

if, whenever

b

occurs, then

a

has already occurred. If the occurrences of

a

and

b

mutually depend on each other, i.e.

a

depends on

b

and vice versa, then (under the standard notion of causality) neither of them can ever occur. This does not faithfully capture systems where, for instance, an agent promises to do event

a

provided that

b

will be

eventually

done, and vice versa. In this case, the circularity between the causal dependencies should allow both

a

and

b

to occur, in any order. In this paper we review three models for circular causality, one based on logic (declarative), one based on event structures (semantical), and one based on Petri nets (operational). We will cast them in a coherent picture pointing out their relationships.

Massimo Bartoletti, Tiziana Cimoli, G. Michele Pinna, Roberto Zunino
Checking Cloud Contracts in Microsoft Azure

Cloud Contracts

capture architectural requirements in data-centers. They can be expressed as logical constraints over configurations. Contract violation is indicative of miss-configuration that may only be noticed when networks are attacked or correctly configured devices go off-line. In the context of Microsoft Azure’s data-center we develop contracts for (1) network access restrictions, (2) forwarding tables, and (3) BGP policies. They are checked using the SecGuru tool that continuously monitors configurations in Azure. SecGuru is based on the Satisfiability Modulo Theories solver Z3, and uses logical formulas over bit-vectors to model network configurations. SecGuru is an instance of applying technologies, so far developed for program analysis, towards networks. We claim that

Network Verification

is an important and exciting new opportunity for formal methods and modern theorem proving technologies. Networking is currently undergoing a revolution thanks to the advent of highly programmable commodity devices for network control, the build out of large scale cloud data-centers and a paradigm shift from network infrastructure as embedded systems into software controlled and defined networking. Tools, programming languages, foundations, and methodologies from software engineering disciplines have a grand opportunity to fuel this transformation.

Nikolaj Bjørner, Karthick Jayaraman
Privacy and Security Challenges in Internet of Things

Internet of Things (IoT) envisions as a global network, connecting any objects around us, ranging from home appliances, wearable things to military applications. With IoT infrastructure, physical objects such as wearable objects, television, refrigerator, smart phones, supply-chain items and any objects across the globe would get connected using the Internet. Sensing, radio waves, mobile technology, embedded systems and Internet technology are promising actors which play significant roles in IoT infrastructure. Security and privacy issues in IoT scenarios would be much more challenging than what is been used in the conventional wireless scenarios. In particular, the constrained environments require lightweight primitives, secure design and effective integration into other environments in order to see IoT in its desired shape. In this paper, we discuss security and privacy challenges in IoT scenarios and applications with special emphasis on resource-constrained environments’ security objectives and privacy requirement. We provide different perspectives of IoT, discuss about important driving forces of IoT, and propose a generic construction of secure protocol suitable for constrained environments with respect to IoT scenarios and applications.

Manik Lal Das
Geo-indistinguishability: A Principled Approach to Location Privacy

In this paper we report on our ongoing project aimed at protecting the privacy of the user when dealing with location-based services. The starting point of our approach is the principle of geo-indistinguishability, a formal notion of privacy that protects the user’s exact location, while allowing approximate information – typically needed to obtain a certain desired service – to be released. We then present two mechanisms for achieving geo-indistinguishability, one generic to sanitize locations in any setting with reasonable utility, the other custom-built for a limited set of locations but providing optimal utility. Finally we extend our mechanisms to the case of location traces, where the user releases his location repeatedly along the day and we provide a method to limit the degradation of the privacy guarantees due to the correlation between the points. All the mechanisms were tested on real datasets and compared both among themselves and with respect to the state of the art in the field.

Konstantinos Chatzikokolakis, Catuscia Palamidessi, Marco Stronati
Fusing Sensors for Occupancy Sensing in Smart Buildings

Understanding occupant-building interactions helps in personalized energy and comfort management. However, occupant identification using affordable infrastructure, remains unresolved. Our analysis of existing solutions revealed that for a building to have real-time view of occupancy state and use it intelligently, there needs to be a smart fusion of affordable, not-necessarily-smart, yet accurate enough sensors. Such a sensor fusion should aim for minimalistic user intervention while providing accurate building occupancy data. We describe an occupant detection system that accurately monitors the occupants’ count and identities in a shared office space, which can be scaled up for a building. Incorporating aspects from data analytics and sensor fusion with intuition, we have built a

Smart-Door

using inexpensive sensors to tackle this problem. It is a scalable, plug-and-play software architecture for flexibly realizing smart-doors using different sensors to monitor buildings with varied occupancy profiles. Further, we show various smart-energy applications of this occupancy information: detecting anomalous device behaviour and load forecasting of plug-level loads.

Nabeel Nasir, Kartik Palani, Amandeep Chugh, Vivek Chil Prakash, Uddhav Arote, Anand P. Krishnan, Krithi Ramamritham
Discrete Control-Based Design of Adaptive and Autonomic Computing Systems

This invited paper makes an overview of our works addressing discrete control-based design of adaptive and reconfigurable computing systems, also called autonomic computing. They are characterized by their ability to switch between different execution modes w.r.t. application and functionality, mapping and deployment, or execution architecture. The control of such reconfigurations or adaptations is a new application domain for control theory, called feedback computing. We approach the problem with a programming language supported approach, based on synchronous languages and discrete control synthesis. We concretely use this approach in FPGA-based reconfigurable architectures, and in the coordination of administration loops.

Xin An, Gwenaël Delaval, Jean-Philippe Diguet, Abdoulaye Gamatié, Soguy Gueye, Hervé Marchand, Noël de Palma, Eric Rutten
Designing for Scalability and Trustworthiness in mHealth Systems

Mobile Healthcare (mHealth) systems use mobile smartphones and portable sensor kits to provide improved and affordable healthcare solutions to underserved communities or to individuals with reduced mobility who need regular monitoring. The architectural constraints of such systems provide a variety of computing challenges: the distributed nature of the system; mobility of the persons and devices involved; asynchrony in communication; security, integrity and authenticity of the data collected; and a plethora of administrative domains and the legacy of installed electronic health/medical systems.

The volume of data collected can be very large; together with the data, there is a large amount of metadata as well. We argue that certain metadata are essential for interpreting the data and assessing their quality. There is great variety in the kinds of medical data and metadata, the methods by which they are collected and administrative constraints on where they may be stored, which suggest the need for flexible distributed data repositories. There also are concerns about the veracity of the data, as well as interesting questions about who owns the data and who may access them.

We argue that traditional notions of relational databases, and security techniques such as access control and encryption of communications are inadequate. Instead, end-to-end systematic (from sensor to cloud) information flow techniques need to be applied for integrity and secrecy. These need to be adapted to work with the volume and diversity of data collected, and in a federated collection of administrative domains where data from different domains are subject to different information flow policies.

Sanjiva Prasad
Host Trait Prediction of Metagenomic Data for Topology-Based Visualization

Microbiome and metagenomic research continues to grow as well as the size and complexity of the collected data. Additionally, it is understood that the microbiome can have a complex relationship with the environment or host it inhabits, such as in gastrointestinal disease. The goal of this study is to accurately predict a host’s trait using only metagenomic data, by training a statistical model on available metagenome sequencing data. We compare a traditional Support Vector Regression approach to a new non-parametric method developed here, called PKEM, which uses dimensionality reduction combined with Kernel Density Estimation. The results are visualized using methods from Topological Data Analysis. Such representations assist in understanding how the data organizes and can lead to new insights. We apply this visualization-of-prediction technique to cat, dog and human microbiome obtained from fecal samples. In the first two the host trait is irritable bowel syndrome while in the last the host trait is Kwashiorkor, a form of severe malnutrition.

Laxmi Parida, Niina Haiminen, David Haws, Jan Suchodolski
Gappy Total Recaller: Efficient Algorithms and Data Structures for Accurate Transcriptomics

Understanding complex mammalian biology depends crucially on our ability to define a precise map of all the transcripts encoded in a genome, and to measure their relative abundances. A promising assay depends on RNASeq approaches, which builds on next generation sequencing pipelines capable of interrogating cDNAs extracted from a cell. The underlying pipeline starts with base-calling, collect the sequence reads and interpret the raw-read in terms of transcripts that are grouped with respect to different splice-variant isoforms of a messenger RNA. We address a very basic problem involved in all of these pipelines, namely accurate Bayesian base-calling, which could combine the analog intensity data with suitable underlying priors on base-composition in the transcripts. In the context of sequencing genomic DNA, a powerful approach for base-calling has been developed in the TotalReCaller pipeline. For these purposes, it uses a suitable reference whole-genome sequence in a compressed self-indexed format to derive its priors. However, TotalReCaller faces many new challenges in the transcriptomic domain, especially since we still lack a fully annotated library of all possible transcripts, and hence a sufficiently good prior. There are many possible solutions, similar to the ones developed for TotalReCaller, in applications addressing de novo sequencing and assembly, where partial contigs or string-graphs could be used to boot-strap the Bayesian priors on base-composition. A similar approach would be applicable here too, partial assembly of transcripts can be used to characterize the splicing junctions or organize them in incompatibility graphs and then provided as priors for TotalReCaller. The key algorithmic techniques for this purpose have been addressed in a forthcoming paper on Stringomics. Here, we address a related but fundamental problem, by assuming that we only have a reference genome, with certain intervals marked as candidate regions for ORF (Open Reading Frames), but not necessarily complete annotations regarding the 5’ or 3’ termini of a gene or its exon-intron structure. The algorithms we describe find the most accurate base-calls of a cDNA with the best possible segmentation, all mapped to the genome appropriately.

B. Mishra

Distributed Computing and Algorithms

Finding RkNN Set in Directed Graphs

The

reverse k-nearest neighbors

of a query data point

q

characterizes the influence set of

q

, and comprises of data points which consider

q

among their

k

-nearest neighbours. This query has gained considerable attention due to its importance in various applications involving decision support systems, profile-based marketing, location based services, etc. Although this query is reasonably well-studied for scenarios where data points belong to Euclidean spaces, there has not been much work done for non-Euclidean data points, and specifically, for large data sets with arbitrary distance measures. In this work, a framework has been proposed for performing RkNN query over data sets that can be represented as directed graphs. We present a graph pruning technique to compute the RkNN of a query point which significantly reduces the search space. We report results of extensive experiments over some real-world data sets from a social network, a product co-purchasing network of Amazon, the web graph, and study the performance of our proposed heuristic in various settings on these data sets. These experiments demonstrate the effectiveness of our proposed technique.

Pankaj Sahu, Prachi Agrawal, Vikram Goyal, Debajyoti Bera
Gathering Asynchronous Swarm Robots under Nonuniform Limited Visibility

This paper proposes a distributed algorithm for gathering a group of autonomous, homogeneous, oblivious, asynchronous mobile robots having limited visibility (sensing) ranges. To the best of our knowledge, all reported results have assumed that the visibility ranges are uniform for all the robots. In contrast, we consider that the visibility ranges of the robots are not uniform. Moreover, the robots have no knowledge about the visibility ranges of other robots. However, a lower bound on the visibility range of all the robots is known to all the robots.

Avik Chatterjee, Sruti Gan Chaudhuri, Krishnendu Mukhopadhyaya
A Routing Calculus with Flooding Updates

We propose a process calculus which explicitly models routing in a distributed computer network. We define a model which consists of a network of routers where the topology of routers is fixed. The calculus has three syntactic categories namely processes, nodes and systems. Processes reside in nodes which are connected to a specific routers which forms a system. Upon creation of new nodes, the routing tables are updated using flooding method. We show that the proposed routing calculi is reduction equivalent to its specification asynchronous distributed pi-calculus (ADpi). We believe that such modeling helps in prototyping the distributed routing algorithms.

Manish Gaur, Simon J. Gay, Ian Mackie
k-Distinct Strong Minimum Energy Topology Problem in Wireless Sensor Networks

Given a set of sensors, the strong minimum energy topology (SMET) problem is to assign transmit power to each sensor such that the resulting topology containing only bidirectional links is strongly connected and the total energy of all the nodes is minimized. The

SMET

problem is known to be

NP

-hard. Currently available sensors in the market support a finite set of transmission ranges. So we consider the

k

-

Distinct-SMET

problem, where only

k

transmission power levels are used. We prove that the

k

-

Distinct-SMET

problem is

NP

-complete for

k

 ≥ 3. However, on the positive side, we show that the 2-

Distinct-SMET

problem can be solved in polynomial time. The energy cost of transmitting a bit is higher than the cost of computation, and hence it may be advantageous to organize the sensors into clusters and form a hierarchical structure. This motivated the study of

k

-Distinct-

r

Strong Minimum Energy Hierarchical Topology (

k

-Distinct-

r

SMEHT) problem:

Given a sensor network consisting of

n

sensors, and integers

k

and

r

, assign transmit powers to all sensors out of the

k

distinct power levels such that (

i

) the graph induced using only the bi-directional links is connected, (

ii

) at most

r

sensors are connected to two or more sensors by a bidirectional link and (

iii

) the sum of the transmit powers of all the sensors is minimum. We Propose a

$\frac{r+1}{2}$

- approximation algorithm for the

k

-Distinct-

r

SMEHT problem for any fixed

r

and arbitrary

k

.

Bhawani Sankar Panda, D. Pushparaj Shetty, Arti Pandey
Path Planning Algorithm for Mobile Anchor in Connected Sensor Networks

Path planning is an important issue for localization with mobile anchor in wireless sensor networks as movement of the mobile anchor consumes more energy compared to static anchor. Most of the works available in the literature either looks into the aspect of reducing path length of the mobile anchor or tries to increase localization accuracy. In this paper we propose a cost-effective movement strategy i.e., path planning for a mobile anchor which reduces path length and at the same time localization can be done using localization scheme [3], which yields good accuracy. Simulation results show improvement over existing work [4] in terms of both path length and localization accuracy.

Kaushik Mondal, Arindam Karmakar, Partha Sarathi Mandal
SMCDCT: A Framework for Automated MC/DC Test Case Generation Using Distributed Concolic Testing

In this paper we propose a framework to compute MC/DC percentage for distributed test case generation. MC/DC stands for Modified Condition/Decison Coverage [1]. This approach uses several client nodes to generate the non-redundant test cases in a distributed and scalable manner. To achieve an increase in MC/DC, we transform the input C program,

P

, into its transformed version,

P

, using Ex-NCT. A coverage analyzer accepts P along with the generated test cases as input from SCORE framework and outputs the MC/DC percentage. The experimental studies show that SMCDCT approach achieves 6.5 % (approx.) of average increase in MC/DC. This increase in MC/DC percentage is achieved in an average computation time of 7.1622715 seconds.

Sangharatna Godboley, Subhrakanta Panda, Durga Prasad Mohapatra
On-the-Fly Symmetry Reduction of Explicitly Represented Probabilistic Models

Quantitative analysis of concurrent systems becomes intract-able due to searches over the enormous state space. Since these systems often contain many identical processes consisting of symmetrical and interchangeable components, this problem can be tackled using symmetry reduction. In this paper, we present an on-the-fly symmetry reduction technique that is applicable to explicitly represented models in the probabilistic setting. We have performed the experiments by integrating our technique into PRISM probabilistic model checker. Experimental results are very encouraging with considerable reductions in both the time taken for property evaluation and the associated memory usage.

Reema Patel, Kevin Patel, Dhiren Patel

Internet Technologies and Web Services

Time and Cost Aware Checkpointing of Choreographed Web Services

Complex business processes can be realized by composing two or more web services into a composite web service. Due to the widespread reachability of Internet, more and more web services are becoming available to the consumers. Quality aware consumers look for resilience in services provisioned on Internet. This paper proposes message logging based checkpointing and recovery for web services to make them resilient to faults. It presents an algorithm that checkpoints services participating in a choreography in such a way that the execution time and cost of service constraints are always met. It identifies checkpoint locations by considering the costs involved in checkpointing, message logging and replaying for service recovery. The cost estimation is carried out using service interaction patterns and QoS values of the services involved. Performance of the proposed checkpointing strategy is corroborated with the results obtained from experiments.

Vani Vathsala Atluri, Hrushikesha Mohanty
A Preprocessing of Service Registry: Based on I/O Parameter Similarity

For registry based web services, finding a service for a given input and output, can be improved by registry preprocessing, that groups services of similar input and output parameters, to a cluster. This paper proposes a Wordnet based similarity detection technique for service I/O parameter clustering and also demonstrates the uses of the technique with experimental result.

Lakshmi H.N., Hrushikesha Mohanty
Financial Risk Management in e-commerce Using Executable Business Process Modeling Notation

In e-commerce systems like online auction houses or online stores, there are financial transactions involving buyers and sellers. At large payment processing firms, there is significant risk of fraud (upto 0.9 %). This fraud can be prevented before the actual transaction phase through risk scoring models. In the post transactions phase, measures like withholding or reserving funds of the seller, or asking for additional supporting material from the seller to release the funds can be done. There are numerous variations based on different geographies or different seller classes or different holding mechanisms of these measures.

It was found that the software being developed for a payment system was combining both infrastructural software (database, queue, logs) as well the actual risk business process. Subsequently, prototypes were created for different risk measures in the post transaction phase using executable BPMN2 (using Activiti engine). For example, a certain amount of money may be withheld from the seller for a configurable time period which can be edited in the graphical BPMN2.

In this paper, we discuss the numerous types of transactions, the numerous measures for financial risk and how BPMN2, can be used to model the same and at the same time form a performant executable component reducing development time. It is also possible that such models can be standardized and exchanged in the industry.

Ramkumar Iyer, Sanjeevi Moorthy
DILT: A Hybrid Model for Dynamic Composition and Execution of Heterogeneous Web Services

Business applications in domains such as e-Governance, require collaboration among both Governmental and non Governmental departments, which raises the need for composing SOAP-based as well as RESTful services. Existing works address this objective using static composition alone. However, it would be beneficial if users can specify the requirements during run-time, based on the outcome of the previous services executed. In general, business applications follow a predefined order and consequently the composition process can follow a template of business activities. Existing works on dynamic web service composition either separate the composition and execution phases distinctly or perform them in an interleaved fashion. The former approach cannot adapt to changes in run-time whereas the latter can select services based on the outcome of previous service executions. However, the interleaved approach does not support business activities that have a specific ordering among them. Hence, a novel hybrid model - Dynamic InterLeaved Template (DILT) - that enables interleaved composition and execution of web services based on predefined workflow templates has been proposed in this paper. This hybrid model lends itself naturally for composing both SOAP-based and RESTful services.

Kanchana Rajaram, Chitra Babu, Akshaya Ganesan
Application of Soft Computing Technique for Web Service Selection

One of the main challenges in service oriented architecture is the optimal selection and ranking of web services. The process of selecting relevant services from a service repository in a heterogeneous environment is a difficult task. Use of different search engines help in selection process by efficiently searching the service repository (like UDDI), peer-to-peer networks, service portals etc. Fixing up appropriate services is necessary because composition of these services leads to the development of a particular application. In this paper, soft computing technique such as ANN and Fuzzy logic are employed for optimal selection of web service with the help of the requisite attributes related to quality of service. A comparative study of performance of both the techniques based on error parameter has been made in order to help in critical assessment.

Debendra Kumar Naik, Smita Kumari, Santanu Kumar Rath
Weighted Load Balanced Adaptive Gateway Discovery in Integrated Internet MANET

An Integrated Internet MANET (IIM) is a heterogeneous network which is an interconnection of the wired Internet and the wireless MANET. Two of the issues which arise in Integrated Internet-MANET are gateway load balancing and efficient gateway discovery. These two issues have been addressed separately in the literature. In this paper, a mechanism is presented which incorporates gateway load balancing and adaptive gateway discovery together. The proposed mechanism uses the Maximal Source Coverage algorithm for dynamically adjusting the proactive range of the IIM while using the WLB-AODV routing protocol for gateway load balanced routing of packets in the MANET. Simulation results using ns-2 network simulator show that the proposed protocol gives better performance in terms of packet delivery ratio and end to end delay than the existing approach.

Rafi U. Zaman, Khaleel Ur Rahman Khan, A. Venugopal Reddy
Slice Based Testing of CGI Based Web Applications

We propose a slice based testing technique to generate test paths for web applications. Our web application is based on

Common Gateway Interface

(CGI) and we have used

PERL

programming language. Our technique uses slicing criterion for all variables defined and used in the program. Then, it computes the slices for each of these criteria and generates test paths. Finally, we generate test cases using these test paths.

Madhusmita Sahu, Durga Prasad Mohapatra
A Fuzzy Computationally Intelligent System for Resource Allocation in WiMAX

WiMAX is an upcoming technology gaining grounds day by day that has inherent support for real and non real time applications. Distribution of resources in such networks has always been a challenging phenomenon. This problem can be solved by designing intelligent and adaptive systems. This paper proposes an application of fuzzy logic by virtue of which an intelligent system for distribution of resources has been defined. The system works adaptively to allocate bandwidth to traffic classes according to incoming traffic in their queues. The results demonstrate significance of the proposed method.

Akashdeep Sharma
Improving Activity Prediction and Activity Scheduling in Smart Home Networks for Enhanced QoS

This paper proposes an algorithm, to enhance the prediction accuracy of inhabitant activities in smart home networks. This work is an enhancement to SPEED [1], which was earlier drawn upon [2,3]. It works with the nested episodes of activity sequences along with the innermost episodes to generate user activity contexts. For a given sequence, our approach on an average predicts 86 percent accurately, which is much better than SPEED’s 59 percent accuracy.

Koteswara Rao Vemu

Secure Computing and Communication

Repetition Pattern Attack on Multi-word-containing SecureString 2.0 Objects

Cloud computing appeals to private individuals and particularly enterprises at a progressive rate, but a noticeable percentage of them refuse it due to mistrust or missing commitment to security. The cryptosystem SecureString 2.0 was designed to outweigh these deficits through the support of blind computations on character strings. Repetition pattern attacks count among the most hazardous enemies of SecureString 2.0 objects because reoccurring ciphergrams within them may reveal linguistic identifying features of the correspondent plaintext. This paper analyzes and compares the success probability of repetition pattern attacks on the following three sorts of SecureString 2.0 objects: single-word-containing ones, multi-word-containing ones with a known number of words plus unknown delimiter positions, and multi-word-containing ones with an unknown number of words plus unknown boundary locations. The latter type is expected to provide the highest privacy.

Günter Fahrnberger
Computationally Secure Cheating Identifiable Multi-Secret Sharing for General Access Structure

Secret sharing scheme is a key component of distributed cryptosystems. In its basic form, secret sharing schemes can tolerate honest but curious adversary. But, in modern open system environment, adversary can behave maliciously i.e., the adversary can do anything according to his available computational resources. To get rid of such adversary, cheating identifiable (multi) secret sharing scheme plays an important role. Informally, cheating identifiable (multi) secret sharing scheme can identify the cheating participants, who are under the control of malicious adversary, and recover the correct secret whenever possible. However, to achieve unconditional security against such adversary, share size should be at least equal to the size of the secret. As a result, the need for computational notion of security of such schemes, which can accommodate smaller share size, has been felt over the years, specially in case of multi-secret sharing schemes. In this paper, we propose a notion of security for computationally secure cheating identifiable multi-secret sharing scheme for general access structure along with a construction which is secure under this new notion.

Partha Sarathi Roy, Angsuman Das, Avishek Adhikari
S-Gossip: Security Enhanced Gossip Protocol for Unstructured P2P Networks

Peer to Peer (P2P) is one of the most popular technology which paved a way to new structures in many applications including content searching, file sharing etc. On the other hand, inclusion of a few malicious peers, the entire network could be disrupted without proper security measures. This paper presents a very simple but an effective security mechanism for gossip based P2P networks. In the proposed protocol, each gossiping node observes its peers closely to ensure that no malicious nodes will actively participate in the gossip protocol. To achieve this, each peer builds trust information about other nodes in the system and exchanges with its neighbours. Using the trust information, each node is able to identify and blacklist malicious nodes in its view. Thus, each node gossips only with nodes it deems as non-malicious. The efficiency of the proposed protocol is far ahead of existing security protocols such as TooLate. Our simulation results show the effectiveness of the proposed work.

Sumit Kumar Tetarave, Somanath Tripathy, Sathya Peri
k-degree Closeness Anonymity: A Centrality Measure Based Approach for Network Anonymization

Social network data are generally published in the form of social graphs which are being used for extensive scientific research. We have noticed that even a k-degree anonymization of social graph can’t ensure protection against identity disclosure. In this paper, we have discussed how closeness centrality measure can be used to identify a social entity in the presence of kdegree anonymization. We have proposed a new model called k-degree closeness anonymization by adopting a mixed strategy of k-degree anonymity, degree centrality and closeness centrality. The model has two phases, namely, construction and validation. The construction phase transforms a graph with given sequence to a graph with anonymous sequence in such a manner that the closeness centrality measure is distributed among the nodes in a smooth way. The nodes with the same degree centrality are assigned with a closer set of closeness centrality values, making re-identification difficult. Validation phase validates our model by generating

1

-neighborhood graphs. Algorithms have been developed both for the construction and validation phases.

Debasis Mohapatra, Manas Ranjan Patra
A Chinese Remainder Theorem Based Key Management Algorithm for Hierarchical Wireless Sensor Network

Wireless Sensor Networks (WSN) are network of sensors having low computation, storage and battery power. Hierarchical WSN are heterogeneous network of sensors having different capabilities which form a hierarchy to achieve energy efficiency. Key management algorithms are center of the security protocols in WSN. It involves key pre distribution, shared key discovery, key revocation, and refreshing. Due to resource constraints in WSN achieving a perfect key management scheme has been quite challenging. In this paper a new key management scheme for Hierarchical WSN based on Chinese Remainder Theorem has been proposed. An experimental setup is created to evaluate this scheme. The results indicate that it establishes the key with minimum computation, communication, storage cost at each node, also it is scalable and resilient to different attacks.

Pranave Kumar Bhaskar, Alwyn R. Pais
An Evolutionary Computation Based Classification Model for Network Intrusion Detection

Current techniques used for network intrusion detection have limited capabilities in coping with the dynamic and increasingly complex nature of security threats. In this paper, we propose a classification model for detecting intrusions based on Genetic Programming, Artificial Immune Recognition Systems (AIRS1, AIRS2), and Clonal Selection Algorithm (CLONALG). Further, six Rank based, viz., Information Gain, Gain ratio, Symmetrical Uncertainty, Chi squared Attribute Evaluator, Relief-F, and one-R; and five search based feature selection methods, viz., PSO Search, Genetic Search, Best First Search, Greedy Stepwise, and Rank Search have been employed to select the most relevant attributes before classification. The performance of the model has been evaluated in terms of accuracy, precision, detection rate, F-value, false alarm rate, and fitness value.

Ashalata Panigrahi, Manas Ranjan Patra
Intrusion Detection in a Tailor-Made Gaussian Distribution Wireless Sensor Networks

Node deployment strategy plays a crucial role in determining the intrusion detection capability of a wireless sensor network (WSN). In this paper, we investigate the intrusion detection problem in a tailor-made Gaussian distributed network considering multiple-sensing detection scenario. Exhaustive simulation is performed primarily to validate the correctness of modeling and analysis. Effects of different network parameters on the detection probability are also examined. Finally, the effectiveness of the proposed approach is confirmed by comparing with its counterpart Gaussian deployment strategy under the considered scenarios.

Amrita Ghosal, Subir Halder
SecureString 3.0
A Cryptosystem for Blind Computing on Encrypted Character Strings

If (electronic) computations need to be undertaken, then mostly performance or commercial aspects lead to the selection for an optimal executing computer. In case the beneficiary of the computation upshots does not wholly confide in any computers except his ones, trust becomes his major decision criterion instead. Blind computing can do the trick to (re)win the beneficiary’s confidence. This paper ameliorates the well-known cryptosystem SecureString 2.0 concerning privacy through pangram ciphertexts and referring integrity through secret sharing. The emerging cryptosystem of this amelioration is called SecureString 3.0 that facilities blind computing on encrypted character strings through an untrusted host.

Günter Fahrnberger, Kathrin Heneis
A Secure Image Hashing Technique for Forgery Detection

Nowadays most of the multimedia contents are in digital form. With the increased use of powerful computer and image processing software, along with wide availability of digital cameras have given rise to huge numbers of doctored images. Several forgery detection algorithms are available. However, these techniques do not address the issue from cryptographic point of view. As a result, even if an image or video is identified as doctored, most of the time it is not possible to track the actual offender. Here, we present a perceptual hash function which can be used for both detection of forged images as well as tracking of forgers.

Tanmoy Kanti Das, Piyush Kanti Bhunre

Cloud Computing

Design and Development of Secure Cloud Architecture for Sensor Services

This paper is aimed at the design and development of secure cloud architecture for the Wireless Sensor Networks (WSN), in which the sensor data are represented as services and are accessed by the client applications in a secure manner with a simple authentication solution for sensors (SASS). Currently, sensor system needs an intelligent middleware to integrate with the cloud. Service Oriented Architecture (SOA), which makes use of the web services and XML technologies, will provide a solution to meet the current demands. Web services provide a mechanism for open and flexible interaction between heterogeneous systems with loosely coupled service endpoints. Further a technique based on formal specification is utilized which reduces the data volume of xml documents at a level that can be handled by the resource constrained environment of the wireless sensors. The proposed architecture will provide a scalable infrastructure for integrating heterogeneous sensor networks using a small set of powerful abstractions.

R. S. Ponmagal, N. Dinesh, Uma Rajaram
Cloud Federation Formation Using Coalitional Game Theory

Cloud federation has been emerged as a new paradigm in which group of Cloud Service Providers (SP) cooperate to share resources with peers, to gain economic advantage. In this paper, we study the cooperative behavior of a group of cloud SPs. We present broker based cloud federation architecture and model the formation of cloud federation using

coalition game theory

. The objective is to find most suitable federation for Cloud SPs that maximize the satisfaction level of each individual Cloud SP on the basis of Quality of Service(QoS) attributes like availability and price.

Benay Kumar Ray, Sunirmal Khatua, Sarbani Roy
An Efficient Resource Allocation Algorithm for IaaS Cloud

Infrastructure as a Service (IaaS) cloud provides access to computing resources by forming a virtualized environment. The resources are offered by means of leases. However, it is not possible to satisfy all the leases due to finite capacity of resources (or nodes). Mapping between all the leases and the available nodes referred as resource allocation problem is very challenging to IaaS cloud. In this paper, we propose a resource allocation algorithm for IaaS cloud which is based on a novel approach of alert time. First, it uses alert time to assign the leases and then employs swapping to reschedule the already accommodated leases in case a lease is not schedulable by the alert time. This makes resource allocation superior to support the deadline sensitive leases by minimizing the lease rejection in contrast to two existing algorithms by Haizea [3] and Nathani [2]. We perform extensive experiments on several synthetic data sets and the results show that the proposed algorithm outperforms both the algorithms in terms of accepted leases and rejected leases.

Sanjaya K. Panda, Prasanta K. Jana
Genetic Algorithm Framework for Bi-objective Task Scheduling in Cloud Computing Systems

Cloud computing gives an excellent opportunity for business enterprises as well as researchers to use the computing power, over Internet, without actually owning the infrastructure, there by reducing establishment and management cost. Task scheduling in cloud systems is challenging due to the conflicting objectives of end users and the cloud service providers. Running time and cost are two key factors that determine the optimal service from the cloud. In this paper, we focus on two objectives, makespan and cost, to be optimized simultaneously using genetic algorithm framework. Finding an optimal schedule, considering both of these conflicting objectives, is a search problem under NP-hard category. We have considered the scheduling of independent tasks and the proposed frame work can be used in public or hybrid cloud.

A. S. Ajeena Beegom, M. S. Rajasree
Retracted: Optimal Cloud Resource Provisioning: A Two-Criteria Formulation

Federated resource provisioning in an on-demand, instantly procurable fashion with the flexibility of a pay as you go model for pricing has led the path for cloud computing to be the computing technology of the future. However, resource provisioning technology needs to be well-supported by appropriate optimization strategies for sustainability purposes. In this paper, therefore, an efficient resource provisioning strategy has been proposed that arrives at optimal provisioning solution with minimal cost and SLA violation rate. To this end, an optimal cloud resource provisioning model has been formulated using the Stochastic Integer Programming (SIP) problem which has been solved by assuming customers’ cloud demand for resources as Poisson’s distribution to accommodate for uncertainties pertaining to user demand.

Geetika Mudali, Manas Ranjan Patra, K. Hemant Kumar Reddy, Diptendu Sinha Roy

Information Retrieval and Recommender Systems

Predicting Post Importance in Question Answer Forums Based on Topic-Wise User Expertise

Q & A forums on the web are aplenty and the content produced through such crowd-sourced efforts is generally of good quality and highly beneficial to novices and experts alike. As the community matures, however, the explosion in the number of posts/answers leads to the information overload problem. Many a times users having expertise in a particular area are not able to address quality issues raised in the area maybe due to the positioning of the question in the list displayed to the user. A good mechanism to assess the quality of questions and to display it to the users depending on their area of expertise, if devised, may lead to a higher quality answers and faster resolutions to the questions posted. In this paper we present the results of our investigations into the effectiveness of various mechanisms to represent user expertise to estimate a post score reflecting its quality/utility of the post. We follow three different approaches to building a user profile representing the user’s areas of expertise: topic models based approach, tag-based approach and semantic user profiling approaches. We present the results of experiments performed on the popular Q&A Forum Stack Overflow, exploring the value add offered by these approaches. The preliminary experiments support our hypothesis that considering additional features in terms of user expertise does offer an increase in the classification accuracy even while ignoring features computable only after the first 24 hours. However, the proposed method to individually leverage on the semantic tag relations to construct an enhanced user profile did not prove beneficial.

Deepa Anand, Febin A. Vahab
SQLiDDS: SQL Injection Detection Using Query Transformation and Document Similarity

SQL Injection Attack has been a major security threat to web applications since last 15 years. Nowadays, hackers use automated tools to discover vulnerable websites and launch mass injection attacks. Accurate run-time detection of SQL injection has been a challenge in spite of extensive research in this area. This paper presents a novel approach for real-time detection of SQL injection attacks using query transformation and document similarity measure. Acting as a database firewall, the proposed system named SQLiDDS, can protect multiple web applications using the database server. With additional inputs from human expert, SQLiDDS can also become more robust over time. Our experimental results confirm that this approach can effectively detect and prevent all types of SQL injection attacks with good accuracy yet negligible impact on system performance. The approach was tested on web applications built using PHP and MySQL, however it can be easily adopted in other platforms with minimal changes.

Debabrata Kar, Suvasini Panigrahi, Srikanth Sundararajan
Information Retrieval in Wikipedia with Conceptual Directions

The paper describes our algorithm used for retrieval of textual information from Wikipedia. The experiments show that the algorithm allows to improve typical evaluation measures of retrieval quality. The improvement of the retrieval results was achieved by two phase usage approach. In first the algorithm extends the set of content that has been indexed by the specified keywords and thus increases the Recall value. Then, using the interaction with the user by presenting him so-called

Conceptual Directions

the search results are purified, which allows to increase Precision value. The preliminary evaluation on multi-sense test phrases indicates, that the algorithm is able to increase the Precision, within result set, without Recall loss. We also describe an additional method used for extending the result set based on creating cluster prototypes and finding the most similar, not retrieved content in text repository. In our demo implementation in the form of web portal, clustering has been used to present the search results organized in thematic groups instead of ranked list.

Julian Szymański
Query Execution for RDF Data on Row and Column Store

This paper shows experimental comparison between various data storage techniques to manage RDF data. The work represents evaluation of query performance in terms of query execution time and data scalability, using row and column store for various data storage techniques. To demonstrate these ideas FOAF (Friend Of A Friend) data is used. The paper contributes experimental and analytical study for application of partitioning techniques on FOAF data which makes queries 168 times faster compared to traditional triples table. Materialized views over vertically partitioned data show an additional 8 times improvement in query performance against partitioned data for the frequently occurring queries. Vertical partitioning is executed on column store also, and as FOAF data size scales, an order of magnitude improved performance is observed over row store execution.

Trupti Padiya, Minal Bhise, Sandeep Vasani, Mohit Pandey
A Concept for Co-existence of Heterogeneous Recommender Systems Based on Blackboard Architecture

This paper introduces the concept of software coordination, collaboration, moderation and visualization in the context of Recommender System (RS). The proposed concept assist to coordinates to execute multiple RSs and collaborates their respective outputs. It also moderates the top-n recommendation list generated by RSs. The multiple RSs can be heterogeneous in terms of their attributes, such as data mining algorithms & prediction techniques used for recommendation. These RSs may be developed by independent experts. The concept employs the notion of blackboard architecture for dealing with multiple RSs. Optimal results are expected to be achieved by feed-forward and feed-backward mechanism.

Rohit Gupta, Anil Kumar Singh
Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval

Pseudo Relevance feedback (PRF) based query expansion approaches assumes that the top ranked retrieved documents are relevant. But this assumption is not always true; it may also possible that a PRF document may contain different topics, which may or may not be relevant to the query terms even if the documents are judged relevant. In this paper our focus is to capture the limitation of PRF based query expansion and propose a hybrid method to improve the performance of PRF based query expansion by combining corpus based term co-occurrence information and semantic information of term. Firstly, the paper suggest use of corpus based term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, we use semantic similarity approach to rank the query expansion terms obtained from top feedback documents. Third, we combine co-occurrence and semantic similarity together to rank the query expansion terms obtained from first step on the basis of semantic similarity. The experiments were performed on FIRE ad hoc and TREC-3 benchmark datasets of information retrieval. The results show significant improvement in terms of precision, recall and mean average precision (MAP). This experiments shows that the combination of both techniques in an intelligent way gives us goodness of both of them. As this is the first attempt in this direction there is a large scope of improving these techniques.

Jagendra Singh, Aditi Sharan

Societal Applications

Predicting User Visibility in Online Social Networks Using Local Connectivity Properties

Recent developments in Online Social Network (OSN) technologies and services, added with availability of wide range of applications has paved the way towards popularity of several social network platforms. These OSNs have evolved as a major communication and interaction platform for millions of users worldwide. The users interact with their social contacts by using various types of available services like messaging, sharing pictures /videos, and many more. However, a major drawback of these platforms is that these activities might reveal certain private information about the users unintentionally. Whenever a user shares any information on OSN with his friends, the information is prone to leakage to other users. The probability of leakage increases with the visibility of the user himself (i.e. the number of users who would be interested on the information of the user) as well as the visibility of his/her friends. Therefore, it is important to measure the visibility of a user in the OSN community. This paper proposes a measure for the visibility of a user, by considering the connectivity properties of the users present in the network. The characteristics of the proposed measure is studied on a real Twitter network as well as a generated Erdős-Rényi network, where we observe the relation between visibility and certain topological parameters of the network. The results show that visibility of a user is determined by his/her direct social contacts, i.e. the number of followers in case of Twitter. However, evaluating the visibility of an user is practically difficult considering the immensely large size of the OSN’s. These findings help us to generate simple mechanisms to estimate the visibility of a user using only its local connectivity properties.

Nemi Chandra Rathore, Somanath Tripathy, Joydeep Chandra
Using KNN and SVM Based One-Class Classifier for Detecting Online Radicalization on Twitter

Twitter is the largest and most popular micro-blogging website on Internet. Due to low publication barrier, anonymity and wide penetration, Twitter has become an easy target or platform for extremists to disseminate their ideologies and opinions by posting hate and extremism promoting tweets. Millions of tweets are posted on Twitter everyday and it is practically impossible for Twitter moderators or an intelligence and security analyst to manually identify such tweets, users and communities. However, automatic classification of tweets into pre-defined categories is a non-trivial problem problem due to short text of the tweet (the maximum length of a tweet can be 140 characters) and noisy content (incorrect grammar, spelling mistakes, presence of standard and non-standard abbreviations and slang). We frame the problem of hate and extremism promoting tweet detection as a one-class or unary-class categorization problem by learning a statistical model from a training set containing only the objects of one class . We propose several linguistic features such as presence of war, religious, negative emotions and offensive terms to discriminate hate and extremism promoting tweets from other tweets. We employ a single-class SVM and KNN algorithm for one-class classification task. We conduct a case-study on Jihad, perform a characterization study of the tweets and measure the precision and recall of the machine-learning based classifier. Experimental results on large and real-world dataset demonstrate that the proposed approach is effective with F-score of 0.60 and 0.83 for the KNN and SVM classifier respectively.

Swati Agarwal, Ashish Sureka
Data Mining on ICT Usage in an Academic Campus: A Case Study

Nowadays, every higher education institution needs to assess the degree of utilisation of Information and Communication Technology (ICT) facilities installed in the campus. This study is based on the responses collected from the student and research communities via survey on ICT in the University of Burdwan. Data mining methodologies – Fuzzy-Rough Feature Selection (FRFS) to reduce the dimensions of survey dataset and subsequently different classification techniques such as J48, JRip, QuickRules Fuzzy-Rough rule induction, Fuzzy Nearest Neighbour, Fuzzy Rough Nearest Neighbour and Vaguely Quantified Nearest Neighbour (VQNN) are applied on this resultant dataset to build model for potential knowledge extraction. Fuzzy-Rough Feature Selection (FRFS) and then different classification algorithms are applied using WEKA 3.7.9 for analysis of the reduced dataset.

Ajay Auddy, Sripati Mukhopadhyay
A Theory on Genesis and Spread of Corruption

Availing a service by an illegal means is usually termed as a corruption. This paper models roles of individuals in genesis of corruption and its spread in a society. Particularly, the roles of individuals, its neighbourhood associations and their social space are studied. Based on these understandings, a theory is proposed to explain genesis and spread of corruption, a society may assume for a given service. The characteristics of an individual include its need, anxiety in availing a service. It also depends on an emergent view on corruption a society projects on its social space.

Hrushikesha Mohanty
Social Network Analysis of Different Parameters Derived from Realtime Profile

The main objective of this work is to study network parameters commonly used to explain social structures. In this paper, we extract data from a real-time Facebook account using Netvizz application, analyze and evaluate network parameters on some widely recognized graph topology using GEPHI software.

Paramita Dey, Aniket Sinha, Sarbani Roy
Analysis of Emergency Evacuation of Building Using PEPA

Verification and validation is a crucial step of system design. However the verification of evacuation plan during emergency situation in highly crowded areas is often ignored. Analysis of building, urban and mega event plans, performed at the early design phase may reveal the flaws in architectural design such as presence of congestion spots, improper width of stairs or corridor, fewer number of exits etc. and can be removed as early as possible. The paper analyses emergency egress situation of multi-storey building using process algebraic approach PEPA and models- evacuees, building sections, rescuers, the external help and the situation causing evacuation.

Anil Kumar Singh, Surinder Kaur

Erratum

Retraction Note to: Optimal Cloud Resource Provisioning: A Two-Criteria Formulation

The paper starting on page 360 of this publication has been withdrawn because it contains portions from the paper “Optimization of Resource Provisioning Cost in Cloud Computing” by S. Chaisiri, B.-S. Lee, and D. Niyato, published in IEEE Transactions on Services Computing, Vol. 5, No. 2, April-June 2012. Permission was not obtained and no reference was included.

Geetika Mudali, Manas Ranjan Patra, K. Hemant Kumar Reddy, Diptendu Sinha Roy
Backmatter
Metadaten
Titel
Distributed Computing and Internet Technology
herausgegeben von
Raja Natarajan
Gautam Barua
Manas Ranjan Patra
Copyright-Jahr
2015
Verlag
Springer International Publishing
Electronic ISBN
978-3-319-14977-6
Print ISBN
978-3-319-14976-9
DOI
https://doi.org/10.1007/978-3-319-14977-6