nach oben

2009 | Buch

Kapitel lesen Erstes Kapitel lesen

Cloud Computing

First International Conference, CloudCom 2009, Beijing, China, December 1-4, 2009. Proceedings

herausgegeben von: Martin Gilje Jaatun, Gansen Zhao, Chunming Rong

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the reviewed proceedings of the first International Conference on Cloud Computing, CloudCom 2009, held in Beijing, China, December 1-4, 2009. The 42 full papers presented together with four invited papers were carefully selected from 200 submissions. This book includes but are not limited to deal with topics like cloud /grid architecture, load balancing, optimal deploy configuration, consistency models, virtualization technologies, middleware frameworks, software as a Service (SaaS), hardware as a Service (HaaS), data grid & semantic web, web services, security and Risk, fault tolerance and reliability, auditing, monitoring and scheduling, utility computing, high-performance computing and peer to peer computing.

Inhaltsverzeichnis

Frontmatter

Invited Papers

The Many Colors and Shapes of Cloud

While many enterprises and business entities are deploying and exploiting Cloud Computing, the academic institutes and researchers are also busy trying to wrestle this beast and put a leash on this possible paradigm changing computing model. Many have argued that Cloud Computing is nothing more than a name change of Utility Computing. Others have argued that Cloud Computing is a revolutionary change of the computing architecture. So it has been difficult to put a boundary of what is in Cloud Computing, and what is not. I assert that it is equally difficult to find a group of people who would agree on even the definition of Cloud Computing. In actuality, may be all that arguments are not necessary, as Clouds have many shapes and colors. In this presentation, the speaker will attempt to illustrate that the shape and the color of the cloud depend very much on the business goals one intends to achieve. It will be a very rich territory for both the businesses to take the advantage of the benefits of Cloud Computing and the academia to integrate the technology research and business research.

James T. Yeh

Biomedical Case Studies in Data Intensive Computing

Many areas of science are seeing a data deluge coming from new instruments, myriads of sensors and exponential growth in electronic records. We take two examples – one the analysis of gene sequence data (35339 Alu sequences) and other a study of medical information (over 100,000 patient records) in Indianapolis and their relationship to Geographic and Information System and Census data available for 635 Census Blocks in Indianapolis. We look at initial processing (such as Smith Waterman dissimilarities), clustering (using robust deterministic annealing) and Multi Dimensional Scaling to map high dimension data to 3D for convenient visualization. We show how scaling pipelines can be produced that can be implemented using either cloud technologies or MPI which are compared. This study illustrates challenges in integrating data exploration tools with a variety of different architectural requirements and natural programming models. We present preliminary results for end to end study of two complete applications.

Geoffrey Fox, Xiaohong Qiu, Scott Beason, Jong Choi, Jaliya Ekanayake, Thilina Gunarathne, Mina Rho, Haixu Tang, Neil Devadasan, Gilbert Liu

An Industrial Cloud: Integrated Operations in Oil and Gas in the Norwegian Continental Shelf

Cloud computing may provide the long waiting technologies and methodologies for large scale industrial collaboration across disciplines and enterprise boundaries. Industrial cloud is introduced as a new inter-enterprise integration concept in cloud computing. Motivations and advantages are given by a practical exploration of the concept from the perspective of the on-going effort by the Norwegian oil and gas industry to build industry wide information integration and collaboration. ISO15926 is recognized as a standard enabling cross boundaries data integration and processing.

Chunming Rong

Cloudbus Toolkit for Market-Oriented Cloud Computing

This keynote paper: (1) presents the 21st century vision of computing and identifies various IT paradigms promising to deliver computing as a utility; (2) defines the architecture for creating market-oriented Clouds and computing atmosphere by leveraging technologies such as virtual machines; (3) provides thoughts on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain SLA-oriented resource allocation; (4) presents the work carried out as part of our new Cloud Computing initiative, called Cloudbus: (i) Aneka, a Platform as a Service software system containing SDK (Software Development Kit) for construction of Cloud applications and deployment on private or public Clouds, in addition to supporting market-oriented resource management; (ii) internetworking of Clouds for dynamic creation of federated computing environments for scaling of elastic applications; (iii) creation of 3

party Cloud brokering services for building content delivery networks and e-Science applications and their deployment on capabilities of IaaS providers such as Amazon along with Grid mashups; (iv) CloudSim supporting modelling and simulation of Clouds for performance studies; (v) Energy Efficient Resource Allocation Mechanisms and Techniques for creation and management of Green Clouds; and (vi) pathways for future research.

Rajkumar Buyya, Suraj Pandey, Christian Vecchiola

Full Papers

Self-healing and Hybrid Diagnosis in Cloud Computing

Cloud computing requires a robust, scalable, and high-performance infrastructure. To provide a reliable and dependable cloud computing platform, it is necessary to build a self-diagnosis and self-healing system against various failures or downgrades. This paper is the first to study the self-healing function, a challenging topic in today’s clouding computing systems, from the consequence-oriented point of view. To fulfill the self-diagnosis and self-healing requirements of efficiency, accuracy, and learning ability, a hybrid tool that takes advantages from Multivariate Decision Diagram and Naïve Bayes Classifier is proposed. An example is used to demonstrate that this proposed approach is effective.

Yuanshun Dai, Yanping Xiang, Gewei Zhang

Snow Leopard Cloud: A Multi-national Education Training and Experimentation Cloud and Its Security Challenges

Military/civilian education training and experimentation networks (ETEN) are an important application area for the cloud computing concept. However, major security challenges have to be overcome to realize an ETEN. These challenges can be categorized as security challenges typical to any cloud and multi-level security challenges specific to an ETEN environment. The cloud approach for ETEN is introduced and its security challenges are explained in this paper.

Erdal Cayirci, Chunming Rong, Wim Huiskamp, Cor Verkoelen

Trust Model to Enhance Security and Interoperability of Cloud Environment

Trust is one of the most important means to improve security and enable interoperability of current heterogeneous independent cloud platforms. This paper first analyzed several trust models used in large and distributed environment and then introduced a novel cloud trust model to solve security issues in cross-clouds environment in which cloud customer can choose different providers’ services and resources in heterogeneous domains can cooperate. The model is domain-based. It divides one cloud provider’s resource nodes into the same domain and sets trust agent. It distinguishes two different roles cloud customer and cloud server and designs different strategies for them. In our model, trust recommendation is treated as one type of cloud services just like computation or storage. The model achieves both identity authentication and behavior authentication. The results of emulation experiments show that the proposed model can efficiently and safely construct trust relationship in cross-clouds environment.

Wenjuan Li, Lingdi Ping

Dynamic Malicious Code Detection Based on Binary Translator

The binary translator is a software component of a computer system. It converts binary code of one ISA into binary code of another ISA. Recent trends show that binary translators have been used to save CPU power consumption and CPU die size, which makes binary translators a possible indispensable component of future computer systems. And such situation would give new opportunities to the security of these computer systems. One of the opportunities is that we can perform malicious code checking dynamically in the layer of binary translators. This approach has many advantages, both in terms of capability of detection and checking overhead. In this paper, we proposed a working dynamic malicious code checking module integrated to an existent open-source binary translator, QEMU, and explained that our module’s capability of detection is superior to other malicious code checking methods while acceptable performance is still maintained.

Zhe Fang, Minglu Li, Chuliang Weng, Yuan Luo

A Privacy Manager for Cloud Computing

We describe a privacy manager for cloud computing, which reduces the risk to the cloud computing user of their private data being stolen or misused, and also assists the cloud computing provider to conform to privacy law. We describe different possible architectures for privacy management in cloud computing; give an algebraic description of obfuscation, one of the features of the privacy manager; and describe how the privacy manager might be used to protect private metadata of online photos.

Siani Pearson, Yun Shen, Miranda Mowbray

Privacy in a Semantic Cloud: What’s Trust Got to Do with It?

The semantic web can benefit from cloud computing as a platform, but for semantic technologies to gain wide adoption, a solution to the privacy challenges of the cloud is necessary. In this paper we present a brief survey on recent work on privacy and trust for the semantic web, and sketch a middleware solution for privacy protection that leverages probabilistic methods for automated trust and privacy management for the semantic web.

Åsmund Ahlmann Nyre, Martin Gilje Jaatun

Data Protection-Aware Design for Cloud Services

The Cloud is a relatively new concept and so it is unsurprising that the information assurance, data protection, network security and privacy concerns have yet to be fully addressed. This paper seeks to begin the process of designing data protection controls into clouds from the outset so as to avoid the costs associated with bolting on security as an afterthought. Our approach is firstly to consider cloud maturity from an enterprise level perspective, describing a novel capability maturity model. We use this model to explore privacy controls within an enterprise cloud deployment, and explore where there may be opportunities to design in data protection controls as exploitation of the Cloud matures. We demonstrate how we might enable such controls via the use of design patterns. Finally, we consider how Service Level Agreements (SLAs) might be used to ensure that third party suppliers act in support of such controls.

Sadie Creese, Paul Hopkins, Siani Pearson, Yun Shen

Accountability as a Way Forward for Privacy Protection in the Cloud

The issue of how to provide appropriate privacy protection for cloud computing is important, and as yet unresolved. In this paper we propose an approach in which procedural and technical solutions are co-designed to demonstrate accountability as a path forward to resolving jurisdictional privacy and security risks within the cloud.

Siani Pearson, Andrew Charlesworth

Towards an Approach of Semantic Access Control for Cloud Computing

With the development of cloud computing, the mutual understandability among distributed Access Control Policies (ACPs) has become an important issue in the security field of cloud computing. Semantic Web technology provides the solution to semantic interoperability of heterogeneous applications. In this paper, we analysis existing access control methods and present a new Semantic Access Control Policy Language (SACPL) for describing ACPs in cloud computing environment. Access Control Oriented Ontology System (ACOOS) is designed as the semantic basis of SACPL. Ontology-based SACPL language can effectively solve the interoperability issue of distributed ACPs. This study enriches the research that the semantic web technology is applied in the field of security, and provides a new way of thinking of access control in cloud computing.

Luokai Hu, Shi Ying, Xiangyang Jia, Kai Zhao

Identity-Based Authentication for Cloud Computing

Cloud computing is a recently developed new technology for complex systems with massive-scale services sharing among numerous users. Therefore, authentication of both users and services is a significant issue for the trust and security of the cloud computing. SSL Authentication Protocol (SAP), once applied in cloud computing, will become so complicated that users will undergo a heavily loaded point both in computation and communication. This paper, based on the identity-based hierarchical model for cloud computing (IBHMCC) and its corresponding encryption and signature schemes, presented a new identity-based authentication protocol for cloud computing and services. Through simulation testing, it is shown that the authentication protocol is more lightweight and efficient than SAP, specially the more lightweight user side. Such merit of our model with great scalability is very suited to the massive-scale cloud.

Hongwei Li, Yuanshun Dai, Ling Tian, Haomiao Yang

Strengthen Cloud Computing Security with Federal Identity Management Using Hierarchical Identity-Based Cryptography

More and more companies begin to provide different kinds of cloud computing services for Internet users at the same time these services also bring some security problems. Currently the majority of cloud computing systems provide digital identity for users to access their services, this will bring some inconvenience for a hybrid cloud that includes multiple private clouds and/or public clouds. Today most cloud computing system use asymmetric and traditional public key cryptography to provide data security and mutual authentication. Identity-based cryptography has some attraction characteristics that seem to fit well the requirements of cloud computing. In this paper, by adopting federated identity management together with hierarchical identity-based cryptography (HIBC), not only the key distribution but also the mutual authentication can be simplified in the cloud.

Liang Yan, Chunming Rong, Gansen Zhao

Availability Analysis of a Scalable Intrusion Tolerant Architecture with Two Detection Modes

In this paper we consider a discrete-time availability model of an intrusion tolerant system with two detection modes; automatic detection mode and manual detection mode. The stochastic behavior of the system is formulated by a discrete-time semi-Markov process and analyzed through an embedded Markov chain (EMC) approach. We derive the optimal switching time from an automatic detection mode to a manual detection mode, which maximizes the steady-state system availability. Numerical examples are presented for illustrating the optimal switching of detection mode and its availability performance. availability, detection mode, EMC approach, Cloud computing environment.

Toshikazu Uemura, Tadashi Dohi, Naoto Kaio

Data Center Consolidation: A Step towards Infrastructure Clouds

Application service providers face enormous challenges and rising costs in managing and operating a growing number of heterogeneous system and computing landscapes. Limitations of traditional computing environments force IT decision-makers to reorganize computing resources within the data center, as continuous growth leads to an inefficient utilization of the underlying hardware infrastructure. This paper discusses a way for infrastructure providers to improve data center operations based on the findings of a case study on resource utilization of very large business applications and presents an outlook beyond server consolidation endeavors, transforming corporate data centers into compute clouds.

Markus Winter

Decentralized Service Allocation in a Broker Overlay Based Grid

Grid computing is based on coordinated resource sharing in a dynamic environment of multi-institutional virtual organizations. Data exchanges, and service allocation, are challenging problems in the field of Grid computing. This is due to the decentralization of Grid systems. Building decentralized Grid systems with efficient resource management and software component mechanisms is a need for achieving the required efficiency and usability of Grid systems. In this work, a decentralized Grid system model is presented in which, the system is divided into virtual organizations each controlled by a broker. An overlay network of brokers is responsible for global resource management and managing allocation of services. Experimental results show that, the system achieves dependable performance with various loads of services, and broker failures.

Abdulrahman Azab, Hein Meling

DisTec: Towards a Distributed System for Telecom Computing

The continued exponential growth in both the volume and the complexity of information, compared with the computing capacity of the silicon-based devices restricted by Moore’s Law, is giving birth to a new challenge to the specific requirements of analysts, researchers and intelligence providers. With respect to this challenge, a new class of techniques and computing platforms, such as Map-Reduce model, which mainly focus on scalability and parallelism, has been emerging. In this paper, to move the scientific prototype forward to practice, we elaborate a prototype of our applied distributed system,

DisTec

, for knowledge discovery from social network perspective in the field of telecommunications. The major infrastructure is constructed on Hadoop, an open-source counterpart of Google’s Map-Reduce. We carefully devised our system to undertake the mining tasks in terabytes call records. To illustrate its functionality, DisTec is applied to real-world large-scale telecom dataset. The experiments range from initial raw data preprocessing to final knowledge extraction. We demonstrate that our system has a good performance in such cloud-scale data computing.

Shengqi Yang, Bai Wang, Haizhou Zhao, Yuan Gao, Bin Wu

Cloud Computing Boosts Business Intelligence of Telecommunication Industry

Business Intelligence becomes an attracting topic in today’s data intensive applications, especially in telecommunication industry. Meanwhile, Cloud Computing providing IT supporting Infrastructure with excellent scalability, large scale storage, and high performance becomes an effective way to implement parallel data processing and data mining algorithms. BC-PDM (Big Cloud based Parallel Data Miner) is a new MapReduce based parallel data mining platform developed by CMRI (China Mobile Research Institute) to fit the urgent requirements of business intelligence in telecommunication industry. In this paper, the architecture, functionality and performance of BC-PDM are presented, together with the experimental evaluation and case studies of its applications. The evaluation result demonstrates both the usability and the cost-effectiveness of Cloud Computing based Business Intelligence system in applications of telecommunication industry.

Meng Xu, Dan Gao, Chao Deng, Zhiguo Luo, Shaoling Sun

Composable IO: A Novel Resource Sharing Platform in Personal Clouds

A fundamental goal for Cloud computing is to group resources to accomplish tasks that may require strong computing or communication capability. In this paper we design specific resource sharing technology under which IO peripherals can be shared among Cloud members. In particular, in a personal Cloud that is built up by a number of personal devices, IO peripherals at any device can be applied to support application running at another device. We call this IO sharing composable IO because it is equivalent to composing IOs from different devices for an application. We design composable USB and achieve pro-migration USB access, namely a migrated application running at the targeted host can still access the USB IO peripherals at the source host. This is supplementary to traditional VM migration under which application can only use resources from the device where the application runs. Experimental results show that through composable IO applications in personal Cloud can achieve much better user experience.

Xiaoxin Wu, Wei Wang, Ben Lin, Kai Miao

SLA-Driven Adaptive Resource Management for Web Applications on a Heterogeneous Compute Cloud

Current service-level agreements (SLAs) offered by cloud providers make guarantees about quality attributes such as availability. However, although one of the most important quality attributes from the perspective of the users of a cloud-based Web application is its response time, current SLAs do not guarantee response time. Satisfying a maximum average response time guarantee for Web applications is difficult due to unpredictable traffic patterns, but in this paper we show how it can be accomplished through dynamic resource allocation in a virtual Web farm. We present the design and implementation of a working prototype built on a EUCALYPTUS-based heterogeneous compute cloud that actively monitors the response time of each virtual machine assigned to the farm and adaptively scales up the application to satisfy a SLA promising a specific average response time. We demonstrate the feasibility of the approach in an experimental evaluation with a testbed cloud and a synthetic workload. Adaptive resource management has the potential to increase the usability of Web applications while maximizing resource utilization.

Waheed Iqbal, Matthew Dailey, David Carrera

Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation

Virtualization has become commonplace in modern data centers, often referred as “computing clouds”. The capability of virtual machine live migration brings benefits such as improved performance, manageability and fault tolerance, while allowing workload movement with a short service downtime. However, service levels of applications are likely to be negatively affected during a live migration. For this reason, a better understanding of its effects on system performance is desirable. In this paper, we evaluate the effects of live migration of virtual machines on the performance of applications running inside Xen VMs. Results show that, in most cases, migration overhead is acceptable but cannot be disregarded, especially in systems where availability and responsiveness are governed by strict Service Level Agreements. Despite that, there is a high potential for live migration applicability in data centers serving modern Internet applications. Our results are based on a workload covering the domain of multi-tier Web 2.0 applications.

William Voorsluys, James Broberg, Srikumar Venugopal, Rajkumar Buyya

Cloud-Oriented Virtual Machine Management with MLN

System administrators are faced with the challenge of making their existing systems power-efficient and scalable. Although Cloud Computing is offered as a solution to this challenge by many, we argue that having multiple interfaces and cloud providers can result in more complexity than before. This paper addresses cloud computing from a user perspective. We show how complex scenarios, such as an on-demand render farm and scaling web-service, can be achieved utilizing clouds but at the same time keeping the same management interface as for local virtual machines. Further, we demonstrate that by enabling the virtual machine to have its policy locally instead of in the underlying framework, it can move between otherwise incompatible cloud providers and sites in order to achieve its goals more efficiently.

Kyrre Begnum, Nii Apleh Lartey, Lu Xing

A Systematic Process for Developing High Quality SaaS Cloud Services

Software-as-a-Service (SaaS) is a type of cloud service which provides software functionality through Internet. Its benefits are well received in academia and industry. To fully utilize the benefits, there should be effective methodologies to support the development of SaaS services which provide high reusability and applicability. Conventional approaches such as object-oriented methods do not effectively support SaaS-specific engineering activities such as modeling common features, variability, and designing quality services. In this paper, we present a systematic process for developing high quality SaaS and highlight the essentiality of commonality and variability (C&V) modeling to maximize the reusability. We first define criteria for designing the process model and provide a theoretical foundation for SaaS; its meta-model and C&V model. We clarify the notion of commonality and variability in SaaS, and propose a SaaS development process which is accompanied with engineering instructions. Using the proposed process, SaaS services with high quality can be effectively developed.

Hyun Jung La, Soo Dong Kim

Cloud Computing Service Composition and Search Based on Semantic

In this paper, we put forward a matching algorithm SMA between cloud computing services of multiple input/output parameters, which considers the semantic similarity of concepts in parameters based on WordNet. Moreover, a highly efficacious service composition algorithm Fast-EP and the improved FastB+-EP are presented. Then QoS information is utilized to rank the search results. At last, we show through experiment that our approach has better efficiency of service composition than traditional approaches.

Cheng Zeng, Xiao Guo, Weijie Ou, Dong Han

Deploying Mobile Computation in Cloud Service

Cloud computing advocates a service-oriented computing par- adigm where various kinds of resources are organized in a virtual way. How to specify and execute tasks to make use of the resources efficiently thus becomes an important problem in cloud computing. Mobile computation is often regarded as a good alternative to conventional RPC-based technology for situations where resources can be dynamically bound to computations. In this paper, we propose a middleware framework for cloud computing to deploy mobile computation, especially mobile agent technology, in cloud services. The major issues to enable mobile agent-based services in the service-oriented computing are discussed and the corresponding mechanisms in the framework are introduced.

Xuhui Li, Hao Zhang, Yongfa Zhang

A Novel Method for Mining SaaS Software Tag via Community Detection in Software Services Network

The number of online software services based on SaaS paradigm is increasing. However, users usually find it hard to get the exact software services they need. At present, tags are widely used to annotate specific software services and also to facilitate the searching of them. Currently these tags are arbitrary and ambiguous since mostly of them are generated manually by service developers. This paper proposes a method for mining tags from the help documents of software services. By extracting terms from the help documents and calculating the similarity between the terms, we construct a software similarity network where nodes represent software services, edges denote the similarity relationship between software services, and the weights of the edges are the similarity degrees. The hierarchical clustering algorithm is used for community detection in this software similarity network. At the final stage, tags are mined for each of the communities and stored as ontology.

Li Qin, Bing Li, Wei-Feng Pan, Tao Peng

Retrieving and Indexing Spatial Data in the Cloud Computing Environment

In order to solve the drawbacks of spatial data storage in common Cloud Computing platform, we design and present a framework for retrieving, indexing, accessing and managing spatial data in the Cloud environment. An interoperable spatial data object model is provided based on the Simple Feature Coding Rules from the OGC such as Well Known Binary (WKB) and Well Known Text (WKT). And the classic spatial indexing algorithms like Quad-Tree and R-Tree are re-designed in the Cloud Computing environment. In the last we develop a prototype software based on Google App Engine to implement the proposed model.

Yonggang Wang, Sheng Wang, Daliang Zhou

Search Engine Prototype System Based on Cloud Computing

With the development of Internet, IT support systems need to provide more storage space and faster computing power for Internet applications such as search engine. The emergence of cloud computing can effectively solve these problems. We present a search engine prototype system based on cloud computing platform in this paper.

Jinyu Han, Min Hu, Hongwei Sun

Distributed Structured Database System HugeTable

The demand of analyzing and processing mass data is increasing in recent years. Though several optimization versions developed, the traditional RDBMS still met a lot of difficulties when facing so huge volume of data. A newly designed distributed structured database HugeTable is proposed, which have the advantage of supporting very large scale of data and fast query speed. HugeTable also have a good compatibility with the standard SQL query language. The basic functions, system architecture and critical techniques are discussed in detail. The usability and efficiency are proved by experiments.

Ji Qi, Ling Qian, Zhiguo Luo

Cloud Computing: A Statistics Aspect of Users

Users see that cloud computing delivers elastic computing services to users based on their needs. Cloud computing service providers must make sure enough resources are provided to meet users’ demand, by either the provision of more than enough resource, or the provision of just-enough resource. This paper investigates the growth of the user set of a cloud computing service from statistic’s point of view. The investigation leads to a simply model on the growth of the system in terms of users. This model provides a simple way to compute the scale of a system at a given time, thus allowing a cloud computing service provider to predict the system’s scale based on the number of users and plan for the infrastructure deployment.

Gansen Zhao, Jiale Liu, Yong Tang, Wei Sun, Feng Zhang, Xiaoping Ye, Na Tang

An Efficient Cloud Computing-Based Architecture for Freight System Application in China Railway

Cloud computing is a new network computing paradigm of distributed application environment. It utilizes the computing resource and storage resource to dynamically provide on-demand service for users. The distribution and parallel characters of cloud computing can leverage the railway freight system. We implement a cloud computing-based architecture for freight system application, which explores the Tashi and Hadoop for virtual resource management and MapReduce-based search technology. We propose the semantic model and setup configuration parameter by experiment, and develop the prototype system for freight search and tracking.

Baopeng Zhang, Ning Zhang, Honghui Li, Feng Liu, Kai Miao

Web Server Farm in the Cloud: Performance Evaluation and Dynamic Architecture

Web applications’ traffic demand fluctuates widely and unpredictably. The common practice of provisioning a fixed capacity would either result in unsatisfied customers (underprovision) or waste valuable capital investment (overprovision). By leveraging an infrastructure cloud’s on-demand, pay-per-use capabilities, we finally can match the capacity with the demand in real time. This paper investigates how we can build a web server farm in the cloud. We first present a benchmark performance study on various cloud components, which not only shows their performance results, but also reveals their limitations. Because of the limitations, no single configuration of cloud components can excel in all traffic scenarios. We then propose a dynamic switching architecture which dynamically switches among several configurations depending on the workload and traffic pattern.

Huan Liu, Sewook Wee

SPECI, a Simulation Tool Exploring Cloud-Scale Data Centres

There is a rapid increase in the size of data centres (DCs) used to provide cloud computing services. It is commonly agreed that not all properties in the middleware that manages DCs will scale linearly with the number of components. Further, “normal failure” complicates the assessment of the performance of a DC. However, unlike in other engineering domains, there are no well established tools that allow the prediction of the performance and behaviour of future generations of DCs. SPECI, Simulation Program for Elastic Cloud Infrastructures, is a simulation tool which allows exploration of aspects of scaling as well as performance properties of future DCs.

Ilango Sriram

CloudWF: A Computational Workflow System for Clouds Based on Hadoop

This paper describes CloudWF, a scalable and lightweight computational workflow system for clouds on top of Hadoop. CloudWF can run workflow jobs composed of multiple Hadoop MapReduce or legacy programs. Its novelty lies in several aspects: a simple workflow description language that encodes workflow blocks and block-to-block dependencies separately as standalone executable components; a new workflow storage method that uses Hadoop HBase sparse tables to store workflow information internally and reconstruct workflow block dependencies implicitly for efficient workflow execution; transparent file staging with Hadoop DFS; and decentralized workflow execution management relying on the MapReduce framework for task scheduling and fault tolerance. This paper describes the design and implementation of CloudWF.

Chen Zhang, Hans De Sterck

A Novel Multipath Load Balancing Algorithm in Fat-Tree Data Center

The rapid development of CPU technology, storage technology and bandwidth improvement have given rise to a strong research interest in cloud computing technology. As a basic infrastructure component of cloud computing, data center becomes more and more important. Based on the analysis of transmission efficiency, a novel hierarchical flow multipath forward (HFMF) algorithm is proposed. HFMF can use adaptive flow-splitting schemes according to the corresponding level in the three-tier fat-tree topology. NS2 simulations prove that HFMF reduces the packet disorder arrival and achieves a better performance of load balancing in the data center.

Laiquan Han, Jinkuan Wang, Cuirong Wang

Scheduling Active Services in Clustered JBI Environment

Active services may cause business or runtime errors in clustered JBI environment. To cope with this problem, a scheduling mechanism is proposed. The overall scheduling framework and scheduling algorithm is given, to guarantee the conflict-free and load balance of active services. The scheduling mechanism is implemented in SOAWARE, a SOA-based application integration platform for electric enterprises, and the experiment proves the effectiveness of scheduling algorithm.

Xiangyang Jia, Shi Ying, Luokai Hu, Chunlin Chen

Task Parallel Scheduling over Multi-core System

Parallel scheduling research based on multi-core system become more and more popular due to its super computing capacity. Scheduling fairness and load balance is the key performance indicator for current scheduling algorithm. The action of scheduler can be modeled as this: accepting the task state graph, task scheduling analyzing and putting the produced task into scheduling queue. Current algorithms involve in the action prediction according to the history record of task scheduling. One disadvantage is that it becomes little efficient when task cost keeps great difference. Our devotion is to rearrange one long task into small subtasks, then form another task state graph and parallel schedule them into task queue. The final experiments show that 20% performance booster has been reached by comparison with the traditional method.

Bo Wang

Cost-Minimizing Scheduling of Workflows on a Cloud of Memory Managed Multicore Machines

Workflows are modeled as hierarchically structured directed acyclic graphs in which vertices represent computational tasks, referred to as requests, and edges represent precedent constraints among requests. Associated with each workflow is a deadline that defines the time by which all computations of a workflow should be complete. Workflows are submitted by numerous clients to a scheduler that assigns workflow requests to a cloud of memory managed multicore machines for execution. A cost function is assumed to be associated with each workflow, which maps values of relative workflow tardiness to corresponding cost function values. A novel cost-minimizing scheduling framework is introduced to schedule requests of workflows so as to minimize the sum of cost function values for all workflows. The utility of the proposed scheduler is compared to another previously known scheduling policy.

Nicolas G. Grounds, John K. Antonio, Jeff Muehring

Green Cloud on the Horizon

This paper proposes a Green Cloud model for mobile Cloud computing. The proposed model leverage on the current trend of IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service), and look at new paradigm called "Network as a Service" (NaaS). The Green Cloud model proposes various Telco’s revenue generating streams and services with the CaaS (Cloud as a Service) for the near future.

Mufajjul Ali

Industrial Cloud: Toward Inter-enterprise Integration

Industrial cloud is introduced as a new inter-enterprise integration concept in cloud computing. The characteristics of an industrial cloud are given by its definition and architecture and compared with other general cloud concepts. The concept is then demonstrated by a practical use case, based on Integrated Operations (IO) in the Norwegian Continental Shelf (NCS), showing how industrial digital information integration platform gives competitive advantage to the companies involved. Further research and development challenges are also discussed.

Tomasz Wiktor Wlodarczyk, Chunming Rong, Kari Anne Haaland Thorsen

Community Cloud Computing

Cloud Computing is rising fast, with its data centres growing at an unprecedented rate. However, this has come with concerns over privacy, efficiency at the expense of resilience, and environmental sustainability, because of the dependence on Cloud vendors such as Google, Amazon and Microsoft. Our response is an alternative model for the Cloud conceptualisation, providing a paradigm for Clouds in the community, utilising networked personal computers for liberation from the centralised vendor model. Community Cloud Computing (C3) offers an alternative architecture, created by combing the Cloud with paradigms from Grid Computing, principles from Digital Ecosystems, and sustainability from Green Computing, while remaining true to the original vision of the Internet. It is more technically challenging than Cloud Computing, having to deal with distributed computing issues, including heterogeneous nodes, varying quality of service, and additional security constraints. However, these are not insurmountable challenges, and with the need to retain control over our digital lives and the potential environmental consequences, it is a challenge we must pursue.

Alexandros Marinos, Gerard Briscoe

A Semantic Grid Oriented to E-Tourism

With increasing complexity of tourism business models and tasks, there is a clear need of the next generation e-Tourism infrastructure to support flexible automation, integration, computation, storage, and collaboration. Currently several enabling technologies such as semantic Web, Web service, agent and grid computing have been applied in the different e-Tourism applications, however there is no a unified framework to be able to integrate all of them. So this paper presents a promising e-Tourism framework based on emerging semantic grid, in which a number of key design issues are discussed including architecture, ontologies structure, semantic reconciliation, service and resource discovery, role based authorization and intelligent agent. The paper finally provides the implementation of the framework.

Xiao Ming Zhang

Irregular Community Discovery for Social CRM in Cloud Computing

Social CRM is critical in utilities services provided by cloud computing. These services rely on virtual customer communities forming spontaneously and evolving continuously. Thus clarifying the explicit boundaries of these communities is quite essential to the quality of utilities services in cloud computing. Communities with overlapping feature or projecting vertexes are usually typical irregular communities. Traditional community identification algorithms are limited in discovering irregular topological structures from a CR networks. These uneven shapes usually play a prominent role in finding prominent customer which is usually ignored in social CRM. A novel method of discovering irregular community based on density threshold and similarity degree. It finds and merges primitive maximal cliques from the first. Irregular features of overlapping and prominent sparse vertex are further considered. An empirical case and a method comparison test indicates its efficiency and feasibility.

Jin Liu, Fei Liu, Jing Zhou, ChengWan He

A Contextual Information Acquisition Approach Based on Semantics and Mashup Technology

Pay per use is an essential feature of cloud computing. Users can make use of some parts of a large scale service to satisfy their requirements, merely at the cost of a little payment. A good understanding of the users’ requirement is a prerequisite for choosing the service in need precisely. Context implies users’ potential requirements, which can be a complement to the requirements delivered explicitly. However, traditional context-aware computing research always demands some specific kinds of sensors to acquire contextual information, which renders a threshold too high for an application to become context-aware. This paper comes up with an approach which combines contextual information obtained directly and indirectly from the cloud services. Semantic relationship between different kinds of contexts lays foundation for the searching of the cloud services. And mashup technology is adopted to compose the heterogonous services. Abundant contextual information may lend strong support to a comprehensive understanding of users’ context and a bettered abstraction of contextual requirements.

Yangfan He, Lu Li, Keqing He, Xiuhong Chen

Evaluating MapReduce on Virtual Machines: The Hadoop Case

MapReduceis emerging as an important programming model for large scale parallel application. Meanwhile, Hadoop is an open source implementation of MapReduce enjoying wide popularity for developing data intensive applications in the cloud. As, in the cloud, the computing unit is virtual machine (VM) based; it is feasible to demonstrate the applicability of MapReduce on virtualized data center. Although the potential for poor performance and heavy load no doubt exists, virtual machines can instead be used to fully utilize the system resources, ease the management of such systems, improve the reliability, and save the power. In this paper, a series of experiments are conducted to measure and analyze the performance of Hadoop on VMs. Our experiments are used as a basis for outlining several issues that will need to be considered when implementing MapReduce to fit completely in the cloud.

Shadi Ibrahim, Hai Jin, Lu Lu, Li Qi, Song Wu, Xuanhua Shi

APFA: Asynchronous Parallel Finite Automaton for Deep Packet Inspection in Cloud Computing

Security in cloud computing is getting more and more important recently. Besides passive defense such as encryption, it is necessary to implement real-time active monitoring, detection and defense in the cloud. According to the published researches, DPI (deep packet inspection) is the most effective technology to realize active inspection and defense. However, most recent works of DPI aim at space reduction but could not meet the demands of high speed and stability in the cloud. So, it is important to improve regular methods of DPI, making it more suitable for cloud computing. In this paper, an asynchronous parallel finite automaton named APFA is proposed, by introducing the asynchronous parallelization and the heuristically forecast mechanism, which significantly decreases the time consumed in matching while still keeps reducing the memory required. What is more, APFA is immune to the overlapping problem so that the stability is also enhanced. The evaluation results show that APFA achieves higher stability, better performance on time and memory. In short, APFA is more suitable for cloud computing.

Yang Li, Zheng Li, Nenghai Yu, Ke Ma

Short Papers

Secure Document Service for Cloud Computing

The development of cloud computing is still in its initial stage, and the biggest obstacle is data security. How to guarantee the privacy of user data is a worthwhile study. This paper has proposed a secure document service mechanism based on cloud computing. Out of consideration of security, in this mechanism, the content and the format of documents were separated prior to handling and storing. In addition, documents could be accessed safely within an optimized method of authorization. This mechanism would protect documents stored in cloud environment from leakage and provide an infrastructure for establishing reliable cloud services.

Jin-Song Xu, Ru-Cheng Huang, Wan-Ming Huang, Geng Yang

Privacy of Value-Added Context-Aware Service Cloud

In the cloud computing era, service provider cloud and context service cloud store all your personal context data. This is a positive aspect for value-added context-aware service cloud as it makes that context information collection are easier than was the case previously. However, this computing environment does add a series of threats in relation to privacy protection. Whoever receives the context information is able to deduce the status of the owners and, generally owners are not happy to share this information. In this paper, we propose a privacy preserved framework which can be utilized by value-added context-aware service cloud. Context data and related services access privileges are determined by context-aware role-based access control (CRAC) extended from role-based access control (RAC). Privacy preserved context service protocol (PPCS) is designed to protect user privacy from exposed context information. Additionally, user network and information diffusion is combined to evaluate the privacy protection effect.

Xin Huang, Yin He, Yifan Hou, Lisi Li, Lan Sun, Sina Zhang, Yang Jiang, Tingting Zhang

A Simple Technique for Securing Data at Rest Stored in a Computing Cloud

“Cloud Computing” offers many potential benefits, including cost savings, the ability to deploy applications and services quickly, and the ease of scaling those application and services once they are deployed. A key barrier for enterprise adoption is the confidentiality of data stored on Cloud Computing Infrastructure. Our simple technique implemented with Open Source software solves this problem by using public key encryption to render stored data at rest unreadable by unauthorized personnel, including system administrators of the cloud computing service on which the data is stored. We validate our approach on a network measurement system implemented on PlanetLab. We then use it on a service where confidentiality is critical – a scanning application that validates external firewall implementations.

Jeff Sedayao, Steven Su, Xiaohao Ma, Minghao Jiang, Kai Miao

Access Control of Cloud Service Based on UCON

Cloud computing is an emerging computing paradigm, and cloud service is also becoming increasingly relevant. Most research communities have recently embarked in the area, and research challenges in every aspect. This paper mainly discusses cloud service security. Cloud service is based on Web Services, and it will face all kinds of security problems including what Web Services face. The development of cloud service closely relates to its security, so the research of cloud service security is a very important theme. This paper introduces cloud computing and cloud service firstly, and then gives cloud services access control model based on UCON and negotiation technologies, and also designs the negotiation module.

Chen Danwei, Huang Xiuli, Ren Xunyi

Replica Replacement Strategy Evaluation Based on Grid Locality

It is highly needed on grid environment to have a high degree of data reuse to increase the performance of grid computing. We present a measure of grid locality to identify the degree of data reuse. The measure of grid locality unifies composite factors such as hit ratio, storage buffer size, and network transmission rate. The measure is applied to the evaluation of replica replacement strategy utilized in grid environment. Experiments show that the grid locality measure can evaluate the influence conducted by replica replacement strategy effectively.

Lihua Ai, Siwei Luo

Performance Evaluation of Cloud Service Considering Fault Recovery

In cloud computing, cloud service performance is an important issue. To improve cloud service reliability, fault recovery may be used. However, the use of fault recovery could have impact on the performance of cloud service. In this paper, we conduct a preliminary study on this issue. Cloud service performance is quantified by service response time, whose probability density function as well as the mean is derived.

Bo Yang, Feng Tan, Yuan-Shun Dai, Suchang Guo

BlueSky Cloud Framework: An E-Learning Framework Embracing Cloud Computing

Currently, E-Learning has grown into a widely accepted way of learning. With the huge growth of users, services, education contents and resources, E-Learning systems are facing challenges of optimizing resource allocations, dealing with dynamic concurrency demands, handling rapid storage growth requirements and cost controlling. In this paper, an E-Learning framework based on cloud computing is presented, namely BlueSky cloud framework. Particularly, the architecture and core components of BlueSky cloud framework are introduced. In BlueSky cloud framework, physical machines are virtualized, and allocated on demand for E-Learning systems. Moreover, BlueSky cloud framework combines with traditional middleware functions (such as load balancing and data caching) to serve for E-Learning systems as a general architecture. It delivers reliable, scalable and cost-efficient services to E-Learning systems, and E-Learning organizations can establish systems through these services in a simple way. BlueSky cloud framework solves the challenges faced by E-Learning, and improves the performance, availability and scalability of E-Learning systems.

Bo Dong, Qinghua Zheng, Mu Qiao, Jian Shu, Jie Yang

Cloud Infrastructure & Applications – CloudIA

The idea behind Cloud Computing is to deliver Infrastructure-as-a-Services and Software-as-a-Service over the Internet on an easy pay-per-use business model. To harness the potentials of Cloud Computing for e-Learning and research purposes, and to small- and medium-sized enterprises, the Hochschule Furtwangen University establishes a new project, called Cloud Infrastructure & Applications (CloudIA). The CloudIA project is a market-oriented cloud infrastructure that leverages different virtualization technologies, by supporting Service-Level Agreements for various service offerings. This paper describes the CloudIA project in details and mentions our early experiences in building a private cloud using an existing infrastructure.

Anthony Sulistio, Christoph Reich, Frank Doelitzscher

One Program Model for Cloud Computing

Cloud computing is dynamically virtual scalable in which neither a central computing nor a central storage is provided. All resources are virtualized and provided as a service over the Internet. Therefore different to the traditional program, the cloud program shall be expressed in a new style. Based on one presented architecture of cloud computing, characteristics of program in “cloud”, control, variable and operation, are discussed respectively. Accordingly, one program model for cloud computing is presented for the future formalization.

Guofu Zhou, Guoliang He

Enterprise Cloud Architecture for Chinese Ministry of Railway

Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.

Xumei Shan, Hefeng Liu

Research on Cloud Computing Based on Deep Analysis to Typical Platforms

Cloud Computing, as a long-term dream of turning the computation to a public utility, has the potential to make IT industry great changed: making software more charming as a service and changing the way hardware designed and purchased. Along with the rapid development of Cloud Computing, many organizations have developed different Cloud Computing platforms, expressing their different understandings of the Cloud. Based on these facts, this paper has analyzed these understandings, introduced and tested several typical kinds of Cloud Computing platforms, and contrasted among them. The purpose of the study is to give a deep insight to the trend of Cloud Computing technology and to provide reference on choosing Cloud Computing platforms according to different needs.

Tianze Xia, Zheng Li, Nenghai Yu

Automatic Construction of SP Problem-Solving Resource Space

The automation and adaptability of software systems to the dynamic environment and requirement variation is quite critical ability in cloud computing. This paper tends to organize vast stacks of problem solution resources for software processes into a structured resource space according to their topic words. The Resource Space model is well-developed by continuously adapting to its surroundings, expanding example group and refining model information. Resource topics are extracted with TDDF algorithm from document resources. Topic networks are established with topic connection strength. Then these topic networks are transformed into maximum spanning trees that are divided into different classification parts with pruning operation. This work may promotes automation of RS-based software service and development of novel software development in cloud computing environment.

Jin Liu, Fei Liu, Xue Chen, Junfeng Wang

An Idea of Special Cloud Computing in Forest Pests’ Control

Forest pests, fires and deforestation, are known as the three forest disasters. And the severity of forest pests infection has increased in recent decades. Therefore, it’s becoming more important to have strategic approach toward forest pests control.

In the face of increasingly serious forest pests control work, the existing forest management systems are no longer applicable. We are in urgent need of a new management model. After examining a variety of techniques and models, we settled on the concept and application of Cloud Computing.

In this paper, we put forward an idea of Special Cloud Computing in forest pests’ control. It is a strong professional cloud computing service which applies to forest pests’ control. It is provided by cloud computing provider and forest pests’ management.

Shaocan Jiang, Luming Fang, Xiaoying Huang

IBM Cloud Computing Powering a Smarter Planet

With increasing need for intelligent systems supporting the world’s businesses, Cloud Computing has emerged as a dominant trend to provide a dynamic infrastructure to make such intelligence possible. The article introduced how to build a smarter planet with cloud computing technology. First, it introduced why we need cloud, and the evolution of cloud technology. Secondly, it analyzed the value of cloud computing and how to apply cloud technology. Finally, it predicted the future of cloud in the smarter planet.

Jinzy Zhu, Xing Fang, Zhe Guo, Meng Hua Niu, Fan Cao, Shuang Yue, Qin Yu Liu

Cloud Computing: An Overview

In order to support the maximum number of user and elastic service with the minimum resource, the Internet service provider invented the cloud computing. within a few years, emerging cloud computing has became the hottest technology. From the publication of core papers by Google since 2003 to the commercialization of Amazon EC2 in 2006, and to the service offering of AT&T Synaptic Hosting, the cloud computing has been evolved from internal IT system to public service, from cost-saving tools to revenue generator, and from ISP to telecom. This paper introduces the concept, history, pros and cons of cloud computing as well as the value chain and standardization effort.

Ling Qian, Zhiguo Luo, Yujian Du, Leitao Guo

Integrating Cloud-Computing-Specific Model into Aircraft Design

Cloud Computing is becoming increasingly relevant, as it will enable companies involved in spreading this technology to open the door to Web 3.0. In the paper, the new categories of services introduced will slowly replace many types of computational resources currently used. In this perspective, grid computing, the basic element for the large scale supply of cloud services, will play a fundamental role in defining how those services will be provided. The paper tries to integrate cloud computing specific model into aircraft design. This work has acquired good results in sharing licenses of large scale and expensive software, such as CFD (Computational Fluid Dynamics), UG, CATIA, and so on.

Tian Zhimin, Lin Qi, Yang Guangwen

Towards a Theory of Universally Composable Cloud Computing

This paper studies universally composable Cloud computing and makes the following two-fold contributions

In the first fold, various notions of Clouds are introduced and formalized. The security of public Cloud computing is formalized within the standard secure computations framework.

In the second, a universally composable theorem of Cloud computing in the presence of monotone adversary structure within Canetti’s universally composable framework is presented and analyzed.

Our contribution possibly bridges security gaps between the industrial views of Clouds and that of theoretical researchers.

Huafei Zhu

A Service-Oriented Qos-Assured and Multi-Agent Cloud Computing Architecture

The essence of Cloud Computing is to provide services by network. As far as user are concerned, resources in the “Cloud” can be extended indefinitely at any time, acquired at any time, used on-demand, and pay-per-use. Combined with SOA and Multi-Agent technology, this paper propose a new Service-Oriented QOS-Assured cloud computing architecture which include physical device and virtual resource layer, cloud service provision layer, cloud service management and multi-agent layer to support QOS-Assured cloud service provision and request. At the same time, based on the proposed service-oriented cloud computing architecture, realization process of cloud service is simplified described.

Bu-Qing Cao, Bing Li, Qi-Ming Xia

Price-Oriented Trading Optimization for Grid Resource

The resources in the Grid are heterogeneous and geographically distributed. Availability, usage and cost policies vary depending on the particular user, time, priorities and goals. Quality of service (QoS) in grid cannot be guaranteed. This article proposes a computational economy as an effective metaphor for the management of resources and application scheduling. It proposes a QoS-based grid banking model. The model is divided into the application- layer, virtual organization (VO) layer, and the physical resources and facilities layer. At each layer, the consumer agent, service agent, and resource provider agent optimize the multi-dimensionality QoS resources respectively. The optimization is under the framework of the grid banking model and the hierarchical constraints in their respective conditions so that it can maximize the function. The optimization algorithm is price-oriented constant iteration at all levels.

Hao Li, Guo Tang, Wei Guo, Changyan Sun, Shaowen Yao

A Requirements Recommendation Method Based on Service Description

The development of Service oriented architecture (SOA) has brought new opportunities to requirements engineering. How to utilize existing services to guide the requestors to express their requirements accurately and completely becomes a new hotspot. In this paper, a requirements recommendation method based on service description was proposed. It can find web services corresponding to user’s initial requirements, establish the association relationship between user’s requirements and service functions, and in turn, recommend the associated service’s functions to the requestor and assist him to express requirements accurately and completely. The effectiveness of this method is evaluated and demonstrated by a case-study in travel planning system.

Da Ning, Rong Peng

Extending YML to Be a Middleware for Scientific Cloud Computing

Grid computing has gained great success in harnessing computing resources. But its progress of gridfication on scientific computing is slower than anticipation. This paper analyzes these reasons of hard gridification in detail. While cloud computing as a new paradigm shows its advantages for its many good features such as lost cost, pay by use, easy of use and non trivial Qos. Based on analysis on existing cloud paradigm, a cloud platform architecture based on YML for scientific computing is presented. Emulations testify we are on the right way to extending YML to be middleware for cloud computing. Finally on going improvements on YML and open problem are also presented in this paper.

Ling Shang, Serge G. Petiton, Nahid Emad, Xiaolin Yang, Zhijian Wang

Power-Aware Management in Cloud Data Centers

Power efficiency is a major concern in operating cloud data centers. It affects operational costs and return on investment, with a profound impact on the environment. Current data center operating environments, such as management consoles and cloud control software, tend to optimize for performance and service level agreements and ignore power implications when evaluating workload scheduling choices. We believe that power should be elevated to the first-order consideration in data-center management and that operators should be provided with insights and controls necessary to achieve that purpose.

In this paper we describe several foundational techniques for group-level power management that result in significant power savings in large data centers with run-time load allocation capability, such as clouds and virtualized data centers. We cover VM migration to save power, server pooling or platooning to balance power savings with startup times so as not to impair performance, and discuss power characteristics of servers that affect both the limits and the opportunities for power savings.

Milan Milenkovic, Enrique Castro-Leon, James R. Blakley

Parallel K-Means Clustering Based on MapReduce

Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel clustering algorithms. In this paper, we propose a parallel

-means clustering algorithm based on MapReduce, which is a simple yet powerful parallel programming technique. The experimental results demonstrate that the proposed algorithm can scale well and efficiently process large datasets on commodity hardware.

Weizhong Zhao, Huifang Ma, Qing He

Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL query. We make use of Hadoop’s MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.

Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham

Distributed Scheduling Extension on Hadoop

Distributed computing splits a large-scale job into multiple tasks and deals with them on clusters. Cluster resource allocation is the key point to restrict the efficiency of distributed computing platform. Hadoop is the current most popular open-source distributed platform. However, the existing scheduling strategies in Hadoop are kind of simple and cannot meet the needs such as sharing the cluster for multi-user, ensuring a concept of guaranteed capacity for each job, as well as providing good performance for interactive jobs. This paper researches the existing scheduling strategies, analyses the inadequacy and adds three new features in Hadoop which can raise the weight of job temporarily, grab cluster resources by higher-priority jobs and support the computing resources share among multi-user. Experiments show they can help in providing better performance for interactive jobs, as well as more fairly share of computing time among users.

Zeng Dadan, Wang Xieqin, Jiang Ningkang

A Data Distribution Aware Task Scheduling Strategy for MapReduce System

MapReduce is a parallel programming system to deal with massive data. It can automatically parallelize MapReduce jobs into multiple tasks, schedule to a cluster built by PCs. This paper describes a data distribution aware MapReduce task scheduling strategy. When worker nodes requests for tasks, it will compute and obtain nodes’ priority according to the times for request, the number of tasks which can be executed locally and so on. Meanwhile, it can also calculate tasks’ priority according to the numbers of copies executed by the task, latency time of tasks and so on. This strategy is based on node and task’s scheduling priority, fully considers data distribution in the system and thus schedules Map tasks to nodes having data in high probability, to reduce network overhead and improve system efficiency.

Leitao Guo, Hongwei Sun, Zhiguo Luo

Cloud Computing Based Internet Data Center

Cloud computing offers a great opportunity for Internet Data Center (IDC) to renovate its infrastructure, systems and services, and cloud computing based IDC is promising and seem as the direction of next generation IDC. This paper explores the applications of cloud computing in IDC with the target of building a public information factory, proposes the framework of cloud computing based in IDC, and probe into how to build cloud services over the cloud platform in the IDC.

Jianping Zheng, Yue Sun, Wenhui Zhou

Backmatter

Titel: Cloud Computing
herausgegeben von: Martin Gilje Jaatun
Gansen Zhao
Chunming Rong
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-10665-1
Print ISBN: 978-3-642-10664-4
DOI: https://doi.org/10.1007/978-3-642-10665-1