Skip to main content
Top

2017 | Book

Challenges and Opportunity with Big Data

19th Monterey Workshop 2016, Beijing, China, October 8 – 11, 2016, Revised Selected Papers

insite
SEARCH

About this book

This book presents the thoroughly refereed and revised post-workshop proceedings of the 19th Monterey Workshop, held in Beijing, China, in Ocotber 2016. The workshop explored the challenges associated with the Development, Operation and Management of Large-Scale complex IT Systems.
The 18 revised full papers presented were significantly extended and improved by the insights gained from the productive and lively discussions at the workshop, and the feedback from the post-workshop peer reviews.
2016 marks the 23rd anniversary for the Monterey Workshop series which started in 1993. For nearly a quarter of century, the Monterey Workshops have established themselves as an important international forum to foster, among academia, industry, and government agencies, discussion and exchange of ideas, research results and experience in developing software intensive systems, and have significantly advanced the field. The community of the workshop participants has grown to become an influential source of ideas and innovations and its impact on the knowledge economy has been felt worldwide.

Table of Contents

Frontmatter

Theoretical Underpinnings for Big Data

Frontmatter
A Hybrid M&S Methodology for Knowledge Discovery
Abstract
M&S (Modeling and Simulation) has been widely used as a decision supporting tool by modeling the structure and dynamics of real-world systems on a computer and simulating the models to answer various what-if questions. As simulation models become complex in their dynamics and structures, more engineers are experiencing difficulties to simulate the models with various real-world scenarios and to discover knowledge from the massive amount of simulation results within a practical time bound. In this paper, we propose a hybrid methodology where the M&S process is combined with a DM (Data Mining) process. Our methodology includes a step to inject simulation outputs to a DM process which generates a prediction model by analyzing pertaining patterns in the simulation outputs. The prediction model can be used to replace simulations, if we need to expedite the M&S-based decision making process. We have applied the proposed methodology to analyze SAM (Surface-to-air missile) and confirmed the applicability.
Jae Kwon Kim, Jong Sik Lee, Kang Sun Lee
A Model-Driven Visualization System Based on DVDL
Abstract
Though model-driven engineering (MDE) methodology has made significant improvements in terms of efficiency and effectiveness in many areas of software development, the same cannot be said in the development of data visualization systems. With this challenge in mind, this paper introduces DVDL, a modular and hierarchical visualization description language that take advantage of the model-based design of MDE to describe visualization development at an abstract level. This paper also presents DVIZ, a visualization system based on DVDL. With a growing popularity and demand for data visualization technology, a number of visualization tools have emerged in recent years, though few would be considered as adaptable and scalable as DVIZ. Some of its key features include the ability for users to select data source, configure properties of visual elements, publish and share result. The system also supports real-time result generation and multi-visuals interaction. Lastly, since DVIZ is web-based, it supports distribution of result across various social media.
Yi Du, Lei Ren, Yuanchun Zhou, Jianhui Li
A Practical Energy Modeling Method for Industrial Robots in Manufacturing
Abstract
Industrial robots (IRs) are widely used in modern manufacturing systems, and energy problem of IRs is paid more attention to meet requirements of environment protection. Therefore, it is necessary to investigate the approaches to optimize the energy consumption of IRs, and the energy consumption model is the basis for enabling such approaches. Usually, energy consumption modeling for IRs is based on dynamic parameters identification. Meanwhile, the physical parameters, e.g. angle, velocity, acceleration, torque, etc. are all the necessary data of parameter identification. However, since the parts of IRs are not easy to be disassembled and the sensor modules can not be installed easily inside IRs, it is difficult to obtain all such physical parameters through sensing method, in particular the torque data. In this context, a practical energy modeling method by measuring total power for IRs is proposed. This method avoids the problem of directly measuring relevant parameters inside IRs, and the parameter identification process is gradually carried out by several excitation experiments. The experimental results show that the proposed energy modeling method can be used to predict the energy consumption of the process used in robot movement in manufacturing processes, and it can also efficiently support the analysis of the energy consumption characteristics of IRs.
Wenjun Xu, Huan Liu, Jiayi Liu, Zude Zhou, Duc Truong Pham
An Optimization Method for User Interface Components Based on Big Data
Abstract
The efficiency and usability of user interface largely depend on the design and optimization of UI components. This paper proposes an optimization method for UI components based on big data collected from users. First, a user interface components optimization model (UCOM) is proposed which is described from four aspects including user model, task model, interaction model, and component presentation model. Then, based on UCOM, a big data-driven optimization method for user interface component (BOM) is presented. This method defines complete optimizing solution, uses the crowdsourcing to publish solution, gathers and analyzes users’ big data, utilizes AHP to develop a weight formula, and finally provides integrated optimization suggestion.
Fei Lyu, Lei Ren, Yi Du
Clustering-Based Data Aggregation and Routing for Real-Time WirelessHART Communication
Abstract
Clustering-based routing strategies have been widely studied in the context of wireless sensor network for energy-efficient data communication. However, application of clustering-based routing to hard real-time wireless network is still an open challenge. In this work, we study the integration of clustering-based routing with WirelessHART for energy-efficient real-time wireless data communication. We re-design the clustering strategy to incorporate node transmission frequency so that the end-to-end delay of each data delivery is captured early in the clustering phase. Moreover, the data aggregation within a cluster is taken care of in the superframe design phase to ensure the stringent timing requirements are met. Experimental results show that the proposed data communication framework prolongs the WirelessHART network by effectively reducing the number of data packages transmitted. Meanwhile, the end-to-end delay of each data delivery can be always guaranteed.
Feng Li, Chunhui Wang, Lei Ju, Zhiping Jia

Big Data Management

Frontmatter
Constrained Semantic Grammar Enabled Question Answering System
Abstract
Restricted domain question answering (QA) system is a hotspot in the natural language processing area. Correct understanding of the users’ intentions is the key to such QA systems while open domain QA can always make use of data redundancy. In this paper, a robust but highly constrained semantic grammar and corresponding matching algorithm are proposed. On the basis of the domain ontology constructed by domain experts, the grammar experts creat the core semantic grammar, which describes patterns of expression about properties of concepts or relationships between concepts in the interested domain. In order to verify the validity of the proposed method, the method is applied to mobile service consulting area, experimental results show that the proposed method has strong practicability as well as maintainability.
Dongsheng Wang, Shi Wang, Weiming Wang, Jianhui Fu, Yun Dai
Information Composition Analysis and Adaptation Access of CNC Lathes in Cloud Manufacturing Environment
Abstract
Aiming at the complicated information composition and the features such as high-degree autonomous, disperse, dynamic and changeable, and adaptive function of CNC lathes in cloud manufacturing environment, a framework for Adaptation Access of CNC lathes is proposed, which could support various types of information modular, dynamic virtual access. The key technologies such as service modeling approach of CNC lathes based on Web Service Modeling Ontology (WSMO) and Adaptation Access of CNC lathes based on Open Service Gateway Initiative (OSGI) are researched. Finally, an experimental case is used to verify the above research results.
Lei Qiu, Chao Yin, Xiao-bin Li
Interactive Animation Editing Based on Sketch Interaction
Abstract
User-centric interactive system shows potential in allowing users facilely access to the experience of natural interaction appropriate to user’s intention with a low cognitive load. A novel approach, central to the user’s experience, is presented based on knowledge reuse, user interaction model and sketch-based interaction. A proposed sketch platform for animation editing provide the key to instantiating typical application within the reuse methods. The sketch-based interface explores a point in the tradeoff between expressiveness and naturalness to provide users a natural interactive environment. It helps to fill in the gaps of traditional WIMP (Window Icon Menu Pointing Device) pattern in the process animation drawings by sketching, exploring and modifying their ideas interactively with immediate and continuous visual feedback, and validate existing efforts and provide impetus for future work in the area of natural interaction research.
Yan Huang, Ti Zhou, Yanfeng Li, Yan Zhang, Cuixia Ma
Manufacturing Service Reconfiguration Optimization Using Hybrid Bees Algorithm in Cloud Manufacturing
Abstract
During the execution process of a cloud manufacturing (CMfg) system, manufacturing service may become faulty to cause the violation of whole production processes against the predefined constraints. It is necessary to timely adjust service aggregation process to the runtime failure during manufacturing process. Therefore it is significant to do service reconfiguration to enhance the reliability of service-oriented manufacturing applications. The issues of the runtime service process reconfiguration based on QoS and energy consumption have been studied. In this paper, by contrast, an effective reconfiguration strategy is proposed to identify reconfiguration regions rather than the whole service process. Moreover, a hybrid bees algorithm (HBA) combining discrete bees algorithm (DBA) with discrete particle swarm optimization (DPSO) is developed to explore the replaceable services during service reconfiguration process. The experiment results show that most of manufacturing service aggregation processes can be repaired by replacing only a small number of services, and HBA is more efficient when finding the replaceable manufacturing services set compared with the existing algorithms.
Wenjun Xu, Xin Zhong, Yuanyuan Zhao, Zude Zhou, Lin Zhang, Duc Truong Pham
MyTrace: A Mobile Phone-Based Tourist Spatial-Temporal Behavior Record and Analysis System
Abstract
Motivated by the needs of personalized travel position logging and interest recommendation, an open research-oriented system to collect and analyze tourist spatial-temporal behavior has been developed. In this paper, we introduce the architecture and internal structure of the system, which not only provides a communication platform to tourists, but also as a medium of data collection for related researchers and administrators. The system includes three key components: mobile phone application, data receiver, and data management and analysis platform. An application user can record his travel traces with interesting activity points in map, which are consist of pictures, videos, user’s feelings, comments, and companions, etc., and can be shared in his social network. Uploaded position logs and activity points of users can be used to analyze the characteristics of spatial-temporal behavior by researchers and administrators and infer the interesting insights that are useful in tourist behavior research and tourist attraction planning. Main functions of each component and key techniques inside the system are described briefly. The system has been tested openly since April, 2016 and promoted in two tourist destinations in July, 2016. Consequently, an available dataset including 188,944 GPS locations, 285 activity points and 251 questionnaire responses from 659 registered users is constructed. The initial experiment results show the system is effective and worth promoting. We hope that more users not only tourists and researchers join this research system.
Lei Dou, Haitao Qu, Xiaoqiang Bi, Yu Zhang, Chongsheng Yu, Jian Qin, Xiaoting Huang, Xin Li

Big Data Simulation

Frontmatter
Multi-source Information Intelligent Collection and Monitoring of CNC Machine Tools Based on Multi-agent
Abstract
Currently, CNC machine tools in manufacturing system are no longer effective in Multi-source data acquisition, real-time interaction and remote monitoring due to the poor ability of compatibility, cross-platform performance, and remote service. An information model and index system for multilayer operation situation considering the information’s multi-source characteristics of CNC machine tools in the intelligent manufacturing environment are built in this paper, and a new approach to realize the multi-source data intelligent acquisition and monitoring based upon Multi-Agent is proposed. Especially, it focuses on the key technologies of intelligent multi-source data acquisition and visualized dynamic monitoring based on OPC and multi-agent. Finally, these analyses are illustrated in a real CNC machine tool VMC1060L and the result shows that it’s an effective approach which has a better performance in practical application.
Yun Yang, Chao Yin, Xiao-bin Li, Liang Li
Ontology Management and Ontology Reuse in Web Environment
Abstract
As a kind of knowledge representation method, ontology describes knowledge and information semantically in various fields and it has been widely used in Web environment. Ontology management and ontology reuse can solve the problems of knowledge confusion and inefficient knowledge base construction when applying ontology. This paper builds an ontology management framework and presents the system data storage model, ontology maintenance method and role-based ontology collaborative definition method. On this basis, an ontology reuse method based on Semantic Web Rule Language (SWRL) rule is proposed.
Yapeng Cui, Lihong Qiao, Yifan Qie
Research on the Shortest Path of Two Places in Urban Based on Improved Ant Colony Algorithm
Abstract
Based on the GIS electronic map and traffic control information database, a shortest path algorithm based on GIS technology is proposed, the A and B geographic information of the monitoring points are extracted, and the shortest path algorithm is used to solve the shortest path between A and B. Using the improved ant colony algorithm to calculate the shortest distance from the start node to the target node. In view of the phenomenon of ant colony algorithm convergence speed is slow and easy to fall into premature defects, and the effective measures for improvement was put forward, and take the simplifying road network as an example, a simulation of the algorithm was conducted. The satisfactory results of the simulation verify the effectiveness of the algorithm.
Yanjuan Hu, Luquan Ren, Hongwei Zhao, Yao Wang
RUL Prediction of Bearings Based on Mixture of Gaussians Bayesian Belief Network and Support Vector Data Description
Abstract
This paper presents a method to predict the Remaining Useful Life (RUL) of bearings based on theories of Mixture of Gaussians Bayesian Belief Network (MoG-BBN) and Support Vector Data Description (SVDD). In this method, the feature vectors, which are used to train the corresponding MoG-BBN and SVDD model, are extracted from raw sensor data by using wavelet packet decomposition (WPD). Genetic algorithm is employed to determine the initial value of the variables in MoG-BBN training algorithm so that the stability of MoG-BBN can be enhanced. The two models are combined to acquire a good generalization ability. We demonstrate the effectiveness of the proposed method by using actual bearing datasets from the NASA prognostic data repository.
Qianhui Wu, Yu Feng, Biqing Huang

Industrial Track of Big Data

Frontmatter
Social Recommendation Terms: Probabilistic Explanation Optimization
Abstract
The Probabilistic Matrix Factorization (PMF) model has been widely studied for recommender systems, which outperform previous models with a solid probabilistic explanation. To further improve its accuracy by using social information, researchers attempt to combine the PMF model with social network graphs by adding social terms. However, existing works on social terms do not provide theoretical explanations to make the models well understood. The lack of explanations limits further improvement of prediction accuracy. Hence, in this paper we provide our explanation and propose a unified covariance framework to solve this problem. Our explanation, including regularization terms, factorization terms and an ensemble of them, reveals how most social terms work from a probabilistic view. Our framework shows that those terms could be optimized in a direct way compatible to PMF. We find out that accuracy improvements for existing works on regularization terms rely more on personalized properties, and that social information for factorization terms is helpful but not always necessary.
Jie Liu, Lin Zhang, Victor S. Sheng, Yuanjun Laili
Towards a Holistic Method for Business Process Analytics
Abstract
In this paper, we propose a holistic approach aimed at combining business process modelling and data-driven business process improvement. The first step requires to develop a “precise” model of the processes of the organization using the UML. Precise means that all business entities involved in the process are determined as well as all the tasks composing the process executions, and all relevant data about them are modelled. Then, a model of the data space of the process will be derived taking into account also the quantitative aspects of each process (e.g. how many instances of the process will run each day?, how many possible instances of some business entities will be around or have ever been created?). In this way it is possible to conceive and design various analyses and improvements of the process based on its data, since all the aspects related to each business process have been explicitly modelled, and in a sufficiently formal way. We will introduce our approach using a small case study: the Buying process having as participants manufacturers, dealers, shippers, and payment systems.
Gianna Reggio, Maurizio Leotta, Filippo Ricca, Egidio Astesiano
Traffic Flow Prediction with Improved SOPIO-SVR Algorithm
Abstract
In urban public transport, the traffic flow prediction is a classical non-linear complicated optimization problem, which is very important for public transport system. With the rapid development of the big data, Smart card data of bus which is provided by millions of passengers traveling by bus across several days plays a more and more important role in our daily life. The issue that we address is whether the data mining algorithm and the intelligent optimization algorithm can be applied to forecast the traffic flow from big data of bus. In this paper, a novel algorithm which called mixed support vector regression with sub-space orthogonal pigeon-Inspired Optimization (SOPIO-MSVR) is used to predict the traffic flow and optimize the algorithm progress. Results show the SOPIO-MSVR algorithm outperforms other algorithms by a margin and is a competitive algorithm. And the research can make the significant contribution to the improvement of the transportation.
Xuejun Cheng, Lei Ren, Jin Cui, Zhiqiang Zhang
Workshop Multi-source Information IntelliSense Method Based on IPv6 Intelligent Terminal
Abstract
Aiming at currently problems that workshop information is multi-source heterogeneous, isolated and inefficient to interaction, the paper proposes a Workshop Multi-source Information IntelliSense method based on IPv6 Intelligent Terminal, and IPv6 Intelligent Terminal is taken as center, IPv6 protocol as the unified communications protocol of workshop, downward integrate Wireless Sensor Networks(WSN) to realize multi-source information IntelliSense and upward realize real-time, efficient interaction with PC. The method-related key technologies are studied, including implementation technology of plant-level IPv6 Intelligent Terminal and XML-based intelligent analysis and adaptation of workshop multi-source information. Finally, the effectiveness and practicality of the method are verified in a manufacturing plant.
Chao Yin, Zhengbing Pan, Xiaobin Li, Liang Li
Backmatter
Metadata
Title
Challenges and Opportunity with Big Data
Editors
Lin Zhang
Lei Ren
Fabrice Kordon
Copyright Year
2017
Electronic ISBN
978-3-319-61994-1
Print ISBN
978-3-319-61993-4
DOI
https://doi.org/10.1007/978-3-319-61994-1

Premium Partner