Skip to main content

Über dieses Buch

This book constitutes the proceedings of the 13th International Conference on Green, Pervasive, and Cloud Computing, GPC 2018, held in Hangzhou, China, in May 2018.
The 35 full papers included in this volume were carefully reviewed and selected from 101 initial submissions. They are organized in the following topical sections: network security, and privacy-preserving; pervasive sensing and analysis; cloud computing, mobile computing, and crowd sensing; social and urban computing; parallel and distributed systems, optimization; pervasive applications; and data mining and knowledge mining.



Network, Security and Privacy-Preserving


A Complex Attacks Recognition Method in Wireless Intrusion Detection System

During recent years, the challenge faced by wireless network security is getting severe with the rapid development of internet. However, due to the defects of wireless communication protocol and difference among wired networks, the existing intrusion prevention systems are seldom involved. This paper proposed a method of identifying complicated multistep attacks orienting to wireless intrusion detection system, which includes the submodules of alarm simplification, VTG generator, LAG generator, attack signature database, attack path resolver and complex attack evaluation. By means of introducing logic attack diagram and virtual topological graph, the attach path was excavated. The experimental result showed that this identification method is applicable to the real scene of wireless intrusion detection, which plays certain significance to predict attackers’ ultimate attack intention.

Guanlin Chen, Ying Wu, Kunlong Zhou, Yong Zhang

Malware Detection Using Logic Signature of Basic Block Sequence

Malware detection is an important method for maintaining the security and privacy in cyberspace. As the most mainstream method currently, signature-based detecting is confronted with many obfuscation methods which can hide the true signature of malware. In our research, we propose a logic signature-based malware detecting method to overcome the shortcoming of being susceptible to disturbance in data signature-based method. Firstly, we achieve the logic of basic block based on Symbolic execution and Static Single Assignment, and then use a set of expression trees to represent the basic block logic, the trees set will be filtered to pick out the remarkable items. Depending on basic block logic trees set, we use n-gram method to select features for the discrimination of malicious and benign software. Every feature of program is a sequence of basic block logic and the feature matching is based on edit distance calculating. We design and implement a detector and evaluate its effectiveness by comparing with data signature-based detector. The experimental results indicate that the proposed malware detector using logic signature of basic block sequence has a higher performance than data signature-based detectors.

Dawei Shi, Qiang Xu

Verifiable and Privacy-Preserving Association Rule Mining in Hybrid Cloud Environment

Nowadays it is becoming a trend for data owners to outsource data storage together with their mining tasks to cloud service providers, which however brings about security concerns on the loss of data integrity and confidentiality. Existing solutions seldom protect data privacy whilst ensuring result integrity. To address these issues, this paper proposes a series of privacy-preserving building blocks. Then an efficient verifiable association rule mining protocol is designed under hybrid cloud environment, in which the public unreliable cloud and semi-honest cloud collaborate to mine frequent patterns over the encrypted database. Our scheme not only protects the privacy of datasets from frequency analysis attacks, but also verifies the correctness of mining results. Theoretical analysis demonstrates that the scheme is semantically secure under the threat model. Experimental evaluations show that our approach can effectively detect faulty servers while achieving good privacy protection.

Hong Rong, Huimei Wang, Jian Liu, Fengyi Tang, Ming Xian

Resource and Attribute Based Access Control Model for System with Huge Amounts of Resources

In information systems where there are a large number of different resources and the resource attributes change frequently, the security, reliability and dynamics of access permissions should be guaranteed. The changing raises security concerns related to authorization, and access control, but existing access control models are difficult to meet practical requirements. In this paper, a resource and attribute based access control model named RA-BAC was proposed. The model bases on attribute-based access control (ABAC) and links access control policy with resource, and redefines the access control rules. Besides, we compare RA-BAC and ABAC from the perspective of theory and simulation experiment respectively to show the advantage of RA-BAC model. We give a detailed analysis combining with instances to show the practicability of the RA-BAC model. RA-BAC solves the problems of policy conflict and policy library expansion in the ABAC model when there are too many resources and the attributes of resources are changed frequently in the system. Using RA-BAC model in system can makes permission query efficient and reduce workload of the system administrator of managing the policy library.

Gang Liu, Lu Fang, Quan Wang, Xiaoqian Qi, Juan Cui, Jiayu Liu

Urban Data Acquisition Routing Approach for Vehicular Sensor Networks

Vehicular sensor networks have emerged as a new wireless sensor network paradigm that is envisioned to revolutionize the human driving experiences and traffic control systems. Like conventional sensor networks, they can sense events and process sensed data. Differently, vehicular sensors are equipped on vehicles such as taxies and buses. Thus, data acquisition is a hot issue which needs more attention when the routing protocols developed for conventional wireless sensor networks become unfeasible. In this paper, we propose a robust urban data acquisition routing approach, named Multi-hop Urban data Requesting and Dissemination (MURD) scheme. It consists of a base station, several static roadside replication nodes and many moving vehicles. They work together to realize the real-time data communications. The simulation results show that the proposed MURD is a flexible data acquisition routing approach which outperforms conventional approaches in terms of packet delivery ratio and data communication delay.

Leilei Meng, Ziyu Dong, Ziyu Wang, Zhen Cheng, Xin Su

Pervasive Sensing and Analysis


Searching the Internet of Things Using Coding Enabled Index Technology

With the Internet of Things (IoT) becoming a major component of our daily life, IoT search engines, which can crawl heterogeneous data sources and search in highly dynamic contexts, attract increasing attention from users, industry, and the research community. While considerable effort has been devoted to designing IoT search engines for finding a particular mobile object device, or a list of object devices that fit the query terms description, relatively little attention has been paid to enabling so-called spatial-temporal-keyword query description. This paper identifies an important efficiency problem in existing IoT search engines that simply apply a keyword or spatial-temporal matching to identify object devices that satisfy the query requirement, but that do not simultaneously consider the spatial-temporal-keyword aspect. To shed light on this line of research, we present a novel SMSTK search engine, the core of which is a coding enabled index called STK-tree that seamlessly integrates spatial-temporal-keyword proximity. Further, we propose efficient algorithms for processing range queries. Extensive experiments suggest that SMSTK search engine enables efficient query processing in spatial-temporal-keyword-based object device search.

Jine Tang, Zhangbing Zhou

Fuel Consumption Estimation of Potential Driving Paths by Leveraging Online Route APIs

Greenhouse gas and pollutant emissions generated by an increasing number of vehicles have become a significant problem in modern cities. Estimating fuel usage of potential driving paths can help drivers choose fuel-efficient paths to save more energy and protect the environment. In this paper, we build a fuel consumption model (FCM) for drivers based on their historical GPS trajectory and OBD-II data. FCM on a path only needs three parameters (i.e., the path distance, traveling time on the path and path curvature), which can be easily obtained from online route APIs. Based on experiment results, we can conclude that the proposed model can achieve high accuracy, with a mean fuel consumption error of less than 10% for paths longer than 15 km. In addition, the traveling time on paths provided by online route APIs is accurate and can be input into FCM to estimate the fuel usage of paths.

Yan Ding, Chao Chen, Xuefeng Xie, Zhikai Yang

Large-Scale Semantic Data Management For Urban Computing Applications

Due to the current lack of effectiveness on perception, management, and coordination for urban computing applications, a great number of semantic data has not yet been fully exploited and utilized, decreasing the effectiveness of urban services. To address the problem, we propose a semantic data management framework, RDFStore, for large-scale urban data management and query. RDFStore uses hashcode as the basic encoding pattern for semantic data storage. Based on the characteristics of strong connectedness of the data clique with different semantics, we construct indexes through the maximum clique on the whole semantic data. The large-scale semantic data of urban computing is organized and managed. On the basis of clique index, we adopt CLARANS clustering to enhance the accessibility of vertexes, and the data management is fulfilled. The experiment compares RDFStore to the mainstream platforms, and the results show that the proposed framework does enhance the effectiveness of semantic data management for urban computing applications.

Shengli Song, Xiang Zhang, Bin Guo

Parking Availability Prediction with Long Short Term Memory Model

Traffic congestion causes heavily energy consumption, carbon dioxide emission and air pollution in cities, which is usually created by cars searching on-street parking spaces. Drivers are likely to move slowly and waste time on the road for an available on-street parking space if parking slot availability information is not revealed in advanced. Therefore, it is necessary for city councils to provide a car parking availability prediction service which could inform car drivers vacant parking slots before they start the journey. In this paper, we propose a novel framework based on recurrent network and use the long short-term memory (LSTM) model to predict parking multi-steps ahead. The core idea of this framework is that both the occupancy rate of on-street parking in a specific region and car leaving probability are exploited as prediction performance metric. A large real parking dataset is used to evaluate the proposed approach with extensive comparative experiments. Experimental results shows the proposed model outperform the state-of-art model.

Wei Shao, Yu Zhang, Bin Guo, Kai Qin, Jeffrey Chan, Flora D. Salim

Time-Based Quality-Aware Incentive Mechanism for Mobile Crowd Sensing

Recent years have witnessed the advance of mobile crowd sensing (MCS) system. How to meet the demands of task time requirements and obtain high-quality data with little expense has become a critical problem. We focus on exploring incentive mechanisms for a practical scenario, where the tasks are time window dependent. An important indicator, “quality of user’s data (QOD)” is also considered. First, we design a prediction model based on user history data (p-QOD), to calculate the next time of the user’s QOD. Second, we design a dynamic programming algorithm based on time windows and p-QOD, to ensure all of the task time windows are covered, as well as minimizing the platform’s cost. Finally, we determine the payment for each user through a Vickrey–Clarke–Groves auction (VCG) considering the user’s true data quality (t-QOD), which is based on their submission time. Through both rigorous theoretical analysis and extensive simulations, we demonstrate that the proposed mechanisms achieve high computation efficiency, fairness, and individual rationality.

Han Yan, Ming Zhao

Cloud Computing, Mobile Computing and Crowd Sensing


Container-Based Customization Approach for Mobile Environments on Clouds

Recently, mobile cloud which utilizes the elastic resources of clouds to provide services for mobile applications, is becoming more and more popular. When building a mobile cloud platform (MCP), one of the most important things is to provide an execution environment for mobile applications, e.g., the Android mobile operating system (OS). Many efforts have been made to build Android environments on clouds, such as Android virtual machines (VMs) and Android containers. However, the need of customizable Android execution environments for MCP has been ignored for many years, since the existing OS customization solutions are only designed for hardware-specific platforms or driver-specific applications, and taking little account of frequently-changing scenarios on clouds. Moreover, they lack a unified method of customization, as well as an effective upgrade and maintenance mechanism. As a result, they are not suitable for varied and large-scale scenarios on clouds. Therefore, in this paper, we propose a unified and effective approach for customizing Android environments on clouds. The approach provides a container-based solution to custom-tailor Android OS components, as well as a way to run Android applications for different scenarios. Under the guidance of this approach, we develop an automatic customization toolkit named AndroidKit for generating specific Android OS components. Through this toolkit, we are able to boot new Android VM instances called AndroidXs. These AndroidXs are composed of OS images generated by AndroidKit, which can be easily customized and combined for varied demands on clouds.

Jiahuan Hu, Song Wu, Hai Jin, Hanhua Chen

A Dynamic Resource Pricing Scheme for a Crowd-Funding Cloud Environment

With the rapid development of cloud computing and the exponential growth of cloud users, federated clouds are becoming increasingly prevalent based on the idea of resource cooperation. In this paper, we consider a new resource cooperation model called “Crowd-funding”, which is aimed at integrating and uniformly managing geographically distributed resource-limited resource owners to achieve a more effective use of resources. The resource owners are rational and maximize their own interest when contributing resources, so a reasonable pricing scheme can incentivize more resource owners to join the Crowd-funding system and increase their service level. Therefore, we propose a dynamic pricing scheme based on a repeated game between the “Crowd-funding” system and the resource owners. The simulation results show that our resource pricing scheme can achieve more effective and longer-lasting incentivizing effects for resource owners.

Nan Zhang, Xiaolong Yang, Min Zhang, Yan Sun

Multi-choice Virtual Machine Allocation with Time Windows in Cloud Computing

Virtual machine allocation is a core problem in cloud computing. Most cloud computing platforms allow users to submit one requirement, which does not satisfy the diversity of user demands and also reduces the incomes of the platform. We propose a novel model, called multi-choice virtual machine allocation (MCVMA) with time windows, where the users can enter and leave the system at any time and submit multiple requirements. We design an optimal algorithm based on dynamic programming and a heuristic algorithm based on the resource scarcity and density for the MCVMA problem with time windows. We experimentally analyze both algorithms in terms of social welfare, execution time, resource utilization and users served.

Jixian Zhang, Ning Xie, Xuejie Zhang, Weidong Li

Fine-Gained Location Recommendation Based on User Textual Reviews in LBSNs

As user-generated reviews from Location Based Social Networks (LBSNs) are becoming increasingly pervasive, exploiting sentiment analysis based on user’s textual reviews for location recommendation has become a popular approach due to its explainable property and high prediction accuracy. However, the inherent limitations of existing methods make it difficult to discover what aspects that a user cared most about when visiting a location. In this study, we propose a fine-gained location recommendation model by jointly exploiting user’s textual reviews and ratings from LBSNs, which considers not only the direct rating that a user would score on a location but also the compatibility between user’s interested features and location’s high-quality features. Specifically, the proposed recommendation model consists of three steps: (1) extracting feature-sentiment pairs from user’s textual reviews; (2) learning to rank features using an Elo-based scheme; (3) making fine-gained location recommendation. Experiment results demonstrate that our proposed model can improve the recommendation performance compared with several state-of-the-art methods.

Yuanyi Chen, Zengwei Zheng, Lin Sun, Dan Chen, Minyi Guo

Social and Urban Computing


Estimating Origin-Destination Flows Using Radio Frequency Identification Data

The origin-destination (OD) demand is a critical information source used in the traffic strategic planning and management. The Radio Frequency Identification (RFID) is an advanced technique to collect traffic data. In this paper, daily origin-destination trips were inferred from the RFID data. Locations of RFID readers are considered as the origins and destinations. However, the sparseness of RFID data leads uncertainty to the destination of trip. To handle this problem, an approach was proposed to estimate the OD matrix. At first, the driving time of trip-legs in all trajectories are calculated by the driving time of taxis, which can be distinguished from the RFID data. And then, the stay, the last pass-by RFID reader of a trip, is inferred based on the calculated driving time. Finally, we extracted daily origin-destination trips for all vehicles. Using the proposed method, a case study was developed employing the real-world data collected in Chongqing, China, which demonstrated the effectiveness of our proposed approach.

Chaoxiong Chen, Linjiang Zheng, Chen Cui, Weining Liu

A Multi-task Decomposition and Reorganization Scheme for Collective Computing Using Extended Task-Tree

Task management has always been a key issue in collective computing, including task decomposition, distribution, execution and results integration, but there is little research on task decomposition. In order to improve multi-tasks execution efficiency and promote the full utilization of collective resources, a task decomposition model based on extended task-tree is proposed in this paper. Meanwhile, a series of pruning and reorganization algorithms are proposed, and the performance of the algorithms is analyzed and evaluated. Experiments verify that the proposed algorithms outperform traditional methods, and prove that the practicality and efficiency of the strategy.

Zhenhua Zhang, Yunlong Zhao, Yang Li, Kun Zhu, Ran Wang

CompetitiveBike: Competitive Prediction of Bike-Sharing Apps Using Heterogeneous Crowdsourced Data

In recent years, bike-sharing systems have been deployed in many cities, which provide an economical lifestyle. With the prevalence of bike-sharing systems, a lot of companies join the market, leading to increasingly fierce competition. To be competitive, bike-sharing companies and app developers need to make strategic decisions for mobile apps development. Therefore, it is significant to predict and compare the popularity of different bike-sharing apps. However, existing works mostly focus on predicting the popularity of a single app, the popularity contest among different apps has not been well explored yet. In this paper, we aim to forecast the popularity contest between Mobike and Ofo, two most popular bike-sharing apps in China. We develop CompetitiveBike, a system to predict the popularity contest among bike-sharing apps. Moreover, we conduct experiments on real-world datasets collected from 11 app stores and Sina Weibo, and the experiments demonstrate the effectiveness of our approach.

Yi Ouyang, Bin Guo, Xinjiang Lu, Qi Han, Tong Guo, Zhiwen Yu

Dual World Network Model Based Social Information Competitive Dissemination

The study of the competitive dissemination of various social information is of great significance to product marketing, political competition, and public opinion. Based on the existing small-world network model, this paper establishes a dual world network model that combines geographical factors to describe the information dissemination in society from two aspects of human relations and geographical relations. In addition, in order to describe the competitive relationship of a variety of opinions, the Opinion Acceptance Rules (OAR) were designed and simulated in the MATLAB environment. Therefore, this paper proves a lot of communication phenomena such as information explosion, information balance, and information island.

Ze-lin Zang, Jia-hui Li, Ling-yun Xu, Xu-sheng Kang

Parallel and Distribution Systems, Optimization


WarmCache: A Comprehensive Distributed Storage System Combining Replication, Erasure Codes and Buffer Cache

A tiered storage system uses replication method to provide both high reliability and availability, which stores three replicas over different nodes in the clusters. Erasure codes (EC) such as Reed-Solomon (RS) are increasingly utilized to further reduce the storage overhead while providing low I/O performance and availability. Existing solutions nowadays implement heterogeneous storage systems either using triple replication, erasure coding methods or a combination of both, although involves high performance gap between each data layer. To address this problem, in this paper, we introduce WarmCache, a new data layer for warm data by having one copy stored using erasure coding and the other copy in memory data layer. Using one copy in erasure coding data layer ensures data reliability, while the other copy in memory data layer provides fast I/O performance.

Brian A. Ignacio, Chentao Wu, Jie Li

imBBO: An Improved Biogeography-Based Optimization Algorithm

Biogeography based Optimization (BBO) is a new evolutionary optimization algorithm based on the science of biogeography for global optimization. However, its direct-copying-based migration and random mutation operators make it easily possess local exploitation ability. To enhance the performance of BBO, we propose an improved BBO algorithm called imBBO. A hybrid migration operation is designed to further improve the population diversity and enhance the algorithm exploration ability. Empirical results demonstrate that our imBBO effectively gains the high optimization performance by comparing with the original BBO and three BBO variants for 23 out of 30 CEC’2017 benchmarks. Moreover, our imBBO presents a faster convergence speed.

Kai Shi, Huiqun Yu, Guisheng Fan, Xingguang Yang, Zheng Song

An Efficient Consensus Protocol for Real-Time Permissioned Blockchains Under Non-Byzantine Conditions

Blockchains are increasingly used in the collaboration between business as a trusted distributed ledger. Coping with massive data transactions raises the requirement of real-time safety of blockchains. The celebrated Raft protocol has limitations of being a consensus protocol for permissioned blockchains where a strong consistency is needed between clients and servers. In this work, we propose a new consensus protocol called Dynasty which ensures the real-time safety and the liveness under all non-Byzantine conditions. We design and implement a three-layer permissioned blockchain framework which tolerates f failures with 2f + 1 correct servers based on Dynasty. We demonstrate the blockchain as a service in an application of used-vehicle trading management and evaluate the performance of the blockchain framework in terms of throughput and latency. Experimental results show that while the latency in different scales of the system increases as expected, the number of committed transactions per second stabilizes at a point within less than 8% difference after a warming-up period.

Gengrui Zhang, Chengzhong Xu

EDF-Based Mixed-Criticality Systems with Weakly-Hard Timing Constraints

Safety-critical embedded systems are often subject to multiple certification requirements from different certification authorities, giving rise to the concept of Mixed-Criticality Systems. In the classical Mixed-Criticality Scheduling task model, all low-criticality tasks are dropped in high-criticality mode. This approach may not be very practical in reality, since it may cause serious degradation of Quality-of-Service (QoS) for low-criticality tasks. In this paper, we present EDF with Virtual Deadlines-Weakly Hard (EDF-VD-WH), where a number of consecutive jobs of LO-crit tasks may be skipped in high-criticality mode, in order to provide a certain level of QoS for low-criticality tasks in high-criticality mode. We present schedulability analysis of EDF-VD-WH based on Demand Bound Functions, and perform experimental evaluation of schedulability acceptance ratios compared to the original EDF-VD.

Hao Wu, Zonghua Gu, Hong Li, Nenggan Zheng

GA-Based Mapping and Scheduling of HSDF Graphs on Multiprocessor Platforms

Synchronous Dataflow (SDF) is a widely-used model-of-computation for signal processing and multimedia applications. We address the problem of mapping a Homogeneous Synchronous Dataflow (HSDF) graph onto a multiprocessor platform with the objective of maximizing system throughput. Since the problem is a NP-hard combinatorial optimization problem, it computationally infeasible to use exhaustive search to obtain optimal solutions for large applications. In this paper, we apply Genetic Algorithms to search the design space of all possible actor-to-processor mappings and static-order schedules on each processor to find a near-optimal solution, and compare the performance and scalability of GA with the exact solution technique based on SAT solving.

Hao Wu, Nenggan Zheng, Hong Li, Zonghua Gu

Integration and Evaluation of a Contract-Based Flexible Real-Time Scheduling Framework in AUTOSAR OS

FRESCOR (Framework for Real-time Embedded Systems based on COntRacts) is a flexible real-time scheduling architecture for real-time systems. This paper describes our implementation experience of integrating the FRESCOR framework in an AUTOSAR-compliant Real-Time Operating System. Performance evaluation shows that the performance overheads of integrating FRESCOR is acceptable on an embedded microcontroller.

Ming Zhang, Nenggan Zheng, Hong Li

Pervasive Application


A Low-Cost Service Node Selection Method in Crowdsensing Based on Region-Characteristics

Crowdsensing is a human-centred perception model. Through the cooperation of multiple nodes, an entire sensing task is completed. To improve the efficiency of accomplishing sensing missions, a proper and cost-effective set of service nodes is needed to perform tasks. In this paper, we propose a low-cost service node selection method based on region features, which builds on relationships between task requirements and geographical locations. The method uses DBSCAN to cluster service nodes and calculate the centre point of each cluster. The region then is divided into regions according to rules of Voronoi diagram. Local feature vectors are constructed according to the historical records in each divided region. When a particular perception task arrives, Analytic Hierarchy Process (AHP) is used to match the feature vector of each region to mission requirements to get a certain number of service nodes satisfying the characteristics. To get a lower cost output, a revised Greedy Algorithm is designed to filter the exported service nodes to get the required low-cost service nodes. Experimental results suggest that the proposed method shows promise in improving service node selection accuracy and the timeliness of finishing tasks.

Zhenlong Peng, Jian An, Xiaolin Gui, Dong Liao, RuoWei Gui

Electric Load Forecasting Based on Sparse Representation Model

Accurate electric load forecasting can prevent the waste of power resources and plays a crucial role in smart grid. The time series of electric load collected by smart meters are non-linear and non-stationary, which poses a great challenge to the traditional forecasting methods. In this paper, sparse representation model (SRM) is proposed as a novel approach to tackle this challenge. The main idea of SRM is to obtain sparse representation coefficients by the training set and the part of over-complete dictionary, and the rest part of over-complete dictionary multiplied with sparse representation coefficients can be used to predict the future load value. Experimental results demonstrate that SRM is capable of forecasting the complex electric load time series effectively. It outperforms some popular machine learning methods such as Neural Network, SVM, and Random Forest.

Fangwan Huang, Xiangping Zheng, Zhiyong Yu, Guanyi Yang, Wenzhong Guo

Sensing Urban Structures and Crowd Dynamics with Mobility Big Data

To facilitate efficient and effective city management, it is important for urban authorities to understand the regular functionalities of urban areas and the irregular crowd dynamics moving around the city. However, existing methods relying on manual surveys and statistics usually cost substantial time and labor, hindering the fine-grain characterization of urban structures and the in-depth understanding of crowd dynamics. In this paper, we leverage large-scale mobility data collected from vehicle GPS devices to analyze the dynamics of crowd movement in different urban areas in a low-cost and automatic manner. We extract the regular crowd movement patterns in different areas, detect the abnormal crowd movement flow peaks, and then interpret the influences of different types of urban events. More specifically, we first divide the city into fine-grained geographic regions and cluster them according to the similarity of crowd movement characteristics. Second, we detect anomaly traffic flow for each cluster area, interpret urban events for each abnormal flow point, and correlate urban events to the interpretation results. Finally, we determine the scope of urban events and use visualization techniques to demonstrate the impact of different types of urban events. We leverage the large-scale real-world datasets from Xiamen City for evaluation. Experimental results validate the effectiveness of our method, and several case studies in Xiamen are conducted.

Yan Liu, Longbiao Chen, Linjin Liu, Xiaoliang Fan, Sheng Wu, Cheng Wang, Jonathan Li

A Multiple Factor Bike Usage Prediction Model in Bike-Sharing System

Bike-sharing is becoming popular in the world, providing a convenient service for citizens. The system has to redistribute bikes among different stations frequently to solve the imbalance of spatial distribution. Real-time monitoring doesn’t solve this problem well, since it takes too much time to redistribute the bike and affects the user experience. In this paper, we first analyze the influence of factors such as time, weather, the location of stations. Then we cluster neighboring stations with similar usage pattern, and propose a lagged variable to simulate the effect of weather conditions in usage number. Finally, a multiple factor regression model with ARMA error (MFR-ARMA) is proposed to predict the check-out/in number of bikes in each cluster in a period of time. Evaluation dataset is from New York Bike Sharing System. The prediction results of the model are compared with four baseline methods. The experiments show a lower RMSLE and ER for check-out/in number prediction in our model.

Zengwei Zheng, Yanzhen Zhou, Lin Sun

Data Mining and Knowledge Mining


Talents Recommendation with Multi-Aspect Preference Learning

Discovering talents has always been a crucial mission in recruitment and applicant selection program. Traditionally, hunting and identifying the best candidate for a particular job is executed by specialists in human resources department, which requires complex manual data collection and analysis. In this paper, we propose to seek talents for companies by leveraging a variety of data from not only online professional networks (e.g., LinkedIn), but also other popular social networks (e.g., Foursquare and Specifically, we extract three distinct features, namely global, user and job preference to understand the patterns of talent recruitment, and then a Multi-Aspect Preference Learning (MAPL) model for applicant recommendation is proposed. Experimental results based on a real-world dataset validate the effectiveness and usability of our proposed method, which can achieve nearly 75% accuracy at best in recommending candidates for job positions.

Fei Yi, Zhiwen Yu, Huang Xu, Bin Guo

A Temporal Learning Framework: From Experience of Artificial Cultivation to Knowledge

This paper presents a novel learning framework to generate fine-grained temporal cultivation knowledge from large climatic sensor data. Compared with human-experience based control, the machine-learned cultivation knowledge can provide precise climatic descriptions in temporal domain during the growth of plants. In the paper, the temporal characteristics of the sensor data are analyzed with heat maps in different temporal aspects. A merging algorithm on temporal segments, which are initialized with respect to the regularity of the heat maps, is designed to create climatic labels. Then the training samples consisting of temporal attributes and climatic labels are constructed for knowledge learning, which is represented as a collection of tree-structured classifiers. The experiments are carried out on the cultivation of a valuable Chinese herbal medicine. A cultivation knowledge cube in month, day and hour dimensions is illustrated. The results show that about 80% climatic conditions in the past successful cases can be duplicated to guide the future artificial cultivation by our method. The framework can also be applied to learn the knowledge of cultivation practices for other plants.

Lin Sun, Zengwei Zheng, Jianzhong Wu, JianFeng Zhu

A Recency Effect Hidden Markov Model for Repeat Consumption Behavior Prediction

With the rapid development of mobile payment technology in China, people can use smartphone with some mobile payment apps (such as Alipay, WeChat pay and Apple pay etc.) to pay bills instead of paying cash. Some commercial platforms accumulated large transaction date from users’ smartphones. In the repeat consumption activities, the final few (or recency) consumption has a great impact on current consumption than long-ago consumption. But traditional HMM can’t deal with this recency effect in our repeat consumption case. This paper proposes a modified HMM method based on recency effect to predict the users’ repeat consumption behavior. We introduce a factor to represent the different recency effect of different time distance. An empirical study on real-world data sets shows encouraging results on our approach, especially on the consumer group which has the most uncertain consumption behavior.

Zengwei Zheng, Yanzhen Zhou, Lin Sun

Forecast of Port Container Throughput Based on TEI@I Methodology

Forecasting container throughput accurately is crucial to the success of any port operation policy. At present, prediction of container throughput is mainly based on traditional time series analysis or single artificial neural network technology. Recent study shows that the combined forecast model enjoys more precise forecast result than monomial forecast approach. In this study, a TEI@I hybrid forecasting model is proposed, which is based on ARIMA (autoregressive integrated moving average model) and BP neural network. Under the proposed framework, ARIMA model can be first used to predict linear component, then using BP neural network to predict the error of ARIMA model which is the nonlinear component. The new method is applied to forecasting the container throughput of Qingdao Port, one of the most important ports of China. The empirical results show that this prediction method has higher prediction accuracy than the single prediction method.

Qingfei Liu, Laisheng Xiang, Xiyu Liu



Named Entity Recognition Based on BiRHN and CRF

Named entity recognition is one of the basic work in the field of natural language processing. By utilizing bidirectional LSTM, Lample achieved the best results in the field of named entity recognition in 2016. In this paper, we propose a new neural network structure based on Recurrent Highway Networks (BiRHN for short) and Conditional Random Field (CRF for short). RHN is a good solution to the problem caused by gradients, which extends the LSTM architecture to allow step-to-step transition depths larger than one. Experiments on several datasets show that our model achieves better results (F1 values) than Lample.

DongYang Zhao

Quadratic Permutation Polynomials-Based Sliding Window Network Coding in MANETs

Quadratic permutation polynomials provide very good coding performance, and they also support a particular specific conflict-free parallel access. Network coding (NC) is a technique where relay nodes mix packets using mathematical operations, which can increase the network throughput and data persistence in Mobile Ad hoc NETwork (MANET). In this paper, we propose a Quadratic Permutation Polynomials-based Sliding Window Network Coding in MANETs (QPP-SWNC). QPP-SWNC enables to control the decoding complexity of each sliding-window independently from the packets received and recover the original data. The performance of the QPP-SWNC is studied using NS2 and evaluated in terms of the encoding overhead, decoding delay and throughput when a packet is transmitted. The simulations result shows that the QPP-SWNC with our proposition can significantly improve the network throughput and encoding efficiency.

Chao Gui, Baolin Sun, Xiong Liu, Ruifan Zhang, Chengli Huang

Image Retrieval Using Inception Structure with Hash Layer for Intelligent Monitoring Platform

In view of the problem of low efficiency and accuracy in traditional image retrievals, a method using inception structure with hash layers of image retrieval is presented for intelligent monitoring platform. The main idea of our work is to add hash layers into the inception structure of deep neural network, which can be used to transform the global average pooling features into low dimensional binary hash codes. Our method is utilized to not only ensure the sparseness of the neural network, but also avoid the overfitting phenomenon. Experimental results via the MNIST and CIFAR-10 datasets show that the retrieval efficiency and accuracy can be higher using our methods than before.

BaoHua Qiang, Xina Shi, Yufeng Wang, Zhi Xu, Wu Xie, Xianjun Chen, Xingchao Zhao, Xukang Zhou

Retail Consumer Traffic Multiple Factors Analysis and Forecasting Model Based on Sparse Regression

The rapid development of O2O business has increased the competition among offline shops in China. Accurate prediction of the shop’s customer traffic can help the stores to change the strategy of sales timely and improve their competitiveness. Customer traffic forecast is more than a problem of time series. In fact, customer traffic for the next period is related to some external factors except for historical traffic. In this paper, the external factors affecting the customer traffic are analyzed using sparse coding, and we propose a sparse regression forecasting model with these external factors. The obtained results show that these external factors have varying degrees of impact on consumer traffic, and the prediction accuracy is significantly improved after considering these factors.

Zengwei Zheng, Junjie Du, Yanzhen Zhou, Lin Sun, Meimei Huo, Jianzhong Wu

Consulting and Forecasting Model of Tourist Dispute Based on LSTM Neural Network

To fill the vacancy of tourist dispute in legal consultation resources, the consulting model of tourist dispute is proposed. The legal consultation model studied in this paper is based on the Long Short-Term Memory (LSTM) network. In terms of natural language processing, the Chinese word segmentation tool jieba popular in Python is adopted, to realize dialogue through the sequential translation model seq2seq and solve the long input sequence being covered or diluted with the help of Attention model. Finally, Google’s second generation of artificial intelligence learning system TensorFlow based on DistBelief is adopted to train and optimize the model, so as to realize and train the forecasting model in this research.

Yiren Du, Jun Liu, Shuoping Wang

Research and Design of Cloud Storage Platform for Field Observation Data in Alpine Area

With the rapid increase of field observation data volume in alpine regions, there are many problems exist in the data storage for geoscience researchers, such as lack of sufficient hardware storage devices, high maintenance costs, and incomplete storage environment. Nowadays, Cloud Storage technology which based on the open source Cloud Computing platform can effectively solve these problems. Therefore, this paper constructs and designs the field observation data Cloud storage platform in the alpine region based on the Apache Hadoop Cloud platform to realize the functions of creating, uploading and browsing of field observation data files in the Cloud Storage, so as to meet the needs of researchers to store observation data, share information and backup and so on. The system also can enable the efficient management of server resources, and provide large-scale data processing capabilities.

Jiuyuan Huo

A Novel PSO Algorithm for Traveling Salesman Problem Based on Dynamic Membrane System

Membrane computing is a class of distributed parallel computing model. In this paper, we propose a novel evolutionary computation method based on dynamic active membrane system. First, an improved particle swarm optimization based on neighborhood searching of every particle that called NPSO is proposed. That is, instead of learning from Pbest and Gbest during the whole evolution, the proposed NPSO learns from Pbest and NPbest (the NPbest is selected by the Neighborhood Searching Based Learning Strategy) in the early stage to preserve swarm diversity. After the predefined number of iterations, the NPSO switches into the conventional global version PSO to accelerate convergence speed. Second, in order to avoid suffering from premature convergence in the early stage, NPSO is partitioned into two stages that in the first stage is to preserve swarm diversity and in the second stage is to enhance the convergence speed towards global optimum. The classic Traveling Salesman Problem (TSP) is one of the most significant stochastic routing problems so we use the proposed NPSO to solve it. In fact, the NPSO can achieve better balance between exploration and exploitation as well. Experimental results show that the proposed NPSO algorithm is more superior or competitive.

Yanmeng Wei, Xiyu Liu

Breaking Though the Limitation of Test Components Using in Authentication Test

In order to break through the limitation that test components in authentication test cannot be encrypted, researchers have conducted plenty of extension study into strand space and made some achievements. Firstly, we analysis the new definitions and improved theorems raised by those researchers and point out their restriction and inaccuracy by way of strand theory and examples in this paper. Secondly, we propose in this paper a new definition named minimum encryption term, effectively limiting the number of forms in which components appear in strand space, lessening the redundancy of authentication test and simplifying the analysis process of nested term encryption. And, based on minimum encryption term, we provide improved authentication test theorems: NE outgoing test, NE incoming test and NE unsolicited test, which help to testify symmetric protocol and discover its flaws, that is, the protocol is an easy target of Man-In-The-Middle attack. These improved theorems increase the accuracy of authentication tests and extend its scope of use to both symmetric and asymmetric cryptosystem.

Meng-meng Yao, Hai-ping Xia


Weitere Informationen

Premium Partner