Skip to main content
main-content

Über dieses Buch

The 2010 International Conference on Web Information Systems and Mining (WISM 2010) was held October 23–24, 2010 in Sanya, China. WISM 2010 received 603 submissions from 20 countries and regions. After rigorous reviews, 54 hi- quality papers were selected for publication in the WISM 2010 proceedings. The acceptance rate was 9%. The aim of WISM 2010 was to bring together researchers working in many different areas of Web information systems and Web mining to foster the exchange of new ideas and promote international collaboration. In addition to the large number of submitted papers and invited sessions, there were several internationally well-known keynote speakers. On behalf of the Organizing Committee, we thank Hainan Province Institute of Computer and Qiongzhou University for its sponsorship and logistics support. We also thank the members of the Organizing Committee and the Program Committee for their hard work. We are very grateful to the keynote speakers, invited session organizers, session chairs, reviewers, and student helpers. Last but not least, we thank all the authors and participants for their great contributions that made this conference possible. October 2010 Fu Lee Wang Gong Zhiguo Xiangfeng Luo Jingsheng Lei

Inhaltsverzeichnis

Frontmatter

Applications of Web Information Systems

An Improving Multi-Objective Particle Swarm Optimization

In the past few years, a number of researchers have successfully extended particle swarm optimization to multiple objectives. However, it still is an important issue to obtain a well-converged and well-distributed set of Pareto-optimal solutions. In this paper, we propose a fuzzy particle swarm optimization algorithm based on fuzzy clustering method and fuzzy strategy and archive update. The particles are evaluated and the dominated solutions are stored into different cluster in the generation, while dominated solutions are pruned. The non-dominated solutions are selected by fuzzy strategy, and the non-dominated solutions are added to the archive. It is observed that the proposed fuzzy particle swarm optimization algorithm is a competitive method in the terms of convergence near to the Pareto-optimal front, diversity of solutions.

JiShan Fan

Study on RSA Based Identity Authentication Algorithm and Application in Wireless Controlling System of Aids to Navigation

This paper studies the safe wireless communication of controlling instructions in system of aids to navigation. RSA based authentication algorithm is employed, and Euclid addition chains are used to compute the modular exponentiation of RSA, transforming the shortest addition chains of a large number e into those of three smaller numbers a, b and c, which meets e=a×b+c. Experiment results show that the Euclid addition chains need much less time and space to create. In the case of wireless communication with authentication, controlling instructions can be transmitted safely and accurately, and it is hard for malicious users to control the intelligent lights directly. This is of great significance for navigation trades.

Fu-guo Dong, Hui Fan

Invariant Subspaces for Some Compact Perturbations of Normal Operators

The famous computer scientist J. von Neumann initiated the research of the invariant subspace theory and its applications. This paper show that compact perturbations of normal operators on an infinite dimensional Hilbert space

H

satisfying certain conditions have nontrivial closed invariant subspaces.

Mingxue Liu

Research of Cooperative Geosteering Drilling Virtual System Based on Network

Aiming at the lacks in traditional information revelation and work style for geosteering drilling, a Cooperative Geosteering Drilling Virtual System Based on Network is developed by use of virtual reality and CSCW techniques. Sequentially a cooperative working virtual platform is provided for multi-domain experts and drilling technicians in different locations, which can make them achieve cooperative decision analysis and control while drilling towards drilling operating of the same well at the same time. The three-dimensional visualization of drilling process is implemented in this system using Java 3D. Consequently, it is convenient to control wellbore tracks in real time while drilling, and thus drilling success ratio can be improved. And real-time synchronization between different users’ scenes is achieved based on the technique of SOCKET communication and multi-threading. Analyzing the particularity of cooperative operation in geosteering drilling, so a new concurrency control mechanism based on attribute operation in client is introduced to promote the consistency, responsiveness and concurrency of multi-user cooperation. Details are provided about the architecture’ design and key techniques of the system in this paper, including the management of drilling virtual objects, constructing of virtual scene, synchronization between multi-users’ scenes, concurrency control of multi-user, and so forth.

Xiaorong Gao, Yingzhuo Xu

Optimization of Storage Location Assignment for Fixed Rack Systems

A multi-objective mathematical model and an improved Genetic Algorithm (GA) are formulated for storage location assignment of the fixed rack system. According to the assignment rules, the optimization aim is to maximize the storage/retrieval efficiency and to keep the stability of the rack system. The improved GA with Pareto optimization and Niche Technology are developed. The approach considers Pareto solution sets with the traditional operators, while the Niche Technology distributes the solutions uniformly in Pareto solution sets. The realization of the approach ensures storage location assignment optimization and offers a dynamic decision making scheme for automated storage and retrieval system (AS/RS).

Qinghong Wu, Ying Zhang, Zongmin Ma

Evaluation Query Answer over Inconsistent Database with Annotations

In this paper, we introduce an annotation based data model of relational database that may violate a set of functional dependency. In the data model, every piece of data in a relation can have zero or more annotations with it and annotations are propagated along with queries from the source to the output. With annotations, data in both input data and query answer can be divided into certain and uncertain part down to cell level. It can avoid information loss. To query an annotated database, we propose an extension of SPJ-UNION SQL,

CASQL

, and algorithms for evaluating

CASQL

so that annotations can be correctly propagated as the valid set of functional dependency changes during query processing. Last, we present a set of performance experiments which show that time performance of our approach is acceptable, but performance in information preserving is excellent.

Aihua Wu

Research and Implementation of Index Weight Calculation Model for Power Grid Investment Returns

Investment evaluation of electric power grid is an evaluation of proportional relation between the profit, which is gained from investment money of electric power grid enterprise in a certain period, and the investment which the profit occupy or consume. The investment evaluation of electric power grid is reflected by the index evaluation system, the importance of which is reflected by index weights, the precise calculation of the weights have a very important role. In this article, we have established a weight calculation model, the coefficient of variation method, Delphi method, entropy method are used to calculate the weight, and finally combination of three methods to calculate the weight, get the Combination weights.

Wei Li, Guofeng Chen, Cheng Duan

Applications of Web Mining

An Improved Algorithm for Session Identification on Web Log

As regards session identification method on web mining, an improved one has been put forward. Firstly, considering website structure and its content, page access time threshold will be reached after collecting access time of each page, which should be used to divide sessions into various sets. Then, the session sets will be optimized further, with the help of session reconstruction, namely union and rupture. It has been proved through experiment that the session set which is attained by the above method is more faithful.

Yuankang Fang, Zhiqiu Huang

A Chinese Web Page Automatic Classification System

In recent years, with the popularization of development of the network, people are getting closer and closer with the net and the number of web page is increasing rapidly. To help people to quickly locate user-interesting web page promptly in the flood of web information and improve the precision of search engine, a system of Simple Bayesian classifier for automatic classification of Chinese web page is proposed. Experimental results show that the system have high page detection rate and have ability to self-learning.

Rongyou Huang, Xinjian Zhao

Towards a Quality-Oriented Real-Time Web Crawler

Real-time search emerges as a significant amount of time-sensitive information is produced online every minute. Rather than most commercial web sites having routine content publish schedules, online users deliver their postings on web communities with high variance in both temporality and quality. In this work, we address the scheduling problem for web crawlers, with the objective of optimizing the quality of the local index (i.e. minimizing the total weighted delays of postings) with the given quantity of resources. Towards this, we utilize the posting importance evaluation mechanism and the underlying publish pattern of data source to exploit a posting weights generation prediction model, which is leveraged to help web crawler decide the retrieval points for better index quality. From extensive experiments applied on several web communities, we show the effectiveness of our policy outperforms uniform scheduling and the one purely based upon posting generation pattern.

Jianling Sun, Hui Gao, Xiao Yang

Research on the Value of Business Information Online Based on Fuzzy Comprehensive Evaluation

With the rapid development of network and e-commerce, the value of business information online has already attracted attention of small and medium-sized enterprises, because it is the basis of effective use of information and is the basis of decision making for managers to scientifically evaluate the value of business information online. This paper provides an evaluation index system of the value of business information online from the aspects of information, user and communication effect based on the current literature and marketing framework, and uses analytic hierarchy process to determine the weight of each level index, and establishes the fuzzy comprehensive evaluation model of the value of business information online. Finally the case study shows that it is reasonable and credible to evaluate the value of business information online with analytic hierarchy process and fuzzy comprehensive evaluation.

Xiaohong Wang, Yilei Pei, Liwei Li

A Clustering Algorithm Based on Matrix over High Dimensional Data Stream

Clustering high-dimensional data stream is a difficult and important issue. In this paper, we propose MStream, a new clustering algorithm based on matrix over high dimensional data stream. MStream algorithm incorporates a synopsis structure, called GC (Grid Cell Structure), and grid matrix technique. The algorithm adopts the two-phased framework. In the online component, the GC is employed to monitor one-dimensional statistics data distribution of each dimension independently. Sparse GCs which need to be deleted are checked by predefined threshold. In the offline component, it is possible to tracing multi-dimensional clusters by dense GCs which are maintained in the online component. Grid matrix technique is introduced to generate the final multi-dimensional clusters in the whole data space. Experimental results show that our algorithm has the flexible scalability and higher clustering quality.

Guibin Hou, Ruixia Yao, Jiadong Ren, Changzhen Hu

Distributed Systems

A Dynamic Dispatcher-Based Scheduling Algorithm on Load Balancing for Web Server Cluster

Effective Request distribution and load balance are the key technological means to guarantee higher-quality service in web server clusters system. Based on the ideas of dynamic load adaptation and priority service, this paper presents a new scheduling algorithm of load balance of web cluster servers. The theoretical model of this scheme is established with Markov chain and probability generating function. Mathematical analysis is made on the mean queue length and the mean inquiry cyclic time of the common queue and the key station. The findings of theoretical analysis and the experiments show that this algorithm optimizes the services of the system in time of load change input via adjustment of the times of the access gated, and strengthens the flexibility and fairness of multimedia transmissions in web cluster system.

Liyong Bao, Dongfeng Zhao, Yifan Zhao

Nature of Chinese Thesaurus Automatic Construction and Its Study of Major Technologies in Digital Libraries

Modern information network such as digital library contains much more data than ever before. These data are globally distributed, become accessible to huge, heterogeneous users easily. On the other hand, the enormous amount of information requires powerful tools for the user to find the relevant data. One such tool is thesaurus. The thesaurus as an ontology is playing an increasingly important role in knowledge management and information retrieval of digital library. The paper reconsiders the nature of thesaurus from the view of ontology. It proposes the approach and ideas of Chinese thesaurus automatic construction. Some experiment results and future work are also described in the paper.

Wen Zeng, Huilin Wang

Research on the Resource Monitoring Model Under Cloud Computing Environment

Resource monitoring is an important part of resource management under the cloud computing environment, which provides a better reference for resource allocation, task scheduling and load balancing. Because of the commercial applications target of billing the user for the use of resources , the high virtualization, scalability and transparency of the cloud computing environment’s resources, the existing resource monitoring methods of both distributed computing and grid computing can not satisfy the cloud computing environment completely. So, according to the characteristics of cloud computing platforms, we present a novel resource monitoring model appropriately adapted to cloud computing environment, which combines VMM (Virtual Machine Monitor) and the C/C++ called by Java to obtain the information of the resource status. Both theoretical analysis and experiments results show that the model can be used to collect resource monitoring information on nodes and VM (virtual machine), which not only meets the requirements of cloud computing platform features but also has a good property of effectiveness.

Junwei Ge, Bo Zhang, Yiqiu Fang

E-Government and E-Commerce

A New Photo-Based Approach for Fast Human Body Shape Modeling

In this paper, a new photo-based approach for fast human body shape modeling is proposed. This approach takes the user’s front and side digital photos as input, and then quickly generates a semblable body shape model with quadrangle mesh. The approach includes five steps: (1) extract profiles from user’s photos and divide body contour into parts, (2) automatically select body part from pre-build body parts database, (3) conduct parametric axial deformation for each part, (4) refine body parts based on FFD, (5) smooth the body mesh. The algorithms are elaborated in detail. Experiments show that this approach can achieve good results very quickly and cheaply.

Xiao Hu Liu, Yu Wen Wu, Yan Ting Huang

A Multi-Criteria Analysis Approach for the Evaluation and Selection of Electronic Market in Electronic Business in Small and Medium Sized Enterprises

This paper presents a multi-criteria analysis approach for effectively evaluating and selecting the most appropriate electronic market (e-market) in electronic business by extending the technique for order preference by similarity to ideal solution (TOPSIS). The subjective assessments of the decision maker in the e-market evaluation and selection process are represented by linguistic variables approximated by fuzzy numbers. The geometric centre based defuzzification method is used for transforming the weighting fuzzy performance matrix into the crisp performance matrix on which the TOPSIS is applied for calculating the overall performance of individual e-markets across all the selection criteria and their associated sub-criteria. An example is presented for demonstrating the applicability of the approach for solving the e-market evaluation and selection problem.

Xiaoxia Duan, Hepu Deng, Brian Corbitt

Geographic Information Systems

A WebGIS and GRA Based Transportation Risk Management System for Oversea Oil Exploitation

Rapid economic growth pushes an increasing demand for China’s oil resources. With China’s growing dependence on oil imports, oversea oil exploitation will influences overall layout of the country’s economic development and energy strategies. Because the transportation risk evaluation involves oversea complex geopolitical and economic environment, in this paper we evaluate the risk using gray relational analysis model (RGA), and develop a transportation risk management system for oversea oil exploitation combination of spatial data mining technology and WebGIS technology to provide scientific basis for safe and efficient transportation of oversea oil.

Tijun Fan, Youyi Jiang

Prediction of Pine Wilt Disease in Jiangsu Province Based on Web Dataset and GIS

80 pine wilt disease occurrence points with geographical coordinates in 2007 and 31 environmental variables from open web datasets were gathered as the main source of information. Four modeling methods of Classification and Regression Trees (CART), Genetic Algorithm for Rule-set prediction (GARP), maximum entropy method (Maxent), and Logistic Regression (LR) were introduced to generate potential geographic distribution maps of pine wood nematode in Jiangsu province, China. Then we calculated three statistical criteria of area under the Receiver Operating Characteristic Curve (AUC), Pearson correlation coefficient (COR) and Kappa to evaluate the performance of the models. The results showed that: CART outperformed other three models; slope, precipitation, seasonal variations (bio15), mean temperature of driest quarter (bio9), north-south aspect (northness), maximum temperature of warmest month (bio5) were the six enforcing environmental factors; future occurrence area of pine wilt disease will be 47.27% of total pine forest, tripling present infected area of the pest.

Mingyang Li, Milan Liu, Min Liu, Yunwei Ju

An Approach for Integrating Geospatial Processing Services into Three-Dimensional GIS

Three-dimensional (3D) GIS is gaining more and more acceptance among both scientists and the general public. Though powerful in data visualization, 3D GIS is comparatively weak in geospatial analysis. In order to enhance 3D GIS’s analysis capability, we present an approach for integrating geospatial processing services into 3D GIS. The architecture of our approach contains four layers which are the Presentation layer, the Application layer, the Service layer and the Data layer. Two different workflows are designed for this architecture to deal with geospatial tasks of different complexities. We also implement this approach in a 3D GIS project named Digital Chongming Island (DCI), Shanghai, China. By successfully integrating a variety of geospatial processing services into DCI, we have demonstrated that our approach is feasible and effective.

Yingjie Hu, Jianping Wu, Haidong Zhong, Zhenhua Lv, Bailang Yu

Parallel K-Means Clustering of Remote Sensing Images Based on MapReduce

The K-Means clustering is a basic method in analyzing RS (remote sensing) images, which generates a direct overview of objects. Usually, such work can be done by some software (e.g. ENVI, ERDAS IMAGINE) in personal computers. However, for PCs, the limitation of hardware resources and the tolerance of time consuming present a bottleneck in processing a large amount of RS images. The techniques of parallel computing and distributed systems are no doubt the suitable choices. Different with traditional ways, in this paper we try to parallel this algorithm on Hadoop, an open source system that implements the MapReduce programming model. The paper firstly describes the color representation of RS images, which means pixels need to be translated into a particular color space CIELAB that is more suitable for distinguishing colors. It also gives an overview of traditional K-Means. Then the programming model MapReduce and a platform Hadoop are briefly introduced. This model requires customized ‘map/reduce’ functions, allowing users to parallel processing in two stages. In addition, the paper detail map and reduce functions by pseudo-codes, and the reports of performance based on the experiments are given. The paper shows that results are acceptable and may also inspire some other approaches of tackling similar problems within the field of remote sensing applications.

Zhenhua Lv, Yingjie Hu, Haidong Zhong, Jianping Wu, Bo Li, Hui Zhao

Information Security

Enhancing Distributed Web Security Based on Kerberos Authentication Service

The increasing popularity of distributed web has promoted the development of new techniques to support various kinds of applications. However, users are faced with insecurity due to its inherent untrustworthiness. An identity (ID) authentication mechanism was presented. Using Kerberos protocol, Local web and Remote web could authenticate the client. If mutual authentication was required, client could also authenticate Local web and Remote web. Moreover, encryption function in the authentication process adopted Rijndael encryption algorithm of AES (Advanced Encryption Standard). Security analysis proves that this authentication process is no-impersonating and has highly availability, and also shows it is transparent and scalable and resisting attack.

Cao Lai-Cheng

A New Scheme for Protecting Master-Key of Data Centre Web Server in Online Banking

The master-key is used to encrypt the operation-key, and the operation-key is applied to encrypt the transport-key, consequently safety protection of the master-key is security core in online banking system. A scheme to protect the master-key was presented. Using method of 3-out-4 key share and LaGrange formula, the shares of the master-key were distributed to one synthesizing card and four key servers. When the data centre web server needed the master-key, the synthesizing card firstly authenticated the legitimacy of the shares of randomly selected three key severs from the four by zero-knowledge proof technology, once the shares were modified and destroyed, rest shares could make up a group so that the system worked continuously. Then the synthesizing card synthesized the master-key based on the shares of those three key severs. Security analysis proves that this scheme makes the whole system to have fault-tolerant and error detection, and also shows no-information leakage and defending collusive attack.

Cao Lai-Cheng, Liang Lei

An Automated Worm Containment Scheme

How to detect and alleviate intelligent worms with the characteristic of both slow scanning rate and high vulnerability density? Here, we present a scheme to solve the problem. Different from previous schemes, which set a limit on instantaneous scanning rate against each host, the scheme considered in this paper counts the number of unique IP addresses contacted by all hosts of a subnet over a period and sets a threshold to determine whether the subnet is suspicious. Specially, we consider the similarity of information required by users belonging to the same subnet. The result shows that our scheme is effective against slow scanning worms and worms with high vulnerability density.

Lipeng Song, Zhen Jin

RBAC-Based Access Control Integration Framework for Legacy System

Access control comprises different kinds of access control policies. This paper proposed an RBAC-based access control integration framework to achieve and manage various access control policies during legacy system integration. Permission is defined as tasks, and tasks are extracted and organized as tree structure for each system. Then, a global task tree and an integrated policy library are generated for the integrated system to reorganize access control policies of different legacy systems. Additionally rules for authorization management are given to carry out further authorization. A case study is demonstrated to depict the proposed framework is a feasible and flexible solution for access control integration.

He Guo, Guoji Lu, Yuxin Wang, Han Li, Xin Chen

A Pseudo Random Numbers Generator Based on Chaotic Iterations: Application to Watermarking

In this paper, a new chaotic pseudo-random number generator (PRNG) is proposed. It combines the well-known ISAAC and XORshift generators with chaotic iterations. This PRNG possesses important properties of topological chaos and can successfully pass NIST and TestU01 batteries of tests. This makes our generator suitable for information security applications like cryptography. As an illustrative example, an application in the field of watermarking is presented.

Christophe Guyeux, Qianxue Wang, Jacques M. Bahi

A Secure Protocol for Point-Segment Position Problem

Privacy Preserving Computation Geometry is an important direction in the application of Secure Multi-party Computation and contains many research subjects, such as intersection problem, point-inclusion problem, convex hull, rang searching and so on. Particularly, point-inclusion problem is of great practical significance in our daily life. In this paper, we will devote our attention to the point-segment position problem in point-inclusion and aim to determine the relationship of a point and a segment. In our solution, we present a concise secure protocol based on two basic protocols, secure scalar product protocol and secure comparison protocol. Compared with precious solutions, which may disclose at least one inside point, our protocol performs better in terms of preserving privacy. It will not reveal any inside point, which is crucially significant in some special occasion.

Yi Sun, Hongxiang Sun, Hua Zhang, Qiaoyan Wen

Learning Automata Representation of Network Protocol by Grammar Induction

In this work, the grammatical inference was applied to model network protocol specification as FSM from the network stream data. The original RPNI algorithm merges pairs of states of the prefix tree acceptor of the positive samples in a fixed order assuring consistency of the resulting automaton, which would get a over-generalization automaton. The proposals presented consist in the modification of RPNI algorithm by means of introducing heuristics about network feature that label merging states from the prefix tree acceptor to prevent state from merging excessively. Preliminary experiments done seem to show that the improvement over the original RPNI algorithm is more helpful for deriving the more general network protocol automaton.

Ming-Ming Xiao, Shun-Zheng Yu

Gene-Certificate Based Model for User Authentication and Access Control

Inspired by the principles of the human natural trust, a gene-certificate based model for user authentication and access control is proposed in this paper. With the formal definitions of gene-certificate, network-family, family member and gene defined, the algorithms of gene assignment, access control policy, gene signature, and gene-certificate generation, are described. Following that, the methods of network family construction and gene-certificate based authentication and access control, are designed. Stimulation results and theoretical analysis show that the presented model is valid, and it has the features of better safety. Thus, it provides an effective novel solution to network security.

Feixian Sun

A New Data Integrity Verification Mechanism for SaaS

Recently, the Software-as–a-Service (SaaS) model has been gaining more and more attention. In SaaS model, both applications and databases will be deployed in servers managed by untrustworthy service providers. Thus, the service providers might maliciously delete, modify or falsify tenants’ data due to some reasons, which brings great challenge to adoption of SaaS model. So this paper defines data integrity concept for SaaS data storage security, which could be measured in terms of durable integrity, correct integrity and provenance integrity. Basing on the meta-data driven data storage model and data chunking technology, SaaS data integrity issues will be mapped as a series of integrity issues of data chunks. Via cyclic group, data chunks traversal approach for verification is presented, and then SaaS data integrity verification can be realized based on the integrity verification of data chunks. Also, we demonstrate the correctness of the mechanism through analysis in this paper.

Yuliang Shi, Kun Zhang, Qingzhong Li

An Exquisite Authentication Scheme with Key Agreement Preserving User Anonymity

In 2009, Liao et al. proposed an exquisite mutual authentication scheme with key agreement using smart cards to access a network system legally and securely. Liao et al.’s scheme adopted a transformed identity (TID) to avoid identity duplication. However, we find out that an adversary may exploit TID to achieve offline guessing attack. Liao et al.’s scheme is also exposed to man-in-the-middle attack and their claimed theorems and proofs are incorrect. We conduct detailed analysis of flaws in the scheme and its security proof. This paper proposes an improved scheme to overcome these problems and preserve user anonymity that is an issue in e-commerce applications.

Mijin Kim, Seungjoo Kim, Dongho Won

Towards a Dynamic Federation Framework Based on SAML and Automated Trust Negotiation

One disadvantage with current Federated Identity Management systems is the establishment of the federation is based on a preestablished relying relationship between Service Provider and Identity Provider. The contribution of this paper is a proposal for the integration of Federated Identity Management with Automated Trust Negotiation to establish a Dynamic Federation, which makes the sharing of user information among potential business partners easier and more flexible, and provides better protection of users’ privacy at the same time. In this paper, the architecture, main information exchange protocol and prototype implementation of Dynamic Federation Framework are described in detail.

Yicun Zuo, Xiling Luo, Feng Zeng

Intelligent Networked Systems

Research and Application of FlexRay High-Speed Bus on Transformer Substation Automation System

Through researching FlexRay high-speed bus technology, this paper auxiliary builds electric power system hardware interface standards and software application layer standards. Meanwhile, this paper also realizes low cost and high speed communication network of digital transformer substation bottom layer, and cooperates with IEC61850 Standard to perform digital transformer substation network structure. The experiment result shows this paper’s work has actual meaning to low end application such as intelligent building, family electric and industrial and mining enterprises.

Hui Li, Hao Zhang, Daogang Peng

A Task Scheduling Algorithm Based on Load Balancing in Cloud Computing

Efficient task scheduling mechanism can meet users’ requirements, and improve the resource utilization, thereby enhancing the overall performance of the cloud computing environment. But the task scheduling in grid computing is often about the static task requirements, and the resources utilization rate is also low. According to the new features of cloud computing, such as flexibility, virtualization and etc, this paper discusses a two levels task scheduling mechanism based on load balancing in cloud computing. This task scheduling mechanism can not only meet user’s requirements, but also get high resource utilization, which was proved by the simulation results in the CloudSim toolkit.

Yiqiu Fang, Fei Wang, Junwei Ge

The Accuracy Enhancements of Virtual Antenna for Location Based Services

Measurement report (MR) base methods are a kind of cell identifier (CI) base methods whose parameters could be extracted from MRs and the cell configuration database (CCD). They are utilized for location based services (LBS), and take advantages in wireless network optimization. In this paper, we suggest the enhancements of virtual antenna method which is the improvement of enhanced CI-RXLEV method. We divide the location procedure into 2 steps, the first step is the virtual antenna method, and the second step utilizes the results calculated by the first step for filtering cells. So the cells which are far from the UE and nearby the serving cell could be filtered more precisely. We also improved the MPPE criterion cited by the virtual antenna. A new parameter is imported to avoiding MPPE score approaching zero. The experiment results show that our new enhancements perform better than the original virtual antenna method, and the position accuracy is close to the fingerprints.

Fu Tao, Huang Benxiong, Mo Yijun

Management Information Systems

Research on the Production Scheduling Management System Based on SOA

The production scheduling management is an important job which directs the whole production activities in a refinery enterprise. It is an important measure to build the dynamic scheduling management information system to improve the work efficiency, promote the production to standardize, and increase the enterprise competition ability. But how to connect the new technologies with the value chains of enterprise is more concerned by the enterprise users. After analyzing the service requirements in the refinery enterprise in depth, it develops the component libraries and then gives a solution to build the scheduling management system based on SOA and component oriented technologies. Otherwise the development of enterprise software, the program method of SOA, the interrelated SOA standards and component oriented technologies are presented in this paper.

Baoan Li

A Requirement-Driven Approach to Enterprise Application Development

The requirements changes are the root causes of evolution of enterprise applications. How to effectively develop enterprise application with the frequently changing requirements is still a challenge to software engineering. The two main aspects are how to capture requirements changes and then how to reflect them to the applications. Use cases and refactoring are excellent tools to capture functional requirements and to change object-oriented software gradually. This paper presents a requirement-driven approach to enterprise application development. The approach uses refined use cases to capture the requirements and to build domain models, controller logics and views. It transforms requirement changes into the refactorings of refined use cases, thus it can propagate the modification to the application. With rapidly continuous iterations, this approach tries to give a solution to the problems of enterprise applications development.

Jinan Duan, Qunxiong Zhu, Zhong Guan

Mobile Computing

ESA: An Efficient and Stable Approach to Querying Reverse k-Nearest-Neighbor of Moving Objects

In this work, we study how to improve the efficiency and stability of querying reverse k-nearest-neighbor (R

k

NN) for moving objects. An approach named as ESA is presented in this paper. Different from the existing approaches, ESA selects

k

objects as

pruning reference objects

for each time of pruning. In this way, its greatly improves the query efficiency. ESA also reduces the communication cost and enhances the stability of the server by adaptively adjusting the objects’

safe regions

. Experimental results verify the performance of our proposed approach.

Dunlu Peng, Wenming Long, Ting Huang, Huan Huo

The Research of Efficient Dual-Port SRAM Data Exchange without Waiting with FIFO-Based Cache

This paper proposes a program of efficient dual-port SRAM data exchange without waiting with FIFO-based cache, which is targeted for timely, massive and interactive features of data transmission in MIMO systems, using FIFO as a dual-port SRAM external cache to achieve real-time data exchange between multiple systems or processors. The program can solve time conflict and data covering problem in the competitive state of data storage, reduce the transmission delay to wait for data exchange. This paper uses dual-port SRAM CY7C019 to do a simulation test for the program, which can realize effective addressing between memory and CPU in address mapping way. By the analyses to the system performance, the effectiveness and feasibility of this program is proved.

Alfred Ji Qianqian, Zhao Ping, Cheng Sen, Tan Jingjing, Wei Xu, Wei Yong

Web Content Mining

Extracting Service Aspects from Web Reviews

Web users have published huge amounts of opinions about services in blogs, Web forums and other review friendly social websites. Consumers form their judgements to service quality according to a variety of service aspects which may be mentioned in different Web reviews. The research challenge is how to extract service aspects from service related Web reviews for conducting automatic service quality evaluation. To address this problem, this paper proposes four different methods to extract service aspects. Two methods are unsupervised methods and the other two methods are supervised methods. In the first method, we use FP-tree to find frequent aspects. The second method is graph-based method. We employ state-of-the-art machine learning methods such as CRFs (Conditional Random Fields) and MLN (Markov Logic Network) to extract service aspects. Experimental results show graph-based method outperforms FP-tree method. We also find that MLN performs well compared to other three methods.

Jinmei Hao, Suke Li, Zhong Chen

Clustering the Tagged Resources Using STAC

Similarity calculation is a key step in the process of clustering. Because most tagged resources on the Internet lack text information, traditional similarity measures cannot obtain good results. We propose the STAC measure to solve the problem of calculating the similarity between tagged resources. In the calculation of STAC, the similarity between tags is calculated using tag co-occurrence information, and the similarity between tagged resources is calculated based on tag comparison. Experiments show the clustering results of tagged resources using STAC is significantly better than using other traditional metrics such as the Euclidean distance and Jaccard coefficient.

Feihang Gao, Kening Gao, Bin Zhang

Advertising Keywords Extraction from Web Pages

A large and growing number of web pages display contextual advertising based on keywords automatically extracted from the text of the page, and it has been become a rapidly growing business in recent years. We describe a system that learns how to extract keywords from web pages for advertisement targeting. Firstly a text network for a single webpage is build, then PageRank is applied in the network to decide on the importance of a word, finally top-ranked words are selected as keywords of the webpage. The algorithm is tested on the corpus of blog pages, and the experiment result proves practical and effective.

Jianyi Liu, Cong Wang, Zhengyang Liu, Wenbin Yao

Automatic Topic Detection with an Incremental Clustering Algorithm

At present, most of the topic detection approaches are not accurate and efficient enough. In this paper, we proposed a new topic detection method (TPIC) based on an incremental clustering algorithm. It employs a self-refinement process of discriminative feature identification and a term reweighting algorithm to accurately cluster the given documents which discuss the same topic. To be efficient, the “aging” nature of topics is used to precluster stories. To automatically detect the true number of topics, Bayesian Information Criterion (BIC) is used to estimate the true number of topics. Experimental results on Linguistic Data Consortium (LDC) datasets TDT4 show that the proposed method can improve both the efficiency and accuracy, compared to other methods.

Xiaoming Zhang, Zhoujun Li

Web Information Classification

Three New Feature Weighting Methods for Text Categorization

Feature weighting is an important phase of text categorization, which computes the feature weight for each feature of documents. This paper proposes three new feature weighting methods for text categorization. In the first and second proposed methods, traditional feature weighting method

tf×idf

is combined with “one-side” feature selection metrics (i.e. odds ratio, correlation coefficient) in a moderate manner, and positive and negative features are weighted separately.

tf×idf+CC

and

tf×idf+OR

are used to calculate the feature weights. In the third method,

tf

is combined with feature entropy, which is effective and concise. The feature entropy measures the diversity of feature’s document frequency in different categories. The experimental results on Reuters-21578 corpus show that the proposed methods outperform several state-of-the-art feature weighting methods, such as

tf×idf

,

tf×CHI

, and

tf×OR

.

Wei Xue, Xinshun Xu

Algorithms of BBS Opinion Leader Mining Based on Sentiment Analysis

Opinion leaders play a crucial role in online communities, which can guide the direction of public opinion. Most proposed algorithms on opinion leaders mining in internet social network are based on network structure and usually omit the fact that opinion leaders are field-limited and the opinion sentiment orientation analysis is the vital factor of one’s authority. We propose a method to find the interest group based on topic content analysis, which combine the advantages of clustering and classification algorithms. Then we use the method of sentiment analysis to define the authority value as the weight of the link between users. On this basis, an algorithm named LeaderRank is proposed to identify the opinion leaders in BBS, and experiments indicate that LeaderRank algorithm can effectively improve the accuracy of leaders mining.

Xiao Yu, Xu Wei, Xia Lin

Entity Relationship Extraction Based on Potential Relationship Pattern

The keep rising of web information ensures the development of entity focused information retrieval system. However, the problem of mining the relationships effectively between entities has not been well resolved. For the entity relationship extraction (RE) problem, this paper firstly establishes the basic pattern trees which can present the overall relation structures and then designs a similarity function according to which we can judge which pattern the sentence containing two entities belongs to. Knowing the matched pattern, we can discovery the relationship easily. By a large number of experiments on real data, the proposed methods are proved running accurately and efficiently.

Chen Chen, HuiLin Liu, GuoRen Wang, LinLin Ding, LiLi Yu

Web Information Retrieval

On Arnoldi Method Accelerating PageRank Computations

PageRank is a very important ranking algorithm in web information retrieval or search engine. We present Power method with Arnoldi acceleration for the computation of Pagerank vector, which can take the advantage of both Power method and Arnoldi process. The description and implementation of the new algorithm are discussed in detail. Numerical results illustrate that our new method is efficient and faster than the existing counterparts.

Guo-Jian Yin, Jun-Feng Yin

A Framework for Automatic Query Expansion

The objective of this paper is to provide a framework and computational model for automatic query expansion using psuedo relevance feedback. We expect that our model can be helpful in dealing with many important aspects in automatic query expansion in an efficient way. We have performed experiments based on our model using TREC data set. Results are encouraging as they indicate improvement in retrieval efficiency after applying query expansion.

Hazra Imran, Aditi Sharan

Web Services and E-Learning

WSCache: A Cache Based Content-Aware Approach of Web Service Discovery

Web service discovery is an important issue for the construction of service based architectures and is also a prerequisite for service oriented applications. Nowadays, Data-intensive Web Service (DWS) is widely used by enterprises to provide worldwide data query services. Different from the existing service discovery approaches, in this work, we propose a service discovery mode for DWS which combines the characteristics of DWS with semantic similarity of services. The model, named as RVBDM, discovers services based on the return values of DWS. Some algorithms, such as Web service cache Update (WSCU) algorithm and Web Service Match (WSM) algorithm, are presented to implement the model. Experiments were conducted to verify the performance of our proposed approach.

Dunlu Peng, Xinglong Jiang, Wenjie Xu, Kai Duan

A Novel Local Optimization Method for QoS-Aware Web Service Composition

QoS-aware web service selection has become a hot-spot research topic in the domain of web service composition. In previous works, the multiple tasks recruited in a composite schema are usually considered of equal importance. However, it is unreasonable for each task to have the absolutely same weight in certain circumstances. Hence, it is a great challenge to mine the weights among different tasks to reflect customers’ partial preferences. In view of this challenge, a novel local optimization method is presented in this paper, which is based on a two-hierarchy weight, i.e., weight of task’s criteria and weight of tasks. Finally, a case study is demonstrated to validate the feasibility of our proposal.

Xiaojie Si, Xuyun Zhang, Wanchun Dou

Preference-Aware QoS Evaluation for Cloud Web Service Composition Based on Artificial Neural Networks

Since QoS properties play an increasingly important role during the procedure of web service composition in Cloud environment, they have obtained great interests in both research community and IT domain. Yet evaluating the comprehensive QoS values of composite services in accord with the consumers’ preferences is still a significant but challenging problem due to the subjectivity of consumers. Since reflecting preference by the explicit weights assigned for each criterion is quite arduous, this paper proposes a global QoS-driven evaluation method based on artificial neural networks, aiming at facilitating the web service composition without preference weights. As well as this, a prototype composition system is developed to bolster the execution of proposed approach.

Xuyun Zhang, Wanchun Dou

Software Architecture Driven Configurability of Multi-tenant SaaS Application

SaaS (Software as a Service) is a new emerging software application delivery model based on Internet. SaaS serves for multiple tenants with a list of business services to be delivered. The configurability of SaaS application has become an attractive aspect for tenants. The characteristics of the SaaS configurability have resulted in a recent drive to revisit the design of software architecture and challenges resulting from SaaS application. Existing approaches have made configurability strategies with external model that used formal method. The proposed method is novel because it utilizes the software architecture as a lever to coordinate between functional architectural elements and configurability components. By employing AOP (Aspect-oriented Programming), the method regards configurability as a crosscutting to realize configurability of SaaS application. Finally, a case study is assessed based on the proposed method.

Hua Wang, Zhijun Zheng

XML and Semi-structured Data

Parallel Accessing Massive NetCDF Data Based on MapReduce

As a Network Common Data Format, NetCDF has been widely used in terrestrial, marine and atmospheric sciences. A new paralleling storage and access method for large scale NetCDF scientific data is implemented based on Hadoop. The retrieval method is implemented based on

MapReduce

. The Argo data is used to demonstrate our method. The performance is compared under a distributed environment based on PCs by using different data scale and different task numbers. The experiments result show that the parallel method can be used to store and access the large scale NetCDF efficiently.

Hui Zhao, SiYun Ai, ZhenHua Lv, Bo Li

On Well-Formedness Rules for UML Use Case Diagram

A software model is a widely used technique to specify software. A UML model may contain different diagrams and a diagram is built from different elements. Each element is restraint to certain constraint or well-formedness rules (WFR). Assurance to these WFR is important to ensure the quality of UML diagrams produced. Even though, the formal definition to UML elements is rapidly increased; there is still lack of formalization of WFR. Therefore, this paper will define the WFR for use case diagrams as it is ranked as one of the most used diagram among UML practitioners. The formalization is based on set theory by logic and quantification. Based on an example of a use case diagram, we show how the diagram satisfied the WFR. Then, the elements involved in the well-formedness problem are detected and formally reasoned.

Noraini Ibrahim, Rosziati Ibrahim, Mohd Zainuri Saringat, Dzahar Mansor, Tutut Herawan

Backmatter

Weitere Informationen

Premium Partner

    Bildnachweise