Skip to main content
Top

2013 | Book

Advanced Computational Methods for Knowledge Engineering

Editors: Ngoc Thanh Nguyen, Tien van Do, Hoai An le Thi

Publisher: Springer International Publishing

Book Series : Studies in Computational Intelligence

insite
SEARCH

About this book

The book consists of 29 extended chapters which have been selected and invited from the submissions to the 1st International Conference on Computer Science, Applied Mathematics and Applications (ICCSAMA 2013) held on 9-10 May, 2013 in Warsaw, Poland. The book is organized into five parts, which are: Advanced Optimization Methods and Their Applications, Queuing Theory and Applications, Computational Methods for Knowledge Engineering, Knowledge Engineering with Cloud and Grid Computing, and Logic Based Methods for Decision Making and Data Mining, respectively. All chapters in the book discuss theoretical and practical issues connected with computational methods and optimization methods for knowledge engineering.

Table of Contents

Frontmatter

Advanced Optimization Methods and Their Applications

Frontmatter
Solution Methods for General Quadratic Programming Problem with Continuous and Binary Variables: Overview
Abstract
The nonconvex quadratic programming problem with continuous and/or binary variables is a typical NP-hard optimization problem, which has a wide range of applications. This article presents an overview of actual solution methods for solving this interesting and important class of programming problems. Solution methods are discussed in the sense of global optimization.
Nguyen Van Thoai
Branch and Bound Algorithm Using Cutting Angle Method for Global Minimization of Increasing Positively Homogeneous Functions
Abstract
We consider the problem of globally minimizing an abstract convex function called increasing positively homogeneous (IPH) function over a compact convex subset of an n −dimensional Euclidean space, for short, IPH optimization problem.
A method for solving IPH optimization problems called cutting angle algorithm was proposed by Rubinov and others in 1999. The principle of cutting angle algorithm is a generalization of the cutting plane method for convex programming problems, where the convex objective function is iteratively approximated by the maximum of a family of affine functions defined by its subgradients. In this article, we propose a method for solving IPH optimization problems which is a combination of the cutting angle algorithm with a branch and bound scheme successfully used in global optimization. The lower bounding procedure in the present algorithm is performed by solving ordinary convex (or even linear) programs. From preliminary computational results we hope that the proposed algorithm could work well for some problems with specific structures.
Nguyen Van Thoai
A DC Programming Framework for Portfolio Selection by Minimizing the Transaction Costs
Abstract
We consider a single-period portfolio selection problem which consists of minimizing the total transaction cost subject to different types of constraints on feasible portfolios. The transaction cost function is separable, i.e., it is the sum of the transaction cost associated with each trade, but discontinuous. This optimization problem is nonconvex and very hard to solve. We investigate in this work a DC (Difference of Convex functions) programming framework for the solution methods. First, the objective function is approximated by a DC function. Then a DC formulation for the resulting problem is proposed for which two approaches are developed: DCA (DC Algorithm) and a hybridization of Branch and Bound and DCA.
Pham Viet-Nga, Hoai An Le Thi, Pham Dinh Tao
Efficient Algorithms for Feature Selection in Multi-class Support Vector Machine
Abstract
This paper addresses the problem of feature selection for Multi-class Support Vector Machines (MSVM). Basing on the l 0 and the l 2-l 0 regularization we consider two models for this problem. The l 0-norm is approximated by a suitable way such that the resulting optimization problems can be expressed as DC (Difference of Convex functions) programs for which DC programming and DC Algorithms (DCA) are investigated. The preliminary numerical experiments on real-world datasets show the efficiency and the superiority of our methods versus one of the best standard algorithms on booth feature selection and classification.
Hoai An Le Thi, Manh Cuong Nguyen
Image Segmentation via Feature Weighted Fuzzy Clustering by a DCA Based Algorithm
Abstract
Image segmentation plays an important role in a variety of applications such as robot vision, object recognition and medical imaging,…Fuzzy clustering is undoubtedly one of the most widely used methods for image segmentation. In many cases, it happens that some characteristics of image are more significant than the others. Therefore, the introduction of a weight for each feature which defines its relevance is a natural way in image segmentation.
In this paper, we develop an efficient method for image segmentation via feature weighted fuzzy clustering model. Firstly, we formulate the feature weighted fuzzy clustering problem as a DC (Difference of Convex functions) program. DCA (DC Algorithm), an innovative approach in nonconvex programming, is then developed to solve the resulting problem. Experimental results on synthetic and real color images have illustrated the effectiveness of the proposed algorithm and its superiority with respect to the standard feature weighted fuzzy clustering algorithm in both running-time and quality of solutions.
Hoai Minh Le, Bich Thuy Nguyen Thi, Minh Thuy Ta, Hoai An Le Thi
Clustering Data Streams over Sliding Windows by DCA
Abstract
Mining data stream is a challenging research area in data mining, and concerns many applications. In stream models, the data is massive and evolving continuously, it can be read only once or a small number of times. Due to the limited memory availability, it is impossible to load the entire data set into memory. Traditional data mining techniques are not suitable for this kind of model and applications, and it is required to develop new approaches meeting these new paradigms. In this paper, we are interested in clustering data stream over sliding window. We investigate an efficient clustering algorithm based on DCA (Difference of Convex functions Algorithm). Comparative experiments with clustering using the standard K-means algorithm on some real-data sets are presented.
Ta Minh Thuy, Le Thi Hoai An, Lydia Boudjeloud-Assala
A Column Generation Based Label Correcting Approach for the Sensor Management in an Information Collection Process
Abstract
This paper deals with problems of sensor management in a human driven information collection process. This applicative context results in complex sensor-to-task assignment problems, which encompass several difficulties. First of all, the tasks take the form of several information requirements, which are linked together by logical connections and priority rankings. Second, the assignment problem is correlated by many constraint paradigms. Our problem is a variant of Vehicle Routing Problem with Time Windows (VRPTW), and it also implements resource constraints including refuelling issues. For solving this problem, we propose a column generation approach, where the label correcting method is used to treat the sub-problem. The efficiency of our approach is evaluated by comparing with solution given by CPLEX on different scenarios.
Duc Manh Nguyen, Frédéric Dambreville, Abdelmalek Toumi, Jean-Christophe Cexus, Ali Khenchaf
Planning Sensors with Cost-Restricted Subprocess Calls: A Rare-Event Simulation Approach
Abstract
This paper deals with optimal sensor planning in the context of an observation mission. In order to accomplish this mission, the observer may request some intelligence teams for preliminary prior information. Since team requests are expensive and resources are bound, the entire process results in a two-level optimization, the first level being an experiment devoted to enhance the criterion modelling. The paper proposes a solve of this problem by rare-event simulation, and a mission scenario is addressed.
Frédéric Dambreville
Large Scale Image Classification with Many Classes, Multi-features and Very High-Dimensional Signatures
Abstract
The usual frameworks for image classification involve three steps: extracting features, building codebook and encoding features, and training the classifiers with a standard classification algorithm. However, the task complexity becomes very large when performing on a large dataset ImageNet [1] containing more than 14M images and 21K classes. The complexity is about the time needed to perform each task and the memory. In this paper, we propose an efficient framework for large scale image classification. We extend LIBLINEAR developed by Rong-En Fan [2] in two ways: (1) The first one is to build the balanced bagging classifiers with under-sampling strategy. Our algorithm avoids training on full data, and the training process rapidly converges to the solution, (2) The second one is to parallelize the training process of all classifiers with a multi-core computer. The evaluation on the 100 largest classes of ImageNet shows that our approach is 10 times faster than the original LIBLINEAR, 157 times faster than our parallel version of LIBSVM and 690 times faster than OCAS [3]. Furthermore, a lot of information is lost in quantization step and the obtained bag-of-words is not enough discriminative power for classification. Therefore, we propose a novel approach using several local descriptors simultaneously.
Thanh-Nghi Doan, Thanh-Nghi Do, François Poulet
Partitioning Methods to Parallelize Constraint Programming Solver Using the Parallel Framework Bobpp
Abstract
This paper presents a parallelization of a Constraint Programming solver, OR-Tools, using the parallel framework Bobpp [2].
An argument in support of this approach is that the parallelization of algorithms searching for solutions in the research area is extensively studied over the world.
The novelty presented here is the study of a parallelization for which the control of the OR-Tools sequential search is limited. Using OR-Tools, it is possible to record the path from the tree’s root to a node so as to stop the search at a precise node. However, to start the search on a subtree, the entire path from the root of the main tree to the root of the sub-tree has to be replayed. This suggests that this leads to additional costs during the search.
To thwart this problem, different strategies of load balancing are tried to reduce the extra costs due to the redundant branches.
Tarek Menouer, Bertrand Le Cun, Pascal Vander-Swalmen

Queueing Theory and Applications

Frontmatter
Spectral Expansion Solution Methodology for QBD-M Processes and Applications in Future Internet Engineering
Abstract
Quasi Simultaneous-Multiple Births and Deaths (QBD-M) Processes are used to model many of the traffic, service and related problems in modern communication systems. Their importance is on the increase due to the great strides that are taking place in telecommunication systems and networks. This paper presents the overview of the Spectral Expansion (SE) for the steady state solution of QBD-M processes and applications in future Internet engineering.
Tien Van Do, Ram Chakka, János Sztrik
On the Properties of Generalised Markovian Queues with Heterogeneous Servers
Abstract
This paper provides the proof for some fundamental properties of the generalised Markovian queue - HetSigma, which has been proposed in order to model nodes in modern telecommunication networks. The fundamental properties serve as the background for the efficient computational algorithm to calculate the steady state probabilities of the HetSigma queue.
Tien Van Do, Ram Chakka
Performance Analysis of Edge Nodes with Multiple Path Routing
Abstract
Multiprotocol label switching (MPLS) can flexibly establish one or several paths for traffic demands in the form of label switched paths (LSP). In this paper we propose a scheme for the multiple LSPs operation of edge nodes in MPLS networks. The proposal comprises the mechanisms of load-dependent path-decision, intervention and path selection policy to facilitate efficient LSP routing. We develop a performance model to evaluate the performance of the proposed scheme.
Hung Tuan Tran, Tien Van Do, Yeon-Mo Yang

Computational Methods for Knowledge Engineering

Frontmatter
Meta-concept and Intelligent Data Processing in the Geotechnical Fields
Abstract
Hence, using the counter-concepts loose orconsolidation orhigh consolidation and intuitive soil or non-intuitive soil to express the knowledge of states of soil consolidation may lead to geotechnical paradoxes. Then, we need a new concept to represent the intelligent data that originates from engineer’s intuition and experiences. It requires further a new methodology for representing and information processing.
A new approach presented in this work for this and many practical problems embraces multiple research disciplines based on integration of modern philosophy, logic, mathematics and engineering experiences. They are referred as denotation representing and computing for determining the relative density of sands using CPT data. It helps us to reach unknown reality through true-false measures and ‘true-false’ analysis using the meta-concept, “true-false”, rather than approximation of the known reality through degrees of truth and ‘degree of truth’ analysis.
Chi Tran
Ekman Layers of Rotating Fluids with Vanishing Viscosity between Two Infinite Parallel Plates
Abstract
In this paper, we prove the convergence of weak solutions of fast rotating fluids between two infinite parallel plates towards the two-dimensional limiting system. We also put in evidence the existence of Ekman boundary layers when Dirichlet boundary conditions are imposed on the domain.
Van-Sang Ngo
Parallelization of the Fast Multipole Method for Molecular Dynamics Simulations on Multicore Computers
Abstract
We have parallelized the fast multipole method (FMM) on multicore computers using OpenMP programming model. The FMM is the one of the fastest approximate force calculation algorithms for molecular dynamics simulations. Its computational complexity is linear. Parallelization of FMM on multicore computers using OpenMP has been reported since the multicore processors become increasingly popular. However the number of those FMM implementations is not large. The main reason is that those FMM implementations have moderate or low parallel efficiency for high expansion orders due to sophisticated formulae of the FMM. In addition, parallel efficiency of those implementations for high expansion orders rapidly drops to 40% or lower as the number of threads increases to 8 or higher. Our FMM implementation on multicore computers using a combination approach as well as a newly developed formula and a computational procedure (A2P) solved the above issues. Test results of our FMM implementation on a multicore computer show that our parallel efficiency with 8 threads is at least 70% for moderate and high expansion orders p = 4,5,6,7. Moreover, the parallel efficiency for moderate and high expansion orders gradually drops from 96% to 70% as the number of threads increases.
Nguyen Hai Chau
An Adaptive Neuro-Fuzzy Inference System for Seasonal Forecasting of Tropical Cyclones Making Landfall along the Vietnam Coast
Abstract
The regression is a causal forecasting method that fits curves to the entire data set to minimize the forecasting errors. It should be noted that the linear statistic-based regression models does not support nonlinear in forecasting. According to literature, Bayesian- and Neural Network-based regression for seasonal typhoon activity forecasting is more effective than the traditional regression models. In this paper, a conjunct space cluster-based adaptive neuro-fuzzy inference system (ANFIS) is applied for seasonal forecasting of tropical cyclones making landfall along the Vietnam coast. The experimental results indicated that the conjunct space cluster-based ANFIS for seasonal forecasting of tropical cyclones is an effective approach with high accuracy.
Trong Hai Duong, Duc Cuong Nguyen, Sy Dung Nguyen, Minh Hien Hoang
An Application of Computational Collective Intelligence to Governance and Policy Modelling
Abstract
The spread of social media provides a great opportunity to enhance the transparency, participation and collaboration in modern democracies. Since nothing is perfect, a best practice engineering approach can be used to continuously monitor processes and operations that are applied in increasing the participation of people and improving public service provision. This paper outlines some basic concept of our initiative called “Knowledge Management Tools for Quality of Experience Evaluation and Policy Modeling-KNOWN” that aims at the modeling and analysis of data collected from social media and other online sources. The purpose is to provide quantitative information and feedbacks regarding the quality of open government and public service provision.
Tien Van Do, Tamas Krejczinger, Michal Laclavik, Nguyen-Thinh Le, Marcin Maleszka, Giang Nguyen, Ngoc Thanh Nguyen
A Framework for Building Intelligent Tutoring Systems
Abstract
Intelligent tutoring systems provide customized instruction or feedback to learners, without intervention from a human teacher. This feature causes that intelligent tutoring systems attract attention because they allow learning everywhere, every time and the cost of courses is cheaper than traditional in-class learning. In this work we propose a formal framework for building intelligent tutoring systems. The particular elements of those systems such as: learner profile, domain model, methods for determination and modification of a learning scenario and for computer adaptive tests are presented. Additionally, we describe an application of rough classification in e-learning systems. The conducted experiments and analysis demonstrate that the personalization has a significant influence on a learning process and the probability of passing all lessons from the learning scenario is greater if the opening learning scenario is selected using a worked-out methods than chosen in a random way. The obtained results proof the correctness of our assumptions and have significant implications for development of intelligent tutoring systems.
Adrianna Kozierkiewicz-Hetmańska, Ngoc Thanh Nguyen
A Review of AI-Supported Tutoring Approaches for Learning Programming
Abstract
In this paper, we review tutoring approaches of computer-supported systems for learning programming. From the survey we have learned three lessons. First, various AI-supported tutoring approaches have been developed and most existing systems use a feedback-based tutoring approach for supporting students. Second, the AI techniques deployed to support feedback-based tutoring approaches are able to identify the student’s intention, i.e. the solution strategy implemented in the student solution. Third, most reviewed tutoring approaches only support individual learning. In order to fill this research gap, we propose an approach to pair learning which supports two students who solve a programming problem face-to-face.
Nguyen-Thinh Le, Sven Strickroth, Sebastian Gross, Niels Pinkwart
Supporting Career Counseling with User Modeling and Job Matching
Abstract
After graduation, an engineer expects to have a desired position of predesigned jobs, e.g. a young IT engineer with a great programming experience usually prefers to a software developer or a tester than a teaching job. For each selected job position, a student has made a good preparation during the study period, such as, right selection of a major or a minor, choosing related elective courses and topics for graduation thesis, taking supportive professional certificates, etc. Therefore, a career counselor is very helpful for students to select a suitable job position to which they would prefer after graduation. This paper presents an ontology-based job matching to facilitate finding the most appropriate job position for a student based on the job’s demand and the student’s capabilities. Ontologies are employed to model industrial job positions and student’s capabilities. Matching functions are proposed to match between the job modeling and the student modeling for job recommendation. The evaluation of job matching result shows a high correlation between the proposed system’s recommendation and the recruiter’s choice.
Cuong Duc Nguyen, Khoi Duy Vo, Dung Tien Nguyen

Knowledge Engineering with Cloud and Grid Computing

Frontmatter
An Abstraction Model for High-Level Application Programming on the Cloud
Abstract
One of the biggest obstacles in the widespread take-up of cloud computing is the difficulties that users meet in developing and deploying applications into clouds. Although PaaS cloud type offers advanced features such as available platform, automatic scaling, it ties users/developers into certain technologies provided by vendors. Meanwhile, due to the gap of a suitable programming environment for IaaS cloud type, the development and deployment applications into IaaS clouds become a complex task. In this paper, we present a novel abstraction model for programming and deploying applications on the cloud. The approach also enables service migration and interoperability among different clouds.
Binh Minh Nguyen, Viet Tran, Ladislav Hluchy
Agent Based Modeling of Ecodistricts with Smart Grid
Abstract
Studies have been carried out on the different aspects of the smart grid using models and simulations, however little is done to combine those efforts to study the smart grid and ecodistrict in the same context. In this paper we discuss the role of smart grid technologies in an ecodistrict context and the mathematical tools such as complex networks theory and game theory for modeling smart grid and ecodistrict. Further more, from a modeling point of view, we choose and discuss the multiagent based modeling approach due to the complex and unpredictable characteristics of the to be modeled system. With the help of multiagent modeling guideline ASPECS, an example of ecodistrict model is discussed, as well. The example can be extended to become a real world model to study different problems in smart grid or ecodistrict designing.
Murat Ahat, Soufian Ben Amor, Alain Bui
Cloud-Based Data Warehousing Application Framework for Modeling Global and Regional Data Management Systems
Abstract
In this paper, a Cloud-based Data Warehousing Application (CDWA) Framework is proposed to handle multi levels in structures of data management systems, i.e. global data warehouse and its data marts on the clouds. First, a Cloud-based Multidimensional Model, i.e. dimensions and theirs related concepts, variables or facts, data cubes, is specified in a very formal manner. Afterwards, to fulfill global and regional specific requirements of data management systems, the Cloud-based Data Warehousing Application Framework and its classes of services are modeled by using UML (Unified Modeling Language) to design a centralized global data warehouse and its local regional data marts. In this context, an implementation example is presented in order to proof of our conceptual framework.
Thanh Binh Nguyen

Logic Based Methods for Decision Making and Data Mining

Frontmatter
A Tableau Method with Optimal Complexity for Deciding the Description Logic SHIQ
Abstract
We present the first tableau method with an ExpTime (optimal) complexity for checking satisfiability of a knowledge base in the description logic \(\mathcal{SHIQ}\). The complexity is measured using binary representation for numbers. Our method is based on global state caching and integer linear feasibility checking.
Linh Anh Nguyen
An Approach to Semantic Indexing Based on Tolerance Rough Set Model
Abstract
In this article we propose a general framework incorporating semantic indexing and search of texts within scientific document repositories, where document representation may include, excepts the content, some additional document meta-data, citations and semantic information. Our idea is based on application of Tolerance Rough Set Model, semantic information extracted from source text and domain ontology to approximate concepts associated with documents and to enrich the vector representation. We present the experiment performed over the freely accessed biomedical research articles from Pubmed Cetral (PMC) portal. The experimental results are showing the advantages of the proposed solution.
Sinh Hoa Nguyen, Hung Son Nguyen
An Approach for Mining Concurrently Closed Itemsets and Generators
Abstract
Closed itemsets and their generators play an important role in frequent itemset and association rule mining since they lead to a lossless representation of all frequent itemsets. The previous approaches discover either frequent closed itemsets or generators separately. Due to their properties and relationship, the paper proposes GENCLOSE thatmines them concurrently. In a level-wise search, it enumerates the generators using a necessary and sufficient condition for producing (i+1)-item generators from i-item ones. The condition is designed based on object-sets which can be implemented efficiently using diffsets, is very convenience and is reliably proved. Along that process, pre-closed itemsets are gradually extended using three proposed expanded operators. Also, we prove that they bring us to expected closed itemsets. Experiments on many benchmark datasets confirm the efficiency of GENCLOSE.
Anh Tran, Tin Truong, Bac Le
An Efficient Algorithm for Mining Frequent Itemsets with Single Constraint
Abstract
Towards the user, it is necessary to find the frequent itemsets which include a set C0, especially when C0 is changed regularly. Our recent studies showed that the frequent itemset mining with often changed constraints should be based on closed itemsets lattice and generators instead of database directly. In this paper, we propose a unique representation of frequent itemsets restricted on constraint C0 using closed frequent itemsets and their generators. Then, we develop the efficient algorithm to quickly and non-repeatedly generate all frequent itemsets contain C0. Extensive experiments on a broad range of many synthetic and real datasets showed the effectiveness of our approach.
Hai Duong, Tin Truong, Bac Le
Mining Frequent Weighted Closed Itemsets
Abstract
Mining frequent itemsets plays an important role in mining association rules. One of methods for mining frequent itemsets is mining frequent weighted itemsets (FWIs). However, the number of FWIs is often very large when the database is large. Besides, FWIs will generate a lot of rules and some of them are redundant. In this paper, a method for mining frequent weighted closed itemsets (FWCIs) in weighted items transaction databases is proposed. Some theorems are derived first, and based on them, an algorithm for mining FWCIs is proposed. Experimental results show that the number of FWCIs is always smaller than that of FWIs and the mining time is also better.
Bay Vo, Nhu-Y Tran, Duong-Ha Ngo
Backmatter
Metadata
Title
Advanced Computational Methods for Knowledge Engineering
Editors
Ngoc Thanh Nguyen
Tien van Do
Hoai An le Thi
Copyright Year
2013
Publisher
Springer International Publishing
Electronic ISBN
978-3-319-00293-4
Print ISBN
978-3-319-00292-7
DOI
https://doi.org/10.1007/978-3-319-00293-4

Premium Partner