Top

2014 | Book

Read chapter Read first chapter

Intelligent Information and Database Systems

6th Asian Conference, ACIIDS 2014, Bangkok, Thailand, April 7-9, 2014, Proceedings, Part I

Editors: Ngoc Thanh Nguyen, Boonwat Attachoo, Bogdan Trawiński, Kulwadee Somboonviwat

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

The two-volume set LNAI 8397 and LNAI 8398 constitutes the refereed proceedings of the 6th Asian Conference on Intelligent Information and Database Systems, ACIIDS 2014, held in Bangkok, Thailand, in April 2014. The 125 revised papers presented were carefully reviewed and selected from 300 submissions. The papers address the following topics: natural language and text processing, intelligent information retrieval, semantic Web, social networks and recommendation systems, intelligent database systems, decision support systems, computer vision techniques, and machine learning and data mining. The papers are organized in topical sections on multiple model approach to machine learning, MMAML 2014, computational intelligence, CI 2014, engineering knowledge and semantic systems, IWEKSS 2014, innovations in intelligent computation and applications, IICA 2014, modeling and optimization techniques in information systems, database systems and industrial systems, MOT 2014, innovation via collective intelligences and globalization in business management, ICIGBM 2014, intelligent supply chains, ISC 2014, and human motion: acquisition, processing, analysis, synthesis and visualization for massive datasets, HMMD 2014.

Frontmatter

Natural Language and Text Processing

A Meta-model Guided Expression Engine

Data acquisition and handling is known to be one of the most severe technical barriers in (bio-)medical research. In order to counter this problem, we created a generic data acquisition and managing system which can be set up for the given domain of application without the need for programming- or database-skills. The user definitions of the domain data structures are stored into an abstract meta data-model and allow the automatic creation of data-input and -managing interfaces. In order to enable the user to define complex search queries on the data or derive new data out of already existing, a meta-model-guided expression engine was developed. Grammatical and structural meta-data are interwoven in order to provide support in expression generation to the domain expert.

Dominic Girardi, Josef Küng, Michael Giretzlehner

Text Clustering Using Novel Hybrid Algorithm

Feature clustering has evolved to be a powerful method for clustering text documents. In this paper we propose a hybrid similarity based clustering algorithm for feature clustering. Documents are represented by keywords. These words are grouped into clusters, based on efficient similarity computations. Documents with related words are grouped into clusters. The clusters are characterised by similarity equations, graph based similarity measures and Gaussian parameters. As words are been given into the system, clusters would be generated automatically. The hybrid mechanism works with membership algorithms to identify documents that match with one another and can be grouped into clusters. The method works to find the real distribution of words in the text documents. Experimental results do show that the proposed method is much better when compared against several other clustering methods. The distinguished clusters are identified by a unique group of top keywords, obtained from the documents of a cluster.

Divya D. Dev, Merlin Jebaruby

Combination of Multi-view Multi-source Language Classifiers for Cross-Lingual Sentiment Classification

Cross-lingual sentiment classification aims to conduct sentiment classification in a target language using labeled sentiment data in a source language. Most existing research works rely on machine translation to directly project information from one language to another. But cross-lingual classifiers always cannot learn all characteristics of target language data by using only translated data from one language. In this paper, we propose a new learning model that uses labeled sentiment data from more than one language to compensate some of the limitations of resource translation. In this model, we first create different views of sentiment data via machine translation, then train individual classifiers in every view and finally combine the classifiers for final decision. We have applied this model to the sentiment classification datasets in three different languages using different combination methods. The results show that the combination methods improve the performances obtained separately by each individual classifier.

Mohammad Sadegh Hajmohammadi, Roliana Ibrahim, Ali Selamat, Alireza Yousefpour

Learning to Simplify Children Stories with Limited Data

In this paper, we examine children stories and propose a text simplification system to automatically generate simpler versions of the stories and, therefore, make them easier to understand for children, especially ones with difficulty in reading comprehension. Our system learns simplifications from limited data built from a small repository of short English stories for children and can perform important simplification operations, namely splitting, dropping, reordering, and substitution. Our experiment shows that our system outperforms other systems in a variety of automatic measures as well as human judgements with regard to simplicity, grammaticality, and semantic similarity.

Tu Thanh Vu, Giang Binh Tran, Son Bao Pham

Clustering Based Topic Events Detection on Text Stream

Detecting and tracking events from the text stream data is critical to social network society and thus attracts more and more research efforts. However, there exist two major limitations in the existing topic detection and tracking models, i.e. noise words and multiple sub-events. In this paper, a novel event detection and tracking algorithm, topic event detection and tracking (TEDT), was proposed to tackle these limitations by clustering the co-occurrent features of the underlying topics in the text stream data and then the evolution of events was analyzed for the event tracking purpose. The evaluation was performed on two real datasets with the promising results demonstrating that (1) the proposed TEDT algorithm is superior to the state-of-the-art topic model with respect to event detection; (2) the proposed TEDT algorithm can successfully track the event changes.

Chunshan Li, Yunming Ye, Xiaofeng Zhang, Dianhui Chu, Shengchun Deng, Xiaofei Xu

Nonterminal Complexity of Weakly Conditional Grammars

weakly conditional grammar

is specified as a pair

= (

′) where

is a context-free grammar, and

′ is a regular grammar such that a production rule of

is only applicable to the sentential form if it belongs to the language generated by

′. The nonterminal complexity Var(

) of the grammar

is defined as the sum of the numbers of nonterminals of

and

′. This paper studies the nonterminal complexity of weakly conditional grammars, and it proves that every recursively enumerable language can be generated by a weakly conditional grammar with no more than

ten

nonterminals. Moreover, it shows that the number of nonterminals in such grammars without erasing rules leads to an infinite hierarchy of families of languages generated by weakly conditional grammars.

Sherzod Turaev, Mohd Izzuddin Mohd Tamrin, Norsaremah Salleh

Thai Grapheme-Phoneme Alignment: Many-to-Many Alignment with Discontinuous Patterns

Grapheme-phoneme aligned data is crucial to the grapheme-to-phoneme conversion system. Although manual alignment is possible, the task is tedious and time-consuming. Therefore, unsupervised alignment algorithms are proposed to reduce this alignment cost. Several efficient algorithms rely on the assumption that patterns are continuous, but the assumption is not true for Thai. When applying these algorithms to Thai grapheme-to-phoneme alignment, some pre-processing steps for discontinuous patterns are necessary. We propose an algorithm to align Thai graphemes and phonemes which directly incorporates the discontinuous patterns. The experiments show that the precision of the proposed alignment algorithm substantially increases from the conventional alignment with only continuous patterns while the recall decreases from the original method. As a result, the proposed algorithm achieves similar

1 to the conventional algorithm.

Dittaya Wanvarie

A New Approach for Mining Top-Rank-k Erasable Itemsets

Erasable itemset mining first introduced in 2009 is an interesting variation of pattern mining. The managers can use the erasable itemsets for planning production plan of the factory. Besides the problem of mining erasable itemsets, the problem of mining top-rank-

erasable itemsets is an interesting and practical problem. In this paper, we first propose a new structure, call dPID_List and two theorems associated with it. Then, an improved algorithm for mining top-rank-

erasable itemsets using dPID_List structure is developed. The effectiveness of the proposed method has been demonstrated by comparisons in terms of mining time and memory usage with VM algorithm for three datasets.

Giang Nguyen, Tuong Le, Bay Vo, Bac Le

Intelligent Information Retrieval

New Method for Extracting Keyword for the Social Actor

In this paper we study the relationship between query and search engine by exploring some properties and also applying their relations to extract keyword for any social actor by proposing new method. The proposed approach based on considering the result of search engine in the singleton and doubleton. In this paper, we develop a novel method for extracting keyword automatically from Web with mirror shade concept (M2M). Results show the potential of the proposed approach, in experiment we get that the performance (recall and precision) of keyword depend on both weights (singleton and tfidf) and the distance of them.

Mahyuddin K. M. Nasution

Query Expansion Using Medical Subject Headings Terms in the Biomedical Documents

MEDLINE database is most resourceful of biomedical literatures. Lay users may get difficulty to formulate a query. Query expansion technique reformulates user query by adding more significant and related terms to original terms to retrieve more relevant results. Finding related terms are explored form external resources, collection and query context. Since each MEDLINE document is manually assigned with controlled vocabularies which is called MeSH (Medical Subject Headings). These controlled vocabularies may be beneficial for query expansion. This paper proposes pseudo-relevance feedback by using MeSH terms in documents for query expansion. Additionally, re-weighting scheme called RABAM-PRF (Rank-Based MeSH Pseudo-Relevance Feedback) for filtering misleading terms is studied. In experiment, we use Lucene to retrieve the OHSUMED collection as baseline. The proposed method improves retrieval performance in MAP, P@10, and B-pref. Furthermore, the experiment showed that not all MeSH terms should be included to the query.

Ornuma Thesprasith, Chuleerat Jaruskulchai

Improving Health Question Classification by Word Location Weights

Healthcare consumers often access the Internet to get health information related to specific health questions, which are often about several health categories such as the

cause

diagnosis

, and

process

(e.g., treatment) of disorders. Therefore, for a given health question

, a classifier should be developed to recognize the intended category (or categories) of

so that relevant information specifically for answering

can be retrieved. In this paper, we show that a Support Vector Machine (SVM) classifier can be trained to properly classify real-world Chinese health questions (CHQs), and more importantly by weighting the words in the CHQs based on their locations in the CHQs, the SVM classifier can be further improved significantly. The improved classifier can serve as a fundamental component to retrieve relevant health information from health information websites, as well as the collections of CHQs whose answers have been written by healthcare professionals so that healthcare consumers can get reliable health information, which is particularly essential in health promotion and disease management.

Rey-Long Liu

Entity Recognition in Information Extraction

Detecting and resolving entities is an important step in information retrieval applications. Humans are able to recognize entities by context, but information extraction systems (IES) need to apply sophisticated algorithms to recognize an entity. The development and implementation of an entity recognition algorithm is described in this paper. The implemented system is integrated with an IES that derives triples from unstructured text. By doing so, the triples are more valuable in query answering because they refer to identified entities. By extracting the information from Wikipedia encyclopedia, a dictionary of entities and their contexts is built. The entity recognition computes a score for context similarity which is based on

cosine

similarity with a

tf-idf

weighting scheme and the string similarity. The implemented system shows a good accuracy on Wikipedia articles, is domain independent, and recognizes entities of arbitrary types.

Novita Hanafiah, Christoph Quix

Author Name Disambiguation by Using Deep Neural Network

Author name ambiguity is one of the problems that decrease the quality and reliability of information retrieved from digital libraries. Existing methods have tried to solve this problem by predefining a feature set based on expert’s knowledge for a specific dataset. In this paper, we propose a new approach which uses deep neural network to learn features automatically for solving author name ambiguity. Additionally, we propose the general system architecture for author name disambiguation on any dataset. We evaluate the proposed method on a dataset containing Vietnamese author names. The results show that this method significantly outperforms other methods that use predefined feature set. The proposed method achieves 99.31% in terms of accuracy. Prediction error rate decreases from 1.83% to 0.69%, i.e., it decreases by 1.14%, or 62.3% relatively compared with other methods that use predefined feature set (Table 3).

Hung Nghiep Tran, Tin Huynh, Tien Do

Incremental Refinement of Linked Data: Ontology-Based Approach

This paper presents an approach to refining linked data using a domain ontology. Many linked data are available on the Internet, and since the volume is increasing, incrementally adding links between pieces of data is important. In this paper, we consider a Frequently Asked Questions (FAQs) database in the domain of rental apartments targeted to international students in Japan. This database is implemented using linked data and contains the relationships between part of the floor plan of the rental apartment and FAQs. To facilitate adding new questions, we propose a method that automatically derives a relationship between a new question and part of the floor plan using the domain ontology. Our experimental results with newly added questions indicate the effectiveness of our proposed approach.

Yusuke Saito, Boonsita Roengsamut, Kazuhiro Kuwabara

Using Lexical Semantic Relation and Multi-attribute Structures for User Profile Adaptation

This contribution presents a new approach to the representation of user interests and preferences at information retrieval process. The

adaptive user profile

includes both interests given explicitly by the user, as a query, and also preferences expressed by the valuation of relevance of retrieved documents, so to express field independent translation between terminology used by user and terminology accepted in some field of knowledge. Building, modifying, expanding by semantically related terms and using procedures for the profile are presented. Experiments concerning the profile, as a personalization mechanism of Web retrieval system, are presented and discussed.

Agnieszka Indyka-Piasecka, Piotr Jacewicz

Auto-Tagging Articles Using Latent Semantic Indexing and Ontology

Tagging plays a crucial role in the success of social network and social collaboration. This paper proposes an auto-tagging methodology for articles using Latent Semantic Indexing (LSI) and ontology. The proposed methodology consists of pre-processing and tagging process. In pre-processing process, the LSI vector is created for article classification. The tagging process suggests some ontological tags. An accuracy evaluation of auto-tagging compared with manual-tagging is discussed. The experimental results show that the proposed auto-tagging methodology returns high accuracy and recall.

Rittipol Rattanapanich, Gridaphat Sriharee

Evaluating Profile Convergence in Document Retrieval Systems

In many document retrieval systems the user is not supported until sufficient information about him is collected. In some other systems randomly selected documents are recommended but they may not be relevant. To avoid so-called “cold-start problem” a method for determining a non-empty profile for a new user is presented in this paper. The experimental evaluations are usually performed using a few real users. This is a time- and cost-consuming method of evaluations, so we propose the methodology of experiments using simulations of user activities. The results were statistically analyzed and have shown that using the proposed method, the adaptation process allows to building a profile that is closer to user preference than in the situation when the first user profile is empty.

Bernadetta Maleszka, Ngoc Thanh Nguyen

Using Non-Zero Dimensions and Lengths of Vectors for the Tanimoto Similarity Search among Real Valued Vectors

The Tanimoto similarity measure finds numerous applications e.g. in chemical informatics, bioinformatics, information retrieval, text and web mining. Recently, two efficient methods for reducing the number of candidates for Tanimoto similar real valued vectors have been offered: the one using lengths of vectors and the other using their non-zero dimensions. In this paper, we offer new theoretical results on combined usage of lengths of real valued vectors and their non-zero dimensions for more efficient reduction of candidates for Tanimoto similar vectors. In particular, we derive more restrictive bounds on lengths of such candidate vectors.

Marzena Kryszkiewicz

Semantic Web, Social Networks and Recommendation Systems

Finding the Cluster of Actors in Social Network Based on the Topic of Messages

Social Network, the most popular Internet service, has a miracle rapid increment of number of users in recent years. In this paper, we present how to use SOM network to cluster the actor based on vector. This vector is a distribution probability of topic that actor prefers. We use ART model to create the vector of interested topics. Moreover, we use Enron email corpus as a sample dataset to evaluate efficiency in SOM network. By experimenting on the dataset, we demonstrate that our proposed model can be used to extract well and meaningful cluster following the topics. We use F – measure method for this application for testing precision of SOM algorithm. As a result, from our sample tests, the F-measure cites the acceptable accuracy of the SOM method. Based on the result, application developers can use SOM to group the actors based on their interested topics.

Hoa Tran Quang, Hung Vo Ho Tien, Hoang Nguyen Le, Thanh Ho Trung, Phuc Do

Geodint: Towards Semantic Web-Based Geographic Data Integration

The main objective of data integration is to unify data from different sources and to provide a unified view to the users. The integration of heterogeneous data has some benefits both for companies and for research. However, finding the common schema and filtering the same element becomes difficult due to the heterogeneity. In this paper, a system is presented that is able to integrate geographic data from different sources using Semantic Web technologies. The problems that appear during the integration are also handled by the system. An ontology has been developed that stores the common attributes that are given after schema matching. To filter the inconsistent and duplicate elements, clustering and string similarity metrics have been used. The data given after integrating can be used among others for touristic purposes, for example it could provide data to an augmented reality browser.

Tamás Matuszka, Attila Kiss

SPARQL – Compliant Semantic Search Engine with an Intuitive User Interface

It is crucial to enable users of Linked Data to explore RDF-compliant knowledge bases in an intuitive and effective way. It is not reasonable to assume that a regular user posses any knowledge about the SPARQL nor about the ontology of the given knowledge base. This paper presents the Semantic Focused Crawler (SFC) system which features a graph-based querying interface that address this issue. As a result of the use of auto-complete recommendations within the SFC query builder interface, the user benefits from using the ontology irrespectively from the degree of the knowledge about semantic technologies he/she possesses. When compared to several widely-referenced alternative solutions in experiments performed with the use of 2011 QALD workshop questions, the presented system appears as achieving high query results accuracy and low complexity of the query formulation process.

Adam Styperek, Michal Ciesielczyk, Andrzej Szwabe

A General Model for Mutual Ranking Systems

Ranking has been applied in many domains using recommendation systems such as search engine, e-commerce, and so on. We will introduce and study N-linear mutual ranking, which can rank

classes of objects at once. The ranking scores of these classes are dependent to the others. For instance, PageRank by Google is a 2-linear mutual ranking, which ranks the webpages and links at once. Particularly, we focus to

-star ranking model and demonstrate it in ranking conference and journal problems. We have conducted the experiments for the models in which the citations are not considered. The experimental results are based on the DBLP dataset, which contains more than one million papers, authors and thousands of conferences and journals in computer science. Finally, N-star ranking is a very strong ranking algorithm can be applied in many real-world problems.

Vu Le Anh, Hai Vo Hoang, Kien Le Trung, Hieu Le Trung, Jason J. Jung

Automated Interestingness Measure Selection for Exhibition Recommender Systems

Exhibition guide system contain various information pertaining to exhibitors, products and events that are happening during the exhibitions. The system would be more useful if it is augmented with a recommender system. Our recommender system would recommend users a list of interesting exhibitors based on associations that mined from the web server logs. The recommendations are ranked based on various Objective Interestingness Measures (OIMs) that quantify the interestingness of an association. Due to data sparsity, some OIMs cannot provide distinct values for different rules and hamper the ranking process. In mobile applications, the ranking of recommendations is crucial because of the low real estate in mobile device screen sizes. We show that our system is able to select an OIM (from 50 OIMs) that would perform better than the regular Support-Confidence OIM. Our system is tested using data from exhibitions held in Germany.

Kok Keong Bong, Matthias Joest, Christoph Quix, Toni Anwar

Equivalent Transformation in an Extended Space for Solving Query-Answering Problems

A query-answering problem (QA problem) is concerned with finding all ground instances of a query atomic formula that are logical consequences of a given logical formula describing the background knowledge of the problem. Based on the equivalent transformation (ET) principle, we propose a general framework for solving QA problems on first-order logic. To solve such a QA problem, the first-order formula representing its background knowledge is converted by meaning-preserving Skolemization into a set of clauses typically containing global existential quantifications of function variables. The obtained clause set is then transformed successively using ET rules until the answer set of the original problem can be readily derived. Many ET rules are demonstrated, including rules for unfolding clauses, for resolution, for dealing with function variables, and for erasing independent satisfiable atomic formulas. Application of the proposed framework is illustrated.

Kiyoshi Akama, Ekawit Nantajeewarawat

Knowledge Generalization during Hierarchical Structures Integration

Hierarchical data structures are common in modern applications. Tree integration is one of the tools that is not fully researched in this scope. Therefore in this paper we define a complex tree to model common hierarchical structures. Complex tree integration aim is determined by specific integration criteria. In this paper we define and analyze a criterion measuring generalization of knowledge – upper semantic precision. We analyze the criterion in terms of simpler syntactic criteria and describe an extended example of an information retrieval system using this criterion.

Marcin Maleszka

Design and Implementation of an Adaptive Tourist Recommendation System

Recommendation Systems and Adaptive Systems have been introduced in travel applications in order to support travellers in their decision-making processes. These systems should respond to the unexpected changes during travel. In this case, they need to sense the changes holistically before, during, and after the travel. In addition, they should also be adapted to the specifications and conditions of the traveller. For example, there is a need to consider all aspects of the traveller’s needs, such as personal, cultural, and social. Similarly, the information about accommodations, flights, cities, activities and countries should be gathered through different sources. Furthermore, these systems need to learn from travellers’ feedback to improve the quality of recommendations. However, the majority of travel applications do not satisfy the above requirements. To address these problems and issues, we propose and implement a travel process that is supported by an adaptive tourist recommendation framework, architecture, and system.

Leila Etaati, David Sundaram

Improving Efficiency of PromoRank Algorithm Using Dimensionality Reduction

Promotion plays a crucial role in online marketing, which can be used in post-sale recommendation, developing brand, customer support, etc. It is often desirable to find markets or sale channels where an object, e.g., a product, person or service, can be promoted efficiently. Since the object may not be highly ranked in the global property space, PromoRank algorithm promotes a given object by discovering promotive subspace in which the target is top rank. However, the computation complexity of PromoRank is exponential to the dimension of the space. This paper proposes to use dimensionality reduction algorithms, such as PCA, in order to reduce the dimension size and, as a consequence, improve the performance of PromoRank. Evaluation results show that the dimensionality reduction algorithm can reduce the execution time of PromoRank up to 25% in large data sets while the ranking result is mostly maintained.

Metawat Kavilkrue, Pruet Boonma

A Framework to Provide Personalization in Learning Management Systems through a Recommender System Approach

Personalization in learning management systems (LMS) occurs when such systems tailor the learning experience of learners such that it fits to their profiles, which helps in increasing their performance within the course and the quality of learning. A learner’s profile can, for example, consist of his/her learning styles, goals, existing knowledge, ability and interests. Generally, traditional LMSs do not take into account the learners’ profile and present the course content in a static way to every learner. To support personalization in LMS, recommender systems can be used to recommend appropriate learning objects to learners, not only based on their individual profile but also based on what worked well for learners with a similar profile. In this paper, we propose a framework to integrate a recommender system approach into LMS. The proposed framework is designed with the goal of presenting a flexible integration model which can provide personalization by automatically suggesting learning objects to learners based on their current situation as well as successful learning experiences of learners with similar profiles in a similar situation. Such advanced personalization can help learners in many ways such as reducing the learning time without negative impact on their marks, improving learning performance as well as increasing the level of satisfaction.

Hazra Imran, Quang Hoang, Ting-Wen Chang, Kinshuk, Sabine Graf

Intelligent Database Systems

Agent-Based Modelling the Evacuation of Endangered Areas

The evacuation process from endangered areas (EA) in crisis situations is modelled by means of simple agents (gate-ways equipped by sensors). Timed Petri nets (TPN) and first-order hybrid Petri nets (FOHPN) are utilized here to model the EA structure as well as the agents and their cooperation. Rooms, other spaces to be evacuated (corridors) and safe spaces out of EA (where people are evacuated) are modelled by TPN places and FOHPN continuous places. Gate-ways are modelled by TPN subnets and by FOHPN continuous transitions. While the supervisor for the TPN gate-ways can be synthesized by means of place/transition Petri nets (P/T PN), the blocks of FOHPN discrete places and transitions are used to affect the gate-ways. Depending on the immediate throughput of the gate-ways the escape time behaviour is found in the process of simulation. This paper is a free continuation of [4] where the problem was solved solely by means of P/T PN.

František Čapkovič

DPI: Dual Private Indexes for Outsourced Databases

Designing secure and efficient indexes to support selective queries over encrypted data at server side is necessary for outsourced databases. On one hand, the indexes must be associated with plain values in order to locate tuples precisely. On the other hand, the indexes may open a door to leak sensitive information, especially when the indexes are combined with selective encryption. In this paper, we propose DPI, a dual private index for outsourced databases. According to DPI, two types of indexes, incorporated with user-specified random salts, are defined for each attribute. The generalization-based index is used to support server side query over encrypted data and protect data from link inferences, and the value-based index is adaptively used at client side to reduce extra decryption costs. We have conducted some experiments to validate our proposed method.

Yi Tang, Fang Liu, Liqing Huang

Anomaly SQL SELECT-Statement Detection Using Entropy Analysis

Database systems are often intruded because they store valuable information and can be accessed through Internet web applications which sometimes are not developed with security in mind. Attackers can inject some crafted inputs to those programs that work on database systems so that some unexpected results occur. We analyze the database system log files, focus on query statements (SQL SELECT statements), using the Shannon entropy to detect such anomaly attempts that would change conditional entropy significantly. Our experiment shows that the proposed anomaly detection using entropy analysis is effective.

Thanunchai Threepak, Akkradach Watcharapupong

Deriving Composite Periodic Patterns from Database Audit Trails

Information about the periodic changes of intensity and structure of database workloads plays an important role in performance tuning of functional components of database systems. Discovering the patterns in workload information such as audit trails, traces of user applications, sequences of dynamic performance views, etc. is a complex and time consuming task. This work investigates a new approach to analysis of information included in the database audit trails. In particular, it describes the transformations of information included in the audit trails into a format that can be used for discovering the periodic patterns in database workloads. It presents an algorithm thatthe fluctuations finds elementary periodic patterns through nested iterations over a four dimensional space of execution plans of SQL statements and positional parameters of the patterns. Finally, it shows the composition rules for the derivations of complex periodic patterns from the elementary and other complex patterns.

Marcin Zimniak, Janusz R. Getta, Wolfgang Benn

Comparison of Stability Models in Incremental Development

There are many stability models that are developed with a different of factors, indicators, and methods. The objective of this paper is to compare models for estimating class logical stability of software design in incremental development from class diagrams and sequence diagrams. The models are developed with different methods such as multiple regression analysis (MRA), principle component analysis (PCA), and design logical ripple effect analysis (DLREA). The empirical result shows that the models are acceptable for estimating stability. Then we compare and discuss the results to help developers make decision when selecting and using the methods for developing the stability estimation models.

Alisa Sangpuwong, Pornsiri Muenchaisri

An Approach of Finding Maximal Submeshes for Task Allocation Algorithms in Mesh Structures

This paper concerns the problem of finding efficient task allocation algorithms in mesh structures. An allocation algorithm, called Window-Based-Best-Fit with Validated-Submeshes (WBBFVS), has been created. The core of this algorithm lays in the way of finding maximal submeshes what is an important part of the task allocation algorithms based on the stack approach. The new way of finding maximal submeshes avoids post-checking submeshes’ redundancy thanks to the proposed initial validation. The elimination of redundancy may increase the speed of the algorithms’ performance. The three considered algorithms have been evaluated on the basis of the results of simulation experiments made using the designed and implemented experimentation system. The obtained results of investigation show that the WBBFVS algorithm seems to be very promising.

Radosław J. Jarecki, Iwona Poźniak-Koszalka, Leszek Koszalka, Andrzej Kasprzak

A GA-Based Approach for Resource Consolidation of Virtual Machines in Clouds

In cloud computing, infrastructure as a service (IaaS) is a growing market that enables users to access cloud resources in the convenient, on-demand manner. The IaaS can provide user to rent the resources of cloud computing and virtual machines (VMs) through virtualization technology. Because different VMs may demand different amounts of resources, an important problem that must be addressed effectively in the cloud is how to decide the mapping adaptively in order to satisfy the resource needs of VMs. The mapping problem solution is called virtual machine placement policy (VMPP). However, VM will change the requirement of resources according to the workload of application VM. Thus, it’s necessary to apply resource consolidation technology to satisfy dynamically resource on demand. In this thesis, we present a two-phase approach for resource consolidation to minimize resource consumption. In the first phase, we use a genetic algorithm to find a reconfiguration plan. In the second phase, we propose a mechanism to find a way to migrate VMs such that the number of active nodes and the overall migration cost could be minimized. Finally, the experimental results show that we obtain well-consolidating active nodes than other existing approaches.

I-Hsun Chuang, Yu-Ting Tsai, Mong-Fong Horng, Yau-Hwang Kuo, Jang-Pong Hsu

Problems of SUMO-Like Ontology Usage in Domain Modelling

Ontologies are increasingly used, especially in the early stages of software development. It is widely believed that the use of ontologies has positive impact on the quality of the final software product. The aim of the paper is an introductory analysis of possible mapping between SUMO-like ontologies and domain models expressed in UML as a commonly accepted modelling language in software development. The main contribution of the paper is identification of basic problems within this mapping.

Bogumiła Hnatkowska, Zbigniew Huzar, Iwona Dubielewicz, Lech Tuzinkiewicz

Intelligent Information Systems

Implementation of Emotional-Aware Computer Systems Using Typical Input Devices

Emotions play an important role in human interactions. Human Emotions Recognition (HER - Affective Computing) is an innovative method for detecting user’s emotions to determine proper responses and recommendations in Human-Computer Interaction (HCI). This paper discusses an intelligent approach to recognize human emotions by using the usual input devices such as keyboard, mouse and touch screen displays. This research is compared with the other usual methods like processing the facial expressions, human voice, body gestures and digital signal processing in Electroencephalography (EEG) machines for an emotional-aware system. The Emotional Intelligence system is trained in a supervised mode by Artificial Neural Network (ANN) and Support Vector Machine (SVM) techniques. The result shows 93.20% in accuracy which is around 5% more than the existing methods. It is a significant contribution to show new directions of future research in this topical area of emotion recognition, which is useful in recommender systems.

Kaveh Bakhtiyari, Mona Taghavi, Hafizah Husain

An Item Bank Calibration Method for a Computer Adaptive Test

Computer adaptive testing is a form of educational measurement that is adaptable to examinee’s proficiency. The usage of a computer adaptive testing brings many benefits but requires creation of a big and a calibrated item bank. The calibration of an item bank made by statistical methods is expensive and time consuming. Therefore, in this paper we worked out an easy item bank calibration method based on experts’ opinions. The proposed algorithm used the Consensus Theory. The researches pointed out that the proposed calibration procedure is efficient. As little as three experts’ opinions were enough to obtain the calibrated item bank where values of items’ parameters estimated by an expert-based method were not statistically different from values of items’ parameter estimated by a statistical calibration method. The statistical calibration method required engaging over 50 persons.

Adrianna Kozierkiewicz-Hetmańska, Rafał Poniatowski

Hybrid Approach to Web Based Systems Usability Evaluation

This paper presents a concept for a design of a new, hybrid method for web systems usability evaluation. This method will combine various elements of other, well-known usability methods, and will be enhanced with a mechanism to evaluate the use of the system, by the users, against the model of its desired use. This way it will be possible to determine the usability of a tested system, using various metrics and techniques, during a single test.

Piotr Chynał

Application of Network Analysis in Website Usability Verification

This paper describes the application of network analysis in website usability verification. In particular we propose a new automatic usability testing method by means of application of network motifs analysis. In our method motifs are constructed according to the patterns of visited web-pages by particular users, which define the relationships between the pairs of users.

Piotr Chynał, Janusz Sobecki, Jerzy M. Szymański

Travel Password: A Secure and Memorable Password Scheme

There is a trade-off between password security and usability; longer password provides higher security but can reduce usability, as it is harder to remember. To address this challenge, this paper proposed a novel password scheme, called “Travel Password”, which is memorable and also secure. The proposed scheme is designed to aid human memory by using mnemonic device, e.g., pictures and symbols, and story telling. Mnemonic device aids memory because human can remember pictures better than text. Story telling, on the other hand, allows users to make connection between each part of the password. The experiment with eighty users shows that the proposed scheme allows users to have better password recall. Compared with traditional textual password which has about 0.8 recall rate for strong passwords, users with the proposed scheme can achieve 1.0 recall rate. Moreover, the proposed scheme is more memorable than the traditional textual one. 90% of users can promptly remember strong passwords in the proposed scheme, compared with 58% of the textual one.

Nattawut Phetmak, Wason Liwlompaisan, Pruet Boonma

Performance Measurement of Higher Education Information System Using IT Balanced Scorecard

Extensive research was conducted at a private university in Indonesia into the performance of the higher education information system, called SIPERTI. The IT Balanced Scorecard framework consisting of its four perspectives: Corporate Contribution, User Orientation, Operational Excellence, and Future Orientation was employed to assess the system. The study was accomplished in the form of a questionnaire composed of five-point Likert scale statements. The questionnaire was addressed to the members of the faculties and staff of the university who used SIPERTI in their everyday work. The data obtained was statistically analysed including the tests of reliability and validity. A structured interview which followed the questionnaire allowed for the formulation of recommendations on SIPERTI performance improvement.

Nunik Afriliana, Ford Lumban Gaol

Decision Support Systems

Decisional DNA Based Framework for Representing Virtual Engineering Objects

In this paper, we propose a frame-work to represent the Virtual Engineering Objects (VEO) utilizing Set of Knowledge Experience Structure (SOEKS) and Decisional DNA. A VEO will enable the discovery of new knowledge in a manufacturing unit and the generation of new rules that drive reasoning. The proposed VEO framework will not only be knowledge based representation but it will also have its associated experience embedded within it. This concept will evolve and discover implicit knowledge in industrial plant, which can be beneficial for the engineers and practitioners. A VEO will be a living representation of an object; capable of adding, storing, improving and sharing knowledge through experience, similar to an expert of that area.

Syed Imran Shafiq, Cesar Sanin, Edward Szczerbicki, Carlos Toro

A Data Quality Index with Respect to Case Bases within Case-Based Reasoning

Within Case-Based Reasoning (CBR), terms concerning quality of a case base are mentioned in publications, but partially without clarifications of criteria. When developing a CBR system from scratch, an index for case base quality supports an assessment of the actual cases. In this approach, both theory and an application are demonstrated. An index was defined and subsequent applied within a current CBR project, which is under development. In addition, various approaches concerning case base quality are demonstrated. Big data occurs within a combination of high velocity, great volume and variety of incoming data. Defining an index to measure the case base quality copes with that.

Jürgen Hönigl, Josef Küng

Agent’s Autonomy Adjustment via Situation Awareness

Interactions between autonomous agents (humans and software) are necessary to increase the system’s awareness, support agents’ decision-making abilities and subsequently reduce the risks of failure. The challenge, however, is to formulate a mechanism that specifies when an agent or a human should take the initiative to interact. In this paper, we propose a Situation Awareness Assessment (SAA) model of autonomy adjustment with situation awareness capabilities in a decentralized environment. The SAA model systematizes the interactions to assist agents’ decision-making and improve the systems’ performance. The model inspects if an agent has the required awareness of a situation to be satisfactorily autonomous, otherwise, it intervenes the agent with feedback about the situation. An example scenario demonstrates the SAA ability to facilitate improved autonomous behavior in agents.

Salama A. Mostafa, Mohd Sharifuddin Ahmad, Alicia Y. C. Tang, Azhana Ahmad, Muthukkaruppan Annamalai, Aida Mustapha

The Development of a Decision Support Model for the Problem of Berths Allocation in Containers Terminal Using a Hybrid of Genetic Algorithm and Simulated Annealing

The berths allocation problem (BAP), aims to allocate the space along the waterfront, for incoming ships in a container terminal, to minimize an objective function. In this paper, we propose a multi- objective model for decision support, for the assignment problem the incoming ships on the quays in a containers terminal. The model we propose,seeks an assignment that simultaneously minimizes the time spent by vessels in the port and the distances traveled by containers imports/exports. We propose a mathematical model to achieve our goals by respecting imposed constraints. This model is solved by using, a hybrid of genetic algorithm and simulated annealing. Calculation results are presented in this article.

Zeinebou Zoubeir, Abdellatif Benabdelhafid

Sensitivity Analysis of a Priori Power Indices

Power index analysis is very important in all group decision making processes. There is no better way how to evaluate the power to act of a decision maker than a priori power index. The formal analysis of measuring sensitivity of the index are presented in the paper. Constant power partition of weight allocation space with respect to a given quota is defined. Responses of the measures of power to changes of allocations of weights, quota and/or number of voters are considered. Proposed methods are base for algorithms of sensitivity analysis of a priori power indices.

František Turnovec, Jacek Mercik

Artificial Neural Network Based Prediction Model of the Sliding Mode Control in Coordinating Two Robot Manipulators

The design of a decentralized controlling law in the coordinated transportation area of an object by multiple robot manipulators employing implicit communication between them is a specific alternative in synchronization problems. A decentralized controller is presented in this work which is combination of the sliding mode control and artificial neural network which guarantees robustness in the system. Implicit communication among robot manipulators considers the light weight beam angle in this controller. A multi layer feed forward neural network based prediction model is presented not only to improve trajectory tracking of multiple robots but also to solve the chattering phenomena in the sliding mode control. The simulation results show the effectiveness of the proposed controller on two cooperative PUMA 560 robot manipulators.

Parvaneh Esmaili, Habibollah Haron

Comparison of Reproduction Schemes in Spatial Evolutionary Game Theoretic Model of Bystander Effect

We compare results of different reproduction schemes used in modelling of radiation induced bystander effect in normal cells. The model is based on the theory of spatial evolutionary games on a lattice and pay-offs are defined by changes in fitness measure resulting from intersections between cells representing different phenotypes. We discuss also qualitative differences between results of simulations of spatial evolutionary games and steady state solutions of replicator dynamics equations for relevant mean field games.

Andrzej Świerniak, Michał Krześlak

Ant Colony Optimization Algorithm for Solving the Provider - Modified Traveling Salesman Problem

The paper concerns the introduced and defined problem which was called the Provider. This problem coming from practice and can be treated as a modified version of Travelling Salesman Problem. For solving the problem an algorithm (called ACO) based on ant colony optimization ideas has been created. The properties of the algorithm were tested using the designed and implemented experimentation system. The effectiveness of the algorithm was evaluated and compared to reference results given by another implemented Random Optimization algorithm (called RO) on the basis of simulation experiments. The reported investigations have shown that the ACO algorithm seems to be very effective for solving the considered problem. Moreover, the ACO algorithm can be recommended for solving other transportation problems.

Krzysztof Baranowski, Leszek Koszałka, Iwona Poźniak-Koszałka, Andrzej Kasprzak

Controlling Quality of Water-Level Data in Thailand

Climate change has increased the number of occurrences of extreme events around the world. Warning and monitoring system is very important for reducing the damage of disasters. The performance of the warning system relies heavily on the quality of data from automated telemetry system (ATS) and the accuracy of the predicting system. Traditional quality management systems cannot discover complicated cases, such as outliers, missing patterns, and inhomogeneity. This paper proposes novel procedures to handle these complex issues in hydrological data focusing on water level. In the proposed system, DBSCAN, which is a clustering algorithm, is applied to discover outliers and missing patterns. The experimental results show that the system outperforms a statistical criterion, mean ±

×SD, where

is a constant. Also, all missing patterns can perfectly be discovered by our approach. For the inhomogeneity problem, several statistical approaches are compared. The comparison results suggest that the best homogenization tool is

changepoint

, a method based on F-test.

Pattarasai Markpeng, Piraya Wongnimmarn, Nattarat Champreeda, Peerapon Vateekul, Kanoksri Sarinnapakorn

Application of Nonlinear State Estimation Methods for Sport Training Support

Typical understanding of healthcare concerns treatment, diagnosis and monitoring of diseases. But healthcare also includes well-being, healthy lifestyle, and maintaining good body condition. One of the most important factor in this respect is physical activity. Modern techniques of data acquisition and data processing enable development of advanced systems for physical activity support with use of measurement data. The need for reliable estimation routines stems from the fact, that many widely available (for bulk customers) measurements devices are not reliable and measured signals are contaminated by the noise. One of the most important variables for physical activity monitoring is the velocity of a moving object (e.g. velocity of selected parts of a body such as elbows). Apart from intensive use of system identification, optimization and control techniques for physical training support, we applied Kalman filtering technique in order to estimate speed of moving part of a body.

Krzysztof Brzostowski, Jarosław Drapała, Jerzy Świątek

Computer Vision Techniques

Multiple Object Tracking Based on a Hierarchical Clustering of Features Approach

One challenge in object tracking is to develop algorithms for automated detection and tracking of multiple objects in real time video sequences. In this paper, we have proposed a new method for multiple object tracking based on the hierarchical clustering of features. First, the Shi-Tomasi corner detection method is employed to extract the feature points from objects of interest and the hierarchical clustering approach is then applied to cluster and form them into feature blocks. These feature blocks will be used to track the objects frame by frame. Experimental results show evidence that the proposed method is highly effective in detecting and tracking multiple objects in real time video sequences.

Supannee Tanathong, Anan Banharnsakun

A Copy Move Forgery Detection to Overcome Sustained Attacks Using Dyadic Wavelet Transform and SIFT Methods

In the present digital world integrity and trustworthiness of the digital images is an important issue. And most probably copy- move forgery is used to tamper the digital images. Thus as a solution to this problem, through this paper we proposes a unique and blind method for detecting copy-move forgery using dyadic wavelet transform (DyWT) in combination with scale invariant feature transform (SIFT). First we applied DyWT on a given test image to decompose it into four sub-bands LL, LH, HL, HH. Out of these four sub-bands LL band contains most of the information we intended to apply SIFT on LL part only to extract the key features and using these key features we obtained descriptor vector and then went on finding similarities between various descriptors vector to come to a decision that there has been some copy-move tampering done to the given image. In this paper, we have done a comparative study based on the methods like (a).DyWT (b).DWT and SIFT (c). DyWT and SIFT. Since DyWT is invariant to shift whereas discrete wavelet transform (DWT) is not, thus DyWT is more accurate in analysis of data. And it is shown that by using DyWT with SIFT we are able to extract more numbers of key points that are matched and thus able to detect copy-move forgery more efficiently.

Vijay Anand, Mohammad Farukh Hashmi, Avinash G. Keskar

Methods for Vanishing Point Estimation by Intersection of Curves from Omnidirectional Image

In this paper, the authors propose solutions for finding the vanishing point in real time based on the Random Sample Consensus (RANSAC) curve fitting and density-based spatial clustering of applications with noise (DBSCAN). First, it was proposed to extract the longest segments of lines from the edge frame. Second, a RANSAC curve fitting method was implemented for detecting the best curve fitting given the data set of points for each line segment. Third, the set of intersection points for each pair of curves are extracted. Finally, the DBSCAN method was used in estimating the VP. Preliminary results were gathered and tested on a group of consecutive frames undertaken at Nam-gu, Ulsan, in South Korea. These specific methods of measurement were chosen to prove their effectiveness.

Danilo Cáceres Hernández, Van-Dung Hoang, Kang-Hyun Jo

Human Detection from Mobile Omnidirectional Camera Using Ego-Motion Compensated

This paper presents a human detection method using optical flows in an image obtained from an omnidirectional camera mounted in a mobile robot. To detect human region from a mobile omnidirectional camera achieved through several steps. First, a method for detection moving objects using frame difference. Then ego-motion compensated is applied in order to deal with noise caused by moving camera. In this step an image divides as grid windows then compute each affine transform for each window. Human shape as moving object is detected from the background transformation-compensated using every local affine transformation for each local window. Second, in order to localize the region as a human or not, histogram vertical projection is applied with specific threshold. The experimental results show the proposed method achieved comparable result comparing with similar methods, with 87.4% in detection rate and less than 10% in false positive detection.

Joko Hariyono, Van-Dung Hoang, Kang-Hyun Jo

Simple and Efficient Method for Calibration of a Camera and 2D Laser Rangefinder

In the last few years, the integration of cameras and laser rangefinders has been applied to a lot of researches on robotics, namely autonomous navigation vehicles, and intelligent transportation systems. The system based on multiple devices usually requires the relative pose of devices for processing. Therefore, the requirement of calibration of a camera and a laser device is very important task. This paper presents a calibration method for determining the relative position and direction of a camera with respect to a laser rangefinder. The calibration method makes use of depth discontinuities of the calibration pattern, which emphasizes the beams of laser to automatically estimate the occurred position of laser scans on the calibration pattern. Laser range scans are also used for estimating corresponding 3D image points in the camera coordinates. Finally, the relative parameters between camera and laser device are discovered by using corresponding 3D points of them.

Van-Dung Hoang, Danilo Cáceres Hernández, Kang-Hyun Jo

Iris Image Quality Assessment Based on Quality Parameters

Iris biometric for personal identification is based on capturing an eye image and obtaining features that will help in identifying a human being. However, captured images may not be of good quality due to variety of reasons e.g. occlusion, blurred images etc. Thus, it is important to assess image quality before applying feature extraction algorithm in order to avoid insufficient results. In this paper, iris quality assessment research is extended by analysing the effect of entropy, mean intensity, area ratio, occlusion, blur, dilation and sharpness of an iris image. Firstly, each parameter is estimated individually, and then fused to obtain a quality score. A fusion method based on principal component analysis (PCA) is proposed to determine whether an image is good or not. To test the proposed technique; Chinese Academy of Science Institute of Automation (CASIA), Internal Iris Database (IID) and University of Beira Interior (UBIRIS) databases are used.

Sisanda Makinana, Tendani Malumedzha, Fulufhelo V. Nelwamondo

Contextual Labeling 3D Point Clouds with Conditional Random Fields

In this paper we present a new approach for labeling 3D point clouds. We use Conditional Random Fields (CRFs) as an objective function, with unary energy term assessing the consistency of points with labels, and pairwise energy term between points and its neighbors. We propose a new method to learn this function from a collection of trained labels using JointBoost classifier formalism. By using CRFs with different geometric and contextual features, we show that our method enables the combination of semantic relations and achieves higher accuracy. We validate and demonstrate the efficiency of our method on complex urban laser scans and compare it with several alternative approaches.

Anh Nguyen, Bac Le

Categorization of Sports Video Shots and Scenes in TV Sports News Based on Ball Detection

Content-based indexing of TV sports news is based on the automatic temporal segmentation, recognition, and then classification of player shots and scenes reporting the sports events in different disciplines. Automatic categorization of sports in TV sports news is a basic process in video indexing. Many strategies how to recognize a sports discipline have been proposed. It may be achieved by player scenes analyses leading to the detection of playing fields, of superimposed text like player or team names, identification of player faces, detection of lines typical for a given playing field and for a given sports discipline, recognition of player and audience emotions, and also detection of sports objects specific for a given sports category. The paper examines the usefulness of ball and ball colour detection for the categorization of sports video shots and scenes in TV sports news. This approach has been verified and its efficiency has been analyzed in the Automatic Video Indexer AVI.

Kazimierz Choroś

A Coral Mapping and Health Assessment System Based on Texture Analysis

Corals have long played an important role in the environment with reefs hosting over four thousand species of marine animals all over the world. However, coral are very delicate in that slight unfavorable changes in environmental conditions cause them harm. With extreme changes in the environment happening ever more frequently, there is a need to efficiently monitor these corals for environmental efforts to keep pace. Manual monitoring of corals is expensive, tedious, and time-consuming. In this paper, an information system called the Coral Mapping and Health Assessment System (CMHAS) is proposed. The system aims to provide a digital repository of information that includes videos and images of corals, which are analyzed using image processing algorithms based on texture features to assess the health status of corals reefs. Evaluation on more than a hundred coral images show promising results with recognition rates as high as 82%.

Prospero C. Naval Jr., Maricor Soriano, Bianca Camille Esmero, Zorina Maika Abad

Navigation Management for Non-linear Interactive Video in Collaborative Video Annotation

This paper proposes a method that enables the use of shared interactive videos to promote collaborative environments in authoring the nonlinear video process. The proposed method addresses a problem in applying the nested nonlinear flow in shared interactive videos. The system enables authors’ collaboration in using existing interactive videos and allows them to have a full control in applying the nonlinear flow on top of each video. The security and policy issues regarding this full control is solved by displaying the nonlinear flow on an additional layer. We separate the video from the navigational elements to maintain the originality of the reused video source. The strength of the system is the collaborative approach to authoring nonlinear interactive videos. Hence, it helps authors to reduce and distribute the authoring workload by reusing existing interactive videos and enabling authors’ collaboration, respectively.

Ivan Ariesthea Supandi, Kee-Sung Lee, Ahmad Nurzid Rosli, Geun-Sik Jo

Backmatter

Title: Intelligent Information and Database Systems
Editors: Ngoc Thanh Nguyen
Boonwat Attachoo
Bogdan Trawiński
Kulwadee Somboonviwat
Publisher: Springer International Publishing
Electronic ISBN: 978-3-319-05476-6
Print ISBN: 978-3-319-05475-9
DOI: https://doi.org/10.1007/978-3-319-05476-6

Springer Professional