Top

2016 | Book | 1. edition

Read chapter Read first chapter

Intelligent Information and Database Systems

8th Asian Conference, ACIIDS 2016, Da Nang, Vietnam, March 14–16, 2016, Proceedings, Part I

Editors: Ngoc Thanh Nguyen, Bogdan Trawiński, Hamido Fujita, Tzung-Pei Hong

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

The two-volume proceedings of the ACIIDS 2016 conference, LNAI 9621 + 9622, constitutes the refereed proceedings of the 8th Asian Conference on Intelligent Information and Database Systems, held in Da Nang, Vietnam, in March 2016. The total of 153 full papers accepted for publication in these proceedings was carefully reviewed and selected from 392 submissions.

They were organized in topical sections named: knowledge engineering and semantic Web; social networks and recommender systems; text processing and information retrieval; database systems and software engineering; intelligent information systems; decision support and control systems; machine learning and data mining; computer vision techniques; intelligent big data exploitation; cloud and network computing; multiple model approach to machine learning; advanced data mining techniques and applications; computational intelligence in data mining for complex problems; collective intelligence for service innovation, technology opportunity, e-learning, and fuzzy intelligent systems; analysis for image, video and motion data in life sciences; real world applications in engineering and technology; ontology-based software development; intelligent and context systems; modeling and optimization techniques in information systems, database systems and industrial systems; smart pattern processing for sports; and intelligent services for smart cities.

Frontmatter

Knowledge Engineering and Semantic Web

Frontmatter

A Novel Approach to Multimedia Ontology Engineering for Automated Reasoning over Audiovisual LOD Datasets

Multimedia reasoning, which is suitable for, among others, multimedia content analysis and high-level video scene interpretation, relies on the formal and comprehensive conceptualization of the represented knowledge domain. However, most multimedia ontologies are not exhaustive in terms of role definitions, and do not incorporate complex role inclusions and role interdependencies. In fact, most multimedia ontologies do not have a role box at all, and implement only a basic subset of the available logical constructors. Consequently, their application in multimedia reasoning is limited. To address the above issues, VidOnt, the very first multimedia ontology with $$ {{\mathcal{S}\mathcal{R}\mathcal{O}\mathcal{I}\mathcal{Q}}^{{({\mathcal{D}})}}} $$SROIQ(D) expressivity and a DL-safe ruleset has been introduced for next-generation multimedia reasoning. In contrast to the common practice, the formal grounding has been set in one of the most expressive description logics, and the ontology validated with industry-leading reasoners, namely HermiT and FaCT++. This paper also presents best practices for developing multimedia ontologies, based on my ontology engineering approach.

Leslie F. Sikos

Finding Similar Clothes Based on Semantic Description for the Purpose of Fashion Recommender System

The fashion domain has been one of the most growing areas of e-commerce, hence the issue of facilitating cloth searching in fashion-related websites becomes an important topic of research. The paper deals with measuring the similarity between items of clothing and between complete outfits, based on the semantic description prepared by users and experts according to a previously developed fashion ontology. Proposed approach deals with different types of attributes describing clothes and allows for calculating similarity between the whole outfits in a domain-aware manner. Exemplary results of experiments performed on real clothing datasets are presented.

Dariusz Frejlichowski, Piotr Czapiewski, Radosław Hofman

An Influence Analysis of the Inconsistency Degree on the Quality of Collective Knowledge for Objective Case

In collective knowledge determination, objective case is the case which the real knowledge state of a subject in the real world exists independently of the knowledge states given by autonomous units. With inconsistency we have in mind some conflicts between the knowledge states of a collective. Besides, the measure of the quality of collective knowledge is based on the distance from the collective knowledge to the real knowledge state. In this work we investigate the influence of the inconsistency degree of a collective on the quality of collective knowledge by increasing the number of collective members. Based on the Euclidean space, some criteria for adding members to a collective and simulating the real knowledge state of a subject in the real world are proposed. Through experiments analysis, adding members causes decreasing the inconsistency degree of a collective is not always helpful in making the quality of collective knowledge to be better. Instead, the quality of collective knowledge tends to be better if added members are closer to the real knowledge state.

Van Du Nguyen, Ngoc Thanh Nguyen

Knowledge Base Refinement with Gamified Crowdsourcing

This paper discusses a gamification design for knowledge base refinement. The maintenance of a knowledge base involves human intervention and is one major application domain of human computation. Using the concept of crowdsourcing, refinement tasks can be delegated to many casual users over a network. In addition, gamification such as games with a purpose (GWAP) is a useful idea to motivate workers in crowdsourcing by making a task into a playful game. For effective gamification, designing the game rules is critical. In this paper, we present a model for simulating the gamified knowledge base refinement process to estimate the effects of different game rule designs beforehand.

Daiki Kurita, Boonsita Roengsamut, Kazuhiro Kuwabara, Hung-Hsuan Huang

Argumentation Framework for Merging Stratified Belief Bases

This paper introduces a new approach for belief merging by using argumentation technique. The key idea is to organize each belief merging process as a game in which participating agents use argumentation technique to debate on their own belief bases to achieve consensus i.e. a common belief base. To this end, we introduce a framework for merging belief by argumentation in which an argumentation-based belief merging protocol is proposed and a set of intuitive and rational postulates to characterize the merging results is introduced. Several logical properties of the family of argumentation-based belief merging operators are also pointed out and discussed.

Trong Hieu Tran, Thi Hong Khanh Nguyen, Quang Thuy Ha, Ngoc Trinh Vu

An Ontology-Based Knowledge Representation of MCDA Methods

Multiple-criteria decision analysis methods are widely used as tools supporting a decision problem. The article presents the taxonomy of the methods, which takes into consideration the most essential characteristics. This taxonomy, in the conceptualization process, was written by means of description logic and then it was implemented in the OWL language in the form of ontology representing field knowledge in the scope of MCDA methods. The research also considers the ontology verification prepared with the use of competency questions.

Jarosław Wątróbski, Jarosław Jankowski

Preliminary Evaluation of Multilevel Ontology Integration on the Concept Level

In many real situations it is not possible to merge multiple knowledge bases into a single one using one-level integration. It could be caused, for example, by high complexity of the integration process or geographical distance between servers that host knowledge bases that expected to be integrated. The paralleling of integration process could solve this problem and in this paper we propose a multi-level ontology integration procedure. The analytical analysis pointed out that for presented algorithm the one- and multi-level integration processes give the same results (the same final ontology). However, the multi-level integration allows to save time of data processing. The experimental research demonstrated a significant difference between times required for the one- and multi-level integration procedure. The latter could be even 20 $$\%$$% faster than the former, which is important especially in the emerging context of Big Data. Due to the limited space we can only consider integration on the concept level.

Adrianna Kozierkiewicz-Hetmańska, Marcin Pietranik

Temporal Ontology Representation and Reasoning Using Ordinals and Sets for Historical Events

In question-and-answer (QA) systems, various queries need to be processed. In particular, those queries such as temporal information require complex query generation processes. This paper proposes a temporal representation model that can support qualitative and quantitative temporal information on historical ontology by applying the concept of ordinals and sets and introduces operators that allow a QA system to easily handle complex temporal queries. To verify the effectiveness of the proposed model and operators, historical scenarios are presented to show that they can effectively handle complex temporal queries.

Myung-Duk Hong, Kyeong-Jin Oh, Seung-Hyun Go, Geun-Sik Jo

Measuring Propagation Phenomena in Social Networks: Promising Directions and Open Issues

The massive information is now spreading like wildfire in social media. As the usage of social data increased, the abuse of the media to spread distorted data also increased several times. To understand and predict the spread of information over a time period in online social networks researchers attempt to quantitatively model and measure the whole process. A number of different statistics aimed at measuring the spread were suggested. Many researchers have coupled these measures with various forgetting factor mechanisms to improve behavioural properties. Unfortunately, frequent unavailability of the full data record in social media prevents straightforward validation of such quantities. Moreover, since most known measures have global affects, they are rather inconvenient to evaluate for large networks. These difficulties lead us to contribute here a methodological identification of the propagation parameters to start afresh. The approach hinges on some recent results arising from the convergence between threshold models and cascade models. For example, three key concepts – distance, centrality and robustness – is successfully balanced by the proposed scope–speed–failures relationship. We conclude by identifying several open issues and possible directions for future research.

Dariusz Król

Social Networks and Recommender Systems

Frontmatter

Visualizing Learning Activities in Social Network

Proliferation of social networks and its use in education and higher learning have become an interesting topic. In order to accommodate diverse student cohorts in the age of massification of education market, this approach has become attractive. Recent advances in learning analytics and data visualization further proved to be useful in encouraging collaborative learning. On the other hand, information of social networks can be too complex to visualize often overloading users with too much possibly unwanted information. The question is what to show and what not to show when we visualize social relationships of users. In this paper, we propose a new visualization model called learning space and develop a method for visualizing learning activities on social network in a 3D virtual learning space. We evaluate the method using questionnaires to see whether visualization of social relations of learning will improve the learning in the following ways or not: make it more fun, make it easier, motivate more. The result shows that our method of visualization of learning activities in social network made learning more fun and easier. The result also shows that it helps student engage and motivate on the subjects.

Thi Hoang Yen Ho, Thanh Tam Nguyen, Insu Song

A Mobility Prediction Model for Location-Based Social Networks

Mobility prediction plays important roles in many fields. For example, tourist companies would like to know the characteristics of their customer movements so that they could design appropriate advertising strategies; sociologists has made many research on migration to try to find general features in human mobility; polices also analyze human movement behaviors to seek criminals. Thus, for location-based social networks, mobility prediction is an important task. This study proposes a mobility prediction model, which can be used to predict the user (human) mobility. The proposed approach is conducted from three characteristics: (1) regular movement in human mobility, (2) the influence of relationships on social networks, (3) other features (in this work, we consider “hot regions” where attract more people coming to there). To validate the proposed approach, three datasets including over 500,000 check-ins which are collected from two location-based social networks, namely Brightkite and Gowalla, are used for the experiments. Results show that the proposed model significantly improves the prediction accuracy, thus, this approach could be promising for mobility prediction, especially for location-based social networks.

Nguyen Thanh Hai, Huu-Hoa Nguyen, Nguyen Thai-Nghe

Empirical Analysis of the Relationship Between Trust and Ratings in Recommender Systems

User-based collaborative filtering (CF) is a widely used recommendation method that suggests items to users based on ratings of other users in the system. The performance of user-based CF can be degraded due to its inherent weaknesses, such as data sparsity and cold start problems. To address these weaknesses, many researchers have proposed to incorporate trust information into user-based CF. However, as reported in many recent works on trust aware recommendation, effectively exploiting trust in recommendation is not straightforward due to insufficient understanding of the relationship between trust and ratings. This paper empirically analyses real-world ratings data and their associated trust networks. Specifically, we focus our analysis on comparative characteristics of cold users vs. non-cold users. Our results show that the characteristics of cold users and non-cold users are significantly different.

Kulwadee Somboonviwat, Hisayuki Aoyama

Text Processing and Information Retrieval

Frontmatter

Integrated Feature Selection Methods Using Metaheuristic Algorithms for Sentiment Analysis

In text mining, the feature selection process can potentially improve classification accuracy by reducing the high-dimensional feature space to a low-dimensional feature space resulting in an optimal subset of available features. In this paper, a hybrid method and two meta-heuristic algorithms are employed to find an optimal feature subset. The feature selection task is performed in two steps: first, different feature subsets (called local-solutions) are obtained using a hybrid filter and wrapper approaches to reduce high-dimensional feature space; second, local-solutions are integrated using two meta-heuristic algorithms (namely, the harmony search algorithm and the genetic algorithm) in order to find an optimal feature subset. The results of a wide range of comparative experiments on three widely-used datasets in sentiment analysis show that the proposed method for feature selection outperforms other baseline methods in terms of accuracy.

Alireza Yousefpour, Roliana Ibrahim, Haza Nuzly Abdul Hamed, Takeru Yokoi

Big Data in Contemporary Linguistic Research. In Search of Optimum Methods for Language Chronologization

The paper will concern the theoretical and practical problems of analysing the mass of linguistic data which has arisen in conjunction with the development of many fields of life. Moreover, the universe of texts is growing every day – both forwards and backwards. Forwards because every new article, book, blog, e-mail or text message expands the set of existing texts; and backwards because the same set is also expanded whenever a scan is made of another historical text. Our knowledge about past times is growing by leaps and bounds. We are therefore particularly interested in the analysis of historical texts that can be carried out in the second decade of the 21st century.

Piotr Wierzchoń

Improving Twitter Aspect-Based Sentiment Analysis Using Hybrid Approach

Twitter sentiment analysis has emerged and become interesting in many field that involves social networks. Previous researches have assumed the problem as a tweet-level classification task where it only determines the general sentiment of a tweet. This paper proposed hybrid approach to analyze aspect-based sentiments for tweets. We conducted several experiments to identify explicit and implicit aspects which is crucial for aspect-based sentiment analysis. The hybrid approach between association rule mining, dependency parsing and Sentiwordnet is applied to solve this aspect-based sentiment analysis problem. The performance is evaluated using hate crime domain and other benchmark dataset in order to evaluate the results and the finding can be used to improve the accuracy for the aspect-based sentiment classification.

Nurulhuda Zainuddin, Ali Selamat, Roliana Ibrahim

Design of a Yoruba Language Speech Corpus for the Purposes of Text-to-Speech (TTS) Synthesis

This paper deals with the design of a speech corpus for a corpus-based Text-To-Speech (TTS) synthesis approach. The purposes are first to provide enough speech to develop Yoruba corpus-based TTS system and second, to provide a simple methodology for other languages corpus design. The paper focuses on text analysis, selection of the reliable sentences, selection of the reader, and sentences recording. The analysis is performed to ensure a good balance of the corpus. Then, 2,415 sentences are gathered (essentially affirmative sentences). Those sentences have been read by a Yoruba language journalist who is a native speaker of the language. There is one speaker for the whole corpus.

Théophile K. Dagba, John O. R. Aoga, Codjo C. Fanou

Named Entity Recognition for Vietnamese Spoken Texts and Its Application in Smart Mobile Voice Interaction

Named entity recognition (NER) for written documents has been studied intensively during the past decades. However, NER for spoken texts is still at its early stage. There are several challenges behind this: spoken texts are usually less grammatical, all in lowercase, and even have no punctuation marks; continuous text chunks like email, hyperlinks are interpreted as discrete tokens; and numeric texts are sometimes interpreted as alphabetic forms. These characteristics are real obstacles for spoken text understanding. In this paper, we propose a lightweight machine learning model to NER for Vietnamese spoken texts that aims to overcome those problems. We incorporated into the model a variety of rich features including sophisticated regular expressions and various look-up dictionaries to make it robust. Unlike previous work on NER, our model does not need to rely on word boundary and part-of-speech information – that are expensive and time-consuming to prepare. We conducted a careful evaluation on a medium-sized dataset about mobile voice interaction and achieved an average $$F_1$$F1 of 92.06. This is a significant result for such a difficult task. In addition, we kept our model compact and fast to integrate it into a mobile virtual assistant for Vietnamese.

Phuong-Nam Tran, Van-Duc Ta, Quoc-Tuan Truong, Quang-Vu Duong, Thac-Thong Nguyen, Xuan-Hieu Phan

Explorations of Prosody in Vietnamese Language

In this paper, we attempt to analyze the intonation of Vietnamese prosody in order to produce a natural Vietnamese synthesizer. We continue to go into the advanced research of the intonation (intensity, altitude and duration). We study the pitch change between words in a phrase, the “swallowing sound” phenomenon at the junction of two words, the simulations of the duration of words in the Vietnamese TTS and the echo problem in processing the change of word’s duration.

Tang Ho Lê, Anh-Viêt Nguyên

Identifying User Intents in Vietnamese Spoken Language Commands and Its Application in Smart Mobile Voice Interaction

This paper presents a lightweight machine learning model and a fast conjunction matching method to the problem of identifying user intents behind their spoken text commands. These model and method were integrated into a mobile virtual assistant for Vietnamese (VAV) to understand what mobile users mean to carry out on their smartphones via their commands. User intent, in the scope of our work, is an action associated with a particular mobile application. Given an input spoken command, its application will be identified by an accurate classifier while the action will be determined by a flexible conjunction matching algorithm. Our classifier and conjunction matcher are very compact in order that we can store and execute them right on mobile devices. To evaluate the classifier and the matcher, we annotated a medium-sized data set, conducting various experiments with different settings, and achieving impressive accuracy for both the application and action identification.

Thi-Lan Ngo, Van-Hop Nguyen, Thi-Hai-Yen Vuong, Thac-Thong Nguyen, Thi-Thua Nguyen, Bao-Son Pham, Xuan-Hieu Phan

A Method for Determining Representative of Ontology-Based User Profile in Personalized Document Retrieval Systems

Information overload is one of the most important problems in context of personalized document retrieval systems. In this paper we propose to use ontology-based user profile. Ontological structures are appropriate to represent relations between concepts in user profile. We present a method for determining representative profile of users’ group. Two users are in the same group when their interests (profiles) are similar. If a new user is classify to a group, a system can recommend him a representative profile to avoid ,,cold-start problem”. Results obtained in experimental evaluation are promising. Method presented in this paper is a crucial part of developed personalized document retrieval system.

Bernadetta Maleszka

Database Systems and Software Engineering

Frontmatter

Data Quality Scores for Pricing on Data Marketplaces

Data and data-related services are increasingly being traded on data marketplaces. However, value attribution of data is still not well-understood, in particular when two competing offers are to be compared. This paper discusses the role data quality can play in this context and suggests a weighted quality score that allows for ‘quality for money’ comparisons of different offerings.

Florian Stahl, Gottfried Vossen

Extraction of Structural Business Rules from C#

Business rules are very important assets of any enterprise. Very often they are directly coded in existing software systems. As business rules evolve during a time, the software itself becomes the only valuable source of the rules applied. The aim of the paper is to present an approach to automatic business rules extraction from existing system written in C#. Considerations are limited to structural business rules. The proposed approach was implemented in a tool which usefulness was confirmed by examples. In comparison with existing solutions for reverse-engineering it gives better results, characterized by high correctness, and accuracy.

Bogumila Hnatkowska, Marcin Ważeliński

Higher Order Mutation Testing to Drive Development of New Test Cases: An Empirical Comparison of Three Strategies

Mutation testing, which includes first order mutation (FOM) testing and higher order mutation (HOM) testing, appeared as a powerful and effective technique to evaluate the quality of test suites. The live mutants, which cannot be killed by the given test suite, make up a significant part of generated mutants and may drive the development of new test cases. Generating live higher order mutants (HOMs) able to drive development of new test cases is considered in this paper. We apply multi-objective optimization algorithms based on our proposed objectives and fitness functions to generate higher order mutants using three strategies: HOMT1 (HOMs generated from all first order mutants), HOMT2 (HOMs generated from killed first order mutants) and HOMT3 (HOMs generated from not-easy-to-kill first order mutants). We then use mutation score indicator to evaluate, which of the three approaches is better suited to drive development of new test cases and, as a result, to improve the software quality.

Quang Vu Nguyen, Lech Madeyski

On the Relationship Between the Order of Mutation Testing and the Properties of Generated Higher Order Mutants

The goal of higher order mutation testing is to improve mutation testing effectiveness in particular and test effectiveness in general. There are different approaches which have been proposed in the area of second order mutation testing and higher order mutation testing with mutants order ranging from 2 to 70. Unfortunately, the empirical evidence on the relationship between the order of mutation testing and the desired properties of generated mutants is scarce except the conviction that the number of generated mutants could grow exponentially with the order of mutation testing. In this paper, we present the study of finding the relationships between the order of mutation testing and the properties of mutants in terms of number of generated high quality and reasonable mutants as well as generated live mutants. Our approach includes higher order mutants classification, objective functions and fitness functions to classify and identify generated higher order mutants. We use four multi-objective optimization algorithms for constructing higher order mutants. Obtained empirical results indicate that 5 is a relevant highest order in higher order mutation testing.

Quang Vu Nguyen, Lech Madeyski

Intelligent Information Systems

Frontmatter

Responsive Web Design: Testing Usability of Mobile Web Applications

Responsive web design (RWD) allows applications for adapting dynamically to diverse screen sizes, proportions, and orientations. RWD is an approach to the problem of designing for the great number of devices ranging from small smartphones to large desktop monitors. The goal of the paper was to test the usability of an application for management of a scientific conference developed using responsive design paradigms. Two versions of the responsive applications were implemented using different design patterns. Various techniques of usability were employed including tests with prospective users, experts’ inspections as well automated tools. The obtained results were thoroughly analysed and recommendations on the utilization of individual design patterns in developing mobile web applications were formulated.

Jarosław Bernacki, Ida Błażejczyk, Agnieszka Indyka-Piasecka, Marek Kopel, Elżbieta Kukla, Bogdan Trawiński

Person Name Disambiguation for Building University Knowledge Base

In this paper we propose a new algorithm for person name disambiguation within authors of scientific publications. The algorithm is effective, elastic, and tailored to a scientific knowledge base. Besides the common properties of publication; namely, title, venue, author and co-authors names, it also exploits references. One of the reasons is that we decided to enrich the University Knowledge Base with connections between publications, not only references represented by a reference (i.e. author’s name, title, etc.). Our algorithm utilises the unsupervised approach which does not require creating a training set, which is time and resources consuming. However, we want to leverage additional information available from crowd sourcing or authorised users which confirms authorship and citation relations between papers. By utilising this information default parameters of the unsupervised algorithm can be optimised for a given case by means of a genetic algorithm in order to increase the accuracy. The proposed method can be applied for three tasks: assigning a publication to a specific researcher, indicating that a new author is yet unknown to the database and clustering a set of publications into clusters that contain papers of one researcher. Validation results confirm high accuracy of the new algorithm and its usefulness in the process of populating a scientific knowledge base.

Piotr Andruszkiewicz, Szymon Szepietowski

Improving Behavior Prediction Accuracy by Using Machine Learning for Agent-Based Simulation

This study models an integration between agent-based simulation and machine learning in order to achieve comprehensive behavior prediction. The model is applied to the case of customer churning in a subscription-based business. Providing a good model for behavior prediction requires dynamic simulation based on social structure. In this study, we first executed an agent-based simulation to capture the dynamic structure of human behavior. Next, we conducted machine learning to classify human behavior using a classification algorithm. Finally, we verified the agent-based simulation and machine learning results by comparing the accuracy of both models. Based on the agent-based simulation results, we provide some recommendations to improve the accuracy of agent-based simulation based on the classification results from machine-learning procedures.

Shinji Hayashi, Niken Prasasti, Katsutoshi Kanamori, Hayato Ohwada

A Model for Analysis and Design of Information Systems Based on a Document Centric Approach

The recent tendency in analysis and design of Information Systems is that the emphasis is placed on the documents that are ubiquitous around information systems and organizations. The proliferation of computer literacy led to the general use of electronic documents. To understand the anticipated behaviour of Information Systems and the actual operation of a particular organization, the analysis of documents play increasingly important role. The behaviour of Information Systems can be interpreted in a framework of Enterprise Architecture and its models that are contained in it. Certain parts and entirety of various types of documents connected to business processes, tasks, roles and actors within organization. The tracking of life cycle of documents and representing the complex relationships is essential both at analysis and operation time. We propose a theoretical framework that makes use previous results of modelling and well-founded mathematical techniques.

Bálint Molnár, András Benczúr, András Béleczki

MobiCough: Real-Time Cough Detection and Monitoring Using Low-Cost Mobile Devices

In this paper we present MobiCough, a method and system for cough detection and monitoring on low-cost mobile devices in real-time. MobiCough utilizes the acoustic data stream captured from a wirelessly low-cost microphone worn on user’s collar and connected to the mobile device via Bluetooth. MobiCough detects the cough in four steps: sound pre-processing, segmentation, feature & event extraction, and cough prediction. In addition, we propose the use of a simple yet effective robust to noise predictive model that combines Gaussian Mixture model and Universal Background model (GMM-UBM) for predicting cough sounds. The proposed method is rigorously evaluated through a dataset consisting of more than 1000 cough events and a significant number of noises. The results demonstrate that cough can be detected with the precision and recall of more than 91 % with individually trained models and over 81 % for subject independent training. These results are really potential for health-care applications acquiring cough detection and monitoring using low-cost mobile devices.

Cuong Pham

Database of Peptides Susceptible to Aggregation as a Tool for Studying Mechanisms of Diseases of Civilization

We introduce a database containing peptides related to diseases arising from protein aggregation. The general database AmyLoad includes all experimentally studied protein fragments that could be involved in erroneous protein folding, leading to amyloid formation. The database has been extended since its first release with regard to new instances of peptides or their fragments. Moreover, information of related diseases has been added to all entries, whenever available. Currently the database includes all available peptides tested for their potential amyloid properties, obtained from diverse resources, creating the largest dataset available at one place. This enables comparison between properties of amyloid and non-amyloid peptides. We could also select candidates for the most pathogenic peptides, involved in several diseases related to protein aggregation. We also discuss a need for sub-databases of different structures, such as related to $$\beta \gamma $$βγ-crystallins - a protein family occurring in the eye lens. Misfolding of these proteins may lead to various forms of cataract. Those freely available internet services can facilitate finding the link between a protein sequence, its propensity to aggregation and the resulting disease, as well as support research on their pharmacological treatment and prevention.

Pawel P. Wozniak, Jean-Christophe Nebel, Malgorzata Kotulska

Using a Cloud Computing Telemetry Service to Assess PaaS Setups

Cloud Computing (CC) is a new paradigm in which capabilities and resources related to Information Technology (IT) are provided as services. This provision could be done via the Internet and on-demand, and is accessible without requiring detailed knowledge of the underlying technology. In this paper we assess different service platforms using cloud computing telemetry services. Using an OpenStack telemetry service, namely Ceilometer, we design experiments to assess the performance of different Platform as a Service (PaaS) setups for basic purposes (e.g. database and Web server). The assessment could be use to decide between common used platforms, comparing requisites of storage, processing time and processor load. Given the costs of each metric at a CC platform and the performance of each PaaS setup, the IT manager could choose the most advantage one.

Francisco Anderson Freire Pereira, Jackson Soares, Adrianne Paula Vieira Andrade, Gilson Gomes Silva, João Paulo Souza Medeiros

Towards the Tradeoff Between Online Marketing Resources Exploitation and the User Experience with the Use of Eye Tracking

Online systems are often overloaded with marketing content and as a result, perceived intrusiveness negatively affects the user experience and the evaluation of the website. Intentional and unintentional avoidance of the commercial content creates the need for compromise solutions from both the perspective of user experience and business goals. The presented research shows a unique approach to search for tradeoffs between the editorial content and the intensity of marketing components with the use of eye tracking and the multiple-criteria decision analysis methods.

Jarosław Jankowski, Paweł Ziemba, Jarosław Wątróbski, Przemysław Kazienko

Using Cognitive Agents for Unstructured Knowledge Management in a Business Organization’s Integrated Information System

Management of unstructured knowledge in business organizations, mainly by using integrated information system, is a very important process. This type of knowledge allows supporting decision-making process to a high degree. The aim of this paper is to present using a cognitive agent’s architecture for knowledge management in integrated information system running in business organization. Analysis of the existing works in considered field is presented in the first part of paper; next an unstructured knowledge management process using The Learning Intelligent Distribution Agent architecture has been described. The last part of paper presents the research experiment performed in order to verification developed solution.

Marcin Hernes

A Norm Assimilation Approach for Multi-agent Systems in Heterogeneous Communities

In a heterogeneous community, which constitutes a number of social groups that adopt different social norms, norm assimilation is considered as the main problem for a new member to join a desired social group. Studies in norm assimilation seem to be lacking in concept and theory within this research domain. Consequently, this paper proposes a norm assimilation approach, in which a new agent attempts to join a social group via assimilating with the social group’s norms. Several cases are considered for an agent’s decision which are, can assimilate, could assimilate, and cannot assimilate. We develop the norm assimilation approach based on the agent’s internal belief about its ability and its external belief about the assimilation cost of a number of social groups. From its beliefs about its ability and assimilation cost, it is able to decide whether to proceed or decline the assimilation with a specific social group or join another group.

Moamin A. Mahmoud, Mohd Sharifuddin Ahmad, Mohd Zaliman M. Yusoff

Knowledge in Asynchronous Social Group Communication

Multi-agent systems are one of many modern distributed approaches to decision, optimization and other problem solving. Among others, multi-agent systems have been often used for prediction, but those approaches require a supervisor agent for integrating the knowledge of other agents. In this paper we discuss the shortcomings of such approach and propose a switch to decentralized groups of agents with asynchronous communications. We show that this approach may obtain similar results, while avoiding the pitfalls of centralized architecture.

Marcin Maleszka

Decision Support and Control Systems

Frontmatter

Interpreted Petri Nets in DES Control Synthesis

Discrete event systems (DES) control based on interpreted Petri nets (IPN) is presented in this paper. While place/transition Petri nets (P/T PN) are usually used for modelling and control in case of controllable transitions and measurable places, the IPN-based models yield the possibility for the control synthesis also in case when P/T PN models contain some uncontrollable transitions and unmeasurable places. The creation of the IPN model from such a P/T PN model is introduced and the control synthesis is performed. The illustrative examples as well as the case study on a robotized assembly cell are introduced.

František Čapkovič

Enhanced Guided Ejection Search for the Pickup and Delivery Problem with Time Windows

This paper presents an enhanced guided ejection search (GES) to minimize the number of vehicles in the NP-hard pickup and delivery problem with time windows. The proposed improvements decrease the convergence time of the GES, and boost the quality of results. Extensive experimental study on the benchmark set shows how the enhancements influence the GES capabilities. It is coupled with the statistical tests to verify the significance of the results. We give a guidance on how to select a proper algorithm variant based on test characteristics and objectives. We report one new world’s best result obtained using the enhanced GES.

Jakub Nalepa, Miroslaw Blocho

How to Generate Benchmarks for Rich Routing Problems?

In this paper, we show how to generate challenging benchmark tests for rich vehicle routing problems (VRPs) using a new heuristic algorithm (termed HeBeG—Heuristic Benchmark Generator). We consider a modified VRP with time windows, in which the depot does not define its time window. Additionally, the taxicab metric is utilized to determine the distance between travel points, instead of a standard Euclidean metric. HeBeG was used to create a test set for the qualifying round of Deadline24—an international 24-hour programming marathon. Finally, we compare the best results submitted to the server during the qualifying round of the contest with the routing schedules elaborated using other algorithms, including a new heuristics proposed in this paper.

Marcin Cwiek, Jakub Nalepa, Marcin Dublanski

Formal a Priori Power Analysis of Elements of a Communication Graph

This paper presents the idea of measuring the formal impact of elements of a communication graph structure consisting of nodes and arcs on its entirety or subparts. Arcs and nodes, depending on the context, can be assigned different interpretations. E.g. in game theory its nodes may represent the players, often referred to as policy makers and arcs symbolize the relationships between them. In another context, however, nodes and arcs of the graph represent elements of technical infrastructure, e.g. a computer. The graph representing the tested relationships is called the communication graph and the influence of the elements on the entire graph (or its subpart) is referred to as power of the element. Taking into account the power of nodes and connections creates so-called incidence-power matrix more completely than the one formerly describing the communication graph.

Jacek Mercik

Angiogenic Switch - Mixed Spatial Evolutionary Game Approach

The main goal of this paper is to study properties of a game theoretic model of an angiogenic switch using Mixed Spatial Evolutionary Games (MSEG). These games are played on multiple lattices corresponding to the possible phenotypes and give the possibility to simulate and investigate heterogeneity on the player-level in addition to the population-level. Furthermore, diverse polymorphic equilibrium points dependent on individuals reproduction, model parameters and their simulation are discussed. The analysis demonstrates the sensitivity properties of MSEGs and the possibility for further development of spatial games.

Michal Krzeslak, Damian Borys, Andrzej Swierniak

Model Kidney Function in Stabilizing of Blood Pressure

The aim of this work is to verify the model of kidney function by stabilizing blood pressure, which is implemented in Matlab Simulink. A user interface is also designed for educational purpose in the subject Biocybernetics thus enabling stage processes involved in the long-term regulation of blood pressure. At the end of the work was carried out physiological and functional verification model.

Martin Augustynek, Jan Kubicek, Martin Cerny, Marie Bachrata

Dynamic Diversity Population Based Flower Pollination Algorithm for Multimodal Optimization

Easy convergence to a local optimum, rather than global optimum could unexpectedly happen in practical multimodal optimization problems due to interference phenomena among physically constrained dimensions. In this paper, an altering strategy for dynamic diversity Flower pollination algorithm (FPA) is proposed for solving the multimodal optimization problems. In this proposed method, the population is divided into several small groups. Agents in these groups are exchanged frequently the evolved fitness information by using their own best historical information and the dynamic switching probability is to provide the diversity of searching process. A set of the benchmark functions is used to test the quality performance of the proposed method. The experimental result of the proposed method shows the better performance in comparison with others methods.

Jeng-Shyang Pan, Thi-Kien Dao, Trong-The Nguyen, Shu-Chuan Chu, Tien-Szu Pan

Hardware Implementation of Fuzzy Petri Nets with Lukasiewicz Norms for Modelling of Control Systems

In the paper an implementation of Fuzzy Petri Nets with Lukasiewicz norms is described based on FPGA integrated circuits. The proposed solution is used for modeling of control systems. The approach for the realization of fuzzy Petri net models is based on the synthesis method taking advantage of a fuzzy Petri net place module concept. The paper contains the description of a real-life control system, which is used to demonstrate new features of fuzzy Petri nets with Lukasiewicz norms. For the example control system, in order to analyze costs of implementation, a comparison is made between the fuzzy Petri net model with Lukasiewicz norms and the fuzzy Petri net model with triangular MIN/MAX norms.

Zbigniew Hajduk, Jolanta Wojtowicz

ALMM Solver for Combinatorial and Discrete Optimization Problems – Idea of Problem Model Library

The paper presents results of further research on a software tool named ALMM Solver. The objective of the ALMM Solver is to solve combinatorial and discrete optimization problems including NP-hard problems.The solver utilizes a modeling paradigm named Algebraic Logical Meta Model of Multistage Decision Processes (ALMM of MDP) and its theory.The ALMM of MDP enables a unified approach to creating discrete optimization problem models and representing knowledge about these problems. The models are stored in a Problem Model Library.A new, extended modular structure of the ALMM Solver is presented together with a basic layout of the Problem Model Library.

Ewa Dudek-Dyduch, Sławomir Korzonek

Integration of Collective Knowledge in Financial Decision Support System

Execution of a process supporting making financial decisions using the multiagent system entails the need of permanent cooperation between a human (humans) and agent (agents) collectives. Their knowledge is acquired from autonomous and distributed sources and they use different decision support methods therefore certain level of heterogeneity characterizes knowledge of collectives. In the decision-making process one, final decision is required therefore knowledge of individual members of the collective shall be automatically integrated. The aim of the paper is to develop consensus method in order to integrate knowledge of human-agent collectives in a multiagent financial decision support system built with the use of cognitive agent architecture. The first part shortly presents the state-of-the-art in the considered field; next a Multiagent Cognitive Financial Decision Support System has been characterized. The last part of paper presents the consensus algorithm for knowledge integration.

Marcin Hernes, Andrzej Bytniewski

Framework for Product Innovation Using SOEKS and Decisional DNA

Product innovation always requires a foundation based on both knowledge and experience. The production and innovation process of products is very similar to the evolution process of humans. The genetic information of humans is stored in genes, chromosomes and DNA. Similarly, the information about the products can be stored in a system having virtual genes, chromosomes and decisional DNA. The present paper proposes a framework for systematic approach for product innovation using a Smart Knowledge Management System comprising Set of Experience Knowledge Structure (SOEKS) and Decisional DNA. Through this system, entrepreneurs and organizations will be able to perform the product innovation process technically and quickly, as this framework will store knowledge in the form of experiences of the past innovative decisions taken. This proposed system is dynamic in nature as it updates itself every time a decision is taken.

Mohammad Maqbool Waris, Cesar Sanin, Edward Szczerbicki

Common-Knowledge and KP-Model

This paper starts epistemic approaches of studying the Bayesian routing problem in the frame work of the network game introduced by Koutsoupias and Papadimitriou [LNCS 1563, pp.404–413. Springer (1999)]. It highlights the role of common-knowledge on the users’ individual conjectures on the others’ selections of channels in the network game. Especially two notions of equilibria are presented in the Bayesian extension of the network game; expected delay equilibrium and rational expectations equilibrium, such as each user minimizes own expectations of delay and social cost respectively. We show that the equilibria have the properties: If all users commonly know them, then the former equilibrium yields a Nash equilibrium in the based KP-model and the latter equilibrium yields a Nash equilibrium for social cost in the network game.

Takashi Matsuhisa

Controllability of Semilinear Fractional Discrete Systems

In the present paper local constrained controllability problems for semilinear finite-dimensional discrete system with constant coefficients are formulated and discussed. Using some mapping theorems taken from functional analysis and linear approximation methods sufficient conditions for constrained controllability are derived and proved. The present paper extends the controllability conditions with unconstrained controls given in the literature to cover the semilinear discrete systems with constrained controls.

Jerzy Klamka

Machine Learning and Data Mining

Frontmatter

On Fast Randomly Generation of Population of Minimal Phase and Stable Biquad Sections for Evolutionary Digital Filters Design Methods

Evolutionary algorithms possesses many practical applications. One of the practical application of the evolutionary methods is digital filters design. Evolutionary techniques are very often used to design FIR (Finite Impulse Response) digital filters or IIR (Infinite Impulse Response) digital filters. IIR digital filters are very often practically realized as a cascade of biquad sections. The guarantee of stability of biquad sections is one of the most important element during IIR digital filter design process. If we want to obtain a stable IIR digital filter, the all poles of the transfer function for all biquad sections must be located into the unitary circle in the z-plane. Of course, if we want to have a minimal phase digital filter then all zeros of the transfer function for all biquad sections must be also located into the unitary circle in the z-plane. In many evolutionary algorithms which are dedicated to the IIR digital filter design the initial population (or re-initialized populations) of the filter coefficients are chosen randomly. Therefore, some of digital filters which are generated in population can be unstable (or/and the filters are not minimal phase). In this paper, we show how to randomly generate a population of stable and minimal phase biquad sections with very high efficiency. Due to our approach, we can also reduce a computational time which is required for evaluation of stability (or/and minimal phase property) of digital filter. The proposed approach has been compared with standard techniques which are used in evolutionary digital filter design methods.

Adam Slowik

Recursive Ensemble Land Cover Classification with Little Training Data and Many Classes

Land-cover classification can construct a land-use map to analyze satellite images using machine learning. However, supervised machine learning requires a lot of training data since remote sensing data is of higher resolution that reveals many features. Therefore, this study proposed a method to generate self-training data from a small amount of training data. This method generates self-training, which is regarded as the correct class to consider various times and the surrounding land cover. As a result of self-training conducted using this method, the Kappa coefficient was 0.644 for 12 classification problems with one training data per class.

Yu Oya, Katsutoshi Kanamori, Hayato Ohwada

Treap Mining – A Comparison with Traditional Algorithm

In this era of big data analysis, mining results hold a very important role. So, the data scientists need to be accurate enough with the tools, methods and procedures while performing rule mining. The major issues faced by these scientists are incremental mining and the huge amount of time that is virtually required to finish the mining task. In this context, we propose a new rule mining algorithm which mines the database in a priority based model for finding interesting relations. In this paper a new mining algorithm using the data structure Treap is explained along with its comparison with the traditional algorithms. The proposed algorithm finishes the task in O (n) in its best case analysis and in O (n log n) in its worst case analysis. The algorithm also considers less frequent high priority attributes for rule creation, thus making sure to create valid mining rules. Thus the major issues of traditional algorithms like creating invalid rules, longer running time and high memory utilization could be remedied by this new proposal. The algorithm was tested against various datasets and the results were evaluated and compared with the traditional algorithm. The results showed a peak performance improvement.

H. S. Anand, S. S. Vinodchandra

SVM Based Lung Cancer Prediction Using microRNA Expression Profiling from NGS Data

microRNAs are single stranded non coding RNA sequences of 18 - 24 nucleotide length. They play an important role in post transcriptional regulation of gene expression. Last decade witnessed identification of hundreds of human microRNAs from genomic data. Experimental as well as computational identification of microRNA binding sites in messenger RNAs are also in progress. Evidences of microRNAs acting as promoter /suppressor of several diseases including cancer are being unveiled. The advancement of Next Generation Sequencing technologies with dramatic reduction in cost, opened endless applications and rapid advances in many fields related to biological science. microRNA expression profiling is a measure of relative abundance of microRNA sequences to the total number of sequences in a sample. Many experiments conducted in this kind of measure proved differential expression of microRNAs in diseased states. This paper discusses an algorithm for microRNA expression profiling, its normalization, and a Support Vector based machine learning approach to develop a Cancer Prediction System. The developed system classify samples with 97.6 % accuracy.

Salim A., Amjesh R., Vinod Chandra S. S.

Forecasting the Magnitude of Dengue in Southern Vietnam

With recent rises of sophisticated and dangerous epidemics, there is a growing need for a system that could predict disease severity with high accuracy. In this paper, we address the problem of forecasting the magnitude of dengue in a short term period, i.e. one week ahead. We consider inputs as both statistics of historical cases and biological factors affecting the dengue virus, including the temperature, population and mosquito density. We propose a two-phase model simulating the disease transmission process, which are the local outbreak and then province transmission. The locality phase estimates the number of potential cases in each province independently in the following week. Then, in the transmission phase, an artificial neural network is used to predict the mobility of the dengue virus across provinces. Our proposed method obtains a higher accuracy than the conventional models of time series, linear regression, and ARIMA. Moreover, this provides the first research results about dengue prediction in Vietnam.

Tuan Q. Dinh, Hiep V. Le, Tru H. Cao, Quang C. Luong, Hai T. Diep

Self-paced Learning for Imbalanced Data

In this paper, we propose a novel training paradigm that combines two learning strategies: cost-sensitive and self-paced learning. This learning approach can be applied to the decision problems where highly imbalanced data is used during training process. The main idea behind the proposed method is to start the learning process by taking large number of minority examples and only the easiest majority objects and then gradually turning to more difficult cases. We examine the quality of this training paradigm comparing to other learning schemas for neural network model using a set of highly imbalanced benchmark datasets.

Maciej Zięba, Jakub M. Tomczak, Jerzy Świątek

A New Similarity Measure for Intuitionistic Fuzzy Sets

Although there exist many similarity measures for intuitionistic fuzzy sets (IFSs), most of them can not satisfy the axioms of similarity measure or provide reasonable results. In this paper, a review of existing similarity measures for IFSs and their drawbacks is carried out. Then a new similarity measure between IFSs on the base of their knowledge measures is proposed. A comprehensive analysis of the performance features of the proposed measure is conducted in a comparative example. Finally, the proposed similarity measure is employed in application to the turbine fault diagnosis. We point out that the new proposed similarity measure overcomes the drawbacks of the existing similarity measures and gives reliable results in real world application.

Hoang Nguyen

Multiple Kernel Based Collaborative Fuzzy Clustering Algorithm

Cluster is found as one of the best useful tools for data analysis, data mining, and pattern recognition. The FCM algorithm and its variants algorithms has been extensively used in problems of clustering or collaborative clustering. In this paper, we present a novel method involving multiple kernel technique and FCM for collaborative clustering problem. These method endowed with multiple kernel technique which transform implicitly the feature space of input data into a higher dimensional via a non linear map, which increases greatly possibility of linear separability of the patterns when the data structure of input patterns is non-spherical and complex. To evaluate the proposed method, we use the criteria of fuzzy silhouette, a sum of squared error and classification rate to show the performance of the algorithms.

Trong Hop Dang, Long Thanh Ngo, Wiltold Pedrycz

Credit Risk Evaluation Using Cycle Reservoir Neural Networks with Support Vector Machines Readout

Automated credit approval helps credit-granting institutions in reducing time and efforts in analyzing credit approval requests and to distinguish good customers from bad ones. Enhancing the automated process of credit approval by integrating it with a good business intelligence (BI) system puts financial institutions and banks in a better position compared to their competitors. In this paper, a novel hybrid approach based on neural network model called Cycle Reservoir with regular Jumps (CRJ) and Support Vector Machines (SVM) is proposed for classifying credit approval requests. In this approach, the readout learning of CRJ will be trained using SVM. Experiments results confirm that in comparison with other data mining techniques, CRJ with SVM readout gives superior classification results.

Ali Rodan, Hossam Faris

Fuzzy-Based Feature and Instance Recovery

The severe class distribution shews the presence of under-represented data, which has great effects on the performance of learning algorithm, is still a challenge of data mining and machine learning. Lots of researches currently focus on experimental comparison of the existing re-sampling approaches. We believe it requires new ways of constructing better algorithms to further balance and analyse the data set. This paper presents a Fuzzy-based Information Decomposition oversampling (FIDoS) algorithm used for handling the imbalanced data. Generally speaking, this is a new way of addressing imbalanced learning problems from missing data perspective. First, we assume that there are missing instances in the minority class that result in the imbalanced dataset. Then the proposed algorithm which takes advantages of fuzzy membership function is used to transfer information to the missing minority class instances. Finally, the experimental results demonstrate that the proposed algorithm is more practical and applicable compared to sampling techniques.

Shigang Liu, Jun Zhang, Yu Wang, Yang Xiang

An Enhanced Support Vector Machine for Faster Time Series Classification

As time series have become a prevalent type of data, various data mining tasks are often performed to extract useful knowledge, including time series classification, a more commonly performed task. Recently, a support vector machine (SVM), one of the most powerful classifiers for general classification tasks, has been applied to time series classification as it has been shown to outperform one nearest neighbor in various domains. Recently, Dynamic Time Warping (DTW) distance measure has been used as a feature for SVM classifier. However, its main drawback is its exceedingly high time complexity in DTW computation, which degrades the SVM’s performance. In this paper, we propose an enhanced SVM that utilizes a DTW’s fast lower bound function as its feature to mitigate the problem. The experiment results on 47 UCR time series data mining datasets demonstrate that our proposed work can speed up the classification process by a large margin while maintaining high accuracies comparing with the state-of-the-art approach.

Thapanan Janyalikit, Phongsakorn Sathianwiriyakhun, Haemwaan Sivaraks, Chotirat Ann Ratanamahatana

Parallel Implementations of the Ant Colony Optimization Metaheuristic

The paper discusses different approaches to parallel implementation of the Ant Colony Optimization metaheuristic. The metaheuristic is applied to the well-known Travelling Salesman Problem. Although the Ant Colony Optimization approach is capable of delivering good quality solutions for the TSP it suffers from two factors: complexity and non-determinism. Overpopulating ants makes the ACO performance more predictable but increasing the number of ants makes the need for parallel processing even more apparent. The proposed Ant Colony Community (ACC) uses a coarse grained approach to parallelization. Two implementations using RMI and Sockets respectively are compared. Results of an experiment prove the ACC is capable of a substantial reduction of processing time.

Andrzej Siemiński

A Segmented Artificial Bee Colony Algorithm Based on Synchronous Learning Factors

In this paper, we propose a segmented ABC algorithm based on synchronous learning factors (SABC). For the problem of inferior local search ability and low convergence precision in the artificial bee colony (ABC) algorithm, we use the method of synchronous change learning factors for local search. Then under the guidance of the segmented thought, it updates the quality honey greedily. It improves the efficiency of nectar source updating, enhances the local search ability of artificial bee colony. The six standard test functions are chosen to do the simulation experiments. Compared with the other three experiments, the results show that SABC has a significant improvement in the convergence speed and searching optimal value.

Yu Li, Jianxia Zhang, Dongsheng Zhou, Qiang Zhang

A Method for Query Top-K Rules from Class Association Rule Set

Methods for mining/querying Top-k frequent patterns and Top-k association rules have been developed in recent years. However, methods for mining/querying Top-k rules from a set of class association rules have not been developed. In this paper, we propose a method for querying Top-k class association rules based on the support. From the set of mined class association rules that satisfy the minimum support and minimum confidence thresholds, we use an insertion-based method to query Top-k rules. Firstly, we insert k rules from the rule set to the result set. After that, for each rule in the rest, we insert it into the result rule set using the idea of insertion strategy if its support is greater than the support of the last rule in the result rule set. Experimental results show that the proposed method is more efficient than obtaining the result after sorting the whole rule set.

Loan T. T. Nguyen, Hai T. Nguyen, Bay Vo, Ngoc-Thanh Nguyen

Hierarchy of Groups Evaluation Using Different F-Score Variants

The paper presents a cursory examination of clustering, focusing on a rarely explored field of hierarchy of clusters. Based on this, a short discussion of clustering quality measures is presented and the F-score measure is examined more deeply. As there are no attempts to assess the quality for hierarchies of clusters, three variants of the F-Score based index are presented: classic, hierarchical and partial order. The partial order index is the authors’ approach to the subject. Conducted experiments show the properties of the considered measures. In conclusions, the strong and weak sides of each variant are presented.

Michał Spytkowski, Łukasz P. Olech, Halina Kwaśnicka

Hierarchical Evolutionary Multi-biclustering

Hierarchical Structures of Biclusters Generation

Biclustering is an important method of processing a big amount of data. In this paper, hierarchical structures of biclusters and their advantages are discussed. We propose the author’s method called HEMBI (Hierarchical Evolutionary Multi-Biclustering) which creates this kind of structures. The HEMBI uses an Evolutionary Algorithm to split a data space into a restricted number of regions. The important feature of the method is ability to choice the optimal number of biclusters, which is restricted only to a maximum value. The conducted experiments and their results are presented and discussed.

Anna Maria Filipiak, Halina Kwasnicka

Computer Vision Techniques

Frontmatter

Feature Selection Based on Synchronization Analysis for Multiple fMRI Data

Functional magnetic resonance imaging (fMRI) can be used to predict the states of the human brain. However, solving the learning problem in multi-subjects is difficult, because of the inter-subject variability. In this paper, we use the synchronization of fMRI voxels when the brain responds to a stimulus in order to construct features for achieving better data representation and more efficient classification. With a simple definition of synchronization, the proposed method is insensitive to the reasonable choices over a broad range of thresholds. We also demonstrate a new unbiased method to compare multiple subjects by applying the singular value decomposition (SVD) to the discrimination matrix, which enumerates the different patterns. The method for analyzing the fMRI data works well for identifying the meaningful functional differences between subjects.

Ngoc Dung Bui, Hieu Cuong Nguyen, Sellappan Palaniappan, Siew Ann Cheong

Exploiting GPU for Large Scale Fingerprint Identification

Fingerprints are the most used biometrics features for identification. Although state-of-the-art algorithms are very accurate, but the need for fast processing speed for databases containing millions fingerprints is highly demanding. GPU devices are widely used in parallel computing tasks for its efficiency and low-cost. In this paper, we propose to adapt minutia cylinder-code (MCC) matching algorithm, an efficient algorithm in term of accuracy to GPU. The proposed method fits well with the architecture of the GPU that makes it easy to implement. The results of our experiments with a GTX- 680 device show that the proposed algorithm can perform 8.5 millions matches in a second that is suitable for real time identification systems having databases containing millions of fingerprints.

Hong Hai Le, Ngoc Hoa Nguyen, Tri Thanh Nguyen

Extraction of Myocardial Fibrosis Using Iterative Active Shape Method

The article deals with complex analysis of myocardial fibrosis. In clinical practice, myocardial fibrosis is commonly examined by MRI. This kind of disease is commonly assessed by human eyes. There isn’t any diagnostic software alternative for evaluation of myocardial fibrosis features. The proposed method partially solves this problem. The main intention is automatic extraction of fibrosis area. This area is represented by closed curve which reflects shape of analyzed object. At the beginning of algorithm, initial circle is set on the fibrosis area. In iterative steps, this circle adopts shape of pathologic lesion. Before execution of segmentation process it is needed to specify region of interest (RoI) and image preprocessing which comprises especially low pass filtration. Filtration process suppresses unwanted adjacent objects. This step is quite important because active shape method could spread out of fibrosis borders and resulting curve has wouldn’t reflect real shape of myocardial fibrosis.

Jan Kubicek, Iveta Bryjova, Marek Penhaker, Michal Kodaj, Martin Augustynek

Increasing the Efficiency of GPU-Based HOG Algorithms Through Tile-Images

Object detection systems which operate on large data streams require an efficient scaling with available computation power. We analyze how the use of tile-images can increase the efficiency (i.e. execution speed) of distributed HOG-based object detectors. Furthermore we discuss the challenges of using our developed algorithms in practical large scale scenarios. We show with a structured evaluation that our approach can provide a speed-up of 30-180 % for existing architectures. Due to the its generic formulation it can be applied to a wide range of HOG-based (or similar) algorithms. In this context we also study the effects of applying our method to an existing detector and discuss a scalable strategy for distributing the computation among nodes in a cluster system.

Darius Malysiak, Markus Markard

Gradient Depth Map Based Ground Plane Detection for Mobile Robot Applications

In the field of navigation and guidance for mobile robots utilizing stereo visual imagery, the main problem to be solved is to detect the ground plane in acquired images by a stereo camera system mounted on the mobile device. This paper focuses on effective detection of ground based on graphical analysis of the gradient depth map evaluated on the input depth map within a given window. The detected ground planes is further divided into blocks and then classified into ground or non-ground regions for elimination of false detected ground planes followed by smoothing in refinement process. This proposed approach also has been shown to be effective in detection of obstacles appearing in the ground plane too while the mobile device is moving. In addition, the algorithm is simple, reliable, feasible, and may be efficiently exploited for implementation in an embedded hardware with limited resources for real-time applications.

Dang Khanh Hoa, Pham The Cuong, Nguyen Tien Dzung

Multiscale Car Detection Using Oriented Gradient Feature and Boosting Machine

In many car detection method, the candidate regions which have variation in size and aspect ratio are mostly resized into a fixed size in order to extract the same length dimensional feature. However, this process reduces local object information due to interpolation problem. Thus, this paper addresses a solution to solve a such problem by using Scalable Histogram of Oriented Gradient (SHOG). The SHOG enables to extract fixed-length dimensional features for any size of region without resizing the region to a fixed size. In addition, instead of using high dimensional features in training stage, our proposal divides the feature into several low-dimensional sub-features. Each sub-feature is trained using SVM, called weak classifier. Boosting strategy is applied for combining the weak classifier results for constructing a strong classifier. By conducting comprehensive experiments, it is found that the accuracy of SHOG is higher than standard HOG as much as 3 % and 4 %, without and with boosting, respectively.

Wahyono, Van-Dung Hoang, Kang-Hyun Jo

Probabilistic Approach to Content-Based Indexing and Categorization of Temporally Aggregated Shots in News Videos

The most frequently stored and browsed videos in Web video collections, TV shows archives, documentary video archives, video-on-demand systems, personal video archives, etc. are broadcast news videos and sports news videos. Content-based indexing of news videos is based on the automatic detection of shots, i.e. of the main structural video units. Video shots can be of different categories such as intro and final animation, chart, diagram or table shots, anchor, reporter, statement, or interview shots, and finally the most informative report shots. The content analysis of a video shot is a very time-consuming process using specific strategy adequate for a given shot category. To analyse faster the content of videos it is desirable to reduce the video space analysed in time-consuming content-based indexing by using temporal aggregation. The temporal aggregation results in grouping of shots of the same event or the same category into scenes. Furthermore, the determination on the basis of time relations of the most likely category also reduces the analysis time enabling us to apply the adequate method of analysis as the first. The paper examines the usefulness of the time relations of shots to determine the most likely category of a shot and to optimize the order of applied strategies.

Kazimierz Choroś

A Method of Data Registration for 3D Point Clouds Combining with Motion Capture Technologies

Data Registration is one of the key techniques in 3D scanning. The traditional methods for data registration have some disadvantages which always need many calibration markers or other accessories. Those will greatly reduce the convenience and usability for the scanning systems, and more markers will covered the limited useful surface of the measured object. This paper proposed a new method to overcome these shortcomings. In the method, the 3D scanner and the motion capture device, which have completely different elements, are effectively combined as a whole system. The position and posture of the measured object can be optionally changed as wish. Mocap system guides the spatial localization for the measured object which has a high flexibility and precision. Dynamic motion data and the static scan data can be obtained in real-time by using the Mocap system and 3D scanner, respectively. In the final, heterogeneous spatial data will be converted to a same 3D space, and the parts of point clouds will be spliced to a whole 3D model. The experiments show that the method is valid.

Shicheng Zhang, Dongsheng Zhou, Qiang Zhang

Detection and Recognition of Speed Limit Sign from Video

The proper identification of speed limit traffic sighs can alarm the drivers the highest speed allowed and can effictively reduce the number of traffic accidents. In this paper, we put forward an efficient detection method for speed limit traffic signs based on the fast radial symmetry transform with new Sobel operator. when we detected the speed limit traffic sign, we need to segment the digits. Digit segmentation is achieved by cropping the candidate traffic sign from the traffic scene, making use of Otsu thresholding algorithm to binary it, and normalizing it to a uniform size. Finally we recognize and classify the signs using DAG-SVMs classifier which is trained for this purpose. In cloudy weather conditions and dusk illumination condition, we tested 10 videos about 28 min. The recognition rate of frames which contain speed limit sign is 90.48 %.

Lei Zhu, Chun-Sheng Yang, Jeng-Shyang Pan

FARTHEST: FormAl distRibuTed scHema to dEtect Suspicious arTefacts

Security breaches are a major concern by both governmental and corporative organisations. This is the case, among others, of airports and official buildings, where X-ray security scanners are deployed to detect elements representing a threat to the human life. In this paper we propose a formal distributed schema, formally specified and analysed, to detect suspicious artefacts. Our approach consists in the integration of several image detection algorithms in order to detect a wide spectrum of weapons, such as guns, knifes and bombs. Also, we present a case of study, where some performance experiments are carried out for analysing the scalability of this schema when it is deployed in current systems.

Pablo C. Cañizares, Mercedes G. Merayo, Alberto Núñez

A Fast and Robust Image Watermarking Scheme Using Improved Singular Value Decomposition

With the popularity of editing software and the Internet, digital content can easily be manipulated and distributed. Therefore, illegal reproduction of digital products became a real problem. Watermarking has been considered as an effective solution for copyright protection and authentication. However, watermarking schemes usually have encountered some difficulties, such as computational complexity, imperceptibility and robustness. In this paper, based on improved singular value decomposition (SVD), we proposed a new image watermarking scheme in order to reduce the computational complexity. To this end, we designed an algorithm to directly compute the largest eigenvalues and eigenvectors of the analyzed image segmented blocks. Moreover, an adaptive embedding technique was utilized to improve the robustness of the proposed scheme. Experimental results showed that the scheme is fast and good for digital image watermarking and it outperforms several widely used schemes in terms of robustness and imperceptibility.

Cao Thi Luyen, Nguyen Hieu Cuong, Pham Van At

Accelerative Object Classification Using Cascade Structure for Vision Based Security Monitoring Systems

Nowadays, object detection systems have achieved significant results, and applied in many important tasks such as security monitoring, surveillance systems, autonomous systems, human- machine interaction and so on. However, one of the most challenges is limitation of computational processing time. In order to deal with this task, a method for speed up processing time is investigated in this paper. The binary of cascaded structural model based detection method is applied for security monitoring systems (SMS). The classification based on cascade structure has been shown advance in extremely rapid discarding negative samples. The SMS is constructed based on two main techniques. First, a feature descriptor for representing data of image based on the modified Histograms of Oriented Gradients (HOG) method is applied. This feature description method allows extracting huge set of partial descriptors, then filtering to obtain only high-discriminated features on training set. Second, the cascade structure model based on the SVM kernel is used for rapidly binary classifying objects. In order taking advantage of optimal SVM classification, the local descriptor within each block is used to feed to SVM. The number of SVMs in each classifier is depended on the precision rate, which decided at the training step. The experimental results demonstrate the effectiveness of this method variety of dataset.

Van-Dung Hoang, Kang-Hyun Jo

Selections of Suitable UAV Imagery’s Configurations for Regions Classification

Unmanned Aerial Vehicles (UAV) are used to conduct a variety of recognition as well as specific missions such as target tracking, safe landing. The transmitted image sequences to be interpreted at ground station usually face limited requirements of the data transmission. In this paper, on one hand, we handle a surveillance mission with segmenting a UAV video’s content into semantic regions. We deploy a spatio-temporal framework that considers UAV videos specific characteristics for segmenting multi regions of interest. After post-processing steps on the segmentation results, a support vector machine classifier is used to recognize regions. In term of temporal feature, we combine the results from the previous frames by proposing to use a state transition formulating through a Markov model. On the other hand, this study also assesses the influences of data reduction techniques on the proposed techniques. The comparisons between the untreated configuration and control conditions under manipulations of the frame rate, spatial resolution, and compression ratio, demonstrate how these data reduction techniques adversely influence the algorithm’s performance. The experiments also point out the optimal configuration in order to obtain a trade-off between the target performance and limitation of the data transmission.

Hai Vu, Thi Lan Le, Van Giap Nguyen, Tan Hung Dinh

Backmatter

Title: Intelligent Information and Database Systems
Editors: Ngoc Thanh Nguyen
Bogdan Trawiński
Hamido Fujita
Tzung-Pei Hong
Publisher: Springer Berlin Heidelberg
Electronic ISBN: 978-3-662-49381-6
Print ISBN: 978-3-662-49380-9
DOI: https://doi.org/10.1007/978-3-662-49381-6

Springer Professional

About this book

Table of Contents

Frontmatter

Knowledge Engineering and Semantic Web

Frontmatter

A Novel Approach to Multimedia Ontology Engineering for Automated Reasoning over Audiovisual LOD Datasets

Finding Similar Clothes Based on Semantic Description for the Purpose of Fashion Recommender System

An Influence Analysis of the Inconsistency Degree on the Quality of Collective Knowledge for Objective Case

Knowledge Base Refinement with Gamified Crowdsourcing

Argumentation Framework for Merging Stratified Belief Bases

An Ontology-Based Knowledge Representation of MCDA Methods

Preliminary Evaluation of Multilevel Ontology Integration on the Concept Level

Temporal Ontology Representation and Reasoning Using Ordinals and Sets for Historical Events

Measuring Propagation Phenomena in Social Networks: Promising Directions and Open Issues

Social Networks and Recommender Systems

Frontmatter

Visualizing Learning Activities in Social Network

A Mobility Prediction Model for Location-Based Social Networks

Empirical Analysis of the Relationship Between Trust and Ratings in Recommender Systems

Text Processing and Information Retrieval

Frontmatter

Integrated Feature Selection Methods Using Metaheuristic Algorithms for Sentiment Analysis

Big Data in Contemporary Linguistic Research. In Search of Optimum Methods for Language Chronologization

Improving Twitter Aspect-Based Sentiment Analysis Using Hybrid Approach

Design of a Yoruba Language Speech Corpus for the Purposes of Text-to-Speech (TTS) Synthesis

Named Entity Recognition for Vietnamese Spoken Texts and Its Application in Smart Mobile Voice Interaction

Explorations of Prosody in Vietnamese Language

Identifying User Intents in Vietnamese Spoken Language Commands and Its Application in Smart Mobile Voice Interaction

A Method for Determining Representative of Ontology-Based User Profile in Personalized Document Retrieval Systems

Database Systems and Software Engineering

Frontmatter

Data Quality Scores for Pricing on Data Marketplaces

Extraction of Structural Business Rules from C#

Higher Order Mutation Testing to Drive Development of New Test Cases: An Empirical Comparison of Three Strategies

On the Relationship Between the Order of Mutation Testing and the Properties of Generated Higher Order Mutants

Intelligent Information Systems

Frontmatter

Responsive Web Design: Testing Usability of Mobile Web Applications

Person Name Disambiguation for Building University Knowledge Base

Improving Behavior Prediction Accuracy by Using Machine Learning for Agent-Based Simulation

A Model for Analysis and Design of Information Systems Based on a Document Centric Approach

MobiCough: Real-Time Cough Detection and Monitoring Using Low-Cost Mobile Devices

Database of Peptides Susceptible to Aggregation as a Tool for Studying Mechanisms of Diseases of Civilization

Using a Cloud Computing Telemetry Service to Assess PaaS Setups

Towards the Tradeoff Between Online Marketing Resources Exploitation and the User Experience with the Use of Eye Tracking

Using Cognitive Agents for Unstructured Knowledge Management in a Business Organization’s Integrated Information System

A Norm Assimilation Approach for Multi-agent Systems in Heterogeneous Communities

Knowledge in Asynchronous Social Group Communication

Decision Support and Control Systems

Frontmatter

Interpreted Petri Nets in DES Control Synthesis

Enhanced Guided Ejection Search for the Pickup and Delivery Problem with Time Windows

How to Generate Benchmarks for Rich Routing Problems?

Formal a Priori Power Analysis of Elements of a Communication Graph

Angiogenic Switch - Mixed Spatial Evolutionary Game Approach

Model Kidney Function in Stabilizing of Blood Pressure

Dynamic Diversity Population Based Flower Pollination Algorithm for Multimodal Optimization

Hardware Implementation of Fuzzy Petri Nets with Lukasiewicz Norms for Modelling of Control Systems

ALMM Solver for Combinatorial and Discrete Optimization Problems – Idea of Problem Model Library

Integration of Collective Knowledge in Financial Decision Support System

Framework for Product Innovation Using SOEKS and Decisional DNA

Common-Knowledge and KP-Model

Controllability of Semilinear Fractional Discrete Systems

Machine Learning and Data Mining

Frontmatter

On Fast Randomly Generation of Population of Minimal Phase and Stable Biquad Sections for Evolutionary Digital Filters Design Methods

Recursive Ensemble Land Cover Classification with Little Training Data and Many Classes

Treap Mining – A Comparison with Traditional Algorithm

SVM Based Lung Cancer Prediction Using microRNA Expression Profiling from NGS Data

Forecasting the Magnitude of Dengue in Southern Vietnam

Self-paced Learning for Imbalanced Data

A New Similarity Measure for Intuitionistic Fuzzy Sets

Multiple Kernel Based Collaborative Fuzzy Clustering Algorithm

Credit Risk Evaluation Using Cycle Reservoir Neural Networks with Support Vector Machines Readout

Fuzzy-Based Feature and Instance Recovery

An Enhanced Support Vector Machine for Faster Time Series Classification

Parallel Implementations of the Ant Colony Optimization Metaheuristic

A Segmented Artificial Bee Colony Algorithm Based on Synchronous Learning Factors

A Method for Query Top-K Rules from Class Association Rule Set