Skip to main content

2009 | Buch

Advances in Artificial Intelligence

22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25-27, 2009 Proceedings

herausgegeben von: Yong Gao, Nathalie Japkowicz

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

insite
SUCHEN

Über dieses Buch

This book constitutes the refereed proceedings of the 22st Conference of the Canadian Society for Computational Studies of Intelligence, Canadian AI 2009, held in Windsor, Canada, in May 2008. The 30 revised full papers presented together with 5 revised short papers and 8 papers from the graduate student symposium were carefully reviewed and selected from 75 submissions. The papers present original high-quality research in all areas of Artificial Intelligence and apply historical AI techniques to modern problem domains as well as recent techniques to historical problem settings.

Inhaltsverzeichnis

Frontmatter

Invited Talks

AI in Web Advertising: Picking the Right Ad Ten Thousand Times a Second

Online advertising is the primary economic force behind many Internet services ranging from major Web search engines to obscure blogs. A successful advertising campaign should be integral to the user experience and relevant to their information needs as well as economically worthwhile to the advertiser and the publisher. This talk will cover some of the methods and challenges of computational advertising, a new scientific discipline that studies advertising on the Internet. At first approximation, and ignoring the economic factors above, finding user-relevant ads can be reduced to conventional information retrieval. However, since both queries and ads are quite short, it is essential to augment the matching process with external knowledge. We demonstrate how to enrich query representation using Web search results, and thus use the Web as a repository of relevant query-specific knowledge. We will discuss how computational advertising benefits from research in many AI areas such as machine learning, machine translation, and text summarization, and also survey some of the new problems it poses in natural language generation, named entity recognition, and user modeling.

Evgeniy Gabrilovich
Living with Constraints

In order to thrive, an agent must satisfy dynamic constraints deriving from four sources: its internal structure, its goals and preferences, its external environment and the coupling between its internal and external worlds. The life of any agent who does not respect those constraints will be out of balance. Based on this framing of the problem of agent design, I shall give four perspectives on the theme of living with constraints, beginning with a theory of constraint-based agent design and a corresponding experiment in robot architecture. Second, I shall touch briefly on a personal historical note, having lived with the evolving concept of the pivotal role of constraints throughout my research life. Third, I shall outline our work on the design of two assistive technology prototypes for people with physical and mental disabilities, who are living with significant additional constraints. Finally, I shall suggest our collective failure to recognize, satisfy and live with various constraints could explain why many of the worlds we live in seem to be out of kilter. This approach hints at ways to restore the balance. Some of the work discussed is joint with Jim Little, Alex Mihailidis, Pinar Muyan-Ozçelik, Robert St-Aubin, Pooja Viswanathan, Suling Yang, and Ying Zhang.

Alan K. Mackworth
Computer (and Human) Perfection at Checkers

In 1989 the Chinook project began with the goal of winning the human World Checkers Championship. There was an imposing obstacle to success ?the human champion, Marion Tinsley. Tinsley was as close to perfection at the game as was humanly possible. To be better than Tinsley meant that the computer had to be perfect. In effect, one had to solve checkers. Little did we know that our quest would take 18 years to complete. What started out as a research project quickly became a personal quest and an emotional roller coaster. In this talk, the creator of Chinook tells the story of the quest for computer perfection at the game of checkers.

Jonathan Schaeffer

Regular Papers

Decision Tree Learning Using a Bayesian Approach at Each Node

We explore the problem of learning decision trees using a Bayesian approach, called TREBBLE (TREe Building by Bayesian LE- arning), in which a population of decision trees is generated by constructing trees using probability distributions at each node. Predictions are made either by using Bayesian Model Averaging to combine information from all the trees (TREBBLE-BMA) or by using the single most likely tree (TREBBLE-MAP), depending on what is appropriate for the particular application domain. We show on benchmark data sets that this method is more accurate than the traditional decision tree learning algorithm C4.5 and is as accurate as the Bayesian method SimTree while being much simpler to understand and implement.

In many application domains, such as help-desks and medical diagnoses, a decision tree needs to be learned from a prior tree (provided by an expert) and some (usually small) amount of training data. We show how TREBBLE-MAP can be used to learn a single tree that performs better than using either the prior tree or the training data alone.

Mirela Andronescu, Mark Brodie
Generating Satisfiable SAT Instances Using Random Subgraph Isomorphism

We report preliminary empirical results on Generating Satisfiable SAT instances using a variation of the Random Subgraph Isomorphism model. The experiments show that the model exhibits an easy-hard-easy pattern of empirical hardness. For both complete and incomplete solvers the hardness of the instances at the peak seems to increase exponentially with the instance size. The hardness of the instances generated by the model appears to be comparable with that of Quasigroup with Holes instances, known to be hard for Satisfiability solvers. A handful of state of the art SAT solvers we tested have different performances with respect to each other, when applied to these instances.

Cǎlin Anton, Lane Olson
Enhancing the Bilingual Concordancer TransSearch with Word-Level Alignment

Despite the impressive amount of recent studies devoted to improving the state of the art of Machine Translation (MT), Computer Assisted Translation (CAT) tools remain the preferred solution of human translators when publication quality is of concern. In this paper, we present our perspectives on improving the commercial bilingual concordancer

TransSearch

, a Web-based service whose core technology mainly relies on sentence-level alignment. We report on experiments which show that it can greatly benefit from statistical word-level alignment.

Julien Bourdaillet, Stéphane Huet, Fabrizio Gotti, Guy Lapalme, Philippe Langlais
Financial Forecasting Using Character N-Gram Analysis and Readability Scores of Annual Reports

Two novel Natural Language Processing (NLP) classification techniques are applied to the analysis of corporate annual reports in the task of financial forecasting. The hypothesis is that textual content of annual reports contain vital information for assessing the performance of the stock over the next year. The first method is based on character n-gram profiles, which are generated for each annual report, and then labeled based on the CNG classification. The second method draws on a more traditional approach, where readability scores are combined with performance inputs and then supplied to a support vector machine (SVM) for classification. Both methods consistently outperformed a benchmark portfolio, and their combination proved to be even more effective and efficient as the combined models yielded the highest returns with the fewest trades.

Matthew Butler, Vlado Kešelj
Statistical Parsing with Context-Free Filtering Grammar

Statistical parsers that simultaneously generate both phrase-structure and lexical dependency trees have been limited to date in two important ways: detecting non-projective dependencies has not been integrated with other parsing decisions, and/or the constraints between phrase-structure and dependency structure have been overly strict. We introduce context-free filtering grammar as a generalization of a lexicalized factored parsing model, and develop a scoring model to resolve parsing ambiguities for this new grammar formalism. We demonstrate the new model’s flexibility by implementing a statistical parser for German, a freer-word-order language exhibiting a mixture of projective and non-projective syntax, using the TüBa-D/Z treebank [1].

Michael Demko, Gerald Penn
Machine Translation of Legal Information and Its Evaluation

This paper presents the machine translation system known as TransLI (

Trans

lation of

L

egal

I

nformation) developed by the authors for automatic translation of Canadian Court judgments from English to French and from French to English. Normally, a certified translation of a legal judgment takes several months to complete. The authors attempted to shorten this time significantly using a unique statistical machine translation system which has attracted the attention of the federal courts in Canada for its accuracy and speed. This paper also describes the results of a human evaluation of the output of the system in the context of a pilot project in collaboration with the federal courts of Canada.

Atefeh Farzindar, Guy Lapalme
An Iterative Hybrid Filter-Wrapper Approach to Feature Selection for Document Clustering

The manipulation of large-scale document data sets often involves the processing of a wealth of features that correspond with the available terms in the document space. The employment of all these features in the learning machine of interest is time consuming and at times reduces the performance of the learning machine. The feature space may consist of many redundant or non-discriminant features; therefore, feature selection techniques have been widely used. In this paper, we introduce a hybrid feature selection algorithm that selects features by applying both filter and wrapper methods in a hybrid manner, and iteratively selects the most competent set of features with an expectation maximization based algorithm. The proposed method employs a greedy algorithm for feature selection in each step. The method has been tested on various data sets whose results have been reported in this paper. The performance of the method both in terms of accuracy and Normalized Mutual Information is promising.

Mohammad-Amin Jashki, Majid Makki, Ebrahim Bagheri, Ali A. Ghorbani
Cost-Based Sampling of Individual Instances

In many practical domains, misclassification costs can differ greatly and may be represented by class ratios, however, most learning algorithms struggle with skewed class distributions. The difficulty is attributed to designing classifiers to maximize the accuracy. Researchers call for using several techniques to address this problem including; under-sampling the majority class, employing a probabilistic algorithm, and adjusting the classification threshold. In this paper, we propose a general sampling approach that assigns weights to individual instances according to the cost function. This approach helps reveal the relationship between classification performance and class ratios and allows the identification of an appropriate class distribution for which, the learning method achieves a reasonable performance on the data. Our results show that combining an ensemble of Naive Bayes classifiers with threshold selection and under-sampling techniques works well for imbalanced data.

William Klement, Peter Flach, Nathalie Japkowicz, Stan Matwin
Context Dependent Movie Recommendations Using a Hierarchical Bayesian Model

We use a hierarchical Bayesian approach to model user preferences in different contexts or settings. Unlike many previous recommenders, our approach is content-based. We assume that for each context, a user has a different set of preference weights which are linked by a common, “generic context” set of weights. The approach uses Expectation Maximization (EM) to estimate both the generic context weights and the context specific weights. This improves upon many current recommender systems that do not incorporate context into the recommendations they provide. In this paper, we show that by considering contextual information, we can improve our recommendations, demonstrating that it is useful to consider context in giving ratings. Because the approach does not rely on connecting users via collaborative filtering, users are able to interpret contexts in different ways and invent their own contexts.

Daniel Pomerantz, Gregory Dudek
Automatic Frame Extraction from Sentences

We present a method for automatic extraction of frames from a dependency graph. Our method uses machine learning applied to a dependency tree to identify frames and assign frame elements. The system is evaluated by cross-validation on FrameNet sentences, and also on the test data from the SemEval 2007 task 19. Our system is intended for use in natural language processing applications such as summarization, entailment, and novelty detection.

Martin Scaiano, Diana Inkpen
Control of Constraint Weights for a 2D Autonomous Camera

This paper addresses the problem of deducing and adjusting constraint weights at run time to guide the movement of the camera in an informed and controlled way in response to the requirements of the shot. This enables the control of weights at the frame level. We analyze the mathematical representation of the cost structure of the search domain so that the constraint solver can search the domain efficiently. Here we consider a simple tracking shot of a single target without occlusion or other environment elements. In this paper we consider only the distance, orientation, frame coherence distance and frame coherence rotation constraints in 2D. The cost structure for 2D suggests the use of a binary search to find the solution camera position.

Md. Shafiul Alam, Scott D. Goodwin
Training Global Linear Models for Chinese Word Segmentation

This paper examines how one can obtain state of the art Chinese word segmentation using global linear models. We provide experimental comparisons that give a detailed road-map for obtaining state of the art accuracy on various datasets. In particular, we compare the use of reranking with full beam search; we compare various methods for learning weights for features that are full sentence features, such as language model features; and, we compare an Averaged Perceptron global linear model with the Exponentiated Gradient max-margin algorithm.

Dong Song, Anoop Sarkar
A Concurrent Dynamic Logic of Knowledge, Belief and Certainty for Multi-agent Systems

This paper extends the logic of Knowledge, Belief and Certainty from one agent to multi-agent systems, and gives a good combination between logic of knowledge, belief, certainty in multi-agent systems and actions that have concurrent and dynamic properties. Based on it, we present a concurrent dynamic logic of knowledge, belief and certainty for MAS, which is called CDKBC logic. Furthermore, a

CDKBC

model is given for interpreting this logic. We construct a

CDKBC

proof system for the logic and show the proof system is sound and complete, and prove the validity problem for the system is EXPTIME-complete.

Lijun Wu, Jinshu Su, Xiangyu Luo, Zhihua Yang, Qingliang Chen
Enumerating Unlabeled and Root Labeled Trees for Causal Model Acquisition

To specify a Bayes net (BN), a conditional probability table (CPT), often of an effect conditioned on its

n

causes, needs to be assessed for each node. It generally has the complexity exponential on

n

. The non-impeding noisy-AND (NIN-AND) tree is a recently developed causal model that reduces the complexity to linear, while modeling both reinforcing and undermining interactions among causes. Acquisition of an NIN-AND tree model involves elicitation of a linear number of probability parameters and a tree structure. Instead of asking the human expert to describe the structure from scratch, in this work, we develop a two-step menu selection technique that aids structure acquisition.

Yang Xiang, Zoe Jingyu Zhu, Yu Li
Compiling the Lexicographic Inference Using Boolean Cardinality Constraints

This paper sheds light on the lexicographic inference from stratified belief bases which is known to have desirable properties from theoretical, practical and psychological points of view. However, this inference is expensive from the computational complexity side. Indeed, it amounts to a

$\Delta_2^p$

-complete problem. In order to tackle this hardness, we propose in this work a new compilation of the lexicographic inference using the so-called Boolean cardinality constraints. This compilation enables a polynomial time lexicographic inference and offers the possibility to update the priority relation between the strata without any re-compilation. Moreover, it can be efficiently extended to deal with the lexicographical closure inference which takes an important place in default reasoning. Furthermore, unlike the existing compilation approaches of the lexicographic inference, ours can be efficiently parametrized by any target compilation language. In particular, it enables to take advantage of the well-known prime implicates language which has been quite influential in artificial intelligence and computer science in general.

Safa Yahi, Salem Benferhat

Short Papers

Improving Document Search Using Social Bookmarking

During the last decade, the use of community-based techniques has emerged in various data mining and search systems. Nowadays, many web search engines use social networking analysis to improve the search results. The present work incorporates one of the popular collaborative tools, called Social Bookmarking, into search. In the present paper, a technique, which utilizes Social Bookmarking information into search, is discussed.

Hamidreza Baghi, Yevgen Biletskiy
Rank-Based Transformation in Measuring Semantic Relatedness

Rank weight functions had been shown to increase the accuracy of measures of semantic relatedness for Polish. We present a generalised ranking principle and demonstrate its effect on a range of established measures of semantic relatedness, and on a different language. The results confirm that the generalised transformation method based on ranking brings an improvement over several well-known measures.

Bartosz Broda, Maciej Piasecki, Stan Szpakowicz
Optimizing a Pseudo Financial Factor Model with Support Vector Machines and Genetic Programming

We compare the effectiveness of Support Vector Machines (SVM) and Tree-based Genetic Programming (GP) to make accurate predictions on the movement of the Dow Jones Industrial Average (DJIA). The approach is facilitated though a novel representation of the data as a pseudo financial factor model, based on a linear factor model for representing correlations between the returns in different assets. To demonstrate the effectiveness of the data representation the results are compared to models developed using only the monthly returns of the inputs. Principal Component Analysis (PCA) is initially used to translate the data into PC space to remove excess noise that is inherent in financial data. The results show that the algorithms were able to achieve superior investment returns and higher classification accuracy with the aid of the pseudo financial factor model. As well, both models outperformed the market benchmark, but ultimately the SVM methodology was superior in terms of accuracy and investment returns.

Matthew Butler, Vlado Kešelj
Novice-Friendly Natural Language Generation Template Authoring Environment

Natural Language Generation (NLG) systems can make data accessible in an easily digestible textual form; but using such systems requires sophisticated linguistic and sometimes even programming knowledge. We have designed and implemented an environment for creating and modifying NLG templates that requires no programming knowledge, and can operate with a minimum of linguistic knowledge. It allows specifying templates with any number of variables and dependencies between them. It internally uses SimpleNLG to provide the linguistic background knowledge. We test the performance of our system in the context of an interactive simulation game.

Maria Fernanda Caropreso, Diana Inkpen, Shahzad Khan, Fazel Keshtkar
A SVM-Based Ensemble Approach to Multi-Document Summarization

In this paper, we present a Support Vector Machine (SVM) based ensemble approach to combat the extractive multi-document summarization problem. Although SVM can have a good generalization ability, it may experience a performance degradation through wrong classifications. We use a committee of several SVMs, i.e. Cross-Validation Committees (CVC), to form an ensemble of classifiers where the strategy is to improve the performance by correcting errors of one classifier using the accurate output of others. The practicality and effectiveness of this technique is demonstrated using the experimental results.

Yllias Chali, Sadid A. Hasan, Shafiq R. Joty
Co-Training on Handwritten Digit Recognition

In this paper, we apply a semi-supervised learning paradigm — co-training to handwritten digit recognition, so as to construct high-performance recognition model with very few labeled images. Experimental results show that, based on arbitrary two types of given features, co-training can always achieve high accuracy. Thus, it provides a generic and robust approach to construct high performance model with very few labeled handwritten digit images.

Jun Du, Charles X. Ling
Evaluation Methods for Ordinal Classification

Ordinal classification is a form of multi-class classification where there is an inherent ordering between the classes, but not a meaningful numeric difference between them. Little attention has been paid as to how to evaluate these problems, with many authors simply reporting accuracy, which does not account for the severity of the error. Several evaluation metrics are compared across a dataset for a problem of classifying user reviews, where the data is highly skewed towards the highest values. Mean squared error is found to be the best metric when we prefer more (smaller) errors overall to reduce the number of large errors, while mean absolute error is also a good metric if we instead prefer fewer errors overall with more tolerance for large errors.

Lisa Gaudette, Nathalie Japkowicz
STFLS: A Heuristic Method for Static and Transportation Facility Location Allocation in Large Spatial Datasets

This paper solves a static and transportation facility location allocation problem defined as follows: given a set of locations

Loc

and a set of demand objects

D

located in

Loc

, the goal is to allocate a set of static facilities

S

and a set of transportation facilities

T

to the locations in

Loc

, which minimizes both the average travelling distance from

D

to

S

and the maximum transportation travelling distance between

D

and

S

through

T

. The problem is challenging because two types of facilities are involved and cooperate with each other. In this paper, we propose a static and transportation facility location allocation algorithm, called STFLS, to solve the problem. The method uses two steps of searching for static facility and transportation facility locations Experiments demonstrate the efficiency and practicality of the algorithm.

Wei Gu, Xin Wang, Liqiang Geng
An Ontology-Based Spatial Clustering Selection System

Spatial clustering, which groups similar spatial objects into classes, is an important research topic in spatial data mining. Many spatial clustering methods have been developed recently. However, many users do not know how to choose the most suitable spatial clustering method to implement their own projects due to lack of expertise in the area. In order to reduce the difficulties of choosing, linking and executing appropriate programs, we build a spatial clustering ontology to formalize a set of concepts and relationships in the spatial clustering domain. Based on the spatial clustering ontology, we implement an ontology-based spatial clustering selection system (OSCS) to guide users selecting an appropriate spatial clustering algorithm. The system consists of the following parts: a spatial clustering ontology, an ontology reasoner using a task-model, a web server and a user interface. Preliminary experiments have been conducted to demonstrate the efficiency and practicality of the system.

Wei Gu, Xin Wang, Danielle Ziébelin
Exploratory Analysis of Co-Change Graphs for Code Refactoring

Version Control Systems (VCS) have always played an essential role for developing reliable software. Recently, many new ways of utilizing the information hidden in VCS have been discovered. Clustering layouts of software systems using VCS is one of them. It reveals groups of related artifacts of the software system, which can be visualized for easier exploration. In this paper we use an Expectation Maximization (EM) based probabilistic clustering algorithm and visualize the clustered modules using a compound node layout algorithm. Our experiments with repositories of two medium size software tools give promising results indicating improvements over many previous approaches.

Hassan Khosravi, Recep Colak
Classifying Biomedical Abstracts Using Committees of Classifiers and Collective Ranking Techniques

The purpose of this work is to reduce the workload of human experts in building systematic reviews from published articles, used in evidence-based medicine. We propose to use a committee of classifiers to rank biomedical abstracts based on the predicted relevance to the topic under review. In our approach, we identify two subsets of abstracts: one that represents the top, and another that represents the bottom of the ranked list. These subsets, identified using machine learning (ML) techniques, are considered zones where abstracts are labeled with high confidence as relevant or irrelevant to the topic of the review. Early experiments with this approach using different classifiers and different representation techniques show significant workload reduction.

Alexandre Kouznetsov, Stan Matwin, Diana Inkpen, Amir H. Razavi, Oana Frunza, Morvarid Sehatkar, Leanne Seaward, Peter O’Blenis
Large Neighborhood Search Using Constraint Satisfaction Techniques in Vehicle Routing Problem

Vehicle Routing Problem(VRP) is a well known NP-hard problem, where an optimal solution to the problem cannot be achieved in reasonable time, as the problem size increases. Due to this problem, many researchers have proposed a heuristic using a local search. In this paper, we propose the Constraint Satisfaction Problem(CSP) model in Large Neighborhood Search. This model enables us to reduce the size of local search space. In addition, it enables the easy handling of many constraints in the real-world.

Hyun-Jin Lee, Sang-Jin Cha, Young-Hoon Yu, Geun-Sik Jo
Valuable Change Detection in Keyword Map Animation

This paper proposes a map animation interface that supports interpretations of difference between two keyword relationships with varying viewpoints. Finding keywords whose relationships drastically change is crucial because the value of keywords mainly consists inof their relationships in networks. Therefore, this interface outputs marks for keywords that drastically change in their relations when it shows animations between two relationships.

Takuya Nishikido, Wataru Sunayama, Yoko Nishihara
The WordNet Weaver: Multi-criteria Voting for Semi-automatic Extension of a Wordnet

The WordNet Weaver application supports extension of a new wordnet. One of its functions is to suggest lexical units semantically close to a given unit. Suggestions arise from activation-area attachment – multi-criteria voting based on several algorithms that score semantic relatedness. We present the contributing algorithms and the method of combining them. Starting from a manually constructed core wordnet and a list of over 1000 units to add, we observed a linguist at work.

Maciej Piasecki, Bartosz Broda, Michał Marcińczuk, Stan Szpakowicz
Active Learning with Automatic Soft Labeling for Induction of Decision Trees

Decision trees have been widely used in many data mining applications due to their interpretable representation. However, learning an accurate decision tree model often requires a large amount of labeled training data. Labeling data is costly and time consuming. In this paper, we study learning decision trees with lesser labeling cost from two perspectives: data quality and data quantity. At each step of active learning process we learn a random forest and then use it to label a large quantity of unlabeled data. To overcome the large tree size caused by the machine labeling, we generate weighted (soft) labeled data using the prediction confidence of the labeling classifier. Empirical studies show that our method can significantly improve active learning in terms of labeling cost for decision tree learning, and the improvement does not sacrifice the size of decision trees.

Jiang Su, Sayyad Shirabad Jelber, Stan Matwin, Jin Huang
A Procedural Planning System for Goal Oriented Agents in Games

This paper explores a procedural planning system for the control of game agents in real time. The planning methodology we present is based on offline goal-oriented behavioral design, and is implemented as a procedural planning system in real-time. Our design intends to achieve efficiency of planning in order to have smooth run-time animation generation. Our experimental results show that the approach is capable of improving agent control for real-time goal processing in games.

Yingying She, Peter Grogono
An Empirical Study of Category Skew on Feature Selection for Text Categorization

In this paper, we present an empirical comparison of the effects of category skew on six feature selection methods. The methods were evaluated on 36 datasets generated from the 20 Newsgroups, OHSUMED, and Reuters-21578 text corpora. The datasets were generated to possess particular category skew characteristics (i.e., the number of documents assigned to each category). Our objective was to determine the best performance of the six feature selection methods, as measured by F-measure and Precision, regardless of the number of features needed to produce the best performance. We found the highest F-measure values were obtained by bi-normal separation and information gain and the highest Precision values were obtained by categorical proportional difference and chi-squared.

Mondelle Simeon, Robert Hilderman
Opinion Learning without Emotional Words

This paper shows that a detailed, although non-emotional, description of event or an action can be a reliable source for learning opinions. Empirical results illustrate the practical utility of our approach and its competitiveness in comparison with previously used methods.

Marina Sokolova, Guy Lapalme
Belief Rough Set Classifier

In this paper, we propose a new rough set classifier induced from partially uncertain decision system. The proposed classifier aims at simplifying the uncertain decision system and generating more significant belief decision rules for classification process. The uncertainty is reperesented by the belief functions and exists only in the decision attribute and not in condition attribute values.

Salsabil Trabelsi, Zied Elouedi, Pawan Lingras

Graduate Student Symposium

Automatic Extraction of Lexical Relations from Analytical Definitions Using a Constraint Grammar

In this paper, we present the preliminary results of a rule-based method for identifying and tagging elements in a hyponymy-hypernymy lexical relation for Spanish. As a starting point, we take into account the role that verbal patterns have in analytical definitions. Such patterns connect each term with its possible genus. We built a constraint grammar based on the results of a syntactic parser. The results demonstrate that constraint grammars provide an efficient mechanism for doing more precise analysis of syntactic structures following a regular pattern.

Olga Acosta
Grid-Enabled Adaptive Metamodeling and Active Learning for Computer Based Design

Many complex, real world phenomena are difficult to study directly using controlled experiments. Instead, the use of computer simulations has become commonplace as a feasible alternative. However, due to the computational cost of these high fidelity simulations, the use of neural networks, kernel methods, and other surrogate modeling techniques have become indispensable. Surrogate models are compact and cheap to evaluate, and have proven very useful for tasks such as optimization, design space exploration, prototyping, and sensitivity analysis. Consequently, in many scientific fields there is great interest in techniques that facilitate the construction of such regression models, while minimizing the computational cost and maximizing model accuracy. This paper presents a fully automated machine learning toolkit for regression modeling and active learning to tackle these issues. A strong focus is placed on adaptivity, self-tuning and robustness in order to maximize efficiency and make the algorithms and tools easily accessible to other scientists in computational science and engineering.

Dirk Gorissen
Reasoning about Movement in Two-Dimensions

Reasoning about movement in two-dimensions can be broken down to depth and spatial relations between objects. This paper covers previous work based on the formal logic descriptions in situational calculus. Also covered is a simulator that produces data for a logical reasoner to process and potentially make decisions about motion in two-dimensions.

Joshua Gross
Executable Specifications of Fully General Attribute Grammars with Ambiguity and Left-Recursion

A top-down parsing algorithm has been constructed to accommodate any form of ambiguous context-free grammar, augmented with semantic rules with arbitrary attribute dependencies. A memoization technique is used with this non-strict method for efficiently processing ambiguous input. This one-pass approach allows Natural Language (NL) processors to be constructed as executable, modular and declarative specifications of Attribute Grammars.

Rahmatullah Hafiz
${\cal K}$ - ${\cal MORPH}$ : A Semantic Web Based Knowledge Representation and Context-Driven Morphing Framework

A knowledge-intensive problem is often not solved by an individual knowledge artifact; rather the solution needs to draw upon multiple, and even heterogeneous, knowledge artifacts. The synthesis of multiple knowledge artifacts to derive a ‘comprehensive’ knowledge artifact is a non-trivial problem. We discuss the need of knowledge morphing, and propose a Semantic Web based framework

${\cal K}$

-

${\cal MORPH}$

for deriving a context-driven integration of multiple knowledge artifacts.

Sajjad Hussain
Background Knowledge Enriched Data Mining for Interactome Analysis

In recent years amount of new information generated by biological experiments keeps growing. High-throughput techniques have been developed and now are widely used to screen biological systems at genome wide level. Extracting structured knowledge from amounts of experimental information is a major challenge to bioinformatics. In this work we propose a novel approach to analyze protein interactome data. The main goal of our research is to provide a biologically meaningful explanation for the phenomena captured by high-throughput screens. We propose to reformulate several interactome analysis problems as classification problems. Consequently, we develop a transparent classification model which while perhaps sacrificing some accuracy, minimizes the amount of routine, trivial and inconsequential reasoning that must be done by a human expert. The key to designing a transparent classification model that can be easily understood by a human expert is the use of the Inductive Logic Programming approach coupled with significant involvement of background knowledge into the classification process.

Mikhail Jiline
Modeling and Inference with Relational Dynamic Bayesian Networks

The explicit recognition of the relationships between interacting objects can improve the understanding of their dynamic domain. In this work, we investigate the use of Relational Dynamic Bayesian Networks to represent the dependencies between the agents’ behaviors in the context of multi-agents tracking. We propose a new formulation of the transition model that accommodates for relations and we extend the Particle Filter algorithm in order to directly track relations between the agents.

Many applications can benefit from this work, including terrorist activities recognition, traffic monitoring, strategic analysis and sports.

Cristina Manfredotti
A Semi-supervised Approach to Bengali-English Phrase-Based Statistical Machine Translation

Large amounts of bilingual data and monolingual data in the target language are usually used to train statistical machine translation systems. In this paper we propose several semi-supervised techniques within a Bengali English Phrase-based Statistical Machine Translation (SMT) System in order to improve translation quality. We conduct experiments on a Bengali-English dataset and our initial experimental results show improvement in translation quality.

Maxim Roy
Backmatter
Metadaten
Titel
Advances in Artificial Intelligence
herausgegeben von
Yong Gao
Nathalie Japkowicz
Copyright-Jahr
2009
Verlag
Springer Berlin Heidelberg
Electronic ISBN
978-3-642-01818-3
Print ISBN
978-3-642-01817-6
DOI
https://doi.org/10.1007/978-3-642-01818-3