Top

2012 | Book

Read chapter Read first chapter

Formal Concept Analysis

10th International Conference, ICFCA 2012, Leuven, Belgium, May 7-10, 2012. Proceedings

Editors: Florent Domenach, Dmitry I. Ignatov, Jonas Poelmans

Publisher: Springer Berlin Heidelberg

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 10th International Conference on Formal Concept Analysis, ICFCA 2012, held in Leuven, Belgium in May 2012. The 20 revised full papers presented together with 6 invited talks were carefully reviewed and selected from 68 submissions. The topics covered in this volume range from recent advances in machine learning and data mining; mining terrorist networks and revealing criminals; concept-based process mining; to scalability issues in FCA and rough sets.

Frontmatter

Invited Talks

Dark Web: Exploring and Mining the Dark Side of the Web

Abstract

This talk will review the emerging research in Terrorism Informatics based on a web mining perspective. Recent progress in the internationally renowned Dark Web project will be reviewed, including: deep/dark web spidering (web sites, forums, Youtube, virtual worlds), web metrics analysis, dark network analysis, web-based authorship analysis, and sentiment and affect analysis for terrorism tracking. In collaboration with selected international terrorism research centers and intelligence agencies, the Dark Web project has generated one of the largest databases in the world about extremist/terrorist-generated Internet contents (web sites, forums, blogs, and multimedia documents). Dark Web research has received significant international press coverage, including: Associated Press, USA Today, The Economist, NSF Press, Washington Post, Fox News, BBC, PBS, Business Week, Discover magazine, WIRED magazine, Government Computing Week, Second German TV (ZDF), Toronto Star, and Arizona Daily Star, among others. For more Dark Web project information, please see: http://ai.eller.arizona.edu/research/terror/ .

Hsinchun Chen

Declarative Modeling for Machine Learning and Data Mining

Abstract

Despite the popularity of machine learning and data mining today, it remains challenging to develop applications and software that incorporates machine learning or data mining techniques. This is because machine learning and data mining have focussed on developing high-performance algorithms for solving particular tasks rather than on developing general principles and techniques.

I propose to alleviate these problems by applying the constraint programming methodology to machine learning and data mining and to specify machine learning and data mining problems as constraint satisfaction and optimization problems. What is essential is that the user be provided with a way to declaratively specify what the machine learning or data mining problem is rather than having to outline how that solution needs to be computed. This corresponds to a model + solver-based approach to machine learning and data mining, in which the user specifies the problem in a high level modeling language and the system automatically transforms such models into a format that can be used by a solver to efficiently generate a solution. This should be much easier for the user than having to implement or adapt an algorithm that computes a particular solution to a specific problem.

I shall illustrate this using our results on constraint programming for itemset mining [1] and probabilistic programming. Some further ideas along these lines are contained in [2].

Luc De Raedt

Can Concepts Reveal Criminals?

Abstract

In 2005 the Amsterdam-Amstelland police introduced Intelligence-led Policing as a management paradigm. The goal of ILP is to optimally use the information which becomes available after police patrols, motor vehicle inspections, video camera recordings, etc. to prevent crimes where possible and optimally allocate available resources. This policy has resulted in an increasing number of textual reports, video materials, etc. every year. Until now, this vast amount of information was not easily accessible because good analysis methods were missing and as a consequence hardly used by the criminal intelligence departments. In the first part of this talk I will give a short overview of traditional statistical methods such as hot spot analysis which have been used to make this information accessible and steer police actions. In the second part of this talk I will present using some real life cases how FCA was used to identify criminals involved in human trafficking, terrorism, robberies, etc. In the third part of this talk I would like to evoke a lively discussion on the potential of FCA related algorithms and methods for analyzing textual reports, internet data such as twitter feeds, browsing behavior of visitors of radical Islamic websites, etc.

Paul Elzinga

Cartification: From Similarities to Itemset Frequencies

Abstract

Suppose we are given a multi-dimensional dataset. For every point in the dataset, we create a transaction, or cart, in which we store the k-nearest neighbors of that point for one of the given dimensions. The resulting collection of carts can then be used to mine frequent itemsets; that is, sets of points that are frequently seen together in some dimensions. Experimentation shows that finding clusters, outliers, cluster centers, or even subspace clustering becomes easy on the cartified dataset using state-of-the-art techniques in mining interesting itemsets.

Bart Goethals

Processes Are Concepts, Aren’t They?

Abstract

Discovery is an information / technical approach to the important managerial problem of decision making under not only uncertainty, but also actually, “unknown unknowns”. Formal Concept Analysis (FCA) is an elegant mathematically grounded theory that complements Data Discovery particularly well, especially for data with no predefined structure. It allows for discovering emergent, a-priori unknown concepts in these data. Process Discovery deals with emergent behaviour in business workflows. In its current state-of-the-art, Process Discovery combines machine-learning techniques utilizing Hidden Markov Model (HMM) representations of Processes. In one of our research lines, we investigate how FCA can improve and complement Process Discovery. Although the inclusion of temporal data and events in FCA allows for the creation of ”early warning” and trend-models, HMM’s are needed for a deep understanding of the processes and their intrinsic complexities. However, FCA can assist significantly in the post-processing and understanding of HMM’s, in particular in the presence of process variations and exceptions. But FCA allows also for the detection of recurrent, coherent Process steps, which can be considered ”service steps” in business processes. Ultimately, an appropriate mathematical representation of HMM’s should allow for the application of algebraic extensions of FCA for discovering Processes and their variations as mathematical concepts as well. Some initial work on Process patterns gives inspiring research directions. Real-life case materials from Healthcare Administration, Customer Contact-Center’s and Financial Services illustrate this keynote lecture.

Ir. Edward Peters

Rough Sets and FCA – Scalability Challenges

Abstract

Rough Sets (RS) [1,2,3] and Formal Concept Analysis (FCA) [4,5] provide foundations for a number of methods useful in data mining and knowledge discovery at different stages of data preprocessing, classification and representation. RS and FCA are often applied together with other techniques in order to cope with real-world challenges. It is therefore important to investigate various ways of extending RS/FCA notions and algorithms in order to facilitate dealing with truly large and complex data. This talk attempts to categorize some ideas of how to scale RS and FCA methods with respect to a number of objects and attributes, as well as types and cardinalities of attribute values. We discuss a usage of analytical database engines [6] and randomized heuristics [7] to compute approximate, yet meaningful results. We also discuss differences and similarities in algorithmic bottlenecks related to RS and FCA, illustrating that these approaches should be regarded as complementary rather than competing methodologies. As a case study, we consider the tasks of data analysis and knowledge representation arising within a research project aiming at enhancing semantic search of diverse types of content in a large repository of scientific articles [8].

Dominik Ślęzak

Regular Papers

Approximating Concept Stability

Abstract

Concept stability was used in numerous applications for selecting concepts as biclusters of similar objects. However, scalability remains a challenge for computing stability. The best algorithms known so far have algorithmic complexity quadratic in the size of the lattice. In this paper the problem of approximate stability computation is analyzed. An approximate algorithm for computing stability is proposed. Its computational complexity and results of computer experiments in comparing stability index and its approximations are discussed.

Mikhail A. Babin, Sergei O. Kuznetsov

Logical Analysis of Concept Lattices by Factorization

Abstract

Reducing the size of concept lattices is a well-known problem in Formal Concept Analysis. A particular instance of this problem is the size reduction of concept lattices using factorization by complete tolerances. We show that all complete tolerances on a complete lattice (i.e., all possible ways of factorizing the lattice) with a naturally-defined operation of multiplication form a residuated lattice. This allows looking at the set of all complete tolerances as a scale of truth degrees using which we can evaluate formulas of predicate logic specifying the desired parameters of the factorization. We present illustrative example to clarify our approach.

Eduard Bartl, Michal Krupka

Basic Level of Concepts in Formal Concept Analysis

Abstract

The paper presents a preliminary study on basic level of concepts in the framework of formal concept analysis (FCA). The basic level of concepts is an important phenomenon studied in the psychology of concepts. We argue that this phenomenon may be utilized in FCA for selecting important concepts. In the other direction, we argue that FCA provides a simple framework for studying the phenomenon itself. In this preliminary study, we attempt to draw the attention of researchers in FCA to the basic level of concepts. Furthermore, we propose a formalization of one of the several existing psychological views of the basic level, present experiments, and outline future research in this direction.

Radim Belohlavek, Martin Trnecka

A Peep through the Looking Glass: Articulation Points in Lattices

Abstract

We define as an ’articulation point’ in a lattice an element which is comparable to all the other elements, but is not extremum.

We investigate a property which holds for both the lattice of a binary relation and for the lattice of the complement relation (which we call the mirror relation): one has an articulation point if and only if the other has one also.

We give efficient algorithms to generate all the articulation points. We discuss artificially creating such an articulation point by adding or removing crosses of the relation, and also creating a chain lattice.

We establish the strong relationships with bipartite and co-bipartite graphs; in particular, we derive efficient algorithms to compute a minimal triangulation and a maximal sub-triangulation of a co-bipartite graph, as well as to find the clique minimal separators and the corresponding decomposition.

Anne Berry, Alain Sigayret

Practical Use of Formal Concept Analysis in Service-Oriented Computing

Abstract

Pervasive applications are encountered in a number of settings, including smart houses, intelligent buildings or connected plants. Service-Oriented Computing is today the technology of choice for implementing and exposing resources in such environments. The selection of appropriate services at the right moment in order to compose meaningful applications is however a real issue. In this paper, we propose a FCA-based solution to this problem. We have integrated FCA algorithms in our pervasive gateways and adapted them in order to allow efficient runtime selection of heterogeneous and dynamic services. This work has been applied to realistic use cases in the scope of a European project.

Stéphanie Chollet, Vincent Lestideau, Yoann Maurel, Etienne Gandrille, Philippe Lalanda, Olivier Raynaud

Publication Analysis of the Formal Concept Analysis Community

Abstract

We present an analysis of the publication and citation networks of all previous editions of the three conferences most relevant to the FCA community: ICFCA, ICCS and CLA. Using data mining methods from FCA and graph analysis, we investigate patterns and communities among authors, we identify and visualize influential publications and authors, and we give a statistical summary of the conferences’ history.

Stephan Doerfel, Robert Jäschke, Gerd Stumme

Understanding the Semantic Structure of Human fMRI Brain Recordings with Formal Concept Analysis

Abstract

We investigate whether semantic information related to object categories can be obtained from human fMRI BOLD responses with Formal Concept Analysis (FCA). While the BOLD response provides only an indirect measure of neural activity on a relatively coarse spatio-temporal scale, it has the advantage that it can be recorded from humans, who can be questioned about their perceptions during the experiment, thereby obviating the need of interpreting animal behavioral responses. Furthermore, the BOLD signal can be recorded from the whole brain simultaneously. In our experiment, a single human subject was scanned while viewing 72 gray-scale pictures of animate and inanimate objects in a target detection task. These pictures comprise the formal objects for FCA. We computed formal attributes by learning a hierarchical Bayesian classifier, which maps BOLD responses onto binary features, and these features onto object labels. The connectivity matrix between the binary features and the object labels can then serve as the formal context. In line with previous reports, FCA revealed a clear dissociation between animate and inanimate objects with the inanimate category also including plants. Furthermore, we found that the inanimate category was subdivided between plants and non-plants when we increased the number of attributes extracted from the BOLD response. FCA also allows for the display of organizational differences between high-level and low-level visual processing areas. We show that subjective familiarity and similarity ratings are strongly correlated with the attribute structure computed from the BOLD signal.

Dominik Endres, Ruth Adam, Martin A. Giese, Uta Noppeney

Cubes of Concepts: Multi-dimensional Exploration of Multi-valued Contexts

Abstract

A number of information systems offer a limited exploration in that users can only navigate from one object to another object, e.g. navigating from folder to folder in file systems, or from page to page on the Web. An advantage of conceptual information systems is to provide navigation from concept to concept, and therefore from set of objects to set of objects. The main contribution of this paper is to push the exploration capability one step further, by providing navigation from set of concepts to set of concepts. Those sets of concepts are structured along a number of dimensions, thus forming a cube of concepts. We describe a number of representations of concepts, such as sets of objects, multisets of values, and aggregated values. We apply our approach to multi-valued contexts, which stand at an intermediate position between many-valued contexts and logical contexts. We explain how users can navigate from one cube of concepts to another. We show that this navigation includes and extends both conceptual navigation and OLAP operations on cubes.

Sébastien Ferré, Pierre Allard, Olivier Ridoux

Ordinal Factor Analysis

Abstract

We build on investigations by Keprt, Snásel, Belohlavek, and Vychodil on Boolean Factor Analysis. Rather than minimising the number of Boolean factors we aim at many-valued factorisations with a small number of ordinal factors.

Bernhard Ganter, Cynthia Vera Glodeanu

A Macroscopic Approach to FCA and Its Various Fuzzifications

Abstract

We promote biresiduation as a fundamental unifying principle in Formal Concept Analysis, including fuzzification and factor analysis. In particular, we show that maximal formal rectangles are exactly formal concepts within the presented framework of biresiduated maps on ordered sets. Macroscopic implications yield the particular derivation operators in specific settings such as Fuzzy Formal Concept Analysis, Factor Analysis, and degree of containment (i.e. degree of being a subset).

Tim B. Kaiser, Stefan E. Schmidt

A Connection between Clone Theory and FCA Provided by Duality Theory

Abstract

The aim of this paper is to show how Formal Concept Analysis can be used for the benefit of clone theory. More precisely, we show how a recently developed duality theory for clones can be used to dualize clones over bounded lattices into the framework of Formal Concept Analysis, where they can be investigated with techniques very different from those that universal algebraists are usually armed with. We also illustrate this approach with some small examples.

Sebastian Kerkhoff

Formal Concept Discovery in Semantic Web Data

Abstract

Semantic Web efforts aim to bring the WWW to a state in which all its content can be interpreted by machines; the ultimate goal being a machine-processable Web of Knowledge. We strongly believe that adding a mechanism to extract and compute concepts from the Semantic Web will help to achieve this vision. However, there are a number of open questions that need to be answered first. In this paper we will establish partial answers to the following questions: 1) Is it feasible to obtain data from the Web (instantaneously) and compute formal concepts without a considerable overhead; 2) have data sets, found on the Web, distinct properties and, if so, how do these properties affect the performance of concept discovery algorithms; and 3) do state-of-the-art concept discovery algorithms scale wrt. the number of data objects found on the Web?

Markus Kirchberg, Erwin Leonardi, Yu Shyang Tan, Sebastian Link, Ryan K. L. Ko, Bu Sung Lee

Concept Lattices of Incomplete Data

Abstract

We present a method of constructing a concept lattice of a formal context with incomplete data. The lattice reduces to a classical concept lattice when the missing values are completed. The lattice also can reflect any known dependencies between the missing values. We show some experiments indicating that in most cases, when the number of missing values is not large, the size of the incomplete concept lattice is not substantially greater than the size of the concept lattice of completed data.

Michal Krupka, Jan Lastovicka

Formal Concept Analysis as a Framework for Business Intelligence Technologies

Abstract

Numerical datasets in data mining are handled using various methods. In this paper, data mining of numerical data using FCA in combination with some interesting ideas from OLAP technology is proposed. This novel method is an enhancement of FCA, in which measures are assigned to objects and/or attributes and then various numeric operations are applied to these measures (e.g. summarization, aggregation functions etc.). This new approach results in a structure, which is a concept lattice and where the extent and/or intent have aggregated values assigned to them. This structure could be seen as a generalization of OLAP technology. A concept lattice can be constrained by using various closure operators. The new closure operators presented here are based on values with very clear meaning for the user. Finally, a fuzzy OLAP formalization based on FCA in a fuzzy setting and using measures is proposed. Examples are shown for each introduced topic.

Juraj Macko

Good Classification Tests as Formal Concepts

Abstract

The interconnection between the Diagnostic (Classification) Test Approach to Data Analysis and the Formal Concept Analysis (FCA) is considered. The definition of a good classification test is given via Galois’s correspondences. Next we discuss the relations between good tests and formal concepts. A good classification test is understood as a good approximation of a given classification on a given set of examples. Classification tests serve as a basis for inferring implicative, functional dependencies and association rules from datasets. This approach gives the possibility to directly control the data analysis process by giving object classifications.

Xenia A. Naidenova

Modeling Preferences over Attribute Sets in Formal Concept Analysis

Abstract

In this paper, we consider two types of preferences from preference logic and propose their interpretation in terms of formal concept analysis. We are concerned only with preferences between sets of attributes, or, viewed logically, between conjunctions of atomic formulas. We provide inference systems for the two types of preferences and study their relation to implications.

Sergei Obiedkov

Finding Top-N Colossal Patterns Based on Clique Search with Dynamic Update of Graph

Abstract

In this paper, we discuss a method for finding top-N colossal frequent patterns. A colossal pattern we try to extract is a maximal pattern with top-N largest length. Since colossal patterns can be found in relatively lower areas of an itemset (concept) lattice, an efficient method with some effective pruning mechanisms is desired.

We design a depth-first branch-and-bound algorithm for finding colossal patterns with top-N length, where a notion of pattern graph plays an important role. A pattern graph is a compact representation of the class of frequent patterns with a designated length. A colossal pattern can be found as a clique in a pattern graph satisfying a certain condition. From this observation, we design an algorithm for finding our target patterns by examining cliques in a graph defined from the pattern graph. The algorithm is based on a depth-first branch-and-bound method for finding a maximum clique. It should be noted that as our search progresses, the graph we are concerned with is dynamically updated into a sparser one which makes our task of finding cliques much easier and the branch-and-bound pruning more powerful. To the best of our knowledge, it is the first algorithm tailored for the problem which can exactly identify top-N colossal patterns. In our experimentation, we compare our algorithm with famous maximal frequent itemset miners from the viewpoint of computational efficiency for a synthetic and a benchmark dataset.

Yoshiaki Okubo, Makoto Haraguchi

Quantitative Concept Analysis

Abstract

Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, such that the first set consists of all objects that satisfy all attributes in the second, and vice versa. Many applications, though, provide contexts with quantitative information, telling not just whether an object satisfies an attribute, but also quantifying this satisfaction. Contexts in this form arise as rating matrices in recommender systems, as occurrence matrices in text analysis, as pixel intensity matrices in digital image processing, etc. Such applications have attracted a lot of attention, and several numeric extensions of FCA have been proposed. We propose the framework of proximity sets (proxets), which subsume partially ordered sets (posets) as well as metric spaces. One feature of this approach is that it extracts from quantified contexts quantified concepts, and thus allows full use of the available information. Another feature is that the categorical approach allows analyzing any universal properties that the classical FCA and the new versions may have, and thus provides structural guidance for aligning and combining the approaches.

Dusko Pavlovic

Some Notes on Managing Closure Operators

Abstract

It is widely known that closure operators on finite sets can be represented by sets of implications (also known as inclusion dependencies) as well as by formal contexts. In this paper we survey known results and present new findings concerning time and space requirements of diverse tasks for managing closure operators, given in contextual, implicational, or black-box representation. These tasks include closure computation, size minimization, finer-coarser-comparison, modification by “adding” closed sets or implications, and conversion from one representation into another.

Sebastian Rudolph

Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

Abstract

While many existing formal concept analysis algorithms are efficient, they are typically unsuitable for distributed implementation. Taking the MapReduce (MR) framework as our inspiration we introduce a distributed approach for performing formal concept mining. Our method has its novelty in that we use a light-weight MapReduce runtime called Twister which is better suited to iterative algorithms than recent distributed approaches. First, we describe the theoretical foundations underpinning our distributed formal concept analysis approach. Second, we provide a representative exemplar of how a classic centralized algorithm can be implemented in a distributed fashion using our methodology: we modify Ganter’s classic algorithm by introducing a family of \(\mbox{MR}^\star\) algorithms, namely MRGanter and MRGanter+ where the prefix denotes the algorithm’s lineage. To evaluate the factors that impact distributed algorithm performance, we compare our \(\mbox{MR}^{*}\) algorithms with the state-of-the-art. Experiments conducted on real datasets demonstrate that MRGanter+ is efficient, scalable and an appealing algorithm for distributed problems.

Biao Xu, Ruairí de Fréin, Eric Robson, Mícheál Ó Foghlú

Backmatter

Title: Formal Concept Analysis
Editors: Florent Domenach
Dmitry I. Ignatov
Jonas Poelmans
Publisher: Springer Berlin Heidelberg
Electronic ISBN: 978-3-642-29892-9
Print ISBN: 978-3-642-29891-2
DOI: https://doi.org/10.1007/978-3-642-29892-9

Springer Professional

Formal Concept Analysis

10th International Conference, ICFCA 2012, Leuven, Belgium, May 7-10, 2012. Proceedings

About this book

Table of Contents

Frontmatter

Invited Talks

Dark Web: Exploring and Mining the Dark Side of the Web

Declarative Modeling for Machine Learning and Data Mining

Can Concepts Reveal Criminals?

Cartification: From Similarities to Itemset Frequencies

Processes Are Concepts, Aren’t They?

Rough Sets and FCA – Scalability Challenges

Regular Papers

Approximating Concept Stability

Logical Analysis of Concept Lattices by Factorization

Basic Level of Concepts in Formal Concept Analysis

A Peep through the Looking Glass: Articulation Points in Lattices

Practical Use of Formal Concept Analysis in Service-Oriented Computing

Publication Analysis of the Formal Concept Analysis Community

Understanding the Semantic Structure of Human fMRI Brain Recordings with Formal Concept Analysis

Cubes of Concepts: Multi-dimensional Exploration of Multi-valued Contexts

Ordinal Factor Analysis

A Macroscopic Approach to FCA and Its Various Fuzzifications

A Connection between Clone Theory and FCA Provided by Duality Theory

Formal Concept Discovery in Semantic Web Data

Concept Lattices of Incomplete Data

Formal Concept Analysis as a Framework for Business Intelligence Technologies

Good Classification Tests as Formal Concepts

Modeling Preferences over Attribute Sets in Formal Concept Analysis

Finding Top-N Colossal Patterns Based on Clique Search with Dynamic Update of Graph

Quantitative Concept Analysis

Some Notes on Managing Closure Operators

Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

Backmatter

Premium Partner