Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data

doi:10.1016/j.ins.2011.09.023

Information Sciences

Volume 185, Issue 1, 15 February 2012, Pages 114-127

https://doi.org/10.1016/j.ins.2011.09.023 Get rights and content

Abstract

Fixpoints of Galois connections induced by object-attribute data tables represent important patterns that can be found in relational data. Such patterns are used in several data mining disciplines including formal concept analysis, frequent itemset and association rule mining, and Boolean factor analysis. In this paper we propose efficient algorithm for listing all fixpoints of Galois connections induced by object-attribute data. The algorithm, called FCbO, results as a modification of Kuznetsov’s CbO in which we use more efficient canonicity test. We describe the algorithm, prove its correctness, discuss efficiency issues, and present an experimental evaluation of its performance and comparison with other algorithms.

Section snippets

Introduction and Preliminaries

This paper describes a new algorithm for computing fixpoints of Galois connections. In particular, we focus on Galois connections [5], [12], [26], [33] that appear in formal concept analysis (FCA) – a method of qualitative analysis of object-attribute relational data [10], [33]. In a broader sense, the algorithm belongs to an important family of algorithms for listing combinatorial structures [11] and algorithms for biclustering [3], [29]. The algorithm we propose is a refinement of Kuznetsov’s

Canonicity test and CbO

In this section we recall CbO [19], [21] and the canonicity test. The next section will describe the new algorithm. In the sequel, we assume that X = {0, 1, … , m} and Y = {0, 1, … , n} are finite nonempty sets of objects and attributes, respectively, and I ⊆ X × Y. Since I is fixed, the concept-forming operators $^{↑_{I}}$ and $^{↓_{I}}$ defined by (5), (6) will be denoted just by ^↑ and ^↓, respectively. The set of all formal concept in I will be denoted by $B (X, Y, I)$ .

CbO has been introduced in [19] (a paper in Russian) and

Improved canonicity test and FCbO

In this section, we propose an improvement of the canonicity test used by CbO that reduces the number of formal concepts computed multiple times. In a call tree like that in Fig. 1, such formal concepts are depicted by the black-square nodes. Our new test and the improved algorithm will reduce the number of such nodes in the call tree without altering the rest of the tree. The major problem with the original canonicity test used by CbO is that it is always used after a new formal concept is

Complexity and efficiency issues

It is a well-known fact that the limiting factor of computing all formal concepts is that the corresponding counting problem is #P-complete [18], [20]. Fortunately, if ∣I∣ is considerably small, one can get the set of all formal concepts in reasonable time even if X and Y are large. Therefore, there have been proposed various algorithms for FCA specialized on sparse incidence data. FCbO performs well in case of both sparse and dense data of reasonable size. From the point of view of the

Conclusions

We have introduced an algorithm called FCbO for computing formal concepts in object-attribute data tables. The algorithm results from CbO [19], [21] by introducing a new canonicity test. We have proved correctness of the algorithm and presented an experimental evaluation of its performance compared to the original CbO, Ganter’s NextClosure and also to Andrews’s In-Close, another contemporary derivative of CbO. The experiments have shown that FCbO significantly reduces the number of computed

Acknowledgment

Supported by Grant Nos. P103/10/1056 and P202/10/P360 of the Czech Science Foundation and by Grant No. MSM 6198959214. The algorithm described in this paper has been presented during ICCS 2009 and CLA 2010 [16] conferences. However, in ICCS 2009, the algorithm took part in a performance contest only and was not published in the proceedings, and the CLA 2010 paper contains the pseudocode and a brief summary of the algorithm without further analysis or proofs. The present paper is aimed as a

References (34)

F. Angiulli et al.
Random walk biclustering for microarray data
Information Sciences
(2008)
R. Belohlavek et al.
Evaluation of IPAQ questionnaires supported by formal concept analysis
Inform. Sci.
(2011)
R. Belohlavek et al.
Discovery of optimal factors in binary data via a novel method of matrix decomposition
J. Comput. Syst. Sci.
(2010)
D.S. Johnson et al.
On generating all maximal independent sets
Inform. Process. Lett.
(1988)
H. Liu et al.
Top–down mining of frequent closed patterns from very high dimensional data
Inform. Sci.
(2009)
J. Medina et al.
Multi-adjoint t-concept lattices
Inform. Sci.
(2010)
R. Agrawal, T. Imielinski, A.N. Swami, Mining association rules between sets of items in large databases, in:...
S. Andrews, In-Close, a Fast Algorithm for Computing Formal Concepts, in: S. Rudolph, F. Dau, S.O. Kuznetsov (Eds.):...
A. Asuncion et al.
UCI Machine Learning Repository
(2007)
R. Belohlavek
Lattices of fixed points of fuzzy Galois connections
Math. Logic Quarterly
(2001)

J.H. Correia et al.

Conceptual knowledge discovery – A human-centered approach

Appl. Artif. Intell.

(2003)

B. Ganter, Two basic algorithms in concept analysis. (Technical Report FB4-Preprint No. 831). TH Darmstadt,...

B. Ganter et al.

Formal Concept Analysis

(1999)

L.A. Goldberg

Efficient Algorithms for Listing Combinatorial Structures

(1993)

G.A. Gratzer

General Lattice Theory

(1998)

S. Hettich, S.D. Bay, The UCI KDD Archive University of California, Irvine, School of Information and Computer...

P. Krajca et al.

Parallel algorithm for computing fixpoints of galois connections

Ann. Math. Artif. Intell.

(2010)

Cited by (94)

Pruning techniques in LinCbO for the computation of the Duquenne-Guigues basis
2022, Information Sciences
In our previous work, we introduced LinCbO—an algorithm for fast computation of the Duquenne-Guigues basis based on Close-by-One and LinClosure. We also proposed additional speed-ups using pruning techniques, like those used in FCbO, In-Close, or LCM. In the present work, we describe the pruning techniques for LinCbO and experimentally evaluate their effect. Additionally, we describe and evaluate algorithms using a similar idea to LinCbO but using naïve closure or Wild’s closure instead of the LinClosure. According to our extensive experimental evaluation, LinCbO with pruning techniques performs significantly faster than already known algorithms. The differences between the performance of the four pruning techniques seem negligible.
Revisiting the GreCon algorithm for Boolean matrix factorization
2022, Knowledge-Based Systems
Over the past decade, the most fundamental Boolean matrix factorization (BMF) algorithms GreCon and GreConD were proposed. Whereas GreConD has become a popular and widely used BMF algorithm, GreCon – the algorithm on which the GreConD is built – is somewhat forgotten in contemporary BMF research; however, GreCon may produce better results than GreConD. The main disadvantage of GreCon algorithm is a slow running time. In the paper, we argue that the search strategy of GreConD, notwithstanding it provides a good result, is limited. We show that the reasons for not using GreCon algorithm are no longer the truth. We revise the algorithm and propose a new approach to storing and updating data required for factor enumeration. By various experiments, we demonstrate that the revised version is competitive with contemporary BMF algorithms in terms of running time. Moreover, in some cases, the revised GreCon outperforms GreConD—one of the fastest BMF algorithms. Furthermore, we show that our novel approach to GreCon enables the utilization of novel approaches to BMF.
A new parallel algorithm for computing formal concepts based on two parallel stages
2022, Information Sciences
The fixpoints of Galois connections induced by binary relational data are called formal concepts in formal concept analysis (FCA). Computing formal concepts is one of the most important issues in FCA, while Close-by-One (CbO) and its variants are usually considered as the most efficient serial algorithms for this task. Current approaches to parallelization of CbO-type algorithms such as PCbO (Parallel CbO) usually enter the parallel stage after a serial stage which could be a possible bottleneck. In this paper, we propose a new parallel algorithm for computing formal concepts, which is composed of two parallel phases. The new algorithm parallelizes both the computations of the top L recursion levels and the workload distribution, which decouples worker threads from the main thread. We describe the algorithm and present an experimental evaluation of its performance and comparison with PCbO on various datasets. Results indicate that our algorithm performs better, especially when a dataset is dense or has a large size.
LCM from FCA point of view: A CbO-style algorithm with speed-up features
2022, International Journal of Approximate Reasoning
LCM is an algorithm for enumeration of frequent closed itemsets in transaction databases. It is well known that when we ignore the required frequency, the closed itemsets are exactly intents of formal concepts in Formal Concept Analysis (FCA). We describe LCM in terms of FCA and show that LCM is basically the Close-by-One algorithm with multiple speed-up features for processing sparse data. We analyze the speed-up features and compare them with those of similar FCA algorithms, like FCbO and algorithms from the In-Close family.
Systematic categorization and evaluation of CbO-based algorithms in FCA
2021, Information Sciences
Algorithms based on Close-by-One (CbO) are polynomial delay algorithms used for the enumeration of closed sets, particularly formal concepts in Formal Concept Analysis. We describe and categorize their distinctive features. We experimentally evaluate the influence of the features on the computation time. We show that via the study of individual features, we can design new and more efficient algorithms.
LinCbO: Fast algorithm for computation of the Duquenne-Guigues basis
2021, Information Sciences
Citation Excerpt :
Originally, we believed that CbO itself can make the computation faster. This motivation came from the paper by Outrata & Vychodil [24], where CbO is shown to be significantly faster than NextClosure when computing intents (see Table 5). The main reason for the speed-up is the fact that CbO uses set intersection to efficiently obtain extents during the tree descent.
We propose and evaluate a novel algorithm for computation of the Duquenne-Guigues basis which combines Close-by-One and LinClosure algorithms. This combination enables us to reuse attribute counters used in LinClosure and speed up the computation. Our experimental evaluation shows that it is the most efficient algorithm for computation of the Duquenne-Guigues basis. keyword: non-redundancy; attribute implications; minimalization; closures.

View all citing articles on Scopus

View full text

Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data

Abstract

Section snippets

Introduction and Preliminaries

Canonicity test and CbO

Improved canonicity test and FCbO

Complexity and efficiency issues

Conclusions

Acknowledgment

Information Sciences

Inform. Sci.

J. Comput. Syst. Sci.

Inform. Process. Lett.

Inform. Sci.

Inform. Sci.

UCI Machine Learning Repository

Lattices of fixed points of fuzzy Galois connections

Math. Logic Quarterly

Conceptual knowledge discovery – A human-centered approach

Appl. Artif. Intell.

Formal Concept Analysis

Efficient Algorithms for Listing Combinatorial Structures

General Lattice Theory

Parallel algorithm for computing fixpoints of galois connections

Ann. Math. Artif. Intell.