Abstract

We present a new method based on the use of fuzzy transforms for detecting coarse-grained association rules in the datasets. The fuzzy association rules are represented in the form of linguistic expressions and we introduce a pre-processing phase to determine the optimal fuzzy partition of the domains of the quantitative attributes. In the extraction of the fuzzy association rules we use the AprioriGen algorithm and a confidence index calculated via the inverse fuzzy transform. Our method is applied to datasets of the 2001 census database of the district of Naples (Italy); the results show that the extracted fuzzy association rules provide a correct coarse-grained view of the data association rule set.

1. Introduction

Fuzzy association rules extraction [1] is a fundamental process in data mining for many topics as classification and information retrieval. Many techniques have been presented for extracting fuzzy association rules in datasets and databases [121]; some authors are using soft computing approaches as evolutionary methods [14, 2126] and clustering algorithms [24, 27, 28] for creating fuzzy partitions of data attribute domains. In many practical cases the user does not need to make a detailed fuzzy partition of the domain attributes and a fine exploration of fuzzy association rules between attributes in datasets. Indeed, sometimes his purpose is to acquire a more immediate coarse-grained knowledge of hidden relations in the data creating a coarse-grained fuzzy partition of each attribute domain and by estimating fuzzy association rules with evaluative linguistic expressions.

Here we propose a new approach for detecting coarse-grained fuzzy association rules in datasets, based on fuzzy transforms (for short, F-transforms), which are already used for image analysis [18, 2934], data analysis [31, 35], and forecasting [34, 36]. In particular, in [31] a modality of extraction of fuzzy association rules in a coarse grained way is proposed by using F-transforms. In this paper we follow this approach; our framework is composed from a pre-processing phase (necessary in order to obtain the optimal cardinality of the fuzzy partitions of the data attribute domains), and of two successive processes for extracting the fuzzy association rules. Let us consider a dataset represented by Table 1.

We define the context of the given attribute by setting and . Thus we can consider a fuzzy partition of fuzzy sets for each context . A fuzzy association rule between two disjoint sets of attributes and can be generally expressed as

In [31] the F-transforms approach, for detecting fuzzy association rules from a dataset in a coarse-grained view and to construct a framework of fuzzy association rules between two disjoint sets of attributes and , is used in the form: where (in accordance to [18, 29]) is the basic function of the uniform partition of the context associated to the node . Each clause in the antecedent assumes the linguistic meaning “ is approximately ,” with being the evaluative expression assigned to the fuzzy set () and a pure evaluative linguistic expression that characterizes the component of the F-transform corresponding at the fuzzy sets . The term “mean” in the consequent derives from the fact that the component is obtained as a mean of the values of the item weighted over the basic functions . The symbol represents an association between attributes obtained with the F-transforms. In other words, we can say that a fuzzy association rule expressed in the form (2) provides a synthetic valuation of the association rule between attributes, with the associations expressed linguistically with the model of [18, 29] and the fuzzy sets in the antecedent being the basic functions of the uniform fuzzy partition of the corresponding context.

Following the definitions and notations of [32], let and be points of a specific context [a,b], called nodes, such that . The assigned family of fuzzy sets , called basic functions, is a fuzzy partition of [a, b] if the following holds: (A) for every ;(B) if for ;(C) is a continuous function on [a,b];(D) strictly increases on for and strictly decreases on for ;(E) for every .We say that the fuzzy sets form an uniform fuzzy partition if (F) and , where and (equidistance of the nodes);(G) = for every and ;(H) for every and .

An example of basic functions is given by triangular fuzzy sets. For example, if , , and , we obtain . Table 2 shows the nodes that characterize each basic function.

If we define basic functions of triangular form, as example, we have for :

Figure 1 shows the four basic functions forming the fuzzy partition of the context [8, 37] given in (3) for .

The method given in [31] can be very useful when we need to extract fuzzy association rules in an approximate way from a dataset; for each attribute a coarse-grained uniform fuzzy partition of its context is created and the evaluative linguistic expression in the consequent represents a weighted mean of the values of the attribute . Nevertheless, as pointed in [35], this approach does not take into account the necessity to have the data sufficiently dense with respect to the chosen fuzzy partition, otherwise the F-transforms cannot be used. In order to avoid the choice of a fuzzy partition either too fine or too coarse of the contexts, it is necessary to define a pre-processing phase which determines the optimal fuzzy partition to choose with respect to the density of the data. Here we propose a technique based on F-transforms to detect the framework of strong fuzzy association rules from a dataset in the form (2). In our method we determine the best uniform fuzzy partition of each context constituted from triangular fuzzy sets like in Figure 1. For each context , of the dataset in Table 1, an initial uniform fuzzy partition of triangular fuzzy sets (3) is established.

To control that the data are sufficiently dense with respect to the chosen partition, we must check that for each combination , there exists at least one data point such that . In order to reduce the computational complexity (we should extend this control on combinations where is the cardinality of the fuzzy partition of the context , we consider fuzzy sets for each context, thus the number of the possible combinations is .

In Section 2 we schematize our method for extraction fuzzy association rules. In Section 3 we recall the definitions of F-transforms in (≥2) variables. In Section 4 we present the extraction process of the fuzzy association rules, in Section 5 we show some results coming from the tests conducted on the databases of the 2001 census data related to the city of Naples (Italy), supplied by ISTAT (Istituto Nazionale di Statistica). These databases contain information on population, buildings, housing, family, employment work for each census zone of Naples. The reliability of the fuzzy association rules is also discussed as well.

2. Pre-Processing and Extraction Fuzzy Association Rule Phases

We use a pre-processing phase that determines the optimal value of n by starting from an initial cardinality of the fuzzy partitions. Furthermore this phase ensures that the data are sufficiently dense with respect to the chosen fuzzy partitions, so that for each combination of basic functions with respect to a minimal density of data points . For each combination , we calculate the value which is the number of data points for which . The user can use a minimal density ρε of data, in the sense that and the fulfillment of this constraint is to determine a value for the parameter which is consistent with the distribution of the data correspondent to the uniform fuzzy partitions. The use of allows us to control how the uniform fuzzy partition set of the attributes should be finer. If , we obtain the more coarse-grained fuzzy partition set in accordance to this constraint. As examples, in Figures 2 and 3 we show the case of two attributes, and . Each object of the dataset is represented as a point in the Cartesian graph of and and in both examples we consider . In Figure 2 we have a too fine partition for ; in fact by considering the combination of triangular fuzzy sets . In Figure 3 we have a more coarse-grained fuzzy partition for which should be optimal. A finer partition set does not satisfy the constraint of minimum density of data and a coarser partition set would be under-sampled with respect to the dataset dimension. In this pre-processing phase we start with a value of the parameter . If the partition set is too fine, we decrement the value of by 1 until we determine the optimal partition. If , the dataset is too coarse grained for the fuzzy rules extraction via F-transforms and the process is stopped (inconsistent dataset), otherwise we use the optimal partition in the successive step of fuzzy rules extraction. Figure 4 gives the schema of the pre-processing phase.

Following [31], in the extraction process we establish fuzzy association rules of the form (2) by calculating the support index as the percentage of objects in the dataset for which the antecedent in the fuzzy association rule (2) is not null, that is, for , and it is defined as where is the dimension of the dataset. Then we calculate the confidence of each rule to evaluate the precision of a potential fuzzy association rule. Normally the confidence index is given by the ratio of the number of the objects in the dataset for which the antecedent and the consequent in the fuzzy association rule (2) are not null with respect to the number of objects in the data set for which the antecedent in the fuzzy association rule (2) is not null. In other words, we use the confidence index proposed in [31] given bywhere is the value of the inverse F-transform applied on the th data object and is the fuzzy transform component associated with the fuzzy sets . The multidimensional inverse F-Transform has been already used in [31, 35] to model the dependency of the attribute via the predictors like , where is a function estimated via a suitable fuzzy partition of the independent attribute domains. The formula (7) provides an estimate of the grade of precision of a potential fuzzy association rule. If the above index is equal to 1, then the error in the approximation of obtained with the inverse fuzzy transform is null.

In our framework we also use two sub-processes for extracting the fuzzy association rules. In the first sub-process we use the AprioriGen algorithm to extract the candidate fuzzy association rule with maximal dimension and support greater than or equal to a threshold . In the successive sub-process, the corresponding inverse fuzzy transform and the confidence index (7) are calculated for each fuzzy association rule candidate. The fuzzy association rule is extracted if the index (7) is greater than or equal to a threshold and in this case we determine the evaluative linguistic expression with the component of the F-transform corresponding to the fuzzy sets (this modality of assignment is described in Section 4 with many details). In Figure 5 we show the processes used for extracting fuzzy association rules in the form (2).

Summarizing, we can say that in the first step the AprioriGen algorithm is applied for determining the set of potential fuzzy association rules; in the successive step the direct, the inverse fuzzy transform, and the index “con” are calculated for each potential fuzzy association rule; if the potential fuzzy association rule is discarded, otherwise it is extracted and inserted like a strong fuzzy association rule in the final set of the rules.

3. Discrete F-Transforms in Several Variables

We firstly deal with functions of one variable and only the discrete case; indeed we know that a function assumes determined values in the set of points of the interval [a, b], . If is sufficiently dense with respect to the fixed partition , that is for each there exists an index such that , we can define the -tuple as the discrete F-transform of with respect to the basic functions , where each is given by for . We also define the inverse F-transform of with respect to the basic functions by setting for every . We have the following theorem [32, Theorem 5].

Theorem 1. Let be a function assigned on the set . Then for every , there exists an integer and a related fuzzy partition of such that is sufficiently dense with respect to and for every , the following inequality holds:

Now we consider functions in (≥2) variables. The universe of the discourse is given by the Cartesian product where , , is the domain of the th variable. Let be assigned points, called nodes, such that for . Furthermore, let be a fuzzy partition of [] for . We assume that the function takes values in the set , , that is, in points . We say that is sufficiently dense with respect to the chosen partitions ,…, if for each -tuple , there exists a point such that , . In this case we define the function as the th component of the discrete F-transform of with respect to given by

Now we define the inverse F-transform of with respect to the basic functions to be the function defined by setting for each : for . As in [32], it is possible to prove the following generalization of Theorem 1.

Theorem 2. Let be a function assigned on the set of points ,  ,,, . Then for every , there exist k integers and related fuzzy partitions such that the set is sufficiently dense with respect to fuzzy partitions (13) and for every , the following inequality holds:

Strictly speaking, Theorems 1 and 2 assure that a discrete (even multi-dimensional) function can be approximated arbitrarily with a suitable inverse (multidimensional) F. transform provided that a convenient fuzzy partition of the universe of discourse is found via related basic functions. Like pointed out in our previous papers [35, 36, 3840], unfortunately Theorem 1 (resp. Theorem 2) is not constructive, in the sense that it does not give a tool to find an integer resp., integers and basic functions (resp., (13)) such that (10) (resp., (14)) holds for an arbitrary . In a practical sense, we assume several values of (resp., ) testing that the set is sufficiently dense with respect to the related fuzzy partition (resp., partitions) and then we use the two indexes (6) and (7) for controlling the choice of the best fuzzy partition (resp., partitions).

4. Fuzzy Association Rules Extraction Process

We use the multi-dimensional direct and inverse F-transform for extracting fuzzy association rules in the form (2) from the dataset represented as in Table 1, where , is the uniform fuzzy partition of the context of the given attribute by setting , with the basic functions (3). For computational simplicity, we assume constant the cardinality of the fuzzy partition of each contexts , that is card . Let the multi-dimensional direct F-transform defined by (11) be corresponding to the combination and if is the given (expected) value, then the formula (11) is reduced to the following one:

In other words, is a mean of the values of the attribute weighted over . Following [18, 29], is given by the combination of one of the linguistic following hedges: Ex (extremely), Si (significantly), Ve (very), empty hedge, ML (more or less), Ro (roughly), QR (quite roughly), VR (very roughly) with one of the following expressions: Sm (small), Me (medium), Bi (big). Each linguistic hedge is modelled with a continuous function νabc defined by means of three parameters a, b, c, with 0 a < b < c 1, as

The fuzzy sets of each combination of linguistic hedges with one of the expressions “Small,” “Medium,” “Big” are defined by where the locution “Int” stands for the intensity of the linguistic expression and

The extension of a linguistic expression is obtained via a simple linear transformation defined, considering the context of the attribute , as

The linguistic expression represents a weighted mean of the values of the attribute , in which the weights are given by the membership values of the basic functions. In Figure 6 we show the three linear functions LH, MH, RH.

As example, in Figure 7 (resp., Figure 8) we show the fuzzy sets “Ro small,” “Ro medium,” and “Ro big” (resp., “Ex small,” “Ex medium” and “Ex big”) determined by setting . In other words, the experts can assign specific labels to the linguistic expressions which are representative of their reasoning, or can use the same label for more fuzzy sets. For example, to the fuzzy set “Ex medium” they can associate the label "perfectly on the average". In accordance to [18, 29], we define a partial ordering “≤” in the set of the linguistic hedges as where is one of the following expressions: “small,” “medium” or “big”. As mentioned above, if we obtain the same membership degree for two or more linguistic expressions, we assign to the linguistic expression of the “lowest fuzzy set,” which is the sharpest evaluative expression with respect to the partial ordering “≤.” For example, if we have , we assign the linguistic expression “Ex big” to because “Ex big” is the lowest fuzzy set among all the fuzzy sets “ big” (like “Ex big” and “Ro big”) assuming the value 1. In order to use the formulae (18), the dataset has to be sufficiently dense with respect to the chosen fuzzy partitions of basic functions.

We set the optimal value of the parameter n using the pre-processing phase schematized in Figure 4, in which we control that the number of data points such that for each combination , where is a prefixed threshold otherwise the cardinality of each fuzzy partition is decremented and the process is iterated. In this mode we impose the fulfilment of the sufficient density of the data points with respect to the fuzzy partition’s set and we are sure that the fuzzy partition created is too coarse grained. We set the initial value of the cardinality of the fuzzy partitions to .The pseudocode of the algorithm of the pre-processing phase is reported below.(1) Set (2) Set the minimal point data density(3) For each combination ,(a) calculate the value defined in (4) (b) If , then(i)(ii) if exit (iii) exit for(c)end if(4)Next(5)Return

The successive extraction process of the fuzzy association rules is composed by two sub-processes, schematized in Figure 5. In the first sub-process we use the AprioriGen algorithm for selecting the candidate fuzzy association rule and for choosing the antecedents with maximal dimension and support greater than or equal to the threshold . The AprioriGen algorithm is composed by two steps: the join and the pruning step. In the join step an itemset is generated and formed by attributes merging two -itemsets having the same first attributes. In the pruning step all the elements not having all the first subsets as great are deleted. In the successive sub-process the fuzzy association rule is extracted if the grade of confidence (7) is greater than or equal to the threshold conε and it is calculated via the inverse F-transform [31, 35] given from the following formula (similar to (12)): obtained considering all the multi-dimensional F-transform components of the combination of basic functions . If the confidence index (7) is greater or equal than the threshold , the F-transform component correspondent to the basic functions in the antecedent of the potential fuzzy association rule is used for determining the linguistic expression of the consequent of the final fuzzy association rule. If we obtain the same membership degree for two or more linguistic expressions, we assign the linguistic expression of the lowest fuzzy sets to according to the above partial ordering “.” The related pseudo-code is reported below.(1)Set (2)Apply AprioriGen algorithm(3)Return the potential fuzzy association rule set(4)Next (5)For each combination , calculate the direct F-transform component (6)Next (7)For each data object , calculate the inverse F-transform (8)Next (9)Calculate the confidence index (7) and call it as “con”(10)If , the F-transform component is to be assigned to (11)End if(12)Insert the fuzzy association rule in the fuzzy association rule set(13)Next(14)Return the fuzzy association rule set

In Section 5 we present some results obtained from datasets relative to the 2001 census ISTAT (Istituto Nazionale di Statistica) concerning the municipalities of the district of Naples (Italy).

5. A Simulation Result

We consider a first dataset of the last ISTAT census database of the municipalities of the district of Naples (Italy). This dataset is obtained by extracting the information about residents with job and families. We use the following notation: stands for census code, stands for the percentage of not employed, for the percentage of managers and professional men, stands for the percentage of women employed, stands for the percentage of graduate employed, stands for the percentage of residential houses of property, stands for the percentage of families with two or more houses of propriety, and finally stands for the percentage of families with more than two sons. In the pre-processing phase we set , obtaining as optimal cardinality partition of each attribute. Each fuzzy partition is uniform and constructed by using five triangular fuzzy sets of the form (3). We set both values of and to 0.1. The domain’s expert has suggested the values reported in Table 3 for the parameters a, b, c.

After the extraction process we obtain four fuzzy association rules (cfr. Table 4). To verify the reliability of these results, we report the values (as percentages) of the confidence index given by (7), obtained for each basic function of the attribute , in the consequent of the extracted fuzzy association rules in Table 5. The linguistic expression in the consequents of the fuzzy association rules in Table 4 can be roughly interpreted as a mean of the fuzzy sets weighted for the value of the confidence index. In Table 5 we report the percentages of the inclusion areas of each basic function with the fuzzy set associated to the linguistic expression in the consequent with respect to the area of the basic function. Each inclusion area is the area given from the intersection between the basic function and the fuzzy set.

From the comparison of Tables 5 and 6, the confidence index is approximately similar to the correspondent percentage of inclusion. In Figure 9 (resp., Figure 10) we show graphically the inclusion areas for the association rule R2 (resp., R3). The fuzzy set associated with the linguistic expression in the consequent is in orange colour. Then we can state that the fuzzy association rules extracted in Table 4 can be interpreted as a coarse-grained fuzzy association rules in which the linguistic expression in the consequent approximates a mean of finer fuzzy set given by the basic functions (3) for the attribute . This approximation depends clearly on the the values of a, b, c.

The next dataset consists of attributes describing characteristics of residential buildings and houses. The notation concerning the attributes is the following: stands for census code, stands for percentage of residential buildings constructed during the last 5 years, stands for percentage of residential buildings with maintenance during the last 5 years, stands for mean year of last maintenance, stands for mean number of residential houses, stands for mean surface of residential houses, stands for percentage of residential houses whit central heating. In the pre-processing phase we set , obtaining as optimal cardinality partition of each attribute. In Figure 11 we show the six basic functions of the form (3) which give the uniform fuzzy partition of each context. After the extraction process we obtain three fuzzy association rules (cfr. Table 7).

Then we calculate the confidence index (7) for all the antecedents of the fuzzy association rules extracted in Table 6. In Table 8 we report as percentages the values of the confidence index obtained for each basic function of the attribute in the consequent of the extracted fuzzy association rules.

In Table 9 we report the inclusion areas of each basic function with the fuzzy set associated to the linguistic expression in the consequent with respect to the area of the basic function.

By comparing Tables 8 and 9, we can state that also for this dataset the linguistic expression in the fuzzy association rules can be roughly interpreted as a mean of the fuzzy sets weighted from the value of the confidence index. In Figure 12 (resp., Figure 13) we show graphically the inclusion areas for the association rule R1 (resp., R2). The fuzzy set associated with the linguistic expression in the consequent is represented in orange colour.

The results confirm that the F-transforms can be used to extract fuzzy association rules in the form (2) in a coarse-grained view from datasets. The comparison of the results suggests that the linguistic evaluation used for the attribute in the consequent and calculated, using the inverse F-transform, can be estimated as a weighted mean of the finer fuzzy sets composed from basic functions (4), where the weights are given from confidence index (7).

6. Conclusions

We propose the usage of multi-dimensional F-transforms which allow to extract fuzzy association rules from datasets in a coarse-grained form. Our approach allows always to control that the set of the assigned points is sufficiently dense respect to the basic functions of the partition and we use the support and confidence indexes for selecting and analyzing fuzzy association rules.

This method can be used in data mining processes in which a fine exploration of fuzzy association rules between attributes in the datasets is not necessary. In a future work the authors intend to explore the performances of these methods for very large datasets and compare the results with the ones obtained by using other well-known existing methods such as clustering- and evolutionary-based ones.