Introduction
Related work
Proposed methodology
Association rules mining
FP-growth algorithm
Apache Spark
Spark architecture
Multi-criteria decision analysis approach (MCDA)
F
1
|
F
2
|
…
|
F
k
|
…
|
F
q
| |
---|---|---|---|---|---|---|
A
1
|
f
1
(a
1
) |
f
2
(a
1
) |
…
|
f
k
(a
1
) |
…
|
f
q
(a
1
) |
A
2
|
f
1
(a
2
) |
f
2
(a
2
) |
…
|
f
k
(a
2
) |
…
|
f
q
(a
2
) |
…
|
…
|
…
|
…
|
…
|
…
|
…
|
A
n
|
f
1
(a
n
) |
f
2
(a
n
) |
…
|
f
k
(a
n
) |
…
|
f
q
(a
n
) |
Quality measurements
Measure | Formula | |
---|---|---|
Support The support defined as the proportion of transaction in the database, which contains the items A [2] |
\(Support(A \to B) = \frac{|t(A \cup B)|}{t(A)}\)
| (1) |
Confidence The confidence determines how frequently items in B appear in transaction that contains A [2], ranges from 0 to 1 |
\(Confidence(A \to B) = \frac{Support(A \cup B)}{Support(A)}\)
| (2) |
Lift The lift measures how far from independence are A and B [16]. It ranges within [0, +∞] |
\(Lift(A \to B) = \frac{Support(A \cup B)}{Support(A)*Support(B)}\)
| (9) |
Laplace It is a confidence estimator that takes support into account [17]. It ranges within [0, 1] |
\(lapl(A \to B) = \frac{Support(A \cup B) + 1}{Support(A) + 2}\)
| (10) |
Conviction Measure the degree of implication of a rule [18]. It ranges along the values [0.25, +∞] |
\(conv(A \to B) = \frac{1 - Support(B)}{1 - conf(A \to B)}\)
| (11) |
Leverage Measure how much more counting is obtained from the co-occurrence of the antecedent and consequent from the independence [19] |
\(leve\left( {A \to B} \right) = Support\left( {A \to B} \right) - Support\left( A \right) \, \times \, Support\left( B \right)\)
| (12) |
Jaccard Measure the degree of overlap between the cases covered by each of them [20] the Jaccard coefficient takes values in [0, 1] |
\(Jacc(A \to B) = \frac{Support(A \cup B)}{Support(A)\, + \,Support(B) - Support(A \cup B)}\)
| (13) |
\(\phi (A \to B) = \frac{leve(A \cup B)}{{\sqrt {(Support(A)\, \times \,Support(B))\, \times \,(1 - Support(A))\, \times \,(1 - Support(B)} )}}\)
| (14) |
Proposed approach
Empirical study: road accident analysis
Attribute name | Values | Description |
---|---|---|
Accident_ID | Integer | Identification of accident |
Accident_Type | Fatal, Injury, Property damage | Accident type |
Driver_Age | < 20, [21–27], [28–60], > 61 | Driver age |
Driver_Sex | M, F | Driver sex |
Driver_Experience | < 1, [2–4], > 5 | Driver experience |
Vehicle_Age | [1–2], [3–4], [5–6], > 7 | Service year of the vehicle |
Vehicle_Type | Car, Trucks, Motorcycles, Other | Type of the vehicle |
Light_Condition | Daylight, Twilight, Public lighting, Night | Light conditions |
Weather_Condition | Normal weather, Rain, Fog, Wind, Snow | Weather conditions |
Road_Condition | Highway, Ice Road, Collapse Road, Unpaved Road | Road conditions |
Road_Geometry | Horizontal, Alignment, Bridge, Tunnel | Road geometry |
Road_Age | [1–2], [3–5], [6–10], [11–20], > 20 | The age of road |
Time | [00–6], [6–12], [12–18], [18–00] | Accident time |
City | Marrakesh, Casablanca, Rabat… | Name of city where accident occurred |
Particular_Area | School, Market, Shops… | Where the accident occurred |
Season | Autumn, Spring, Summer, Winter | Seasons of year |
Day | Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday | Days of week |
Accident_Causes | Alcohol effects, Fatigue, Loss of control, Speed, Pushed by another vehicle, Brake failure | Causes of accident |
Number_of_injuries | 1, [2–5], [6–10], > 10 | Number of injuries |
Number_of_deaths | 1, [2–5], [6–10], > 10 | Number of deaths |
Victim_Age | < 1, [1–2], [3–5], > 5 | Victim age |
Results and discussion
Software | Node environment |
---|---|
Apache Spark 2.0 | Single node |
Scala IDE 4.4.1 | Memory: 12 Gb |
SBT 0.13 | OS: Ubuntu 16.04 LTS, |
CPU: 2.7 GHz, i7 |
Id | Frequent itemset | Support |
---|---|---|
1 | [Collapse Road] | 40 |
2 | [Collapse Road, Clear] | 28 |
3 | [Collapse Road, Summer] | 27 |
4 | [Collapse Road, M] | 35 |
5 | [Collapse Road, M, Day] | 28 |
6 | [Collapse Road, Day] | 31 |
7 | [ [21–27], M] | 27 |
8 | [ [21–27], Day] | 32 |
9 | [Clear] | 54 |
10 | [Clear, M] | 37 |
11 | [Clear, M, Day] | 30 |
12 | [Clear, Day] | 45 |
13 | [Horizontal, [21–27]] | 27 |
14 | [Horizontal, M] | 27 |
15 | [Summer] | 52 |
16 | [Summer, Clear] | 39 |
17 | [Summer, Clear, M] | 27 |
18 | [Summer, Clear, Day] | 35 |
19 | [Summer, M] | 36 |
20 | [Summer, M, Day] | 32 |
21 | [Summer, Car, Day] | 30 |
22 | [Summer, Day] | 48 |
23 | [S] | 38 |
24 | [S, Day] | 32 |
25 | [M] | 62 |
… | … | … |
27 | [M, Day] | 47 |
28 | [Car, Clear] | 35 |
29 | [Car, Clear, M] | 27 |
30 | [Car, Clear, Day] | 30 |
31 | [Car, M] | 38 |
32 | [Car, M, Day] | 29 |
33 | [Car, Day] | 41 |
34 | [Unpaved Road] | 37 |
35 | [Unpaved Road, Day] | 27 |
36 | [Fatal] | 42 |
37 | [Fatal, Clear] | 33 |
38 | [Fatal, Clear, M] | 28 |
39 | [Fatal, Clear, Day] | 28 |
40 | [Fatal, Summer] | 29 |
N | Antecedent | Consequent | Conf |
---|---|---|---|
1 | Fatal | M | 0.85 |
2 | Fatal, Day | Clear | 0.90 |
3 | Clear, M | Day | 0.81 |
4 | Car, Clear | Day | 0.21 |
5 | Fatal, Summer | Clear | 0.93 |
6 | [12–18], Summer | Day | 0.90 |
7 | Summer, Clear | Day | 0.89 |
8 | Clear | Day | 0.83 |
9 | 3 | Car | 1.0 |
10 | 2, Day | Clear | 0.96 |
11 | Collapse Road, Day | M | 0.90 |
12 | Summer, Car | Day | 0.96 |
13 | Collapse Road, M | Day | 0.80 |
14 | [6–12] | Day | 0.93 |
15 | [12–18], Day | Summer | 0.83 |
16 | < 2, Clear | Day | 0.96 |
17 | [21–27] | Day | 0.84 |
18 | Summer, M | Day | 0.88 |
19 | Summer | Day | 0.92 |
20 | S | Day | 0.84 |
21 | [12–18] | Day | 0.81 |
22 | < 2 | Clear | 0.90 |
23 | < 2 | Car | 0.93 |
24 | < 2 | Day | 0.90 |
25 | Collapse Road | M | 0.87 |
26 |
Md
| M | 0.86 |
27 | Fatal, Clear | M | 0.84 |
28 | Fatal, Clear | Day | 0.84 |
29 | Fatal, Clear | Summer | 0.81 |
Rule\criteria | Support | Lift | Laplace | Confidence | Conviction | Leverage | Jaccard | Phi-coeff |
---|---|---|---|---|---|---|---|---|
Rule1 | 36 | 85 | 97 | 2 | 87 | 30 | 50 | 19 |
Rule2 | 31 | 90 | 96 | 2 | 93 | 52 | 55 | 18 |
Rule3 | 37 | 81 | 97 | 2 | 83 | 66 | 65 | 78 |
Rule4 | 35 | 85 | 97 | 2 | 87 | 90 | 36 | 98 |
Rule5 | 29 | 93 | 96 | 3 | 96 | 90 | 45 | 69 |
Rule6 | 33 | 90 | 97 | 2 | 92 | 12 | 89 | 98 |
Rule7 | 39 | 89 | 97 | 2 | 91 | 93 | 45 | 89 |
Rule8 | 54 | 83 | 98 | 1 | 84 | 10 | 65 | 98 |
Rule9 | 29 | 100 | 96 | 3 | 2 | 59 | 43 | 85 |
Rule10 | 30 | 96 | 96 | 3 | 1 | 0 | 45 | 97 |
Rule11 | 31 | 90 | 96 | 2 | 93 | 20 | 16 | 58 |
Rule12 | 31 | 96 | 96 | 3 | 1 | 33 | 99 | 56 |
Rule13 | 35 | 80 | 97 | 2 | 82 | 99 | 89 | 45 |
Rule14 | 32 | 93 | 97 | 2 | 95 | 46 | 78 | 68 |
Rule15 | 36 | 83 | 97 | 2 | 85 | 85 | 98 | 86 |
Rule16 | 30 | 96 | 96 | 3 | 0 | 75 | 69 | 87 |
Rule17 | 38 | 84 | 97 | 2 | 86 | 56 | 68 | 84 |
Rule18 | 36 | 88 | 97 | 2 | 90 | 89 | 94 | 98 |
Rule19 | 52 | 92 | 98 | 1 | 93 | 58 | 59 | 56 |
Rule20 | 38 | 84 | 97 | 2 | 86 | 87 | 93 | 54 |
Rule21 | 44 | 81 | 97 | 1 | 83 | 69 | 98 | 36 |
Rule22 | 30 | 90 | 96 | 3 | 93 | 58 | 97 | 57 |
Rule23 | 30 | 93 | 96 | 3 | 96 | 69 | 94 | 58 |
Rule24 | 30 | 90 | 96 | 3 | 93 | 39 | 98 | 98 |
Rule25 | 40 | 87 | 94 | 2 | 92 | 93 | 93 | 97 |
Rule26 | 36 | 86 | 97 | 2 | 88 | 58 | 89 | 95 |
Rule27 | 33 | 84 | 97 | 2 | 86 | 79 | 97 | 97 |
Rule28 | 33 | 84 | 97 | 2 | 86 | 87 | 36 | 65 |
Rule29 | 33 | 81 | 97 | 2 | 83 | 89 | 91 | 95 |
Criteria | Support | Lift | Laplace | Confidence | Conviction | Leverage | Jaccard | Phi-coeff |
---|---|---|---|---|---|---|---|---|
Weight | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
No. | Support | Confidence | Laplace | Lift | Conviction | Leverage | Jaccard | Phi-coeff |
---|---|---|---|---|---|---|---|---|
Rule1 | 0.3214 | − 0.2500 | 0.3214 | − 0.7857 | − 0.2500 | − 0.7857 | − 0.5714 | − 0.8929 |
Rule2 | − 0.4286 | 0.2857 | − 0.6071 | 0.3929 | 0.3571 | − 0.5714 | − 0.5000 | − 0.9643 |
Rule3 | 0.5000 | − 0.8571 | 0.3214 | 0.2857 | − 0.8571 | − 0.1429 | − 0.3214 | − 0.0357 |
Rule4 | 0.1071 | − 0.2500 | 0.3214 | 0.0000 | − 0.2500 | 0.6071 | − 0.8929 | 0.8214 |
Rule5 | − 0.9643 | 0.6429 | − 0.6071 | 0.8571 | 0.6786 | 0.6071 | − 0.7143 | − 0.1071 |
Rule6 | − 0.1071 | 0.2857 | 0.3214 | 0.2143 | 0.1071 | − 0.9286 | 0.0714 | 0.8214 |
Rule7 | 0.7143 | 0.0714 | 0.3214 | − 0.3214 | 0.0000 | 0.7500 | − 0.7143 | 0.3214 |
Rule8 | 1.0000 | − 0.6786 | 0.9643 | − 1.0000 | − 0.7143 | − 1.0000 | − 0.3214 | 0.8214 |
Rule9 | − 0.9643 | 1.0000 | − 0.6071 | 1.0000 | 1.0000 | − 0.2143 | 1.0000 | 0.1071 |
Rule10 | − 0.7143 | 0.8571 | − 0.6071 | 0.8571 | 0.8571 | 0.9643 | − 0.7143 | 0.5714 |
Rule11 | − 0.4286 | 0.2857 | − 0.6071 | 0.3929 | 0.3571 | − 0.8571 | − 1.0000 | − 0.3571 |
Rule12 | − 0.4286 | 0.8571 | − 0.6071 | 0.6429 | 0.8571 | 0.9643 | 0.9286 | − 0.5714 |
Rule13 | 0.1071 | − 1.0000 | 0.3214 | − 0.3214 | − 1.0000 | 0.8571 | 0.0714 | − 0.7500 |
Rule14 | − 0.2857 | 0.6429 | 0.3214 | − 0.1786 | 0.5714 | − 0.6429 | − 0.0714 | − 0.1786 |
Rule15 | 0.3214 | − 0.6786 | 0.3214 | − 0.1786 | − 0.6429 | 0.2143 | 0.7857 | 0.1786 |
Rule16 | − 0.7143 | 0.8571 | − 0.6071 | 0.8571 | 0.8571 | 0.0714 | − 0.1429 | 0.2500 |
Rule17 | 0.6071 | − 0.4643 | 0.3214 | − 0.5357 | − 0.4643 | − 0.5000 | − 0.2143 | 0.0357 |
Rule18 | 0.3214 | 0.0000 | 0.3214 | 0.0714 | − 0.0714 | 0.4643 | 0.4643 | 0.8214 |
Rule19 | 0.9286 | 0.5000 | 0.9643 | − 0.9286 | 0.3571 | − 0.3571 | − 0.4286 | − 0.5714 |
Rule20 | 0.6071 | − 0.4643 | 0.3214 | − 0.5357 | − 0.4643 | 0.3214 | 0.3214 | − 0.6786 |
Rule21 | 0.8571 | − 0.8571 | 0.3214 | − 0.8571 | − 0.8571 | − 0.0357 | 0.7857 | − 0.8214 |
Rule22 | − 0.7143 | 0.2857 | − 0.6071 | 0.5357 | 0.3571 | − 0.3571 | 0.6071 | − 0.4643 |
Rule23 | − 0.7143 | 0.6429 | − 0.6071 | 0.7143 | 0.6786 | − 0.0357 | 0.4643 | − 0.3571 |
Rule24 | − 0.7143 | 0.2857 | − 0.6071 | 0.5357 | 0.3571 | − 0.7143 | 0.7857 | 0.8214 |
Rule25 | 0.7857 | − 0.0714 | − 1.0000 | − 0.7143 | 0.1071 | 0.7500 | 0.3214 | 0.0000 |
Rule26 | 0.3214 | − 0.1429 | 0.3214 | − 0.0714 | − 0.1429 | − 0.3571 | 0.0714 | 0.4286 |
Rule27 | − 0.1071 | − 0.4643 | 0.3214 | − 0.5357 | − 0.4643 | 0.1429 | 0.6071 | 0.5714 |
Rule28 | − 0.1071 | − 0.4643 | 0.3214 | − 0.5357 | − 0.4643 | 0.3214 | − 0.8929 | − 0.2500 |
Rule29 | − 0.1071 | − 0.8571 | 0.3214 | 0.1429 | − 0.8571 | 0.4643 | 0.2143 | 0.4286 |
Order | Rules | Phi | Phi+ | Phi− |
---|---|---|---|---|
1 | Rule12 | 0.3571 | 0.6384 | 0.2813 |
2 | Rule18 | 0.3170 | 0.6027 | 0.2857 |
3 | Rule10 | 0.2768 | 0.5848 | 0.3080 |
4 | Rule16 | 0.2054 | 0.5580 | 0.3527 |
5 | Rule7 | 0.1518 | 0.5313 | 0.3795 |
6 | Rule23 | 0.1250 | 0.5179 | 0.3929 |
7 | Rule6 | 0.1161 | 0.4911 | 0.3750 |
8 | Rule24 | 0.1116 | 0.4911 | 0.3795 |
9 | Rule26 | 0.0714 | 0.4821 | 0.4107 |
10 | Rule5 | 0.0670 | 0.4911 | 0.4241 |
11 | Rule4 | 0.0670 | 0.4777 | 0.4107 |
12 | Rule19 | 0.0670 | 0.5134 | 0.4464 |
13 | Rule15 | 0.0536 | 0.4732 | 0.4196 |
14 | Rule25 | 0.0402 | 0.4509 | 0.4107 |
15 | Rule14 | 0.0402 | 0.4777 | 0.4375 |
16 | Rule27 | 0.0268 | 0.4464 | 0.4196 |
17 | Rule29 | − 0.0223 | 0.4330 | 0.4554 |
18 | Rule22 | − 0.0268 | 0.4286 | 0.4554 |
19 | Rule20 | − 0.0536 | 0.4107 | 0.4643 |
20 | Rule9 | − 0.1071 | 0.4196 | 0.5268 |
21 | Rule8 | − 0.1161 | 0.4241 | 0.5402 |
22 | Rule17 | − 0.1339 | 0.3750 | 0.5089 |
23 | Rule3 | − 0.1384 | 0.3839 | 0.5223 |
24 | Rule21 | − 0.1741 | 0.3616 | 0.5357 |
25 | Rule13 | − 0.2054 | 0.3527 | 0.5580 |
26 | Rule2 | − 0.2455 | 0.3304 | 0.5759 |
27 | Rule28 | − 0.2500 | 0.3080 | 0.5580 |
28 | Rule11 | − 0.2679 | 0.3170 | 0.5848 |
29 | Rule1 | − 0.3527 | 0.2768 | 0.6295 |
-
Manage the complex decision situations by taking into account all the objective and subjective factors.
-
Mining interesting association rules for big data.
-
Improve the response time for iterative algorithms.
-
Improve road safety.