1 Introduction
2 Identification of categories, choices, and their relations
3 Previous work on category and choice identification
4 Experimental settings
4.1 The present studies
MOS
), which is being used by an international company providing catering service for many different airlines. The main function of
MOS
is to help the catering company determine the types (such as normal, child, and vegetarian) and numbers of meals to be prepared and loaded onto each flight served by the company.4.2 Our previous studies
5 Terminology and definitions
-
(One of the overlapping choices) Given a category Q, two distinct choices Q x and Q y are said to be overlapping if there exists a common element in both Q x and Q y (that is, the two sets of possible values are not disjoint).
-
(One of the combinable choices) Given a category Q, two distinct choices Q x and Q y are said to be combinable if, for any complete test frames \(B^{c}_1\) and \(B^{c}_2\) containing Q x and Q y , respectively, such that \(B^{c}_1 \setminus \{Q_x\} = B^{c}_2 \setminus \{Q_y\}\), they are associated with the same function rule in the specification. (The mapping between a given set of system inputs and the corresponding set of system outputs is expressed by means of a function rule. This rule states precisely the preconditions for the function to execute and how the outputs are related to the inputs (Chen et al. 2004).) In this case, we should replace the individual Q x and Q y by a combined \(Q_z = Q_x \cup Q_y\) so as to reduce the number of complete test frames and, hence, save testing effort.
-
(A composite choice) Given a category Q, any choice Q z is said to be composite if there exist valid, nonoverlapping, and noncombinable choices Q x and Q y such that \(Q_x \cup Q_y \subseteq Q_z\). (Thus, we should replace Q z by Q x and Q y in order to increase the comprehensiveness of the resulting set of complete test frames.)
-
It is an invalid choice.
-
It is one of the overlapping choices.
-
It is one of the combinable choices.
-
It is a composite choice.
-
It is an irrelevant category.
-
It is a category with missing choices.
-
It is a category with problematic choices.
6 Study 1: Effect of tester experience
6.1 Objective and steps
MOS
contains numerous modules and is fairly complex in logic, we first decompose \({\mathbb{S}}_{\rm{\tt MOS}}\) into several functional units. For instance, there is a functional unit \({\mathbb{U}}_{\rm{\tt MEAL}}\) directly related to the generation of daily meal schedules and other units related to the maintenance of the airline codes and city codes. Such decomposition does not apply to \({\mathbb{S}}_{\rm{\tt TRADE}}\) and \({\mathbb{S}}_{\rm{\tt PURCHASE}}\) because their corresponding systems are less complex and, hence, can be tested in their entirety. Thus, we treat \({\mathbb{S}}_{\rm{\tt TRADE}}\) and \({\mathbb{S}}_{\rm{\tt PURCHASE}}\) as functional units denoted by \({\mathbb{U}}_{\rm{\tt TRADE}}\) and \({\mathbb{U}}_{\rm{\tt PURCHASE}}\), respectively. After the subjects have learned choc’late and ctm, we ask each of them to do the first round of identification exercises according to the following scheme: MOS
.6.2 Findings and discussion
6.2.1 Potential categories and choices
Functional unit | Numbers of PCsa
| Numbers of potential categories (choices) | |||
---|---|---|---|---|---|
Totals | Meansb
| Ranges | Standard deviations | ||
By inexperienced testers/experienced testers
| |||||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 48/8 | 265 (579)/54 (124) | 5.5 (12.1)/6.8 (15.5) | 5 (10)/2 (8) | 0.9 (1.5)/0.7 (2.5) |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 48/8 | 475 (1 138)/101 (278) | 9.9 (23.7)/12.6 (34.8) | 8 (20)/4 (11) | 2.0 (4.4)/1.3 (3.1) |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 44/8 | 615 (1 488)/134 (299) | 14.0 (33.8)/16.8 (37.4) | 36 (73)/3 (10) | 7.8 (16.7)/1.5 (3.7) |
Averages | 9.8 (23.2)/12.0 (29.2) | 16.3 (34.3)/3.0 (9.7) | 3.6 (7.5)/1.2 (3.1) |
-
When the complexity of the functional units increases, the numbers of potential categories and choices identified by both groups of subjects also increase.
-
There are large variations in the numbers of potential categories and choices identified by the subjects, and the variations are generally much larger for inexperienced testers than for experienced testers.
-
Compared with inexperienced testers, experienced testers are able to identify more potential categories and choices.
6.2.2 Missing categories
Functional unit | Total numbers of missing categories | Mean numbers of missing categories in each PC
a
| % of mean numbers of missing categories in each PC
a in relation to mean numbers of potential categories in each PC
a
|
---|---|---|---|
By inexperienced testers/experienced testers
| |||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 1/5 | 0.02/0.63 | 0.38%/9.26% |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 33/5 | 0.69/0.63 | 6.95%/4.95% |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 158/11 | 3.59/1.38 | 25.69%/8.21% |
Averages | 1.43/0.88 | 11.01%/7.47% |
6.2.3 Problematic and non-problematic categories
Functional unit | Problematic categories | Non-problematic categories | ||||
---|---|---|---|---|---|---|
Total numbers | Mean numbers in each PC
a
| Mean % among all potential categories | Total numbers | Mean numbers in each PC
a
| Mean % among all potential categories | |
By inexperienced testers/experienced testers
| ||||||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 43/5 | 0.90/0.63 | 16.23%/9.26% | 222/49 | 4.63/6.13 | 83.77%/90.74% |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 79/12 | 1.65/1.50 | 16.63%/11.88% | 396/89 | 8.25/11.13 | 83.37%/88.12% |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 158/28 | 3.59/3.50 | 25.69%/20.90% | 457/106 | 10.39/13.25 | 74.31%/79.10% |
Averages | 2.04/1.88 | 19.52%/14.01% | 7.75/10.17 | 80.48%/85.99% |
Functional unit | % increase in mean numbers of potential categories in each PC
a
| % increase in mean numbers of missing categories in each PC
a
| % increase in mean numbers of problematic categories in each PC
a
| % increase in mean percentages of problematic categories among all potential categories | % increase in mean numbers of non-problematic categories in each PC
a
| % decrease in mean percentages of non-problematic categories among all potential categories |
---|---|---|---|---|---|---|
By inexperienced testers/experienced testers
| ||||||
From \({\mathbb{U}}_{\rm{\tt TRADE}}\) to \({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 80%/85% | 3 350%/0% | 83%/138% | 2%/28% | 78%/82% | 0.5%/2.9% |
From \({\mathbb{U}}_{\rm{\tt PURCHASE}}\) to \({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 41%/33% | 420%/119% | 118%/133% | 54%/76% | 26%/19% | 10.9%/10.2% |
-
Complexity of the functional units. Consider observations 1, 5, 7, 10, 13, and 15. When the complexity of a functional unit increases, there are more aspects to be tested. On one hand, more testing aspects normally lead to more categories and choices (in terms of the number of potential categories and choices (observation 1) and the number of non-problematic categories (observation 10)) to be identified. On the other hand, more testing aspects would also increase the chances of mistakes (in terms of the number of missing categories (observations 5 and 13) and the number of problematic categories (observations 7 and 15)).
-
Experience of the subjects. Consider observations 6, 8, 9, 11, 12, and 14. When compared with inexperienced testers, experienced testers have fewer missing categories (observations 6 and 12) and problematic categories (observation 8), but more non-problematic categories (observation 11). Hence, experience in software development and testing does help in the ad hoc identification exercises. It should be noted, however, that the contribution of experience to the performance of an ad hoc identification approach decreases with the complexity of the functional unit (observations 9 and 14). Thus, not only are systematic and more effective identification techniques generally needed, but such a demand will grow with the complexity of the specifications.
6.2.4 Types of problematic categories
Functional unit | Total numbers of different types of problematic categories (Mean numbers of different types of problematic categories in each
PC
a) | |||||
---|---|---|---|---|---|---|
Irrelevant categories | With missing choices | With invalid choices | With overlapping choices | With combinable choices | With composite choices | |
By inexperienced testers/experienced testers
| ||||||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 0/0 (0.00/0.00) | 3/0 (0.06/0.00) | 0/0 (0.00/0.00) | 6/0 (0.13/0.00) | 0/0 (0.00/0.00) | 34/5 (0.71/0.63) |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 0/1 (0.00/0.13) | 9/1 (0.19/0.13) | 2/1 (0.04/0.13) | 26/2 (0.54/0.25) | 0/1 (0.00/0.13) | 42/8 (0.88/1.00) |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 123/14 (2.80/1.75) | 12/2 (0.27/0.25) | 14/4 (0.32/0.50) | 4/1 (0.09/0.13) | 5/2 (0.11/0.25) | 4/7 (0.09/0.88) |
Averages | (0.93/0.63) | (0.17/0.13) | (0.12/0.21) | (0.25/0.13) | (0.04/0.13) | (0.56/0.83) |
Functional unit | % of different types of problematic categories among all potential (problematic) categories | |||||
---|---|---|---|---|---|---|
Irrelevant categories | With missing choices | With invalid choices | With overlapping choices | With combinable choices | With composite choices | |
By inexperienced testers/experienced testers
| ||||||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 0.0%/0.0% (0.0%/0.0%) | 1.1%/0.0% (7.0%/0.0%) | 0.0%/0.0% (0.0%/0.0%) | 2.3%/0.0% (14.0%/0.0%) | 0.0%/0.0% (0.0%/0.0%) | 12.8%/9.3% (79.1%/100.0%) |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 0.0%/1.0% (0.0%/9.1%) | 1.9%/1.0% (11.4%/9.1%) | 0.4%/1.0% (2.5%/9.1%) | 5.5%/2.0% (32.9%/18.2%) | 0.0%/1.0% (0.0%/9.1%) | 8.8%/7.9% (53.2%/66.7%) |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 20.0%/10.4% (77.8%/50.0%) | 2.0%/1.5% (7.6%/7.1%) | 2.3%/3.0% (8.9%/14.3%) | 0.7%/0.7% (2.5%/3.8%) | 0.8%/1.5% (3.2%/7.1%) | 0.7%/5.2% (2.5%/25.0%) |
Averages | 6.7%/3.8% (25.9%/19.7%) | 1.7%/0.8% (8.7%/5.4%) | 0.9%/1.3% (3.8%/7.8%) | 2.8%/0.9% (16.5%/7.3%) | 0.3%/0.8% (1.1%/5.4%) | 7.4%/7.5% (44.9%/63.9%) |
-
Experienced testers are not necessarily better than inexperienced ones in every aspect. Experienced testers have identified fewer irrelevant categories, categories with missing choices, and categories with overlapping choices than inexperienced testers, but more categories with invalid choices, categories with combinable choices, and categories with composite choices.
-
Among the different types of problematic categories, categories with composite choices occur the most, while categories with combinable choices occur the least.
7 Study 2: Effectiveness of checklist guideline
7.1 Objective and steps
7.2 Findings and discussions
Functional unit | Total numbers of missing categories |
---|---|
Without checklist/with checklist
| |
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 5/1 |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 5/2 |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 11/4 |
Totals | 21/7 |
Functional unit | Total numbers of different types of problematic categories | |||||
---|---|---|---|---|---|---|
Irrelevant categories | With missing choices | With invalid choices | With overlapping choices | With combinable choices | With composite choices | |
Without checklist/with checklist
| ||||||
\({\mathbb{U}}_{\rm{\tt TRADE}}\)
| 0/1 | 0/0 | 0/0 | 0/0 | 0/0 | 5/2 |
\({\mathbb{U}}_{\rm{\tt PURCHASE}}\)
| 1/1 | 1/0 | 1/1 | 2/0 | 1/0 | 8/3 |
\({\mathbb{U}}_{\rm{\tt MEAL}}\)
| 14/9 | 2/0 | 4/1 | 1/0 | 2/0 | 7/3 |
Totals | 15/11 | 3/0 | 5/2 | 3/0 | 3/0 | 20/8 |