1 Introduction
1.1 Contributions
- This is the first empirical study that compares a design language that expresses predicates graphically against the UML and its OCL satellite notation.
- This is the first empirical demonstration suggesting benefits of a diagrammatic approach to the modelling of operations using design-by-contract (Meyer 1992).
- This is the first empirical demonstration that suggests that a diagrammatic modelling approach benefits tasks associated with the usage of models. The paper shows how VCL was significantly better when compared to UML+OCL in tasks related to end-user model comprehension and defect detection.
1.2 Outline
2 Background
2.1 Diagrams and their Cognitive Effectiveness
2.2 A Primer on VCL
Bank
imports sets from CommonTypes
. Package CommonTypes
provides definitions common across the model; Bank
focuses on those concerns specific to banking; other packages in Amálio (2011) address security concerns and the modular weaving of security and banking.Bank
(Fig. 1b) has class sets for banking entities, namely, customers, accounts and transactions, and uses value sets from SD of CommonTypes
(Fig. 1c — e.g. CT::Name
. Objects and values are depicted as rectangles. Sets for dates and times of Fig. 1c use VCL’s predicate language to define the novel derived sets. Set Month
, for instance, is defined as the natural numbers (set Nat
) from 1 to 12 in a graphical depiction of a set comprehension — Month = {n : Nat | n ≥ 1 ∧ n ≤ 12}. The arrows emanating from Nat
(predicate edges) refer to the source and are combined through conjunction.HasCurrentBefSavings
, SavingsArePositive
and CorporateHaveNoSavings
, defined in ADs of Fig. 1b, e and f, embody relevant invariants: customers must hold a current account prior to opening a savings account (HasCurrentBefSavings
), savings accounts must be positive (SavingsArePositive
) and corporate customers must not hold savings accounts (CorporateHaveNoSavings
).Bank
(Fig. 2a) include the operations available to the environment, namely, create customer, open account, deposit, withdraw, delete account, view an account’s balance, get accounts in debt, and get accounts of a customer.OpenAccount
does all that is involved in opening actual bank accounts and the New
operation inside Account
, a sort of sub-operation, creates new account objects only. Modifier operations are defined in contract diagrams (CDs); operations CreateCustomer
, Customer.New
, AccWithdraw
and Account.Withdraw
of Fig. 2a are defined in CDs of Fig. 2b, c, d and e, respectively. Observe operations are defined in ADs; ADs of Fig. 2h and g define the local Account.GetBalance
and the global AccGetBalance
of Fig. 2a; AD of Fig. 2f defines GetAccountGivenAccNo
imported in Fig. 2e and g.SavingsArePositive
(Fig. 1e) expresses a local invariant of Account
. It has no declarations; the predicate contains a pictorial propositional logic formula made-up of individual atomic statements involving predicate edges that are with an implication to say that if the account’s type is savings then its balance must not be negative — aType = savings ⇒ balance ≥ 0.CorporateHaveNoSavings
(Fig. 1f) expresses a set formula. Relation Holds
defined in SD of Fig. 1b, which denotes a set of pairs, is restricted to those pairs with corporate customers and savings accounts (a sort of filtering); the restricted relation is then required to be empty — hence, corporate customers must not hold savings accounts. In Fig. 1f the restrictions are performed using domain restriction (symbol \(\lhd \)) and range restriction (\(\rhd \)) using edge modifiers (represented as double arrows and denoting functions) and the outer shading indicates that the restricted set or relation must be empty. The resulting formula is: \(\{o : Customer | o.cType = corporate\}\lhd Holds \rhd \{o : Account | o.aType = savings\} = \emptyset \).HasCurrentBefSavings
(Fig. 1d) says that the set of customers with current accounts is a subset of customers with savings accounts — hence, a customer must hold a current account prior to holding a savings account. This involves two internal set variables (included in the declarations compartment) to represent customers with current accounts (custsCurr
) and customers with savings accounts (custsSav
), which are defined in the predicate in a similar way: relation Holds
is range-restricted using edge modifiers (symbol \(\rhd \)) to accounts that are either current or savings, and the actual sets are obtained from these restricted relations using the domain relational operator (symbol \(\leftarrow \)); finally, the bottom-most formula says, using enclosure (or insideness), that custsSav
must be a subset of custsCurr
. This results in the following formulas:
GetAccountGivenAccNo
(Fig. 2f) fetches the Account
object corresponding to aNo?
into output a!
— a! ∈{o : Account|o.accNo = aNo?}. AD Account.GetBalance
(Fig. 2h), which stores the account’s balance in the output bal!
, is imported in global AccGetBalance
(Fig. 2g), which fetches the account object corresponding to the aNo?
input (via imported GetAccountGivenAccNo
) and calls Account.GetBalance
on this object.Customer.New
(Fig. 2c), a constructor, says how a new Customer
object (c!
) should be initialised in the post-condition; using predicate edges, the after-state (represented in bold) of custNo
is set non-deterministically, and those of name
, addr
and cType
are set to the corresponding inputs. CD Account.Withdraw
(Fig. 2e) declares an amount
natural number input and says in the post-condition that the new account’s balance (bold line around rectangle) is the old balance minus the requested amount. These two local operations are brought into the global context (of a system or package) through CreateCustomer
(CD in Fig. 2b) and AccWithdraw
(CD in Fig. 2d). CD of CreateCustomer
declares inputs required from the environment and imports Customer.New
; CD of AccWithdraw
declares aNo?
input for account from which money is to be withdrawn, and imported assertion GetAccountGivenAccNo
of (Fig. 2f).2.3 VCL and its Visual Effectiveness
- VCL tries to be well-matched to meaning, following CDN’s closeness of mapping and PoN’s semantic transparency, by conveying the underlying mathematics. For example: VCL’s round contour set construct (eg.
Customer
,Account
,CustId
andCustType
in Fig. 1b, andHolds
,Account
andCustomer
in Fig. 1f) taps to similar shapes of mathematics (Venn or Euler circles); the shading of Venn diagrams is used in Fig. 1f to indicate that the set must be empty; the single and double lines of assertions and contracts (see Fig. 2a), respectively, refer to the fact that assertions involve a single set of states, whereas contracts involve the state-sets of pre- and post-conditions. - VCL’s graphical primitives follow PoN’s principle of semiotic clarity. In the different diagrams of Figs. 1 and 2, sets are consistently rendered as rounded shapes, and values and objects (members of a set) as rectangles,2 constraints upon single states are hexagons (independently of whether they denote invariants or observe operations). Furthermore, VCL’s primitives have a core meaning that varies slightly with the context, enabling users to infer the meaning of graphical expressions in different contexts, following CDN’s consistency and PoN’s graphical economy. The round contours of Fig. 1b do not mean exactly the same as the several round shapes of ADs and CDs of Figs. 1 and 2, but they all have set-like meanings.
- PoN’s principles of perceptual discriminability and visual expressiveness can be observed in the panoply of shapes and colours used across the different VCL diagram elements illustrated in Fig. 1 — packages are green clouds, sets are rounded blue contours, assertions are red hexagons and contracts are brown hexagons, objects are yellow rectangles, shading distinguishes empty from non-empty sets, and size and brightness differentiate types of sets and edges.
- PoN’s cognitive integration (integration of different pieces of information), and CDN’s role-expressiveness (how pieces contribute to the whole) is intrinsic to VCL. In SDs, as illustrated in Fig. 1b, the assertions of
Bank
’s SD, defined separately in ADs, explicitly say that the AD-defined invariants constrain the SD’s defined state space. BDs, on the other hand, identify all the different pieces of behaviour defined separately in ADs and CDs, as illustrated in Fig. 2a. Importing mechanism in ADs and CDs (illustrated in Fig. 2b, d and g) integrate pieces defined elsewhere to make compound definitions. Furthermore, the ‘+’ attached to package (clouds) assertions and contracts (elongated hexagons) of SDs, BDs and PDs, provide navigation clues that a separate diagram is opened upon double-clicking. CDN’s role-expressiveness is also manifested in SDs when line size and brightness are used to distinguish class from values sets (class contours are thicker) to give relevance to classes which are the major abstractions of a domain and act as beacons. - PoN’s dual coding (text complements graphics to strengthen communication) and CDN’s secondary notation is applied in SDs of Fig. 1b and c to distinguish the different kinds of sets and to reinforce the multiplicity constraints of relations; sets include a word to indicate set-kind, class sets are bold-lined and remaining sets have lines with normal thickness; relation multiplicity is conveyed both visually and textually. Figure 1f reinforces the empty set meaning through both shading and symbol ∅.
- PoN’s complexity management (representing information without overloading the human mind) and CDN’s abstraction gradient, are intrinsic to VCL. Statics and dynamics are clearly separated through VCL structural and behaviour diagrams; UML represents data and operations in class diagrams that tend to become severely cluttered even for medium to small models. VCL’s package construct (represented as clouds, Fig. 1a) defines large modules to keep the contents of each package manageable; reference sets (symbol ↑) enable references to sets from other packages. VCL operations are constructed modularly; operation and assertions may be composed of other modules (operations or assertions); for example, in Fig. 2d, operation
AccWithdraw
is made-up of observe operationsGetAccountGivenAccNo
and local contractAccount.Withdraw
. - VCL addresses CDN’s hard mental operations dimension as part of its raison d’être as it tries to improve the usability of formal software design, with its inherently hard underlying mathematics, through visualisation. However, this per-se does not totally solve this problem and two principles discussed above, CDN’s abstraction gradient and PoN’s complexity management, are key in giving VCL an abstract and modular ethos, which helps in dealing with hard mental operations. The notion of separation, inherent to modularity, is manifested in the different compartments of ADs and CDs (Figs. 1 and 2), which ease the hardness of the task through order and focus as each compartment expresses something specific that is relevant to the whole (also an application of the cognitive integration principle). Furthermore, modellers are encouraged to come up with abstractions and express designs that are modular as VCL provides the means to break down potentially overwhelmingly complex problems into manageable and meaningful chunks. BDs, for instance, encourage abstraction and modularity by letting the modeller focus on the different pieces that make up an overall behaviour. Despite this, there are so many ways in which VCL could be improved to ease hard mental operations.
2.4 VCL’s Trajectory, Founding Ideas and Suitability as Visual Notation
3 The Experiment’s Scope
3.1 Objective
Evaluate the effectiveness of VCL on user performance using a set of tasks associated with constructing and using design models by comparing VCL against UML and its satellite textual language OCL.
3.2 Research Questions
- RQ1: Is the performance of modellers in building software designs better with VCL than UML+OCL?
- RQ2: Is the comprehension of the problem accrued from modelling better with VCL than UML+OCL?
- RQ3: Is end-users’ performance in tasks related to usage of software designs, namely defect detection and model comprehension, better with VCL than UML+OCL?
- RQ4: Is VCL perceived as being more useful and easy to use than UML+OCL?
- RQ5: Is VCL’s usability better than UML+OCL’s?
- RQ6: How is the overall perception of VCL in comparison to UML+OCL?
3.3 Dependent Variables
Category/RQ | Definition | Description |
---|---|---|
Completeness (Co), RQ1 | \(\frac {score}{maxScore}\) | Aggregate proportion obtained from measuring satisfaction of each expected model piece on an ordinal scale from 4 (fully satisfied) to 0 (unsatisfied). Variables: CoS, CoI, CoO |
Accuracy (Ac), RQ1 | \(\frac {score}{maxScore}\) | Aggregate proportion obtained from evaluating test cases on an ordinal scale from 4 (fully satisfied) to 0 (unsatisfied). Variables: AcS, AcI, AcO |
Perceived modelling (PM), RQ1 | {VCL,UML,NP} | Notation perceived as providing better modelling performance. Variables: PMS, PMI, PMO |
Problem comprehension (PC), RQ2 | \(\frac {correctAnswers}{noQuestions}\) | Proportion of correct answers in PC questionnaire. Variables: PC |
Perceived PC (PPC), RQ2 | {VCL,UML,NP} | Notation perceived as providing better PC performance. Variables: PPC |
Model comprehension (MC), RQ3 | \(\frac {correctAnswers}{noQuestions}\) | Proportion of correct answers in MC questionnaire. Variables: MC |
Defect detection (DD), RQ3 | \(\frac {foundDefects}{totalDefects}\) | Proportion of identified defects in model with seeded defects (DD). Variables: DD |
Perceived usage, RQ3 | {VCL,UML,NP} | Notation perceived as providing better DD or MC performance. Variables: PDD, PMC |
Usefulness (U) and ease of use (EoU), RQ4 | \(\frac {score}{maxScore}\) | Calculated from responses on a Likert Scale from 1 (strongly agree) to 5 (strongly disagree). Variables: U, EoU |
Usability (Us), RQ5 | {VCL,UML,NP} | Notation perceived as better for different usability criteria. Variables: UsR, UsN, UsMOs UsLEC, UsLF, UsC, UsL, UsCS |
Appraisal (Appr), RQ6 | \(\{P{\kern -.23pt}o{\kern -.23pt}s{\kern -.23pt}i{\kern -.23pt}t{\kern -.23pt}i{\kern -.23pt}v{\kern -.23pt}e{\kern -.23pt},{\kern -.23pt} N{\kern -.23pt}e{\kern -.23pt}g{\kern -.23pt}a{\kern -.23pt}t{\kern -.23pt}i{\kern -.23pt}v{\kern -.23pt}e{\kern -.23pt},{\kern -.23pt} N{\kern -.23pt}e{\kern -.23pt}u{\kern -.23pt}t{\kern -.23pt}r{\kern -.23pt}a{\kern -.23pt}l\}\rightarrow Integer\) | Subject’s appraisal of VCL in comments to open questions to obtain number of positive, negative and neutral comments. Variables: Appr |
Preferred notation (PN), RQ6 | {VCL,UML,NP} | Preferred notation at state space (S), invariants (I), operations (O), overall and to use in the future. Variables: PNS, PNI, PNO, PN, FN |
3.3.1 RQ1, Modelling
- Completeness (Co) measures how much is modelled. This is based on a breakdown of requirements and corresponding modelling pieces, whose satisfaction is marked manually on an ordinal scale from 4 (fully satisfied) to 0 (unsatisfied).
- Accuracy(Ac) measures the quality of what is modelled based on a partitioning of requirements into aspects of interest exercised by test cases, which are evaluated manually on an ordinal scale from 4 (fully satisfied) to 0 (unsatisfied).
3.3.2 RQ2, Problem Comprehension
3.3.3 RQ3, Model Usage
- Defect detection (DD), related to model inspection, consists of identifying defects in a model with seeded errors.
- Model Comprehension (MC), or how well end-users understand given models, is assessed through multiple choice questions about a given design.
3.3.4 RQ4, Usefulness and Ease of Use
3.3.5 RQ5, Usability
3.3.6 RQ6, Overall Perception
3.4 Hypotheses
3.5 Recruitment
4 Experiment Design
4.1 Case Studies
- University Library (UL). A university library system enables members to borrow and return books, renew borrowings and recall books unavailable for loan.
- Flight Booking (FB). A system to manage flight bookings of different airlines.
4.2 Experimental Tasks and Time Allocation
Task 1 | Task 2 | Task 3 | Task 4 | Task 5 |
---|---|---|---|---|
Modelling of state space and invariants, 30 minutes | Modelling of operations, 35 minutes | Problem comprehension, 15 minutes | Defect Detection, 25 minutes | Model Comprehension, 15 minutes |
4.3 Design Type and Scheduling
Day 1 | Day 2 | Day 3 | Day 4 | |
---|---|---|---|---|
Group A | VCL, UL | UML+OCL, UL | UML+OCL, FB | VCL, FB |
Group B | UML+OCL, UL | VCL, UL | VCL, FB | UML+OCL, FB |
4.4 Modelling Tools
4.5 Training
Session 1 | Session 2 | Session 3 | |
---|---|---|---|
FCUL | VCL (2 hours) | UML+OCL (2 hours) | – |
FCT/UNL | VCL (2 hours) | UML+OCL (2 hours) | 1 hour VCL followed by 1 hour UML+OCL (optional) |
U. Luxembourg | VCL (2 hours) | UML+OCL (2 hours) | 1 hour VCL followed by 1 hour UML+OCL (optional) |
U. York | UML+OCL (3 hours) | VCL (3 hours) | – |
5 Instrumentation
5.1 Case Study Narratives and Sample Models
Author
is missing altogether. Table 5 gives the frequency distribution of seeded defects across the different modelling aspects being analysed (state space, invariants and operations). A χ2 goodness of fit test confirmed that the distributions of notations and case study are evenly distributed (see p-values of goodness of fit in Table 5) to avoid bias — no significant differences were found.
University Library | Flight Booking | |||
---|---|---|---|---|
VCL | UML+OCL | VCL | UML+OCL | |
State space | 8 | 7 | 13 | 13 |
Invariants | 7 | 5 | 7 | 6 |
Operations | 21 | 24 | 19 | 20 |
Total | 36 | 36 | 39 | 39 |
Goodness of fit p-value | .51 | .9 |
5.2 Comprehension Questionnaires
Copy
is recalled, its status changes from onloan
to recalled
.
5.3 Ability Questionnaire
5.4 Debriefing Survey
6 Statistical Analysis
6.1 Means, Proportions and their CIs
6.2 Hypothesis Testing
6.3 Effect Sizes (ESs)
6.4 Statistical Graphs
- Histograms (example in Fig. 9c), typically used to depict frequency distributions, use the area of the bars to represent proportions.
- Plots of point estimates and confidence intervals (example in Fig. 9e) depict point estimates (e.g. means) as dots and 95% confidence intervals (CIs) as error bars to convey the uncertainty of the sampled point estimates. They display the different samples in the abscissa and the units of the analysed data in the ordinate (Fig. 9e uses subject’s proficiency scores).
- Forest plots (example in Fig. 12a) display point estimates and their corresponding 95% CIs. Point estimates are represented as circles and CIs as error bars; the abscissa conveys the scale of the pictured result.
6.5 Symbols
- For statistical significance, given a p-value p we have: ns = not significant (p ≥ .05); * = p < .05; ** = p < .01, *** = p < .001.
- For ESs, we use symbols to denote the attained magnitude level. Given a value es we have: ø = null (|es|≤ .05); ∙ = small (.05 < |es|≤ .2); ∙+= small to medium (.2 < |es| < .4); ∙∙= medium (.4 ≤|es|≤ .6); ∙∙+= medium to large (.6 < |es| < .8); ∙∙∙= large (|es|≥ .8)
7 Participants
8 Results
8.1 Results: a Bird’s Eye View
8.2 Modelling
8.2.1 Completeness
- In state space (Fig. 10a), VCL (M = .64, 95% CI [.6,.69], sd = .22) is very close to UML+OCL (M = .67, CI [.63,.71], sd=.19) — the mean of differences (MD) is − .03 (CI [−.07,.02]). Cohen’s d ES of − .13 (CI [−.35,.08], ∙) indicates a small effect.
- In invariants (Fig. 10c), VCL (M = .1, CI[.07,.13], sd = .14) is close to UML+OCL (M=.13, CI[.09,.18], sd = .22) — MD = −.03 (CI[−.08,.01]), small effect (d = −.18, CI[−.38,.05], ∙).
- In Operations (Fig. 10e), VCL’s advantage (M = .16, CI [.13,.19], sd = .14) to UML+OCL (M = .11, CI [.09,.13], sd = .1) is significant — MD = .05[.03,.08], W p = 3 × 10− 6(***), BY q = 7 × 10− 5 (***), d = .45[.26,.71] (∙∙).
8.2.2 Accuracy
- In state space (Fig. 10b), VCL (M = .38[.33,.43], sd = .23) is quasi-equal to UML (M = .38[.34,.43], sd = .22) — MD = 0[−.05,.05], d = −.01[−.22,.21] (ø). Likewise for invariants (Fig. 10d) — VCL (M = .2[.16,.24], sd = .18), UML (M = .2[.15,.24], sd = .21), MD = .01[−.04,.05], d = .03[−.19,.24] (ø).
- Operations (Fig. 10f) highlight a non-significant VCL advantage (M = .11 [.08,.14], sd = .13) to UML+OCL (M = .09[.06,.11], sd = .11) — MD = .02[0,.04], W p = .07 (ns), d = .17[−.03,.4] (∙).
8.2.3 Perceived Performance
- In state-space (PMS), a higher but non-significant proportion of subjects perceived a better performance with VCL (22 out 43 = .51[.39,.67]) than UML + OCL (15/43 = .35[.15,.4]) and no preference (6) — proportion difference (PD) = .16[−.11,.41], χ2p = .011 (*), BY q = .088 (ns), h = .33[.03,.63] (∙+).
- In invariants (PMI), a higher but non-significant proportion of subjects perceived a better performance with VCL (23/43 = .53[.39,.67]) than UML + OCL (11/43 = .26[.15,.4]) and no preference (9) — PD = .28[.02,.5], χ2p = .018 (*), BY q = .13 (ns), h = .58[.28,.88] (∙∙).
- In operations (PMO), VCL (26/43 = .6[.46,.74]) significantly outperformed UML+OCL (12/43 = .28[.17,.43]) and no preference (5) — PD = .33[.05,.55], χ2p = .00034 (***), BY q = .0061 (**), h = .67[.37,.97] (∙∙+).
8.3 Comprehension and Model Usage
8.3.1 Objective Performance
- In problem comprehension (PC, Fig. 13a), VCL (M = .65[.61,.69], sd= .19) is quasi-equal to UML+OCL (M = .64[.6,.68], sd= .2) — MD = .01[−.04,.05]).
- In defect detection (DD, Fig. 13b), VCL’s (M = .27[.24,.3], sd= .13) difference to UML (M = .19[.17,.21], sd = .1) is highly significant — MD = .08[.05,.1]), W p = 5 × 10− 7 (***), BY q = 2 × 10− 5 (***), d = .67[.41,.87] (∙∙+).
- In model comprehension (MC, Fig. 13c), VCL (M=.67[.64,.71], sd = .16) significantly outperforms UML+OCL (M = .61[.58,.64], sd = .16) — MD = .06[.02,.1]), W p = .0025 (**), BY q = .022 (*), d = .37[.09,.52] (∙+).
8.3.2 Perceived Performance
- In PC, VCL (P = .47[.33,.61]) outperformed UML+OCL (P = .12[.05,.24]) non-significantly — PD = .35[.13,.53], χ2p = .0098 (**), BY q = .081 (ns), h = .81[.51,1.1] (∙∙∙).
- In DD, VCL (P = .56[.41,.7]) outperformed UML+OCL (P = .23[.13,.38]) non-significantly — PD = .33[.06,.54], χ2p = .0074 (**), BY q = .066 (ns), h = .68[.38,.98] (∙∙+).
- In MC, VCL (P = .44[.3,.59]) outperformed UML+OCL (P = .14[.07,.27]) non-significantly — PD = .3[.08,.49]), χ2p = .026 (*), BY q = .17 (ns), h = .69[.39,.99] (∙∙+).
8.4 Usefulness and Ease of Use
- In U (Fig. 16a), VCL (M = .68[.64,.73], sd = .15) outperforms UML+OCL (M = .66[.61,.7], sd = .14), but not significantly — MD = .03[−.04,.09] (sd=.14), W p = .19 (ns), d = .18[−.18,.42] (∙).
- In EoU (Fig. 16b), VCL (M = .58[.55,.62], sd=.12) significantly outperforms UML+OCL (M = .49[.45,.54], sd=.14, ) — MD = .09[.04,.15] (sd=.14), W p = .0026 (**), BY q = .022 (*), d = .7[.21,.85], (∙∙+).
8.5 Usability
- In reading (R), VCL’s proportion of .53 (CI [.39,.67]) against UML+OCL’s .33 (CI [.19,.47]) yields a non-significant proportion difference (PD) of .21[−.07,.45] — χ2p = .0064 (**), BY q = .066 (ns), h = .43[.13,.73] (∙∙).
- In navigation (N), VCL’s (P = .69[.54,.81]) difference to UML+OCL (P = .14[.07,.28]) is significant — PD = .55[.29,.72]), χ2p = 6 × 10− 6 (***), BY q = .00014 (***), h = 1.19[.88,1.49] (∙∙∙).
- In maps and overviews (MOs), VCL’s (P = .56[.41,.7]) difference to UML+OCL (P = .21[.11,.35]) is not significant — PD = .35[.09,.56]), χ2p = .0074 (**), BY q = .066 (ns), h = .74[.31,1.16] (∙∙+).
- In live error checking (LEC), VCL (P = .51[.37,.65]) significantly outperformed UML (P = .09[.04,.22]) — PD = .42[.2,.59], χ2p = .0024 (**), BY q = .03 (*), h = .97[.68,1.27] (∙∙∙).
- In look and feel (LaF), VCL (P = .7[.55,.81]) is significantly higher than UML+OCL (P = .16[.08,.3]) — PD = .53[.27,.72], χ2p = 3 × 10− 6 (***), BY q = 9 × 10−5 (***), h = 1.15[.85,1.45] (∙∙∙).
- In cohesion (C), VCL (P = .37[.24,.52]) non-significantly outperforms UML (P = .27[.16,.42]) — PD = .1[−.14,.32], χ2p = .68 (ns), h = .21[−.1,.52] (∙+).
- In learnability (L), VCL (P = .46[.3,.64]) non-significantly outperforms UML+OCL (P = .32[.18,.51]) — PD = .14[−.18,.43], χ2p = .27, h = .29[−.08,.66] (∙+).
- In comfort/satisfaction (CS), VCL’s (P = .54[.36,.7]) advantage to UML (P = .25[.13,.43]) is not significant — PD = .29[−.04,.55], χ2p = .074 (ns), h = .6[.22,.97] (∙∙).
8.6 Overall Perception
8.6.1 Preferred Notation
- In the state space (PNS), VCL (P = .42[.28,.57]) under-performs UML+OCL (P = .44[.3,.59]), but non-significantly — PD = −.02[−.29,.24], χ2p = .026 (*), BY q = .17 (ns), h = −.05[−.35,.25] (ø).
- In invariants (PNI), VCL (P = .65[.5,.78]) significantly outperforms UML+OCL (P = .19[.1,.33]) — PD = .47[.2,.66], χ2p = 6 × 10− 5 (***), BY q = .0012 (**), h = .99[.69,1.28] (∙∙∙).
- In operations (PNO), VCL (P = .58[.43,.72]) significantly outperforms UML (P = .21[.11,.35]) — PD = .37[.11,.58]), χ2p = .0026 (**), BY q = .03 (*), h = .78[.49,1.08] (∙∙+).
- Overall, VCL’s (P = .49[.35,.63]) advantage to UML (P = .28[.17,.43]) is not significant— PD = .21[−.05,.44], χ2p = .091 (ns), h = .43, CI [.14,.73] (∙∙).
- There is a tie in the notation to be used in the future (NF) — PVCL = PUML = .33[.2,.47], PD = 0[−.23,.23], h = 0[−.3,.3] (ø).
8.6.2 Positive and negative aspects
- From a total of 385 comments, 215 were positive (P = .56[.51,.61]), 121 negative (P = .31[.27,.36]) and 49 were neutral — Fig. 19a and b.
- As signalled by Fig. 19c, positives significantly surpass the negatives — PD = .24[.15,.33], χ2p = 4 × 10− 24 (***), BY q = 4 × 10− 22 (***),h = .5[.4,.6] (∙∙).
- By dissecting the positive comments (Fig. 19d), we can see that understanding (
U
, 21), ease of use (oU
, 20) and ease of finding errors (EFE
, 19) were the most remarked. Participants also appraised positively VCL’s modelling of behaviour (MB
, 18) and invariants (MI
, 14), and VCL’s visualisations (V
, 14), usability with respect to ease of access to information (EAI
, 13) and navigability (N
, 12), while appreciating VCL as an overall language (OL
, 10). UML’s Papyrus tool was perceived by many as difficult (TD
, 10). Some participants appreciated VCL’s overall modelling (M
, 8), organisation (Or
, 8), user interface (UI
, 6), appealling (A
, 5), and its sate modelling approach (MS
, 5), while remarking UML+OCL’s bad usability (BU
, 6). A few participants found it comfortable to work with VCL (Ct
, 4), appreciated VCL’s cohesion (Cn
, 4) and capacity to provide overviews (Ov
, 4) while remarking that OCL is difficult (D
, 4). - In terms of VCL’s negatives (Fig. 19e), the most remarked aspects were UML’s familiarisation (
F
, 16), the fact that UML is more know (MK
, 9) and VCL’s bad usability (BU
, 9). Some participants remarked UML’s ease of use (EU
, 8), modelling of state (MS
, 8) and behaviour (MB
, 7), and cohesion (Cn
, 7), while remarking VCL’s cumbersome modelling of behaviour (CB
, 8) and cumbersome editing (Ed
, 7). Some participants appraised positively UML’s Papyrus tool (T
, 6) and UML’s understanding (U
, 5), while recognising that VCL’s tool is difficult (TD
, 6). A few praised UML as an overall language (OL
, 4), its capacity to provide overviews (Ov
, 3) and its expressivity (Ex
, 3), and that UML is okay and does the job (Ok
, 3), while emphasising that they felt comfortable using UML (Ct
, 3) and that it is easy to find errors in UML models (EFE
, 3).
8.7 Learning effects
9 Threats to Validity
9.1 Conclusion Validity
9.2 Construct Validity
Id | Question | N | Mean | SD |
---|---|---|---|---|
Training | ||||
T1 | The training allowed me to carry out the tasks of the experiment competently | 43 | 3.21[2.91, 3.51]∗∗∗,∙+ | .99 |
T2 | I had enough training in UML | 43 | 2.56[2.24, 2.88]∗,∙∙ | 1.08 |
T3 | I had enough training in OCL | 43 | 3.67[3.38, 3.96]∗∗∗,∙∙∙ | .97 |
T4 | I had enough training in VCL | 43 | 3.35[3.09, 3.61]∗,∙∙ | .87 |
T5 | I would have performed better if I were given more hours of training | 43 | 1.63[1.39, 1.86]∗∗∗,∙∙∙ | .79 |
Case studies | ||||
CS1 | I had enough time to read through the requirements descriptions of the given case studies | 43 | 2.02 [1.74, 2.31]∗∗∗,∙∙∙ | .96 |
CS2 | I fully understood the systems described in the requirements documents | 43 | 2.21[1.97, 2.45]∗∗∗,∙∙∙ | .8 |
CS3 | In general, I had no problems carrying out the experiment’s tasks | 43 | 2.79[2.52, 3.06]NS,∙+ | .91 |
CS4 | The instructions were clear and easy to follow | 43 | 1.77[1.56, 1.97]∗∗∗,∙∙∙ | .68 |
CS5 | The university library system was fairly easy to understand | 43 | 2.02[1.79, 2.25]∗∗∗,∙∙∙ | .77 |
CS6 | The flight booking system was fairly easy to understand | 43 | 2.14[1.87, 2.41]∗∗∗,∙∙∙ | .89 |
CS7 | The university library case study was realistic; a good example of a real-world case study | 7 | 1.57[.99, 2.15]∗,∙∙∙ | .79 |
CS8 | The flight booking case study was realistic; a good example of a real-world case study | 7 | 1.86[1.19, 2.52]NS,∙∙∙ | .9 |
Defect detection | ||||
DD1 | I fully understood the task | 43 | 1.81[1.61, 2.02]∗∗∗,∙∙∙ | .7 |
DD2 | I was not sure what type of errors I was looking for | 43 | 3.02[2.65, 3.4]NS,ø | .03 |
Modelling tools | ||||
MT1 | The modelling tools were fairly stable | 43 | 2.23[1.96, 2.51]∗∗∗,∙∙∙ | .92 |
9.3 Internal Validity
Id | Question | N | Mean | SD |
---|---|---|---|---|
E1 | I liked the experiment | 35 | 1.89[1.61, 2.16]∗∗∗,∙∙∙ | .83 |
E2 | The experiment was interesting | 43 | 1.93[1.66, 2.2]∗∗∗,∙∙∙ | .91 |
E3 | During the experiment I was motivated, committed and I felt challenged | 43 | 2.23[1.94, 2.52]∗∗∗,∙∙∙ | .97 |
E4 | I felt it was a good learning experience | 43 | 1.95[1.7, 2.21]∗∗∗,∙∙∙ | .84 |
9.4 External Validity
10 Related Work
- It covers comprehension, a focus of many studies (Purchase et al. 2002; Purchase et al. 2001; Torchiano 2004; Staron et al. 2006; Ricca et al. 2007; 2010) to investigate whether either notation or modelling per se are fulfilling their aims. This paper insists on end-user comprehension (either through model comprehension or defect detection tasks), but going beyond to explore the problem comprehension gained from modelling.
- It goes beyond data modelling, sharing with (Kim and March 1995) the emphasis on the modeller perspective, with Otero and Dolado (Otero and Dolado 2002) the focus on dynamic modelling and with Briand et al (Briand et al. 2005; Briand et al. 2011) the focus on constraints and design by contract, but exploring novel graphical notations to the modelling of invariants and operations not covered by any other study.
Study | Goal | Perspective | Tasks | Participants |
---|---|---|---|---|
Kim and March (1995) | Investigate effectiveness of data (or state space) modelling notations on comprehension and modelling. | end-user, modeller | (i) modelling from textual requirements, (ii) questionnaire on model comprehension, (iii) detecting model defects against textual requirements. | 28 postgraduate students (users) and 26 professionals (modellers). |
Moody (2002) | Investigate comprehension of large data modelling mechanisms | end-user | (i) questionnaire on model comprehension, (ii) defect detection | 60 students |
Investigate comprehension of variants of UML class and collaboration diagrams. | end-user | (i) identifying correct and incorrect diagrams against textual requirements | ||
Otero and Dolado (2002) | Investigate comprehension of UML dynamic notations (statecharts, and sequence and communication diagrams) | end-user | (i) model comprehension questionnaires | 18 final year undergraduates |
Tilley and Huang (2003) | Investigate effectiveness of UML in facilitating program understanding. | end-user, programmer | (i) a questionnaire on impact of changes to a system from code and UML-based documentation. | 15 expert academics |
Torchiano (2004) | Study comprehension of UML class models accompanied by object diagrams. | end-user | (i) model comprehension questionnaires, (ii) debriefing survey | 17 masters students |
Briand et al (2005) | Study comprehension of UML-based with OCL | end-user | (i) questionnaires on model comprehension and (ii) maintenance (impact of changes to a model), (iii) defect detection and (iv) debriefing survey. | 38 final-year undergraduates |
Arisholm et al (2006) | Investigate effectiveness of UML models with respect to maintenance. | end-user, programmer | (i) code and (ii) design modifications on given design and code, (iii) post-task survey | 96 undergraduate students. |
Staron et al (2006) | Investigate comprehension of stereotyped UML class models | end-user | (i) model comprehension questionnaires, (ii) debriefing survey | 44 students and 4 industry professionals |
Investigate comprehension of stereotyped UML class models | end-user | (i) model comprehension questionnaires, (ii) debriefing survey | 51 subjects among under- and post-graduates, and researchers. | |
Briand et al (2011) | Study system sequence diagrams and natural language contracts on quality of UML-based models. | end-user, modeller | (i) modelling | 223 final year undergraduates |
11 Conclusions
RQ | Findings |
---|---|
RQ1 | On modelling of state space, invariants and operations: |
∙ VCL’s structural diagrams (SDs) were quasi-equal to UML class diagrams (variables CoS, AcS and PMS). | |
∙ On invariants, no significant VCL benefits were encountered. | |
∙ On operations, significant VCL benefits were encountered in completeness (CoO), but not accuracy (AcO). A significant proportion of subjects perceived a better VCL performance (PMO). | |
RQ2 | On the modeller’s comprehension of modelled problem, no significant differences were encountered in both objective (PC) and subjective measurements (PPC). |
RQ3 | On model usage, significant VCL benefits were encountered in objective measurements of defect detection (DD) and model comprehension (MC). |
RQ4 | In usefulness (U), no significant VCL benefits were encountered. In ease of use (EoU), VCL was perceived as being significantly better than UML+OCL. |
RQ5 | Usability was analysed from different angles. Significant VCL benefits were found for navigation (UsN), live error checking (LEC) and look and feel (LF). |
RQ6 | On the overall perception: |
∙ A largely significant proportion of participants preferred VCL as a notation to express invariants (PNI) and operations (PNO). | |
∙ In the appraisal of positive and negative aspects (Appr), VCL was appraised favourably as positives significantly surpassed negatives. |