Human factors research and debate related to mental workload have been going on for decades since the 60’s (McKenzie et al. 1966) and is still happening (Finomore et al. 2013). The raised issues are: is it useful (“do we need the concept of mental workload or do we have to banish it for good?” Leplat 2002), is it scientifically credible (Dekker et al. 2010), and, in case of a positive answer, how to measure it (Jex 1988).

In the context of the driving task, human factors research is abundant and diversified, aiming at a better understanding of driver behavior and functional capacities in terms of perception, cognition, and motor processes in order to improve road safety (Lee 2008), drivers’ mental workload being an important issue to consider in this framework (Dick de Waard 1996). This problematic is even more crucial since the deployment of onboard Intelligent Transport System in the vehicles (Carsten and Nilsson 2001), as human factors have the responsibility to evaluate whether these innovative systems really support the driving task or, on the contrary, lead to distraction and increase mental workload, with potential dramatic consequences in terms of road safety. So, since the beginning of research in this area, the objectives have been to establish methods of assessing fluctuations in mental workload that are sensitive to the various aspects of attentional processing requirements in relation to both external environmental conditions, such as traffic density as well as in-vehicle conditions, such as competing visual and auditory displays (Pauzié and Amditis 2010).

Workload can be defined as a hypothetical construct that represents the cost incurred by a human operator to achieve a particular level of performance (Hart 1986). Driving performance and the driver’s mental workload are both relevant and complementary parameters to consider, knowing that they can vary independently (Yeh and Wickens 1988). Indeed, if the complexity of the task increases, the driver is able to maintain a stable performance to a certain degree, by increasing effort.

Making only performance assessment, such as measurements of vehicle deviation trajectories in relation to visual consultation of in-vehicle system, or quantifying and qualifying driving errors, is not sufficient. These variables reveal only very poor system design inducing high-critical situations. Ambitious of the human factors is firstly, to be subtler in the assessment of the system’s usability and, secondly, to get indications about which part of the design would require improvement: visual and/or auditory features, timing of the messages, etc. Assessment of a driver’s mental workload will allow obtaining data on the level of effort, if any, to maintain performance, and indications on what was the origin of the costly process (Pauzié 2008).

Furthermore, analysis of observable driver’s behavior such as visual strategy in terms of glance duration and glance frequency gives interesting data of the display visual demand (Parkes et al. 1991), and can lead to a standard performance concept, where it is considered that a system is unsafe because it requires too much reading time to be used (Zwahlen and Rockwell 1977). Nevertheless, workload needs to be evaluated in parallel, in order to take also into account the potential cognitive support brought by the displayed information—in comparison with reference situation where there is no system, and so, no visual load toward display—such as, for example, the decrease in workload provided at the strategic level of the orientation process in the case of the support of a correctly designed navigation system (Pauzié and Pachiaudi 1997).

For all these reasons, in addition to data coming from performance and behavioral analysis, complementary measurements are necessary in order to identify the driver’s workload. In this purpose, workload assessment techniques abound (Cain 2007; Tsang and Wilson 1997). However, subjective ratings are the most commonly used method, as subjective ratings may come closest to tapping the essence of mental workload and would provide the most generally valid and sensitive indicator (Johanssen and Moray 1979) and are also very easy to use from a practical point of view, especially in a context of real road experimentations.

The NASA-TLX is the most popular of these subjective evaluation methods. It assumes that the workload is influenced by several factors, namely mental demand, physical demand, temporal demand, performance, frustration level and effort, with the assumption that some combination of these dimensions is likely to represent the “workload” experienced by most people performing most tasks (Hart and Staveland 1988). Originally, this method was tested and used by the army which considered it as being superior in terms of sensitivity and well accepted by the operator compared with other subjective methods (Hill et al. 1992).

Since its creation, this method has been explosively used in human factors researches in several areas. Hart (2006) investigated this phenomenon 20 years later and noted that simply “Googling” the phrase “NASA-TLX” returned 82,900 citations, 44,000 of which were in English, from diversified areas of investigation and fields of research.

This tool has been also widely used in the context of road safety and design of In-Vehicle Information Systems (IVIS) and Advanced Driver Assistance Systems (ADAS). For example, for IVIS, it allowed evaluating navigation system usability and human–machine interface design (Park and Cha 1998) and, for ADAS, it allowed showing that an Adaptive Cruise Control (ACC) system substantially decreased driver’s workload among drivers with lower limb disabilities (Peters 2001). This method allowed assessing efficiency of an adaptive HMI that filters information presentation according to situational requirements in field experiment, in conjunction with objective methods (Piechulla et al. 2003). Data from NASA-TLX ratings allowed also to demonstrate the effect of an increase in driver workload in a study of mobile phoning while driving (Alm and Nilsson 1994). These are some examples of the huge amount of knowledge gathered thanks to this method.

It has to be stressed that mental workload is multidimensional and, among other things, depends upon the type of task. Since NASA-TLX was initially designed for use in aviation, original factors were adapted to this context. The Driving Activity Load Index (DALI), which is a revised version of the NASA-TLX, has been created specifically for the driving context, with factors defined according to the specific dimensions of this task that could potentially induce workload (Pauzié and Marin-Lamellet 1989). The basic principle is the same as the NASA-TLX, with a scale rating procedure for six pre-defined factors, the weighting procedure being eliminated after several analysis showing the lack of usefulness of this second part of the tool. The six DALI factors are: Effort of attention, Visual demand, Auditory demand, Temporal demand, Interference and Situational stress.

A real road experiment has been conducted in order to define advantages and limits of the DALI method for the evaluation of driver’s mental workload (Pauzié and Manzano 2006). As the objective of the experiment was to test the method, the driving conditions were set up to induce on purpose various and diversified levels of workload for the driver. Results showed that the tool reflected correctly what was expected according to the a priori complexity of the driving context, data resulting from the subjective evaluation matching with the characteristics of the various experimental conditions.

The DALI has been used for various research investigations, testing in-vehicle navigation and guidance systems used by young and elderly drivers, and hands-free car phone while driving (Pauzié and Pachiaudi 1997). This method has been also applied for evaluation of IVIS usability (Harvey et al. 2011), and to study the relationship between cognitive workload and driving performance (Gabaude et al. 2012). Tretten (2011), Tretten et al. (2009) analyzed DALI scores in addition to eye-tracking data and interviews to compare four different Head-Up Display (HUD) areas in research investigating IVIS safety use.

In a study comparing navigation based upon a 2D map display and an Augmented Reality (AR) display where virtual objects are superimposed on the real scene, data from DALI indicated that AR navigation was visually and temporally more demanding that the 2D map, and analysis of driver’s eye movements confirmed this result, with AR navigation attracting driver’s visual attention more frequently than map navigation (Kim and Wohn 2011).

Investigating an innovative mode of driver/system dialogue, where the driver’s eye-gaze tracking, in combination with a button on the steering wheel as explicit input, substitute the interaction on the touch screen, allowing then hands to remain on the steering wheel, Kern et al. (2010), using the DALI, concluded that this mode of interaction is more distracting than a touch screen, performance data indicating that it is also a slightly slower mode.

Subjective workload evaluation defined by factors based upon task specificity is efficient and useful for the researcher; it has been demonstrated with the driving task and the process is ongoing in the context of the riding task. Indeed, research has been conducted recently on the adaptation and implementation for Powered Two Wheeler’s of appropriate ADAS/IVIS technologies, renamed Advanced Rider Assistant System (ARAS) and On-Bike Information System (OBIS) that might contribute to significant enhancement of riders’ safety. In this framework, it has to be stressed that safety consequences have to be carefully processed, as riding is a very sensitive task with any unexpected distraction or change in rider’s dynamic motion potentially leading to loss of control.

In order to evaluate riders’ workload, the Riding Activity Load Index (RALI) has been developed after discussion with experts in this area (Pauzié et al. 2009). The aim was to adapt this tool to the specificity of this riding task the same way the NASA-TLX was adapted to the driving one. The first version of the RALI has the following factors: Visual Demand, Auditory Demand, Temporal Demand, System Interference, Effort Of Attention, Situation Own Coping, Situational Stress, Emotions Handling Vehicle. Two main factors have been added to the tool, as they seem to be typical from the riding context: “Situation own coping” related to “evaluate the workload induced for coping with the other vehicles and with the complexity of the environment” and “Emotions handling vehicle” related to “evaluate the level of negative emotions linked to the control and the handling of the motorbike”. Preliminary tests have been conducted with motorbike manufacturers; additional experimentations would allow improving and validating the method.

1 Conclusion

From an operational perspective in road safety, human factors constructs related to evaluation of drivers’ mental workload are essential in prediction. This method allows to identify and to characterize critical contexts inducing high mental workload that could be not detected by performance or behavior, and then making possible to propose concrete and effective improvements in terms of system design, training and adaptation procedures; as Hart and Wickens (1990) pointed out, designers, manufacturers, and operators, who are ultimately interested in system performance, need answers about operator workload.

Subjective measures of mental workload allow “evaluation” rather than “measurement,” by establishing relative comparison between situations, and producing global and even “crude” criterion. Nevertheless, it is a powerful and easy to use tool, allowing detecting changes such as resource allocation that would be impossible to identify by direct observation.

Several decades of human factors research in the automotive domain have demonstrated the success of using driver workload evaluation to get reliable and efficient data for avoiding misconception of in-vehicle system design and for allowing better understanding of drivers’ behavior, with the purpose of road safety improvement. The methods of workload evaluation have been diverse, with, nevertheless, a high disposition for the subjective evaluation tool, and more specifically the NASA-TLX, with certainly a possible Matthew effect in addition to the willingness for the researchers to be congruent with the other studies conducted in the area.

Of course, it is still relevant and desirable to work on robustness, adaptation and reliability of existing tools, techniques and procedures available to evaluate mental workload; creation of the DALI for the driving context and the RALI for the riding context are examples regarding this objective.

Human constructs research gains by building up knowledge on the representational side of the representational–operational continuum, in addition to this pragmatic operational approach. For example, the neuroergonomics approach deserves to be investigated in the coming decades, bringing innovative knowledge, thanks to the great improvement of physiological recording, to the theoretical background of driver’s cognitive processes (Fort et al. 2010).