We will now provide a brief review of a number of key behavioural phenomena. This list is not meant to be complete and the inclusion of topics is unavoidably selective. Each time, we seek to discuss the behavioural relevance of the topic, the likely impact of not accommodating the phenomenon in our models, and an overview of attempts (if any) to represent the effect in the choice modelling literature. With regard to the last point, we specifically look at the implications of such efforts on maintaining consistency with utility maximisation. We group the phenomena together according to whether or not they are theoretically consistent with RUM.
3.2.1 Generally fully consistent
Anchoring effects
Anchoring effects refer to the phenomenon that individuals’ decisions could be affected by external cues. A crucial initial investigation came in the work of Tversky and Kahneman (
1974), who demonstrated that students’ judgements of the percentage of African countries in the United Nations were biased towards a random number generated by a ‘wheel of fortune’. Since then, behavioural economists and psychologists have found salient and robust anchoring effects in both experiments and real world choices.
In the context of the choice modelling literature, the main focus on anchoring effects has been how a previous choice setting can influence preferences in a subsequent choice setting. A key example comes in value of time work, especially where based on stated choice data. If a respondent is faced with a choice in task 1 where he/she can
purchase a reduction in travel time at a cost of £x/h, then this may influence his/her willingness to
purchase a reduction at a cost of £year/h (where £y may be smaller or larger than £x) in subsequent tasks. Anchors may form specifically the first time a respondent faces a given type of choice, but subsequent choices may refine the anchor. The influence of anchoring on the value of time has been considered in some depth by VandeKaa (
2005).
The specification of anchors may vary, and an anchor could be formed either by what a decision-maker ‘sees’ in a given choice task or by what he/she chooses. An anchor may also be constant (formed the first time a respondent faces a particular choice) or evolve over time (e.g. changing with each choice situation). If, in each choice situation, the choice is modelled with a RUM structure, then the actual choice is consistent with RUM, but the sequence is not consistent with a single definition of utility, as utility gets redefined over time, either just once for all choices following the initial choice, or after each choice. This is consistent with the original Block and Marschak (
1960) interpretation of RUM. Either way, such heterogeneity in valuations over time is not in principle inconsistent with RUM.
Zero cost/price effects
In an example made famous by Dan Ariely’s book
‘Predictably Irrational’ (Ariely
2008), and based on Shampanier et al. (
2007), individuals’ choices between two chocolate products changed substantially when an equal reduction in the cost (i.e. price) of the two products led to a zero cost for one of the two. Such effects are also visible in many stated choice surveys where one or more of the alternatives in a choice task have a zero cost to the respondent, be it in the case of toll road studies (e.g. Hess et al.
2008) or the numerous environmental economics datasets including a zero cost status quo alternative (see the discussion on confounding in Hess and Beharry-Borg
2012). The behaviour exhibited by this effect is not consistent with a linear cost sensitivity, which is a core assumption in many applications of choice models. However, it can easily be accommodated through a non-linear specification and does not lead to violations of utility maximisation.
Status quo bias
Status quo bias refers to the phenomenon that individuals have strong propensity to choose the alternative that describes their current situation. It was first demonstrated by Samuelson and Zeckhauser (
1988), but is commonly observed in many stated choice surveys, especially when the status quo alternative is explicitly labelled as such. The fact that individuals attach undue weight to their current situation does not lead to any issues from a utility maximisation perspective, and is routinely accommodated in models. A different issue of course applies if these models are used in forecasting, where the status quo is unknown. Applications looking at this issue are common in environmental economics, see for example Meyerhoff and Liebe (
2009).
Mental accounting
Mental accounting refers to the cognitive process by which individuals allocate their overall money budget into different mental accounts. It is a common empirical finding that money in one mental account is not a perfect substitute for money in another account, thus violating the principle of fungibility (Thaler
1985). This effect is commonly observed in transport choice models with multiple cost components (e.g. different responses to fuel costs and toll costs) and has for example been studied in a stated choice context by Hess et al. (
2012). While this behavioural effect poses issues from an economic theory perspective, it does not pose any particular issues for a theoretical RUM-consistent model of choice behaviour.
Elimination by aspects
Elimination by aspects (EBA), which was proposed by Tversky (
1972a,
b), posits that an agent successively eliminates alternatives that fail to possess aspects that the agent finds necessary or important. Noting that the elimination process establishes a branching choice structure, several authors have suggested similarity with the nesting structures of McFadden (
1978) RUM-consistent GEV model. This suggestion was investigated in detail by Batley and Daly (
2006), who found that there was equivalence between ‘hierarchical’ EBA (where there is a unique sequence of eliminations to reach each alternative) and ‘tree’ Nested Logit models (where again there is a unique choice sequence). Although more general EBA and Cross-Nested Logit models are not necessarily equivalent, despite the apparent similarity of structure, Tversky (
1972a) presented a much-neglected proof that, by re-interpreting EBA as a ranking model, general consistency between EBA and RUM can be established.
3.2.2 Not consistent in general or in some cases
Lexicography and extreme sensitivities
Lexicography refers to the case where, typically in an experimental setting, a decision-maker evaluates the alternatives on the basis of a subset of attributes (e.g. Sælensminde
2006). Common examples include respondents who always choose the cheapest alternative irrespective of the other attributes shown, or travellers who always choose the fastest alternative. Lexicography may also exhibit itself as non-trading if, for example, respondents always choose the same mode in a transport setting. This type of behaviour may be consistent with utility maximisation if it reflects true preferences, i.e. extremely high sensitivities to given attributes, such that a change in behaviour would arise only with a sufficiently large incentive. If, however, it is caused by strategic behaviour in a survey context, violations of RUM may arise. Lexicographic behaviour may also be the result of choice set complexity, leading to decision makers adopting processing heuristics, an issue we return to below.
Reference-dependent preferences and loss aversion
The topics of reference dependence and loss aversion are generally attributed to Tversky and Kahneman (
1991) and have become a widely studied topic in choice modelling in recent years. The central argument is that when individuals evaluate their response to a given stimulus, i.e. the value of an attribute
\(x_\mathrm{jntk} \) (the
kth component of
\(x_\mathrm{jnt} )\), this valuation depends not just on the absolute value of this attribute, but also on its value relative to a reference point, say
\(r_\mathrm{nk} \). For an undesirable attribute, respondents are expected to react negatively to increases in
\(x_\mathrm{jntk} \) and positively to decreases. When these reactions are symmetrical, we return to the standard specification, where the contribution to the utility of alternative
j is given by
\(\beta _k x_\mathrm{jntk} \) (under the assumption of a linear specification). Loss aversion postulates that losses are more painful than gains are pleasurable, and we then instead have that the contribution is driven by separate loss (
\(\beta _\mathrm{k,loss} )\) and gain (
\(\beta _\mathrm{k,gain} )\) parameters
\(\beta _\mathrm{k,loss} \left( {x_\mathrm{jntk} -r_\mathrm{nk} } \right) \) if
\(x_\mathrm{jntk} >r_\mathrm{nk} \), and
\(\beta _\mathrm{k,gain} \left( {r_\mathrm{nk} -x_\mathrm{jntk} } \right) \) if
\(x_\mathrm{jntk} <r_\mathrm{nk} \), where we would expect that
\(\beta _\mathrm{k,loss} \le 0\le \beta _\mathrm{k,gain} \) and
\(\mid \beta _\mathrm{k,loss} \mid \ge \mid \beta _\mathrm{k,gain} \mid \).
Empirical support for reference dependence and loss aversion is widespread in the choice modelling literature (e.g. Hess et al.
2008) and has also led to the development of bespoke modelling approaches (cf. de Borger and Fosgerau
2008). What has received little or no attention is the impact on consistency with utility maximisation. With reference dependence, the utility of an alternative depends on the characteristics of the alternative and the reference point. It should be clear that if the reference point is independent of the composition of the choice task, then the inclusion of reference dependence in a model will not lead to a violation of utility maximisation. Indeed, the addition of an alternative into the choice set will not change the utilities of other alternatives, and the probabilities of all existing alternatives (prior to the new one being added) will not increase—thereby complying with regularity. This applies whether or not the reference alternative itself is included in the choice task, or indeed if the reference alternative is the alternative that is being added. If the reference point changes over time, then preferences will of course change too, but this is not a problem for utility maximisation. As a final point, if the reference alternative is included in the choice task, say as alternative 1, then a standard implementation of a model for such data (as in Hess et al.
2008) is in effect a Mother Logit structure, where, e.g.
\(g_\mathrm{int} =\mathop \sum \nolimits _k \left[ {\beta _\mathrm{k,inc} \cdot \max \left( {x_\mathrm{intk} -x_\mathrm{1ntk} ,0} \right) +\beta _\mathrm{k,dec} \cdot \max \left( {x_\mathrm{1ntk} -x_\mathrm{intk} ,0} \right) } \right] \), where
k is an index over attributes. This is thus an example where a Mother Logit structure is consistent with utility maximisation, as the utility for alternative
iis only a function of its own attributes and the fixed attributes of the reference alternative. Effectively, the reference alternative becomes part of the preference structure at the moment of choice, and the alternatives are evaluated in that preference structure using only their own attributes.
Decoy, context and framing effects
The term ‘decoy effects’ has been used to describe a set of slightly different effects, including asymmetric dominance effects, attraction effects, compromise effects and phantom decoy effects. Asymmetric dominance effects were first described by Huber et al. (
1982), who found that in a binary choice task, adding a third alternative (i.e. decoy) that is dominated by one alternative but not the other can shift individuals’ preferences towards the alternative that dominates the decoy. An attraction effect (Huber and Puto
1983) arises when the decoy is ‘nearly dominated’ rather than ‘fully dominated’ by one alternative in the choice set but not the other, i.e. if it is outperformed by one alternative on all its characteristics except one, where it only has a small advantage for the latter. A further possibility is that of a ‘phantom decoy’ effect (Pratkanis and Farquhar
1992), where the decoy can be ‘seen’ but is unavailable for choice. Finally, in a compromise setting, the decoy is not dominating or dominated by any alternative, but has a combination of small advantages and disadvantages in relation to the other alternatives. Such compromise alternatives can have increased probability of being chosen when individuals are averse to extreme outcomes.
Decoy effects in discrete choice modelling have been studied by Guevara and Fukushi (
2016) and Rooderkerk et al. (
2011), as well as by Chorus and Bierlaire (
2013) in the context of compromise effects. The presence of decoy alternatives will lead to changes in the relative probabilities of other alternatives and, with the exception of the phantom decoy which cannot be chosen, their inclusion in the choice set has the potential to lead to an increase in the probability of one or more alternatives; this breaches regularity and makes such effects inconsistent with RUM.
Context effects cover a broader range of issues that relate to the fact that the relative choice probabilities across alternatives may differ depending on the presence or absence in the choice set of other alternatives. They cover attraction, compromise and similarity effects, some of which can also be classified under the decoy points above. Similarity effects are at the heart of the development of nested logit structures in choice modelling. If the effect is captured purely through the error structure of the model, and if specific conditions on the nesting structure are satisfied (Batley and Hess
2016), then the model remains consistent with utility maximisation.
Problems arise when the cross-substitution effects are captured through the observed component of utility, since the size and sign of associated coefficients can lead to preference reversals. Examples in the mainstream choice modelling literature include models used for route choice behaviour, where the impact of the overlap of different routes is captured in the observed utility component. Two popular examples are the C-Logit model developed by Cascetta et al. (
1996) and the path-size approach of Ben-Akiva and Bierlaire (
1999a). Both approaches include in the utility function of alternative
i a measure of the similarity/overlap with other alternatives (
\(j\ne i)\) and thus open up the possibility of preference reversals as this component depends on the attributes of other alternatives in the choice set (again in the manner of Mother Logit) and changes in the composition of the choice set.
Framing effects refer to the phenomenon that individuals’ judgements and decisions could be affected by changes to the descriptions of the same piece of information. Framing effects violate the normative principle of description invariance (Tversky and Kahneman
1981), but do not affect consistency with utility maximisation.
Regret
Loomes and Sugden (
1982) put forward the notion that an individual’s utility is not only derived from the chosen alternative but also from the regret or the ‘rejoicing’ generated from the differences between the chosen alternative and the alternative he/she forgoes.
Regret has received widespread attention in choice modelling in recent years, with the development of successive versions of a Random Regret Minimisation (RRM) framework (cf. Chorus
2010).
In the most widely used implementation (Chorus
2010), the regret associated with alternative
i in choice task
t for agent
n is instead obtained as:
$$\begin{aligned} R_\mathrm{int} =\sum \nolimits _{j\ne i} \sum \nolimits _k \ln \left( {1+\hbox {exp}\left[ {\beta _\mathrm{k} \cdot \left( {x_\mathrm{jntk} -x_\mathrm{intk} } \right) } \right] } \right) \end{aligned}$$
(5)
where
k is an index of attributes. With the assumption of a type I extreme value error and the notion of regret minimisation rather than maximisation, we then have (with either (
2) or (
3)) that:
$$\begin{aligned} P_\mathrm{int} =\frac{\hbox {exp}\left( {-R_\mathrm{int} } \right) }{\mathop \sum \nolimits _{j=1..J} \hbox {exp}\left( {-R_\mathrm{jnt} } \right) }, \end{aligned}$$
(6)
It can clearly be seen from (
5) that the RRM model is in fact a specific version of a Mother Logit model, with the utility of an alternative depending on the attributes of other alternatives, where
\(g_\mathrm{int} =-\mathop \sum \nolimits _{j\ne i} \mathop \sum \nolimits _k \ln \left( {1+\hbox {exp}\left[ {\beta _k \cdot \left( {x_\mathrm{jntk} -x_\mathrm{intk} } \right) } \right] } \right) \). RRM is thus not a
novel type of model but remains a Logit model, albeit one that, like most Mother Logit specifications, is not consistent with utility maximisation. While this lack of consistency has been acknowledged by authors using RRM, and indeed seen as an advantage, this link with Mother Logit has not previously been made to the best of our knowledge. A special case arises when
\(J=2\), where RRM is formally equivalent to a RUM-consistent Logit model with (
5) (cf. Chorus
2010). With RRM, it is easy to see how the inclusion of an additional alternative can increase the choice probability of one or more of the alternatives already in the choice set, i.e. the model does not exhibit regularity, given that the regret needs to be recalculated for all alternatives in the choice set.
Complexity, simplification of choice tasks and heuristics
A number of authors have addressed the issue of choice complexity, especially in the context of stated choice surveys (e.g. Rose et al.
2008). These papers have looked at the impact that the composition of the choice environment, in terms of number of alternatives, attributes and attribute levels, has on the level of noise in the data (i.e. model scale) as well as substantive outputs (e.g. willingness-to-pay measures). At the same time, there is a growing literature in choice modelling looking at how individual decision-makers process the information describing the choices they face and what heuristics they may use (e.g. potential attribute ‘non-attendance’). Other work has looked at the role of choice set generation, where individuals may look at only a subset of the available alternatives (e.g. Manzini and Mariotti
2014).
The majority of the above work has been conducted with the use of random utility models. The focus has generally been on behaviour within a given context and by a given person, e.g. making the heuristic specific to a given individual. However, if one makes the link between the literature on choice task complexity and the literature on choice process, then it is clear that the presence of such effects may in fact lead to violations of key principles of utility maximisation. As an example, if the inclusion of additional alternatives into a choice set changes the way in which respondents make their choice, i.e. leading to the application of a
different RUM, and if this effect differs across alternatives (due to differing attribute values), then the potential for preference reversals clearly exists, as the utility functions become dependent on attributes of other alternatives. On the other hand, it is also worth noting the existence of work looking at the role of inattention (which can link to complexity) and incorporating this in an Additive Random Utility Model (ARUM) context (cf. Matejka and McKay
2014).