Introduction

According to dual-system theories of instrumental action control, everyday decision making is determined by the balance between parallel goal-directed and habitual systems in which associations build up concurrently. This notion has been captured by the associative-cybernetic model (de Wit & Dickinson, 2009) and by computational models (Daw, Gershman, Seymour, Dayan, & Dolan, 2011; Daw, Niv, & Dayan, 2005). When the goal-directed system is engaged, actions are selected on the basis of knowledge of the causal relationship between action and their consequences (cognitive criterion) and on the basis of the current desirability of the anticipated outcomes (motivational criterion). In contrast, when the habitual system exerts dominant control, behavior is based on inflexible stimulus–response (S–R) associations that are “stamped in” through behavioral repetition (Heyes & Dickinson, 1990).

William James wrote already in 1890: “Could the young but realize how soon they will become mere walking bundles of habits, they would give more heed to their conduct while in the plastic state.” This ominous warning to the young has since received indirect support from studies showing that aging affects controlled, effortful, conscious processing (Band, Ridderinkhof, & Segalowitz, 2002; Span, Ridderinkhof, & van der Molen, 2004), while there is some evidence that aging may spare the acquisition and performance of simple, automatic behaviors (e.g., Kester, Benjamin, Castel, & Craik, 2002) and procedural memories (Smith et al., 2005). These findings would be in line with observations that skills, such as typing and playing the piano, can be maintained into advanced age at a high level if practiced on a regular basis (Krampe & Ericsson, 1996; Salthouse, 1984). Related findings from the computational modeling literature also suggest relatively impaired goal-directed action control. Specifically, Eppinger, Walter, Heekeren, and Li (2013) tested younger and older participants on a two-stage Markov decision task and provided evidence that aging is associated with impaired model-based, as opposed to model-free, decision making (but see Worthy, Gorlick, Pacheco, Schnyer, & Maddox, 2011)—two computational strategies that have been proposed to underlie goal-directed versus habitual action control, respectively (Daw et al., 2011; Daw et al., 2005; Dezfouli & Balleine, 2013; Dolan & Dayan, 2013). However, the literature is mixed, since some authors have claimed to find evidence for impaired skill learning (but see Ghisletta, Kennedy, Rodrigue, Lindenberger, & Raz, 2010; Kennedy, Partridge, & Raz, 2008).

Dissociating the contributions of goal-directed and habitual mechanisms to instrumental learning is challenging but was recently accomplished with an experimental paradigm that evokes outcome-induced response conflict in the goal-directed system (de Wit, Niry, Wariyar, Aitken, & Dickinson, 2007). In the present study, we used this paradigm to further investigate the dual-system balance in action control in young versus older adults (see Fig. 1). In the first stage of this computerized task (Fig. 1a, b), pictures of fruit on the computer screen function both as discriminative stimuli (S) that signal that certain instrumental response (R)–outcome (O) contingencies are in effect and as the outcomes (S:R→O). For example, grapes as the discriminative stimulus signal that pressing right leads to the cherries outcome, while an apple signals that pressing left leads to the melon outcome. Participants are instructed to learn by trial and error which key to press to earn the different outcomes that are worth credits for a financial bonus at the end of the game. Performance on this standard discrimination (grapes:right→; apple:left→melon) can be supported both by the habit (S–R) system (grapes–right and apple–left) and by the goal-directed (S–O–R) system that does encode and evaluate the outcomes (grapes–cherries–right; apple–melon–left). However, simply inspecting the level of performance on this discrimination does not inform us about the relative contributions of those systems. For this reason, our instrumental task includes another, incongruent discrimination that can be supported only by the habit system (de Wit & Dickinson, 2009; but for an alternative, propositional account, see de Wit, Ridderinkhof, Fletcher, & Dickinson, 2013). This was achieved by introducing response conflict by having the same events function both as stimuli and as outcomes for opposite responses (for example, orange:right→pineapple; pineapple:left→orange). These incongruent contingencies should cause the encoding of the outcome of each action to interfere with learning of the appropriate S–R relationships. The incongruent discrimination should, therefore, be hard to solve in a goal-directed manner, and we should expect participants to be impaired in learning this discrimination, relative to the standard one. This prediction about a congruence effect has been confirmed in previous studies (de Wit, Corlett, Aitken, Dickinson, & Fletcher, 2009; de Wit et al., 2007; de Wit, Standing, et al., 2012; de Wit, Watson, et al., 2012; Sjoerds et al., 2013). Therefore, if healthy aging is indeed characterized by a relative impairment in goal-directed control, we should expect the congruence effect to be attenuated in older relative to young participants. Finally, our instrumental task also includes a third, congruent discrimination in which the stimuli and outcomes are the same for each response (e.g., bananas:right→bananas; pear:left→pear). In this discrimination, the stimulus essentially acts as a reminder of the available outcome, and as a consequence, it should be relatively easy, for both young and older adults, to learn this discrimination.

Fig. 1
figure 1

Illustration of the instrumental learning task. a In the first stage of the task, participants were trained concurrently on a standard, congruent, and incongruent discrimination. b In this example of a training trial, grapes on the outside of a closed box signaled that pressing right (R+), but not pressing left (L−), would be rewarded with cherries on the inside. c In this example of a trial of the outcome-devaluation test, cherries are devalued, while melon is still worth points. The correct response would be to press the left key. d At the start of each block of the slips-of-action test, all six fruit outcomes were shown on the inside of the boxes, with a cross superimposed on two of these to signal that these were now devalued and would lead to deduction of points. During the test trials, participants were shown the boxes with the discriminative stimuli on the outside, and they were instructed to continue to press the appropriate keys for the boxes that contained the still-valuable outcomes but to refrain from responding for the now-devalued outcomes

Importantly, so far, it has not been investigated whether healthy older adults show the behavioral autonomy that characterizes inflexible S–R habits. In other words, does the behavior persist even when the consequences are no longer in line with current motivation (Tricomi, Balleine, & O'Doherty, 2009)? To address this important, outstanding issue, we conducted two critical tests: the (instructed) outcome-devaluation test and the slips-of-action test.

The outcome-devaluation test was conducted to assess response–outcome (R–O) knowledge. On each trial, one of the fruit outcomes was no longer worth credits, and participants had to use their R–O knowledge, which they had acquired during training, to select the opposite response for a still-valuable outcome. Therefore, goal-directed action control is reflected in the ability to direct one’s actions toward still-valuable outcomes, as opposed to no-longer-valuable outcomes. In the example in Fig. 1c, the cherries were devalued, and participants should therefore press the left key for the still-valuable melon outcome. The second slips-of-action test was designed to assess more directly the balance between goal-directed and habitual control. Once again, some of the outcomes were no longer valuable (and in fact, even led to the subtraction of financial credits), but this time participants were shown the discriminative stimuli from the original learning stage and were asked to selectively respond to those stimuli that signaled the availability of still-valuable outcomes. As illustrated in Fig. 1d, participants should press the appropriate key when shown a box with the apple on the outside (go trial), but they should refrain from pressing a key when shown a box with grapes on the outside (no-go trial) because they previously learned to expect the now-devalued cherries inside the latter box. If older participants were relatively strongly reliant on S–R associations, they should be particularly prone to slips of action, reflected in a failure to withhold responses to stimuli that signaled no-longer-valuable outcomes. We expected older adults to be impaired on both tests. Furthermore, these tests serve to confirm our assumption that goal-directed control is generally impaired on incongruent trials, relative to standard (as in previous studies: de Wit, Barker, Dickinson, & Cools, 2011; de Wit et al., 2009; de Wit et al., 2007; de Wit, Watson, et al., 2012; Gillan et al., 2011; Sjoerds et al., 2013).

To summarize, the present study was conducted to assess the relative contribution of goal-directed versus habitual systems to action control in healthy aging. To this end, we trained and tested young and older participants on an instrumental learning paradigm. Furthermore, we controlled for differences in intelligence and working memory.

Method

Participants

Twenty-four young control volunteers were recruited through local advertisement in the University of Amsterdam. These students participated either for study credits or for 24.50 Euros. Two participants were excluded on the basis of drug use, 1 because of dyscalculia and 1 because he was not a native speaker. The remaining 20 young participants were between the ages of 17 and 24 years (11 male; mean age = 20, SEM = 0.5).

Twenty older adults between 69 and 84 years of age (9 male; mean age = 75, SEM = 1.1) were paid 30 Euros for their participation. They were recruited via a database that contains contact details of older adults who are willing to participate in psychological research at the University of Amsterdam (Seniorlab.nl).

Older participants completed the Cognitive Screening Test for dementia (CST-20), a Dutch 20-item questionnaire to test memory and orientation (Deelman, Maring, & Otten, 1989). All participants obtained a score above the threshold score for cognitive decline (all scores ≥ 16).

We also conducted an extensive interview about current and previous neurological and psychological illnesses and current use of medication. None of the older participants reported a significant neurological history. All participants evaluated their current happiness with a 7 or higher on a 10-point scale (M = 7.81, SD = 0.87), and it is therefore unlikely that participants suffered from untreated depression. They also rated their physical health with a 7 or higher (M = 8.00, SD = 0.95) and physical fitness with a score of 6 or higher (M = 7.95, SD = 0.97).

Education levels of the older adults were lower and more variable than those of the younger adults (university students who had all completed 6 years of secondary school). All older adults completed secondary school; 3 completed additional applied education, and 8 completed higher education. Their average duration of schooling was 6.8 years (SEM = 0.63). All older adults reported their income as average/above average.

According to self-report, all participants had normal or corrected-to-normal vision. If needed, participants wore their glasses during the task.

All participants were tested at the University of Amsterdam. They all gave written consent. The study was approved by the local ethics committee and was conducted in accord with relevant laws and institutional guidelines.

Study design

The (~30-min) computerized instrumental learning task was very similar to versions adopted in previous studies (de Wit et al., 2011; de Wit et al., 2009; de Wit et al., 2007; de Wit, Standing, et al., 2012; Gillan et al., 2011) but will be briefly related below (see Fig. 1 for a schematic illustration). In addition to this task, all volunteers received several background neuropsychological tests. We used the NLV, or “Nederlandse Leestest voor Volwassenen” (Schmand, Lindeboom, & van Harskamp, 1992), which is the Dutch equivalent of the National Adult Reading Test, and Raven’s Progressive Matrices (Raven, Raven, & Court, 1988) to assess IQ. The operation span (O-span) task was used to assess working memory (Turner & Engle, 1989).

Instrumental discrimination training and outcome-devaluation test

During training, participants were shown boxes bearing a fruit icon (on the outside) and were required to use this information to select the left or right keypress. A correct response led to a 1-s presentation of another fruit icon (inside the box) and credits. Incorrect responses led to an empty box on the screen and zero credits (see Fig. 1b). On the basis of this feedback, participants could learn by trial and error which key to press for each stimulus. They were instructed to try to gain as many fruit icons and credits as possible by responding as quickly and accurately as they could (fast, correct responses earned more credits). For each credit, they received 1 euro-cent.

Three biconditional discriminations were trained together: cue–outcome incongruent, cue–outcome congruent, and standard (see Fig. 1a). Performing the correct response to a fruit (discriminative) stimulus yielded the same fruit icon as the outcome in the congruent discrimination but the opposite fruit icon in the incongruent discrimination. In the standard discrimination, two fruit icons acted as the stimuli and the other two as the outcomes, with the assignment of S–O pairs remaining constant across training (for a detailed description, see, e.g., de Wit et al., 2009; de Wit et al., 2007). Discrimination training consisted of eight 12-trial blocks. Within each block, there were 2 trials with each of the six stimuli, which were presented in a random order that varied across participants and with an intertrial interval of 1.5 s.

Following discrimination training, the outcome-devaluation test was conducted to test whether participants had learned the R–O contingencies (see Fig. 1c). On each trial, participants were shown to open boxes with fruit outcomes inside. A cross was superimposed on one of these, signaling that it was no longer worth credits. Participants were asked to select the key that had previously earned the still-valuable fruit. This test was conducted in nominal extinction, which means that they were no longer given feedback on their performance. The outcome-devaluation test consisted of 4 trials from each of the three discriminations, two with one of the outcomes devalued and two with the other outcome devalued. These 12 trials were presented in a different random order for each participant.

Prior to training, participants received a short task demonstration (of the training as well as the outcome-devaluation test) with pictures of drinks instead of fruits, to ensure that the instructions were clear.

Slips-of-action test

This test was designed to assess directly the balance between habitual and goal-directed control (see Fig. 1d). Once again, prior to this stage of the task, participants received a short demonstration with drinks instead of fruits.

At the start of each of the six test blocks, all six fruit outcomes inside the boxes were shown on the screen, but a red cross was superimposed on two of these to indicate that these would now lead to subtraction of points. The experimenter emphasized that if participants saw a box during the test in which they suspected to find one of the two devalued fruits, they should not press a key but wait for the next box to appear. Conversely, they could still earn credits by pressing the appropriate keys in order to open boxes that contained still-valuable fruit outcomes. When necessary, the experimenter repeated the instructions until the participants clearly understood the instructions. Subsequently, a series of closed boxes with the fruit stimuli on the front was shown in rapid succession. Each box was shown for 1 s, and the intertrial interval was also fixed at 1 s. Each of the six stimuli was shown four times per block, and across blocks, each of the outcomes was devalued twice.

Strong response activation via direct S–R associations should lead to commission errors on trials with the devalued outcomes. Conversely, successful selective inhibition on the basis of outcome value should be indicative of dominant goal-directed control that is mediated by anticipation and evaluation of the consequent outcome (for a more detailed rationale and description of the slips-of-action test, see, e.g., de Wit, Standing, et al., 2012; de Wit, Watson, et al., 2012; Gillan et al., 2011).

Operation span test of working memory

The O-span task (Turner & Engle, 1989) was administered to control for potential working memory impairments in healthy aging (Span et al., 2004). During this task, participants viewed combinations of mathematical operations and words. After each set of two to six combinations of operations and words (three sets of each length), participants were asked to write down the words that they still remembered in the correct order. We employed the partial-credit unit scoring method to calculate the percentage of words per series that were remembered in the correct position and averaged this to calculate a total score as an indicator of working memory span.

Statistical analysis

Statistical analysis was performed using SPSS 15.0. We employed mixed-design analyses of variance (ANOVAs), complemented with two-tailed t-tests, to investigate whether older and young participants differed in accuracy (percentages of correct responses per block of training) and reaction times (RTs; in seconds) during discrimination training and in accuracy (correct responses/total * 100) during the outcome-devaluation test. To assess performance on the slips-of-action test, we calculated percentages of responses made toward valuable versus devalued outcomes. All p-values involving repeated measures factors are based on Greenhouse–Geisser sphericity corrections, and all significant (p < .05) effects involving the factors of interest (age, young vs. older; discrimination type, congruent, standard, or incongruent) are reported. Post hoc pairwise comparisons are Bonferroni-corrected.

Results

Instrumental discrimination training

As can be seen in Fig. 2, we replicated the earlier observed congruence effect in the younger participants. In contrast, not only did older participants show a general performance decrement, but more interestingly, their pattern of performance across the discrimination types looked qualitatively different. These observations were confirmed by statistical analysis. An ANOVA with age as a between-subjects factor and discrimination and block as within-subjects factors yielded a significant main effect of age, F(1, 38) = 13.3, MSE = 2,995, p = .001. A three-way age × discrimination × block interaction, F(10, 380) = 2.43, MSE = 318.1, p < .05, was further investigated with separate analyses of the two age groups. In young participants, the discrimination × block interaction failed to reach significance, F < 1, but there was a highly significant main effect of discrimination, F(2, 38) = 11.1, MSE = 527.5, p < .0005. Post hoc pairwise comparisons show that their performance on standard and congruent trials was superior to that on incongruent trials, ps = .02 and .001, respectively (with average percentages correct of 88 %, 81 %, and 74 % for the congruent, standard, and incongruent discriminations, respectively).

Fig. 2
figure 2

Results from the instrumental discrimination training. These graphs show the percentage of correct responses (number of responses signaled to be correct by the discriminative stimuli/total number of responses × 100) across the six blocks of training for the three discrimination types (standard, congruent, and incongruent), separately for the young participants (left panel) and older participants (right panel). The error bars represent SEMs

Analysis of the older participants did yield a discrimination × block interaction, F(10, 190) = 2.51, MSE = 407.4, p < .05, but separate block analyses revealed significant effects only for blocks 3 and 6, Fs(2, 38) = 4.89 and 10.2, MSEs = 462.2 and 477.5, ps < .05 and .0005, which reflected a congruent advantage relative to standard and to incongruent performance (on block 3, p < .05, and block 6, p < .0005, respectively). Importantly, in line with a goal-directed impairment, standard performance of older participants was statistically indistinguishable from incongruent performance. Finally, post hoc comparisons of overall performance on the three discrimination types in older adults showed that there was a significant difference only between congruent and standard performance (p < .05), while there was no difference between standard and incongruent (p = 1.0). Therefore, these results support our hypothesis of impaired goal-directed control in aging. It appears that, in contrast to their younger counterparts, older adults were not able to benefit from additional goal-directed control on standard trials, relative to the incongruent trials (with average percentages correct of 73 %, 61 %, and 63 % for the congruent, standard, and incongruent discriminations, respectively).

The effect of discrimination was not due to a speed–accuracy trade-off. RT analysis yielded a significant effect of discrimination, F(2, 76) = 3.95, MSE = .124, p < .05, but this was due to fast performance on the congruent, relative to the incongruent, trials, p < .05. A main effect of age indicated that older adults did perform more slowly overall than younger adults, F(1, 38) = 38.8, MSE = 1.532, p < .0005, with average RTs of 1.3 and 0.8 s, respectively. Finally, this analysis yielded a significant effect of block, F(5, 190) = 25.3, MSE = .236, p < .0005, reflecting gradual speeding of responding in the course of training.

The slower responding overall in the older adults may indicate that the time pressure rendered the task relatively challenging for them. To explore this issue, we performed an additional analysis in which we compared accuracy of responding during training in relatively slow versus fast responders among the older adults (using a median-split procedure based on the average RT during the acquisition stage). This analysis did not yield a main effect of speed of responding, F(1, 18) = 2.4, MSE = 3,191, p = .1, η p 2 = .12, or an interaction with discrimination, F < 1, p = .8, η p 2 = .014. Although the small sample size means that these results should be treated with caution, this exploratory analysis suggests that the impaired goal-directed control in older adults (as reflected in the absence of a congruence effect) was not simply the result of general slowing of responding.

Outcome-devaluation test

As can be seen in Fig. 3, older adults were generally impaired on the outcome-devaluation test of R–O knowledge, F(1, 38) = 29.1, MSE = 623.6, p < .0005. This overall impairment is consistent with a goal-directed deficit.

Fig. 3
figure 3

Results from the outcome-devaluation test. These graphs show the percentage of correct responses (responses toward valuable outcomes/total number of responses × 100) for the three discrimination types (standard, congruent, and incongruent), separately for the young participants (left panel) and older participants (right panel). The error bars represent SEMs. Asterisks indicate above 50 % chance level performance

In contrast to performance during the training phase, the pattern of test performance across discriminations looked very similar between young and older participants. There was a significant overall effect of discrimination, F(2, 76) = 12.9, MSE = 601.4, p < .0005, and pairwise comparisons revealed that the pattern was in the expected direction: Congruent performance was superior to standard performance, which, in turn, was superior to incongruent (ps ≤ .05).

Additional one-sample t-tests showed that older adults performed above chance only on congruent, t = 4.36, p < .0005, and standard, t = 2.77, p < .05, trials, but not incongruent trials, t = −.38. In contrast, young participants performed above chance on all trial types (congruent, t = 39, p < .0005; standard, t = 8.72, p < .0005; incongruent, t = 2.74, p = .01). Finally, analysis of the RTs just yielded a main effect of age, F(1, 38) = 12.2, MSE = 11.28, p = .001, indicating relatively slow performance, overall, of the older group.

Slips-of-action test

Figure 4 displays the percentages of responses that were made toward valuable and devalued outcomes. Older adults were markedly impaired on this task, as was reflected in both relatively low percentages of responses toward still-valuable outcomes and high percentages toward devalued outcomes. Statistical analysis revealed significant interactions between the factor devaluation and that of age, F(1, 38) = 30.8, MSE = 839.0, p < .0005, and that of discrimination, F(2, 76) = 22.13, MSE = 234.41, p < .0005. Separate analyses of still-valuable and devalued trials confirmed that the older participants made more omission errors than did young participants on the former, F(1, 38) = 74.3, MSE = 315.8, p < .0005, and more commission errors on the latter, F(1, 38) = 5.4, MSE = 1,028, p < .05.

Fig. 4
figure 4

Results from the slips-of-action test. These graphs show the percentages of responses (number of responses/total number of trials × 100) toward still-valuable outcomes (filled bars) versus now-devalued outcomes (empty bars), for the three discrimination types (standard, congruent, and incongruent), separately for the young participants (left panel) and older participants (right panel). The error bars represent SEMs. Asterisks indicate significant differences between GO percentages for still-valuable versus now-devalued outcomes

Paired-samples t-tests showed that young participants were able to selectively withhold responses toward devalued outcomes on all trial types (congruent, t = 11.47, p < .0005; standard, t = 5.01, p < .0005; incongruent, t = 3.69, p < .005). In contrast, older participants were unable to do so on standard trials (t = 1.39). These trials are easiest to interpret in terms of competition between goal-directed and habitual systems, since these lack the S–O confound that is inherent to congruent and incongruent trials. When the need to accurately predict available outcomes was removed on congruent trials, such that older participants could base their responses on the identity of the discriminative stimuli, they were able to base responding on the current value, t = 5.18, p < .0005. Finally, on incongruent trials, S–R priming overrode outcome-based responding in the older participants, thereby giving rise to a significantly higher percentage of responses toward no-longer-valuable outcomes, relative to still-valuable outcomes, t = −3.14, p = .005.

In conclusion, older participants showed a deficit in responding on the basis of current outcome value. Therefore, these results are in line with those from the acquisition and outcome-devaluation test stages of the task, which also point toward a goal-directed deficit.

Controlling for individual differences in intelligence and working memory

Average levels of schooling were higher in the younger subjects (university students) than in the older subjects. However, level of schooling does not necessarily reflect cognitive abilities, and for the younger generation, it is far more common to study at universities. To check for group differences in intelligence, we investigated performance on tests of crystallized (NLV) and fluid (Raven) intelligence. We did not find evidence for a group difference in (crystallized) intelligence using the NLV, t = −1.06 (with average scores of 86 and 84 [SEMs = 10 and 4] for the older and younger groups, respectively). However, mean Raven scores did differ significantly between the younger and older group, suggesting that levels of (fluid) intelligence were higher in young participants, with average scores of 23 in the young (SEM = 0.37) and 20 in the older (SEM =0.53) participants, t = 4.66, p < .0005.

To exclude the possibility that any group differences in task performance were due to differences in intelligence, individual Raven scores and years of schooling were entered into separate ANOVAs as a covariate. Importantly, these exploratory ANCOVA analyses did not yield any interactions with Raven and schooling, nor did inclusion of these factors affect our pattern of results. For completeness’ sake, we report the ANCOVA results with the Raven and schooling covariates, respectively, for the three main findings of the previous ANOVA analyses: (1) a three-way age × discrimination × block interaction during training, Fs(10, 370) = 2.3 and 2.8, MSEs = 315.7 and 314.7, ps < .05; (2) a main effect of age during the outcome-devaluation test, Fs(1, 37) = 11.6 and 26.0, MSEs = 596.1 and 615.9, ps < .005; (3) an age × devaluation interaction during the slips-of-action test, Fs(1, 37) = 16.5 and 29.9, MSEs = 854.7 and 857.8, ps < .0005.

As would be expected on the basis of the aging literature, older participants also performed significantly worse than the young participants on the O-span test of working memory, t = 3.37, p < .005, with average scores of .55 (SEM = .04) and .72 (SEM = .03), respectively. Inclusion of this factor as a covariate in an exploratory ANCOVA analysis did not affect the basic patterns of results during the outcome-devaluation test and the slips-of-action test, with the following statistical results for the main findings of the previous ANOVAs: (1) three-way age × discrimination × block interaction during training, F(10, 370) = 2.07, MSE = 310.9, p < .05; (2) a main effect of age during the outcome-devaluation test, F(1, 37) = 19.2, MSE = 632.9, p < .0005; and (3) a two-way age × devaluation interaction during the slips-of-action test, F(1, 37) = 15.8, MSE = 726.2, p < .0005.

Discussion

The present study provides evidence for a qualitative effect of aging on the balance between goal-directed and habitual action control during instrumental learning. Specifically, our results point to a reliance on inflexible, S–R habit learning, as opposed to goal-directed action that is motivated by anticipation of a currently desirable behavioral outcome.

The strongest evidence for this dual-system imbalance comes from the training phase. In line with previous studies into feedback-based learning (e.g., Schmitt-Eliassen, Ferstl, Wiesner, Deuschl, & Witt, 2007), the older participants showed a general performance decrement, as well as general slowing. More interesting, older adults failed to show superior performance on standard trials, relative to incongruent trials, pointing to a failure to engage outcome-based goal-directed control. In contrast, young participants did show a congruence effect, which indicates that they were able to benefit from dual-system support on standard trials (de Wit et al., 2009; de Wit et al., 2007; de Wit, Standing, et al., 2012).

Our findings are in line with a previous study with this paradigm in which Parkinson disease patients were compared with age-matched healthy controls (de Wit et al., 2011). In that study, the average age was relatively high (on average, 63 years), and in line with the present study, we found convincing evidence only for a congruent—not for a standard—advantage over the incongruent discrimination during acquisition, across patients and healthy controls.

In light of the age-related impairment during the initial learning stage, we should treat the test results with caution. However, these do corroborate the notion of an imbalance in dual-system action control. In line with a goal-directed deficit, the older participants were generally impaired on the outcome-devaluation test of R–O knowledge. Surprisingly, however, they were able to perform above chance level on both standard and congruent test trials, and they gave evidence for superior R–O knowledge of these contingencies, relative to the incongruent. These test results give rise to the question as to why the older participants failed to show a congruence effect during the learning phase. A possible answer is suggested by the pattern of performance on the slips-of-action test. In this test, action selection on stimulus-elicited anticipations of still-valuable versus no-longer-valuable outcomes directly competes with S–R habits. In this test, older adults were no longer able to perform above chance level on the standard trials. From their above-chance performance on congruent trials and below-chance performance on incongruent trials, it is clear that they based their responding on the identity of the stimuli, as opposed to outcomes. These results raise the possibility that the lack of a congruence effect during training is the result not only of impaired control by the goal-directed system, but also of relatively strong S–R habits. When these two are in direct competition (in the presence of the discriminative stimuli both during the slips-of-action test and during the learning phase), habitual control may completely bypass the goal-directed system. This speculative account should be further investigated in future studies. Also, it remains to be established how this instrumental dual-system balance is related to the balance between “explicit” versus “implicit” processes in category learning. It is interesting in this respect that Maddox, Pacheco, Reeves, Zhu, and Schnyer (2010) have provided evidence for age-related impairments in both processes.

An important issue that remains to be explored in the context of the dual-system balance in aging is that of speed or time pressure. In the present study, we opted for self-paced acquisition with time constraints for getting a certain number of financial credits, which we believe mimics most everyday situations in which decision making is also self-paced but still always inherently under a certain time pressure. However, it is possible that these time constraints posed a greater challenge for older, relative to younger, adults. This issue is even more clearly applicable to the slips-of-action test, during which we deliberately exerted time pressure in order to evoke stimulus-induced slips of action. Since the age-related impairment on this test was due to a combination not only of omission errors on valuable trials, but also of commission errors on devalued trials, it seems unlikely that they were simply unable to prepare the motor response in time. It remains possible, however, that the time limit forced them to rely on the faster, more efficient habit system. Future studies could investigate this issue by assessing the effect of varying time limits during training and test—or indeed, removing time constraints altogether—on performance of older versus younger participants. To conclude, the role of “global slowing” in impaired goal-directed action in healthy aging remains to be uncovered in future research.

Impaired performance on the slips-of-action test could also be related to other cognitive functions that are known to be affected in healthy aging—most notably, working memory (Braver & West, 2008). However, the observed goal-directed deficits on the outcome-devaluation test and the slips-of-action test were not related to performance on the O-span, a standard measure of working memory, suggesting that the underlying cause was not simply an inability to maintain goals in working memory (Frank & Claus, 2006). This finding accords with that of a recent study in which an age-related impairment in model-based decision-making on the two-stage Markov decision task was likewise found to be unrelated to impairments in working memory (Eppinger et al., 2013). Another cause for the lack of selective response suppression in the slips-of-action test could be a general impairment in response inhibition (Kramer, Humphrey, Larish, Logan, & Strayer, 1994; Potter & Grealy, 2008; Span et al., 2004). Importantly, in the present study, older participants were able to perform above chance level on congruent trials of the slips-of-action test, on which the stimuli served as an explicit reminder of the available outcome, but not on standard and incongruent trials that required active goal retrieval. Therefore, the present results support the notion of an age-related deficit in the ability to associatively retrieve and evaluate the motivational significance of the consequences of one’s actions, over and above a simple inhibitory deficit.

Goal-directed action control is essential in everyday life, and consequently, our results suggest that older adults face challenges when changes in the desirability of outcomes require immediate, flexible behavioral adjustment, in line with previous suggestions that reduced intentional processing in older people leads to failures to “break” automatic modes of behavior (Craik, 1986; Kester et al., 2002). Relatedly, we showed in the present study that providing an external reminder of the goal (in the congruent discrimination) helps older adults to achieve goal-directed action control. This possibility is in line with goal-neglect theory, according to which aging affects internal goal activation and maintenance (de Jong, 2001). On the other hand, the ability to acquire automatic S–R habits appears to be relatively intact in healthy aging. While such habitual behavior is relatively inflexible, it is thought to offer the advantages of automatic skills (Seger & Spiering, 2011), allowing for fast selection of appropriate behaviors in stable contexts, while requiring relatively little cognitive effort. In line with the idea that reliance on habits may be as advantageous strategy in older adults, Einstein, McDaniel, Smith, and Shaw (1998) showed that extensive repetition of a prospective memory task significantly improves their performance. The transformation of everyday activities into habits can also serve to compensate impaired memory of prior events; for example, routinely placing one’s medication or keys in a specific location may prevent the necessity of relying on an episodic memory of where one has previously left these by allowing one to rely, instead, on a simple S–R habit (McDaniel, Einstein, & Jacoby, 2008).

Our evidence for a dual-system imbalance in healthy aging converges with neuroscientific evidence for age-related deterioriation of the prefrontal cortex (PFC; Grieve, Williams, Paul, Clark, & Gordon, 2007; Span et al., 2004) and, more specifically, the ventromedial aspect (vmPFC; including the orbitofrontal cortex) (Denburg et al., 2007; Resnick, Lamar, & Driscoll, 2007; Salat et al., 2005). Importantly, intact functioning of the latter region is thought to be a prerequisite for the ability to act in a goal-directed action manner. Functional MRI studies have shown that this area is engaged during goal-directed action (Tricomi et al., 2009; Valentin, Dickinson, & O'Doherty, 2007), and animal studies have revealed that lesions to the rodent functional homologue region of the vmPFC—the prelimbic cortex—significantly impair goal-directed action (Balleine & O'Doherty, 2010). In line with the idea that age-related deterioriation of the vmPFC underlies the absence of a congruence effect in older adults, we have previously shown that the vmPFC is preferentially engaged during the acquisition of standard discriminations, relative to incongruent (de Wit et al., 2009; Sjoerds et al., 2013). Furthermore, lesions to the prelimbic cortex in rats abolish the congruence effect in the rodent version of our paradigm (Dwyer, Dunn, Rhodes, & Killcross, 2010).

Clearly, the vmPFC does not operate in isolation. A recent diffusion tensor imaging study into white matter integrity provided evidence that goal-directed action is supported by a corticostriatal network that includes the caudate nucleus. Habit learning, on the other hand, appeared to be supported by a corticostriatal network that includes the motor cortex and the posterior putamen (de Wit, Watson, et al., 2012). Future structural MRI studies may investigate whether relatively strong degradation of the goal-directed pathway underlies the dual-system imbalance in healthy aging. Recent studies have already provided evidence that structural white matter integrity plays a role in related cognitive impairments in aging. For example, Samanez-Larkin, Levens, Perry, Dougherty, and Knutson (2012) presented evidence to suggest that declining white matter integrity of frontostriatal circuitries in aging underlies impaired reward learning (for a discussion, see also Eppinger et al., 2013). Furthermore, Daselaar et al. (2013) related decreased frontal and temporal white matter integrity in aging to executive functioning. Interestingly, they also provided evidence for compensatory, increased activity in these same areas. This complex relationship between corticostriatal integrity versus function in aging remains to be explored in the context of habitual and goal-directed control.

Another potentially important factor to consider in the neural basis of dual-system balance is the role of dopamine. Prefrontal dopamine is thought to be crucially involved in goal-directed action (Cheer et al., 2007), and indeed, we found that dopamine depletion in healthy young (female) volunteers (through a dietary intervention) reduced goal-directed action control as measured by the outcome-devaluation test and slips-of-action test (de Wit, Standing, et al., 2012). Furthermore, Wunderlich, Smittenaar, and Dolan (2012) provided evidence that L-DOPA (a dopamine-enhancing drug) favors model-based over model-free behavior, and Chowdhury et al. (2013) found that impaired reward prediction error signaling in healthy aging (see also Nieuwenhuis et al., 2002) can be restored with L-DOPA. The present findings may therefore be related to the loss of prefrontal dopamine function in aging (Kaasinen et al., 2000). However, this picture is complicated by evidence that dopamine in a nigrostriatal pathway that involves the putamen is important for the reinforcement of S–R habits (Faure, Haberland, Conde, & El Massioui, 2005), together with evidence for a more general disruption of the dopamine system in aging, with receptors being lost not only in cortical regions, but also in the striatum (Backman et al., 2000; Volkow et al., 1998). Therefore, future research needs to further clarify the role of dopamine, as well as that of corticostriatal networks, in age-related impairments in goal-directed action.

In conclusion, we provide evidence for a shift in the dual-system balance away from flexible goal-directed control toward reliance on S–R habits in healthy aging. Our findings are, therefore, in line with a recent study that suggested that aging is associated with impaired model-based, as opposed to model-free, decision making (Eppinger et al., 2013). Whereas Eppinger and his colleagues used a two-stage Markov decision task to reach this conclusion, we present the first investigation of this issue with an instrumental discrimination learning task that introduces outcome-induced conflict when learned in a goal-directed manner. Our results suggest that older adults, as opposed to younger adults, did not recruit goal-directed support during acquisition of instrumental discriminations. Furthermore, during subsequent outcome-devaluation and slips-of-action tests, they exhibited a high degree of behavioral autonomy of current outcome value, which is indicative of reliance on S–R habits. These findings are in line with a “last-in-first-out” developmental hypothesis of goal-directed control, since previous studies have provided evidence that the ability to behave in a goal-directed manner develops later in early childhood than S–R habit learning (Klossek & Dickinson, 2011; Klossek, Russell, & Dickinson, 2008). We hope that this research will inspire further investigations of the important issue of flexible motivational modulation of action in aging, and we recommend that future research aims to replicate our findings with more participants across different age groups in order to allow for investigation of individual differences in cognitive function throughout older adulthood in relation to brain function—an approach that is increasingly being viewed as essential in the field of psychology of aging.