Actions can create preferences, increasing the value ascribed to commodities acquired at greater cost. This behavioural finding has been observed in a variety of species; however, the causal factors underlying the phenomenon are relatively unknown. We sought to develop a behavioural platform to examine the relationship between effort and reinforcer value in mice trained under demanding or lenient schedules of reinforcement to obtain food. In the initial experiment, expenditure of effort enhanced the value of the associated food via relatively lasting changes in its hedonic attributes, promoting an acquired preference for these reinforcers when tested outside of the training environment. Moreover, otherwise neutral cues associated with those reinforcers during training similarly acquired greater reinforcing value, as assessed under conditioned reinforcement. In a separate experiment, expenditure of effort was also capable of enhancing the value of less-preferred low-caloric reinforcers. Analysis of licking microstructure revealed the basis for this increased valuation was, in part, due to increased palatability of the associated reinforcer. This change in the hedonic taste properties of the food can not only serve as a basis for preference, but also guide decision-making and foraging behaviour by coordinating a potentially adaptive repertoire of incentive motivation, goal-directed action and consumption.
Most theories of reinforcement and decision-making posit that performance or demand for a particular commodity is inversely related to its expected cost [1–3]. Conversely, actions can create preferences, increasing the value ascribed to commodities acquired at greater cost/effort [4–6]. Shifts in preferences following greater effort are observed across a variety of species. For example, laboratory rats trained to pull a heavy (80 g) weighted harness down an alley to receive food will subsequently consume more food in a later test, compared with rats trained under light-weight (5 g) conditions . Similarly, when provided with a choice between cues previously associated with high or low effort, pigeons and starlings preferentially select visual cues associated with high effort [4,5]. When analogous experiments have been conducted with human participants, stimuli associated with greater effort are preferred over those that were less hard to obtain [6,8].
According to cognitive dissonance theory  individuals assign greater value to high-effort outcomes in an attempt to reduce the ‘dissonance’ (conflict) that would otherwise occur by working hard to gain a goal. Alternative hypotheses offer that the hedonic state of the animal at the time of food delivery mediates the development of preferences , or that the increased utility of the outcome under high-effort conditions leads to the development of value preference . While these hypotheses attempt to account for the reported findings, the basis for this behavioural phenomenon is still relatively unexplored. Thus, we devised a setting to examine the relationship between effort and reinforcer value.
To this end, mice were trained in separate sessions with different levers, where each response (i.e. fixed-response 1; FR-1) led to the delivery of discriminatively different cues and reinforcers (e.g. left leverFR-1 → polycose + tone; right leverFR-1 → sucrose + noise). For one of the levers, the effort required to obtain reinforcement was gradually increased, such that prior to test, 15 responses (i.e. FR-15) were required for reinforcer delivery (e.g. right leverFR-15 → sucrose + noise), while the reinforcement schedule remained unchanged for the low-effort lever (e.g. left leverFR-1 → polycose + tone). At test, mice were presented with free access to each reinforcer and the consumption during this period was calculated. We also tested whether the mice would learn to nose-poke to earn presentation of the cues (i.e. noise and tone) previously associated with different levels of effort (i.e. conditioned reinforcement; ). In a separate experiment, we assessed whether increases in effort were capable of enhancing the value of less-preferred low-caloric reinforcers, and whether a refined microstructural analysis of meal intake [11,12] might elucidate how the change in reinforcer value is mediated.
2. Material and methods
Behavioural testing was conducted in male C57BL6/J strain mice (n = 48), purchased from Jackson Laboratories. Mice were transferred to the Neurogenetics and Behaviour Center, Johns Hopkins University, at six to eight weeks of age, and housed in groups of three to four per cage under a 12 h light/dark cycle (lights on: 7.00–19.00). Mice were handled for one week prior to food deprivation, at which time they were restricted to a single daily meal for 3 days prior to behavioural testing, resulting in a reduction to 85 per cent of their ad libitum body weight. All handling and behavioural testing occurred within the light cycle between 9.00 and 17.00. Experiment 1 was conducted in a single study (n = 16); and experiment 2 was conducted in two replications (n = 32) under protocols approved by the Institutional Animal Care and Use Committee.
For both experiments, instrumental conditioning took place in eight identical chambers. Each chamber consisted of aluminium front and back walls, clear polycarbonate sides and ceiling, and the floor consisted of parallel, stainless-steel rods, all housed in sound-attenuating shells (Med Associates, St Albans, VT, USA). Chambers were modified to include a programmable food cup into which 50 µl of liquid reward could be delivered. Food cups were connected to vacuums, allowing for the removal of liquid when desired. Infrared photocells installed in the food cup monitored the time spent and number of entries into the cup. Within each chamber, retractable ultra-sensitive mouse levers (Med Associates) were available to the right and left sides of the food cup. Each chamber also contained a speaker that delivered a 3 kHz tone or white noise (amplitude set at ≈80 dB) mounted outside of the chamber on the opposite side from the food cup. Ambient light was supplied by a 28 V, 100 mA house light mounted inside the sound-attenuating shell.
For the test stage of experiment 1, individual homecages were set up to include two cubes, positioned at opposite sides of the homecage. Each cube was capable of holding 5 ml of liquid. For the conditioned reinforcement test session, two nose-poke devices were placed at the locations of the two levers. Each nose-poke device contained an illuminated yellow stimulus LED located at the rear of the recessed hole and a photo beam sensor to monitor nose-poke entries. In experiment 2, a dedicated single automated consummatory chamber was used that contained a lickometer, and used fibre optics to introduce a light beam through the fluid–air interface of a fluid bolus in the food cup. This system allowed for the detection and accurate time-stamping of licks, detected as disturbances in the amplified light surface within the interface when the fluid was contacted. The time-stamped data were available for analysis of licking microstructure using custom-made programmes. An IBM-compatible computer, with Med PC software, controlled the apparatus and recorded data.
(c) Behavioural procedures: experiment 1
Initially all mice received a single 40 min food cup training session conducted over two separate days. For half the mice, in session 1, deliveries of sucrose (50 µl of 10% w/v for 10 s) were available on a random time 30 s (RT-30) schedule, whereas 10 per cent w/v polycose was available in session 2. For the remaining mice, the order of the sessions were reversed. Following food cup training, mice received two sessions of instrumental conditioning each day, one on the left lever and a second on the right lever. The session commenced with the availability of the lever, to which a single response (i.e. fixed-ratio 1; FR-1) resulted in reinforcer delivery and the contemporaneous presentation of either a 3 kHz tone or white noise (amplitude set 5 dB above background; approx. 80 dB), followed by the immediate retraction of the lever for a 20 s period. During this time, the reinforcer was available, and the auditory cue presented for a 10 s period. The session was completed following either 50 reinforcer deliveries or after 60 min had passed. Each lever resulted in a particular stimulus–outcome delivery. For example, if left lever responses resulted in delivery of sucrose and presentation of white noise, then right-lever responses resulted in polycose delivery and tone presentation. The response–stimulus–outcome contingencies were fully counterbalanced across mice. The order of each instrumental training session was also reversed each day. Following three sessions of training, the maximum session duration was reduced to 45 min, and for one lever (high effort), the FR schedule was incrementally increased to FR-5 (for three sessions); FR-10 (for four sessions) and then FR-15 (for 10 sessions). The FR schedule for the alternate (low effort) lever remained unchanged. The assignment of high- and low-effort levers was fully counterbalanced across mice.
Following training, a 30 min reinforcer choice test was administered. Each mouse was placed in an individual homecage, where 5 ml of each reward was located in separate drinking cubes. The position of sucrose and polycose was counterbalanced with respect to their location in the homecage. Once completed, any remaining reinforcer was measured and the experimenter calculated the total consumption of each reinforcer. Finally, the ability of the high- and low-effort paired cues to serve as conditioned reinforcers for the acquisition of an instrumental nose-poke response was assessed in a single 40 min conditioned reinforcement test session. Nose-poke responses were monitored in the conditioning chamber, where for half the mice, each nose-poke to the left port resulted in the brief (3 s) presentation of the tone cue, and each right nose-poke response produced a 3 s noise presentation. For the remaining mice, the response–stimulus contingencies were reversed. The response–stimulus–outcome conditioning histories were also counterbalanced with respect to left or right nose-poke responses. Nose-pokes made during a cue presentation were recorded but had no programmed consequences.
(d) Behavioural procedures: experiment 2
Mice were first assigned two reinforcers: a relatively high-caloric and low-caloric reinforcer. For half the mice, the high-caloric reinforcer was 6 per cent polycose and the low-caloric was 1 per cent sucrose. For the remaining mice, a 5 per cent sucrose solution served as the high-caloric reinforcer and a 2 per cent polycose served as the low-caloric reinforcer. These concentrations were chosen as pilot studies had identified that the high-caloric reinforcers (i.e. 5% sucrose and 6% polycose) were consumed at similar levels, as were the low-caloric reinforcers (i.e. 2% polycose and 1% sucrose). Importantly, these same studies revealed a significant preference and enhanced intake for the high-caloric reinforcers.
Initially, all mice received food cup training similar to the previous experiment, with the low- and high-caloric reinforcers delivered in separate sessions. Following food cup training, all mice received single daily 10 min baseline consumption sessions for 4 days in the automated consummatory chamber. During each consumption session, the photo-beam lickometer was used and the tested reinforcer was continuously available in the food cup. At the start of the session, 50 µl of the tested reinforcer was available in the food cup, and additional 50 µl deliveries occurred every 25 licks as mice consumed the liquid. We previously conducted an extensive set of parametric studies to validate the lickometer counts (A. W. Johnson 2007, unpublished data), using a comparison of lick count to slow-motion video of the mice's licking that showed accurate lick counting and the continuous availability of reinforcer using this 25 lick per reinforcer criterion. For the first session, half the mice consumed the high-caloric reinforcer, and the remainder consumed the low-caloric reinforcer. On the second day, the alternate reinforcer was consumed in the chamber. This order was then repeated on sessions 3 and 4, respectively.
Mice were then divided into two groups based on their mean level of consumption for each reinforcer. In one group, the high-effort lever was subsequently associated with the low-caloric reinforcer, while the low-effort lever was subsequently associated with the high-caloric reinforcer. For the second group, the response–outcome contingencies were reversed. Mice then received two instrumental training sessions each day, one on each lever. For half the mice in each group, a single response (FR-1) to the left lever resulted in delivery of the low-caloric reinforcer for a 10 s period and the retraction of the lever for 20 s, whereas right-lever responses (FR-1) led to high-caloric reinforcer delivery. The session was completed following either 25 reinforcer deliveries or after 45 min had passed. The order of the training sessions was reversed each day. Following four sessions of training, the FR schedule was increased to FR-5 (for three sessions); FR-10 (for three sessions) and FR-15 (for 10 sessions) on the high-effort lever. The low-effort lever remained at FR-1 for the duration of behavioural training (i.e. 20 sessions). On completion of training, mice received two single-access consumption test sessions (each on a separate day) in the automated consummatory chamber, administered separately for the high- and low-caloric reinforcers (counterbalanced across groups). Licking microstructure analysis was then conducted on intake data collected from the consummatory chamber.
(e) Licking microstructure analysis
Total intake (in ml) during the consumption of high- and low-caloric reinforcers was calculated. We also calculated the first minute licks for each reinforcer, as this measure reflects tastant palatability prior to the development of inhibitory feedback owing to the collection of fluid in the gastrointestinal tract. In addition, we examined the duration of discontinuous licking bursts as this measure is also thought to reflect tastant palatability . We defined a termination in an otherwise steady stream of licking as any interlick interval (ILI) that exceeded 1 s. In addition, the number of ILIs that exceeded 1 s (burst number) was also calculated for each consumption session. This latter measure is thought to reflect post-oral ingestive factors that are known to influence tastant intake (e.g. ). We have recently conducted an extensive parametric study supporting the suitability of the 1 s pause criterion in C57/BL6J mice . A licking burst was defined as two or more consecutive licks, with pauses greater than 1 s determining the licking burst termination. The burst duration was calculated by summing the duration of licking within each burst during the 10 min baseline and test consumption sessions.
(a) Experiment 1: reinforcer preference and conditioned reinforcement
There were no differences in the percentage of high- (94.31 ± 4.85) or low-effort reinforcers (97.19 ± 1.96) obtained in the final sessions in the operant task. However, when presented at test with free access to each reinforcer in a different context, mice showed a significant consumption preference for the reinforcer associated with high-effort instrumental responding (Z = 2.25; p = 0.02; figure 1a). Since this occurred in a different homecage setting from the training environment with free access to food (i.e. no response contingencies), 24 h following training itself, the results demonstrate an acquired preference for reinforcers associated with higher effort.
We also examined whether neutral cues associated with those reinforcers during training similarly acquired greater reinforcing value, by assessing whether the mice would learn to nose-poke to earn presentation of the cues (i.e. noise and tone) associated with different levels of effort (figure 1b). Mice responded preferentially to the poke associated with presentations of the high-effort cue (Z = 2.21; p = 0.03). This occurred over a protracted period of time (40 min) in the absence of primary reinforcement, indicating that cues associated with high-effort responding were more effective conditioned reinforcers than those associated with low-effort contingencies. Collectively, the results from experiment 1 support a concurrent effect of high effort on both reinforcer preference and associated cues.
(b) Experiment 2: acquired preferences for low-caloric reinforcers
In experiment 2, we initially assessed whether increases in effort were capable of enhancing the value of less-preferred low-caloric reinforcers. Mice were given ‘baseline’ consumption tests in automated consummatory chambers where, in separate sessions, each mouse had free access to relatively high- (e.g. 6% polycose) and low-caloric (e.g. 1% sucrose) reinforcers. Using those two reinforcers, mice then received instrumental training with the high- and low-effort levers followed by the final ‘test’ sessions in the apparatus used to measure consumption (figure 2a). At test, when consumption of the low-caloric reinforcer was measured (figure 2b), previous increases in effort significantly enhanced value as reflected by an increase from baseline to test following high, but not low-effort training; significant group × stage interaction (F1,30 = 4.67, p < 0.05) owing to increase in consumption relative to baseline for high-effort (p < 0.001) but not low-effort group (p = 0.27). By contrast, consumption of the high-caloric reinforcer was unaffected by effort demand during training, which was probably due to relatively high intake of that reinforcer irrespective of effort demand (figure 2c).
(c) Experiment 2: analysis of licking microstructure
By testing in the consummatory chambers, we were able to accurately time-stamp individual licks and subsequently conduct a microstructural analysis on the time-stamped licking data to determine whether increases in the value of reinforcers after high effort were based on a change in palatability [11,12]. To this end, we examined two measures that are commonly used to index palatability: (i) first minute licks and (ii) duration of licking bursts. First minute licks reflect palatability uncontaminated by any unconditioned inhibitory feedback owing to the collection of fluid in the gastrointestinal tract . This measure of palatability increased for the low-caloric reinforcer (figure 3a); the group × stage interaction (F1,30 = 16.72, p < 0.001) was significant owing to an increase in initial lick rate for low-caloric reinforcer when trained with high (p = 0.0001) but not low effort (p = 0.98). However, initial lick rate did not interact with prior effort training for the high-caloric reinforcer (figure 3b). Our second measure of palatability, the licking burst duration for each consumption session, reflects the total duration of licks that occurred within each licking burst. Burst duration increased for the low-caloric reinforcer (figure 3c); as indicated by the significant group × stage interaction (F1,30 = 3.75, p = 0.05) owing to an increase in burst duration for high-effort (p < 0.001) but not low-effort paired mice (p = 0.2). As with the previous measure and consistent with overall intake, burst duration for the high-caloric reinforcer (figure 3d) was unaffected by prior effort, most probably because high palatability produced a ceiling on those measures under the test conditions used. While initial licks and burst duration were each increased in the high-effort condition for the low-caloric reinforcer, there was no effect on numbers of bursts for either the low-caloric (main effect of stage only; F1,30 = 15.22, p < 0.001) or the high-caloric reinforcer (main effect of stage only; F1,30 = 10.98, p < 0.01), a pattern consistent with a selective effect of effort on licking microstructure measures associated with taste and palatability. Finally, other behavioural measures we recorded, which included food cup entries and activity, were also not influenced by effort (figure 4).
The current study advances our understanding of the relationship between effort and reinforcer value in several ways. Increases in the value of a reinforcer were manifest at test as a choice preference for the outcome associated with high-effort training. This occurred in an environmental setting (homecage) outside of the training context. Thus, the expression of reinforcer preference could not be explained in terms of an effect of reward-related cues (in the training environment) that are known to influence decision-making processes . Measures of microstructural taste reactions in consumption tests further identified a basis for such preference (experiment 2). Both the initial lick rate and the burst duration increased selectively as a function of high effort. These measures have been well documented in the literature [11–13] as measures of palatability (i.e. taste) independent of any post-oral factors that may contribute to intake (e.g. [14,17). Thus, the pattern of licking demonstrates that animals exhibited greater taste reactivity indicative of increased palatability, which could serve as a basis for enhanced consumption of a less-preferred food after high-effort experience. At the same time, we independently confirmed an enhanced value of other cues associated with high effort (experiment 1); cues associated with high effort proved capable of maintaining new behaviour for a protracted period of time in the absence of primary reinforcement. This boost in conditioned reinforcement extends the effect of high effort to learning in a novel behavioural context.
It is well known that the hedonics of taste vary depending on motivational state at the time of consumption. A prime example is salt appetite, in which a normally aversive salty substance can dramatically gain hedonic value when the organism is physiologically salt-deprived. In the current case, some models would predict that motivational conditions in the task would lead to an increase in value of food after greater effort. For instance, a within-trial contrast model  assumes that the state of the organism following higher effort trials is more negative than lower effort trials such that a greater shift in hedonic state in the high-effort trials results in increased value. What is particularly striking in this work is the lasting effect on the hedonic attributes of food. Independent of the motivational conditions and performance demands in later tests, greater palatability is expressed, suggesting a change in the representation of the food itself.
Here, we observed that greater effort augments both consummatory and appetitive behaviours, which are generally thought to rely on different neural systems. Opioid and endocannabinoid signals within rostrodorsal portions of the nucleus accumbens shell can increase consumption by modulating the perceived palatability of food [19,20], independent of metabolic energy or nutritional demands. By contrast, a dopamine-dependent mechanism boosts the incentive value of reward-related events . This cue-triggered incentive process is manifest as potentiation of conditioned reinforcers that support goal-directed behaviour. Interference with learning-based mechanisms that drive appetitive behaviour (seeking or so-called ‘wanting’) can spare consumption that is based on ‘liking’. The current data suggest that effort modulates brain systems mediating both liking and wanting. However, both the shift in preference and greater cue value could be due to increased palatability of the food itself, not only serving as a basis for behaviour in consumption tests (figure 1a), but also augmenting learned effects on the value of cues that guide appetitive behaviour and decision-making (figure 1b). These results should be of interest to the field of neuroeconomics, which examines the computational and neurobiological basis of value-based decision-making (e.g. ).
How might an organism benefit from such effort-mediated modulation of cue and reinforcer value? In evolutionary terms, such mechanisms could benefit survival under conditions of scarcity when the chances of acquiring food are probably related to increased foraging effort. An increase in palatability accruing to lower value foods under such conditions would serve a dual purpose, both to boost consumption and to confer greater learning connected to the prevailing cues in the environment. In the first instance, a change in palatability would prolong meal duration  and/or consumption of certain (e.g. less valued) foods that might otherwise be rejected under conditions when food supply is plentiful. As a consequence of learning, cues could also come to support and prolong effort in seeking food in the absence of reinforcement itself  when a scarcity of resources prevails. Thus, a rather primitive and direct effect on the affective taste properties of food would ensure the coordination of a behavioural repertoire for both foraging and food intake.
We thank Peter Holland for helpful discussions of this manuscript. This work was supported by NIH grants DK084415 and MG MH60179, MH84018 to A.W.J.
- Received July 23, 2010.
- Accepted October 12, 2010.
- This Journal is © 2010 The Royal Society