Consistent with human gambling behaviour but contrary to optimal foraging theory, pigeons showed maladaptive choice behaviour in experiment 1 by choosing an alternative that provided on average two food pellets over an alternative that provided a certain three food pellets. On 20 per cent of the trials, choice of the two-pellet alternative resulted in a stimulus that always predicted ten food pellets; on the remaining 80 per cent of the trials, the two-pellet alternative resulted in a different stimulus that always predicted zero food pellets. Choice of the three-pellet alternative always resulted in three food pellets. This choice behaviour mimics human monetary gambling in which the infrequent occurrence of a stimulus signalling the winning event (10 pellets) is overemphasized and the more frequent occurrence of a stimulus signalling the losing event (zero pellets) is underemphasized, compared with the certain outcome associated with not gambling (the signal for three pellets). In experiment 2, choice of the two-pellet alternative resulted in ten pellets with a probability of 20 per cent following presentation of either stimulus. Choice of the three-pellet alternative continued to result in three food pellets. In this case, the pigeons reliably chose the alternative that provided a certain three pellets over the alternative that provided an average of two pellets. Thus, in experiment 1, the pigeons were responding to obtain the discriminative stimuli signalling reinforcement and the absence of reinforcement, rather than to obtain the variability in reinforcement.
Maladaptive human decision-making (e.g. buying lottery tickets or playing slot machines) has two important characteristics: on average, the net return is less than the amount wagered; and having sufficient experience with the game to learn about the odds of winning does not appear to reduce its frequency (paradoxically, it may actually increase the tendency to gamble). Those who gamble tend to overvalue the winning outcomes and undervalue the losing outcomes [1–3]. This bias may be related to the availability heuristic , the fact that wins are more salient than losses . In addition, the extra value of winning may be strengthened by the social attention (reinforcement) that often comes with winning. Gamblers may also believe superstitiously that they have some control over the outcome of a gamble  and continue wagering in the face of a string of losses because they may not understand the principle of randomness or the independence of events (; e.g. the gambler's fallacy).
Although many people gamble occasionally for its entertainment value, others become addicted and are debilitated by it . In some cases, pathological gambling has been associated with substance abuse and with general psychiatric problems including suicide .
By contrast, non-human animals should be less susceptible to the attraction of a poor gamble because it is likely to affect their survival . According to optimal foraging theory, animals should make more ‘rational’ choices because evolution should have favoured the survival of animals that do . Given appropriate experience, non-human animals are presumed to be sensitive to the relative amounts of food obtained from different alternatives (or patches).
But there is also evidence that animals prefer alternatives that lead to signals for reinforcement, even if on other trials those alternatives lead to signals for the absence of reinforcement [12,13]. For example, pigeons prefer an alternative that sometimes produces a green light that always predicts reinforcement and equally often produces a red light that predicts the absence of reinforcement (discriminative stimuli) over an alternative that produces a blue or yellow light, each of which predicts reinforcement 50 per cent of the time (non-discriminative stimuli; see figure 1). In this case, the preference for the ‘informative’ red or green light over the ‘uninformative’ blue or yellow light incurs no added cost to the animal because the probability of reinforcement associated with the two alternatives is equal (50%). But would the pigeon prefer the informative alternative if there was a substantial cost involved in the form of loss of food?
Recently, when choice of one alternative provided 75 per cent reinforcement and the other only 50 per cent reinforcement but was followed by discriminative stimuli (one followed by 100% reinforcement, the other by 0% reinforcement), pigeons showed a reliable preference for the 50 per cent reinforcement alternative . Stagner & Zentall  found similar results when pigeons chose a 20 per cent reinforcement alternative with discriminative stimuli over a 50 per cent reinforcement alternative with non-discriminative stimuli (see also [16–18]).
When humans gamble, they choose between a certain outcome (keeping the money in their pocket) and an uncertain outcome (the gamble). In both the studies mentioned above [14,15], however, the pigeons chose between uncertain outcomes. In each case the higher probability outcome was still uncertain (75% in one case, 50% in the other). Thus, unlike human gambling behaviour, the pigeons may have been avoiding the stimuli associated with the uncertain (and unsignalled) outcome by choosing discriminative over non-discriminative stimuli.
Thus, the purpose of these experiments was to determine if the results of earlier research resulted from pigeons' avoidance of the uncertainty of reinforcement associated with the higher probability of reinforcement alternative or from the preference for the possible appearance of a stimulus that predicted a high probability (or high magnitude) of reinforcement.
2. Experiment 1
A better analogy to human gambling would be to give pigeons a choice between one alternative associated with a certain outcome (analogous to money in one's pocket) and a second alternative associated with a low-probability, high-payoff stimulus on some trials or a high-probability, zero-payoff stimulus on others. To be analogous to maladaptive human gambling, the average payoff for the first alternative should be greater than that of the second alternative. The question is whether pigeons will prefer the first alternative with the greater net payoff (which they should, according to optimal foraging theory) or the second alternative involving low probability and high payoff (which they may if they behave like human gamblers).
The subjects were eight unsexed white carneau pigeons ranging from 5 to 8 years of age purchased from the Palmetto Pigeon Plant, Sumter, South Carolina. The pigeons were kept on a 12 L : 12 D cycle and were maintained at 80 per cent of their free feeding body weight. They had free access to grit and water, and were cared for in accordance with the University of Kentucky's animal care guidelines.
A Med Associates (St Albans, VT) ENV–008 modular operant test cage was used. The response panel in the chamber had a horizontal row of three response keys. Behind each key was a 12-stimulus inline projector (Industrial Electronics Engineering, Van Nuys, CA) that projected vertical and horizontal lines as well as red, yellow, blue and green hues. Reinforcement was delivered by a pellet dispenser mounted behind the response panel (Med Associates ENV–45). A 28 V, 0.1 A houselight was centred above the response panel. A microcomputer in the adjacent room controlled the experiment.
All pigeons received two pretraining sessions in which they were required to peck once for reinforcement on each side key to one of the four hues (yellow, red, green and blue) and to one of the two line orientations (horizontal and vertical), as well as to a triangle stimulus on the centre key. There were 78 stimulus presentations per session: six involving each hue and line stimulus on each of the side keys and six of the triangle stimulus on the centre key. Each response was reinforced with three food pellets.
The design of the training phase appears in figure 2 (experiment 1). All trials were initiated by presentation of a triangle stimulus on the centre key. A single peck to the triangle turned it off. On forced trials (40 per session), either a vertical or horizontal line was presented on one of the side keys. Each line orientation could appear equally often on the left or right. One peck to this stimulus initiated a coloured stimulus on the same key for 10 s. For half of the pigeons, if a vertical line had been presented (20 trials per session), 20 per cent of the time the colour would be, for example, red (four trials per session) and after 10 s, 10 pellets would be presented. The other 80 per cent of the time that a vertical line appeared, the colour that followed would be, for example, green (16 trials per session) and after 10 s no pellets would be presented. If a horizontal line had been presented (20 trials per session), a peck would change the stimulus to, for example, blue 20 per cent of the time (four trials per session) and to, for example, yellow 80 per cent of the time (16 trials per session). In each case, after 10 s, three pellets always would be presented. The colours that signalled 10, three and zero pellets were counterbalanced over subjects, as were the line orientations that signalled the discriminative stimuli and the non-discriminative stimuli.
Each session also included 20 choice trials. Choice trials were also initiated by pecking the triangle stimulus. When pecked, both vertical and horizontal lines were presented simultaneously on the left and right keys. A single peck to either line orientation stimulus resulted in presentation of one of the colours with the same probability as on forced trials. The unchosen key was darkened. The choice trials were presented randomly among the forced trials. All pigeons were trained with this procedure for 40 sessions.
The pigeons developed a strong preference for the alternative associated with a mean of two pellets of reinforcement per trial (10 pellets with a probability of 20%) over the alternative associated with a consistent three pellets per trial. Acquisition of the preference for the alternative with the discriminative stimuli is presented in figure 3. When the preferences were pooled over the last five training sessions, the preference was 82.2 per cent and significantly different from chance (as determined by a single-sample two-tailed t-test: t(7) = 4.23, p = 0.004, effect size r = 0.83). Of the eight pigeons in this experiment, six showed a clear preference for the alternative with the discriminative stimuli (more than 92%) over the last five sessions, whereas two pigeons were relatively indifferent (45% and 50% choice of the discriminative stimuli).
The results of experiment 1 indicate that pigeons prefer an alternative providing discriminative stimuli that signal an occasional large reinforcement (20% of the trials) or its absence (80% of the trials) over an alternative providing non-discriminative stimuli that always signal a smaller reinforcement, in spite of the fact that the alternative associated with the consistent smaller reinforcement provided 50 per cent more reinforcement than the other alternative.
That two of the eight pigeons did not prefer the probabilistic 10 pellets over certain three pellets suggests that there may be interesting individual differences in this form of gambling behaviour. The small number of pigeons that avoided the lure of the large probabilistic reinforcement precluded further differentiation of their behaviour but it should be noted that none of the pigeons preferred the greater reward associated with the certain 3-pellet alternative.
6. Experiment 2
To what extent could the preferences found in experiment 1 be due to a preference for a variable magnitude of reinforcement (zero or 10 pellets) over a fixed magnitude of reinforcement (three pellets) independent of the stimuli that signalled those magnitudes ? In experiment 2, the probability of reinforcement associated with the two colours that had been associated with 10 and zero pellets was equated. That is, if in experiment 1 the red stimulus had been associated with 10 pellets and the green stimulus had been associated with zero pellets, then in experiment 2 presentation of either colour was followed by 10 pellets 20 per cent of the time. Thus, the probability of receiving 10 pellets associated with the initial choice was still 20 per cent, as it was in experiment 1, but the large magnitude of reinforcement (as well as the absence of reinforcement) was no longer signalled.
(a) Subjects and apparatus
The subjects were seven new unsexed white carneau pigeons similar to those used in experiment 1. The apparatus was the same as that used in experiment 1.
The procedure was the same as that of experiment 1 with the following exceptions. Choice of the variable-outcome alternative resulted in the presentation of one of two colours (one on 20 per cent of the trials and the other on 80 per cent of the trials), but 10 pellets followed either colour on 20 per cent of the trials. Thus, those colours were no longer discriminative of reinforcement but the probability of 10 pellets of reinforcement for choice of that alternative continued to be 20 per cent (see figure 2, experiment 2). That is, in each session, for the variable-outcome alternative there were 20 forced trial presentations of the more frequently presented colour, 16 of which were followed by zero pellets of reinforcement and four of which were followed by 10 pellets of reinforcement. For that alternative, there were also five forced trial presentations of the less frequently presented colour, four of which were followed by zero pellets of reinforcement and one of which was followed by 10 pellets of reinforcement. Thus, there were 25 forced trials to the alternative associated with a constant three pellets of reinforcement (as in experiment 1). There were also 25 choice trials, for a total of 75 trials per session. The change from experiment 1 in the number of trials of each type and the number of total trials was necessitated by the change in the probability of reinforcement associated with the colours that signalled 20 per cent reinforcement (from 100 to 20% reinforcement and from 0% to 20% reinforcement). In all other respects, the procedure was the same as in experiment 1. The experiment consisted of 40 sessions of training.
When the two discriminative colours associated with 10 and zero pellets were equated at 20 per cent probability of obtaining 10 pellets, in contrast to the pigeons in experiment 1, these pigeons showed a strong preference for the alternative associated with a consistent three pellets. When pooled over the last five sessions of training, the mean choice of the alternative associated with a consistent three pellets was 79.9 per cent. This difference was significantly different from chance (t(6) = 3.32, p = 0.02, effect size r = 0.78). Of the seven pigeons, six showed a strong preference for the three-pellet alternative over the two-pellet alternative. Thus, when reinforcement was not differentially signalled, the pigeons preferred the alternative associated with a higher mean number of pellets per trial. Acquisition of the preference for the alternative associated with a consistent three pellets appears in figure 3.
The results of experiment 2 indicated that the preference for an average of two pellets over a consistent three pellets found in experiment 1 did not depend on the variability of reinforcement associated with the two-pellet alternative. Instead the results indicate that the preference for the two-pellet alternative depended on the signalling value of the stimulus that differentially predicted that 10 pellets would be coming.
10. General discussion
The results of these experiments suggest that pigeons show a tendency to make maladaptive decisions similar to those of humans. That is, pigeons prefer a signal for a low-probability, high-payoff alternative over a signal for a certain low-payoff alternative that on average provides 50 per cent more reinforcement. The results of experiment 2 suggest that it was not the variability in reinforcement that was responsible for the preference for less over more reinforcement in experiment 1 but the preference for the discriminative stimuli that reliably signalled the presence and the absence of reinforcement to follow.
The results of experiment 2 may have additional implications for human gambling. The importance of the conditioned reinforcer in demonstrating this maladaptive choice by pigeons suggests that similar processes may be affecting the behaviour of human gamblers (e.g. ). For example, matching a winning lottery ticket comes well before obtaining the reward, and watching the winning pattern as it comes up on a slot machine comes before the money is received. Would gamblers wager as often as they do if there were no signals for the occasions of their winning (e.g. slot machine gamblers could not see the three wheels, roulette players could not see the ball skipping over the slots in the wheel)?
The results of these experiments complement the results of earlier research  which show that pigeons prefer an alternative that provides 50 per cent reinforcement (when followed by a stimulus that predicts 100% reinforcement and equally often by a stimulus that predicts 0% reinforcement) over an alternative that provides 75 per cent reinforcement (independent of the stimulus that follows). Similarly, the results of these experiments are consistent with research that found that pigeons prefer an alternative that provides 20 per cent reinforcement (when followed on 20% of the trials by a stimulus that predicts 100% reinforcement and on 80% of the trials by a stimulus that predicts 0% reinforcement) over an alternative that provides 50 per cent reinforcement (independent of the stimulus that follows).
Behavioural ecologists might argue that this kind of choice (and those involved in most human games of chance) does not occur in nature. Although animals in nature may confront differential probabilistic outcomes of different magnitudes associated with their choices, it is often the case that approaching a low-probability, high-value outcome will increase the probability of its occurrence by bringing the animal into closer proximity to a high-value patch. Thus, what may induce animals (including humans) to pursue a large low-probability reward is that, in nature (but not typically in the laboratory or casino), such behaviour often increases the probability of the large reward.
One could also argue that to the degree that the pigeons do not get sufficient food during the experimental session, they will eventually be provided with additional food in the home cage, and this may have caused them to be willing to forgo food during the experimental sessions. However, research on delay discounting in which pigeons are given a choice between a small immediate reinforcer and a large delayed reinforcer suggests that reinforcement that is delayed by as much as 5 s has very little value to a pigeon . That is, pigeons have a very short time horizon and generally will choose a small amount of food right away over a much larger amount of food at a later time .
In one sense, the present results are different from human gambling decisions because the pigeon does not yet have the reward prior to making its decision, whereas the human has to give up already earned money. However, this difference should make the likelihood of the wager even less probable because humans are forgoing a present (immediate) reward (the money in their pocket) over the (however unlikely) possibility of a larger future gain .
We believe that the mechanism responsible for this maladaptive behaviour is the contrast between the expectation of reinforcement associated with the alternative prior to its selection and the expectation of reinforcement associated with the stimulus that appears after the alternative is chosen (see ). Thus, in the present experiment 1, for the two-pellet alternative, when the stimulus that is associated with 10 pellets appears, there is a large amount of positive contrast. However, when the stimulus that is associated with zero pellets appears the resulting negative contrast is relatively small. Thus, the net effect will tend to be positive contrast. For the three-pellet alternative there should be no contrast because on every trial three pellets are expected and three pellets are obtained. The net positive contrast thus favours the two-pellet alternative. In experiment 2, without discriminative stimuli, the expectation of reinforcement associated with the two-pellet alternative is on average two pellets, but because both of the stimuli that follow that choice are also associated with on average two pellets (a 20% chance of obtaining 10 pellets) there should be no contrast and thus the pigeons should choose the alternative associated with the greater amount of food (the consistent three-pellet alternative).
A similar contrast effect may account for human gambling. As the expectation of a win is quite small, there is large positive contrast when there is a win but there is very small negative contrast when there is a loss. More specifically, human gamblers tend to overvalue their wins and undervalue their considerably more frequent losses  (see also prospect theory ).
This research was supported by the National Institute of Child Health and Human Development Grant HD060 996. We thank Mikael Molet and Jennifer Laude for their comments on an earlier draft of this manuscript.
- Received July 27, 2010.
- Accepted September 20, 2010.
- This Journal is © 2010 The Royal Society