Royal Society Publishing

Adaptive numerical competency in a food-hoarding songbird

Simon Hunt, Jason Low, K.C Burns


Most animals can distinguish between small quantities (less than four) innately. Many animals can also distinguish between larger quantities after extensive training. However, the adaptive significance of numerical discriminations in wild animals is almost completely unknown. We conducted a series of experiments to test whether a food-hoarding songbird, the New Zealand robin Petroica australis, uses numerical judgements when retrieving and pilfering cached food. Different numbers of mealworms were presented sequentially to wild birds in a pair of artificial cache sites, which were then obscured from view. Robins frequently chose the site containing more prey, and the accuracy of their number discriminations declined linearly with the total number of prey concealed, rising above-chance expectations in trials containing up to 12 prey items. A series of complementary experiments showed that these results could not be explained by time, volume, orientation, order or sensory confounds. Lastly, a violation of expectancy experiment, in which birds were allowed to retrieve a fraction of the prey they were originally offered, showed that birds searched for longer when they expected to retrieve more prey. Overall results indicate that New Zealand robins use a sophisticated numerical sense to retrieve and pilfer stored food, thus providing a critical link in understanding the evolution of numerical competency.


1. Introduction

Food hoarding is a key behaviour in the study of animal cognition. Valuable insights into a variety of complex cognitive traits, e.g. spatial memory and its neurological basis (Dally et al. 2006; Smulders 2006; Dochtermann & Jenkins 2007; Martin & Wallace 2007), have arisen from investigations of the strategies that animals use to protect their own caches and to pilfer caches made by other animals. However, the insights into animal cognition based on investigations of food hoarding have largely been restricted to aviary-based experiments on memory-related tasks in corvids and parids. Field experiments on other species might generate new insight into different cognitive traits, such as numerical competency.

One approach to studying quantitative judgements in animals is to document relative numerosity judgements, i.e. their dichotomous judgements of inequality in magnitude (e.g. deciding which has more versus less; Davis & Pérusse 1988). Many studies have found that a range of non-human animals as diverse as chimpanzees (Rumbaugh et al. 1987) and salamanders (Uller et al. 2003) can reliably choose the larger of two simultaneously visible sets of items. Several of these studies suggest that there is an upper limit in numerical discriminations between three and four objects (e.g. Uller et al. 2003; Agrillo et al. 2007). These findings support the object-file system of number representation where individual items are encoded as separate tokens in capacity-limited working memory, which results in ceiling of discriminations at three to four elements in any single set (Trick & Pylyshyn 1994). Accurate numerosity judgements that are limited to four countable items have led to suggestions that different mechanisms may be responsible for the representation of small versus large number sets (Feigenson et al. 2004).

Other studies have shown that animals can discern beyond four items. Alex, an African grey parrot, after many years of training could make numerical discriminations involving sets up to nine items by associating abstract numerical symbols to discrete number quantities (Pepperberg 2006). Apes can make accurate relative numerosity judgements even when the elements in sets are displayed sequentially and then obscured from view. Moreover, their performance over the entire number range is related to the ratio between quantities and, to some extent, total set size (e.g. Beran 2004, 2007; Anderson et al. 2007; Hanus & Call 2007). Different corroborative evidence that accuracy in numerical judgements smoothly decreases with increasing magnitude, rather than breaking down after a set size limit of four items, has been documented in experiments investigating numerical ordering tasks (e.g. Brannon & Terrace 2000) and violation of expectancy search tasks (e.g. Lewis et al. 2005). These studies have led many investigators to instead propose that an analogue magnitude system such as the accumulator model is responsible for quantity judgements in animals. Under this analogue model, there is no a priori limit in relative numerosity judgements. Rather, the accuracy of discriminations decreases with increasing quantities (Gallistel & Gelman 2000; Hanus & Call 2007). However, support for an analogue magnitude system may be criticized because accuracy beyond the three- to four-item limit in these studies may be due to significant training histories of the participants used. In addition, several studies supporting the analogue model failed to account for confounds with non-numerical dimensions such as timing or volume (e.g. Beran 2007; Hanus & Call 2007). Other studies have documented evidence against the analogue model by showing that animals are incapable of discerning between sets of three or four sequentially presented items, even after controlling for non-numerical dimensions (Hauser et al. 2000). Despite the mounting interest in relative numerosity judgements, it is still not fully clear whether wild animals can represent quantities larger than four items. Whether wild animals use numerical judgements in their daily lives, and whether number-based decision making has any adaptive significance, is also unclear (Hauser 2000; Lyon 2003).

The New Zealand robin (Petroica australis) is one of a very small number of food-hoarding birds in the Southern Hemisphere (Vander Wall 1990). It is a medium-sized insectivorous passerine that is endemic to New Zealand. Robins are monogamous and mated pairs reside on exclusive territories year-round (Higgins & Peter 2002). Robins forage mostly on the forest floor and their diet includes some of the world's largest terrestrial invertebrates, including giant earthworms and flightless grasshoppers (Lee 1959; Gibbs 1998). Because their prey are often too large to be consumed whole, they are typically dismembered, and pieces of unconsumed prey are cached in depressions in tree branches (Powlesland 1980; Alexander et al. 2005). Like many other animals endemic to isolated oceanic islands, robins lack pronounced anti-predatory behaviours and are fearless of humans (Carlquist 1965). Therefore, activities that are difficult to document in most songbirds (e.g. mating, nuptial feeding) are readily observed in wild New Zealand robins (Powlesland 1980, 1981). They will also consume and cache food offered to them by hand, providing a unique opportunity to conduct field experiments on their food hoarding and cache retrieval strategies.

Previous experiments have shown that robins rely heavily on food hoarding and accurate cache retrieval in winter, when temperatures drop, days become shorter and food is in reduced supply (Burns & Steer 2006; Burns & van Horik 2006; van Horik & Burns 2007; Steer & Burns 2008). Males and females compete for food on their winter territories and both sexes retrieve their own caches and steal caches made by their mate. Cache retrieval is therefore ‘reciprocal’ (sensu Vander Wall & Jenkins 2003) and advanced numerical competency could provide an advantage to birds while prioritizing cache sites for retrieval or to pilfer.

We tested the extent of numerical discrimination capabilities in New Zealand robins by conducting a series of field experiments where we presented solitary wild birds with different numbers of mealworm prey in artificial cache sites. Each of two cache sites was filled sequentially in full view of the subject, they were concealed and subjects were then allowed to choose between cache sites (Hauser et al. 2000). Additional experiments were conducted to control for time, volume, sensory, order and orientation confounds. Lastly, we conducted a violation of expectancy experiment (Lewis et al. 2005), where prey were displayed sequentially to subjects, but birds were only allowed to retrieve a subset of the prey shown to them, to test whether subjects search longer when they ‘expected’ to find more food.

2. Material and methods

Experiments were conducted between May and August (i.e. winter) in 2006 and 2007 in the Karori Wildlife Sanctuary (KWS). KWS is located on the southern tip of the North Island of New Zealand (41°18′ S, 174°44′ E), and at the time of the experiment, it housed a population of approximately 70–100 colour-banded robins. Birds used in trials were located audibly and/or visually along a series of footpaths traversing KWS. More detailed descriptions of the study site, the robin population and the somewhat unusual circumstances surrounding field experiments on wild, yet tame, robins can be found elsewhere (Burns & Steer 2006; Burns & van Horik 2006; van Horik & Burns 2007).

In experiment 1, we presented mealworm prey to wild birds in an experimental arena comprising a tree branch containing two artificial cache sites (figure 1). Each artificial cache site was a circular opening 7 cm in diameter and 7 cm deep, which could be covered by a piece of leather attached to a swivel. Wild robins frequently turn over leaves in search of prey on the forest floor. As a result, all birds readily removed leather lids to access prey without training. Mealworms (Tenebrio molitor larvae) were individually placed in each site sequentially, at an approximate rate of 5 s per item, so that birds saw each individual mealworm as they were being transferred. Eight number combinations were presented to 14 colour-banded birds in the following randomly selected order: 4 versus 6; 4 versus 8; 6 versus 8; 8 versus 10; 4 versus 5; 2 versus 3; 3 versus 4; and 1 versus 2. All 14 birds were subject to all eight treatments. We randomized the order of treatments in an attempt to control for observational learning. If trials were conducted in a systematic order, and birds learned to perform differently throughout the course of the experiment, the order in which treatments were conducted may confound magnitude effects. The order in which each individual was sampled within each treatment was chosen according to the order in which they were encountered in the field. We specifically defined decisions (i.e. the cache site chosen) as the first leather cover removed. Birds were only allowed to access the first cache site chosen; the apparatus was removed before the bird had an opportunity to recover prey located in the other site. Robins always retrieved caches immediately after they were given the opportunity to do so.

Figure 1

A New Zealand robin choosing an artificial cache site during a field trial.

In experiment 2, we controlled for the potential confounding effects of the time taken to fill each cache site with prey, as well as the total volume of items stored in each site. All aspects of this experiment were similar to the first, except in this instance we dropped inanimate objects (i.e. small rocks that were approximately the same size and shape as mealworms) into the cache site containing the smaller number of mealworms. The number of rocks used in each treatment varied such that there was an equivalent number of items in each site (e.g. three worms versus two worms and one rock; Hauser et al. 2000). Consequently, the total amount of time taken to fill each cache site, and the total volume of all items stored in each, never differed. The orientation (left or right) and sequence of cache fill (the larger number cached first or second) were randomized separately for each trial in experiments 1 and 2; otherwise birds might have associated the larger magnitude with a particular cache orientation or fill sequence during the course of the experiment.

In experiment 3, we tested for potential orientation preferences (i.e. cache sites oriented left or right) or order preferences (cache sites filled first or second). This experiment was identical to the first two, except that an equal number of mealworms was placed in each site (1 versus 1, 2 versus 2, 3 versus 3, 4 versus 4, 5 versus 5, 6 versus 6, 8 versus 8 and 10 versus 10). Separate binomial tests were used to establish non-random orientation and order preferences in each treatment. In experiment 4, we tested whether birds could sense the number of prey in each cache site, even though they were covered. One cache site was randomly chosen, filled with six mealworms and both sites were concealed prior to engaging the subject. The other site was left empty and birds were allowed to choose between sites. As in experiment 3, binomial tests were used to establish whether birds chose filled cache site preferentially.

If robins choose cache sites based on numerical judgements, then they should search for longer when they are allowed to retrieve only a fraction of the prey they are shown. To test this hypothesis, we conducted a fifth, ‘violation of expectancy’ experiment. We offered mealworm prey to 10 subjects in a similar experimental display in a different branch containing only a single cache site and a trapdoor. All other attributes of this arena were identical to the first, except that in this experiment some prey items were hidden behind a trapdoor after being shown to birds.

In the first trial, birds were shown one prey item and allowed to retrieve one. In the second, they were shown two prey items but allowed to retrieve only one (the other was hidden underneath the trapdoor). Differences in the amount of time spent searching for prey at the retrieval stage was then compared between trials with a paired t-test. Three identical sets of paired trials were conducted with larger numbers of hidden prey (i.e. shown 2, allowed 2 versus shown 3, allowed 2; shown 4, allowed 4 versus shown 6, allowed 4; shown 4, allowed 4 versus shown 8, allowed 4).

In addition to these four pairs of trials, two sets of control trials were conducted. The first controlled for volume confounds. It compared search times between a trial where birds were shown one mealworm, which they were allowed to retrieve, to a second trial where birds were shown two small mealworms (which when summed matched the weight of prey items used previously), but allowed to retrieve one normal sized mealworm. The second controlled for sensory confounds. It compared search times between a trial where birds were shown one mealworm and allowed to retrieve one, to a second trial where birds were offered one mealworm and allowed to retrieve one when six additional mealworms were placed in the hidden compartment below.

3. Results

Results from experiment 1 showed that in four of the eight treatments (1 versus 2, 2 versus 3, 3 versus 4 and 4 versus 8), birds selected the site containing more prey at frequencies above-chance expectations (figure 2; binomial p<0.05 for all, n=14). The multiple regression model explained a significant amount of variation in the proportion of birds making ‘correct’ decisions (r2=0.754, F2,5=11.752, p=0.013). However, only the total number of prey stored across both cache sites contributed to the model (p=0.013), not the ratio between the two numbers stored (p=0.234). Set size was not correlated with set ratio (n=8, r=0.375, p=0.360). Therefore, the accuracy of number discriminations made by birds declined linearly with the total number of prey items stored in both cache sites.

Figure 2

Results from two field experiments where birds were allowed to choose between two cache sites containing different numbers of prey. The percentage of birds that chose the cache site containing the greater number of prey is shown on the y-axis. The total number of prey used in each treatment is shown on the x-axis. Treatments are labelled by the number of prey hidden in each of the two compartments. Treatments located above the dotted line (11 or more ‘correct’ decisions) had a greater number of correct decisions than expected by chance based on the binomial distribution (i.e. p<0.05). (a) Results from experiment 1, where only mealworms were presented to birds (uncontrolled). (b) Results from experiment 2, where small rocks were placed in wells containing fewer mealworms to control for time and volume confounds.

Results from experiment 2 were similar to experiment 1. In five of nine treatments (0 versus 1, 1 versus 2, 2 versus 3, 3 versus 4 and 4 versus 6), birds selected the site containing more prey at frequencies above-chance expectations (binomial p<0.05 for all, n=14). A regression analysis showed that the proportion of birds making correct decisions during each treatment declined with the total number of items (prey and small rocks) stored in both cache sites, albeit more weakly (r2=0.352, F1,7=5.343, p=0.053). The ratio of items stored in each site was not included because it never differed. Therefore, results from the first experiment cannot be attributed to volume or time confounds.

Results from experiment 3 showed that orientation preferences (left versus right) did not occur in any treatment (p>0.122 for all, n=14). Birds also did not choose cache sites based on the order in which they were filled (p>0.061 for all, n=14). Results from experiment 4 showed that birds did not choose the cache site containing hidden prey (p=0.183, n=14). In addition, the proportion of birds making correct decisions in experiments 1 and 2 was unrelated to the order in which treatments were conducted (r2=0.240, F1,6=3.209, p=0.123; r2=0.028, F1,6=0.779, p=0.407), suggesting that results were not confounded by between-trial learning.

Results from experiment 5 showed that robins searched the experimental arena for longer periods when they were allowed to retrieve fewer prey than they were shown (figure 3). They searched over four times longer when they were presented with two prey items but allowed to retrieve one, compared to when they were shown one and allowed to retrieve one (n=10, t=6.813, p<0.001). Similar results were obtained when they were presented with three mealworms but allowed to retrieve two, relative to when they were shown two but allowed to retrieve two (n=10, t=4.531, p=0.001). They searched for twice as long when they were presented with six mealworms but allowed to retrieve four, compared to when they were shown four but allowed to retrieve four (n=10, t=3.102, p=0.013). Search times were similar when they were shown four and allowed to retrieve four, compared to when they were shown eight but allowed to retrieve four (n=10, t=0.984, p=0.351).

Figure 3

Results from a ‘violation of expectancy’ experiment, in which birds were shown a particular number of prey items that were then hidden in an artificial cache site containing a trapdoor. In some trials, birds were allowed to retrieve all the prey items they were shown. In other trials, several prey items were hidden behind the trapdoor, thus allowing birds to retrieve only a fraction of what they were shown. The amount of time spent searching for hidden prey is shown on the x-axis (±s.e.; ***p<0.001; **p<0.01; *p<0.05; n.s., not significant). Birds spent more time searching when they were shown more prey than they were allowed to retrieve. However, this effect declined with the total number of prey items they were shown. The bottom pair of treatments controlled for the possibility that birds detected prey hidden behind the trapdoor; 1(6h) refers to a treatment where one prey item was shown to the subject, while six prey identical items were hidden behind the trapdoor in the absence of robins. The pair of treatments listed second from the bottom controlled for the possibility that birds made decisions based on differences in volume; 2(s) refers to a treatment where two small prey items, each measuring half the volume of the prey items used in all other treatments, were used.

The control treatments indicated that volume or sensory confounds are unlikely explanations for the results. Birds searched for longer when they were shown two small prey items, but allowed to retrieve one normal sized prey item, compared to when they were shown one normal sized prey item and allowed to retrieve one. Search times were also similar when birds were shown one item and allowed to retrieve one, compared to when they were shown one item and allowed to retrieve one above the hidden compartment containing six prey. In addition, search times were similar among the four trials where birds were shown one item and allowed to retrieve one, as well as the two trials where they were shown four and allowed to retrieve four, suggesting a high degree of repeatability to results (figure 3).

4. Discussion

In experiments 1 and 2, New Zealand robins often selected the larger of two quantities, even when food items were cached sequentially so that the contents of each site were never simultaneously visible. Such sophisticated performance implies that robins encoded information about the quantity stored at each cache site and mentally compared them to make a relative numerosity judgement in order to retrieve the larger amount of food (Hanus & Call 2007). We found that total set size was correlated with the accuracy of numerical judgements. Therefore, for a given difference between two quantities, the ability of New Zealand robins to discriminate between them worsens as their sum increases (Dehaene et al. 1998).

Primates are capable of tracking more than four items when presented sequentially after extensive training (e.g. Beran 2004). In the wild, relative number judgements involving sequentially presented items have only been tested with rhesus macaques, revealing a set size limit of four items (Hauser et al. 2000). Our field experiments are the first to demonstrate that wild animals regularly exercise sophisticated numerical abilities in the absence of training.

Our results may, to a certain extent, be explained by analogue systems of number representation. For example, in the accumulator model, there is no strict upper limit in representing numerosity, but judgements become systematically less precise and noisier with increasing numbers (Gallistel & Gelman 2000). Ratio effects (see Beran 2001, 2004, 2008) may also be indicative of analogue representations. Although the multiple regression analysis in experiment 1 indicated that set size rather than ratio predicted the accuracy of numerical judgements, all comparisons with ratios at 0.50, half at 0.66 and 0.75 and none at 0.80 were significant. This suggests that ratios, to some extent, exert an influence in the relative number judgements of New Zealand robins, but it was undetected due to the limited range of ratios we worked with (M. J. Beran 2008, personal communication).

The object-file system of number representation, which argues that accurate number discriminations are limited to three to four elements (Trick & Pylyshyn 1994), cannot explain our findings completely. We found no evidence for a sharp discontinuity in performance between trials involving small (e.g. 1 versus 2) versus large (e.g. 4 versus 8) number combinations. However, all treatments in which the robins performed above-chance expectations were between two numbers below or equal to 4 (except for the 4 versus 8 in experiment 1, and even then one of the numbers is 4). A number sense up to 4 (and perhaps also vaguely defining something as ‘larger than 4’) could therefore be sufficient to account for our results.

Results may also be explained by observationally acquired associative strengths among treatments (e.g. Browne 1976; Hauser et al. 2000; van Marle et al. 2006; Tomonaga 2008), where each pairing of a cache site with a mealworm may represent a learning episode. The acquisition of associative strength is negatively accelerated (Rescorla & Wagner 1972; Gallistel et al. 2004), with the initial pairings of a cache site with a mealworm within a trial producing the greatest increments in associative strength. Thus, acquisition of associative strengths could account for the decline in performance with the absolute magnitude of the caches as indicated in figure 2. Although the accuracy of performance was statistically unrelated to the order in which treatments were conducted, we suspect that observational memory plays a key role in enhancing robins' ability to make numerical discriminations. Robins regularly pilfer food cached by other robins, indicating that they may be ‘trained’ naturally and develop more sophisticated numerical abilities as they age.

Evidence for an upper limit to numerical discriminations varied somewhat between experiments. For example, the 4 versus 8 comparison was above-chance expectations in experiment 1, but below-chance expectations in experiment 2. Similarly, robins were able to discriminate between 4 and 8 items in experiment 1, yet they were unable to discriminate between 4 and 8 in the violation of expectancy experiment or experiment 2. One explanation for these discrepancies is that these number pairs are at the upper threshold of their numerical abilities.

The level of numerical competency shown by New Zealand robins appears to be higher than that recorded for any other wild animal. The wild robins in our study can discern between groups of items that total up to 12 items, without training by humans. Clear evidence of adaptive numerosity judgements among animals in the wild has been limited to ecological demands associated with parental investment and intergroup aggression (Hauser 2000; Lyon 2003). A potential explanation for the sophisticated number sense in New Zealand robins is that numerical discriminations form an integral part of their cache retrieval strategy. Although age-related differences in numerical discriminations have yet to be established, birds may learn to use numerical discriminations naturally, during daily cache retrieval and pilfering activities. If an animal knows how many pieces of dismembered prey items are in each of its cache sites, it would help prioritize efficient cache retrieval. It would also help prioritize raids on caches made by its mate, given that males and females compete for food in winter (Alexander et al. 2005; Burns & Steer 2006; Burns & van Horik 2006; Steer & Burns 2008). Robins only store insect prey that are highly perishable. Therefore, knowing how many items are stored in particular cache sites would also help prioritize cache retrieval to minimize the losses to spoilage.

On a cautionary note, similar experiments on birds that do not hoard food have yet to be conducted. Therefore, attributing the advanced numerical skills of robins to food hoarding per se remains speculative. In addition, there may be circumstances where robins could benefit more from making decisions based on volume than on number (Stevens et al. 2007). It would be interesting to evaluate under what ecological conditions robins use volume rather than number to make cache retrieval decisions. For example, it is conceivable that volume might be a better discriminatory cue when the birds dismember prey into different-sized caches (e.g. leg versus abdomen). Finally, recent work has shown that some animals can suppress their tendency to select higher quantities and instead pursue lower quantities (e.g. Boysen et al. 1999; Uher & Call 2008). In experiment 1, we observed a trend for a reversed quantity judgement in the 8 versus 10 condition, although it was statistically insignificant. One possible benefit of selecting the cache site with fewer prey may be that the 10-item site incurs greater handling costs associated with re-caching. Additional experiments are needed to identify when birds might strategically ‘go for more’ and when they might ‘go for less’.

The use of advanced numerical discriminations in New Zealand robins provides a clear and critical link in understanding the evolution of number sense in animals, by identifying a way in which number discriminations might be used by wild animals. Experiments on food hoarding in corvids have identified several other sophisticated cognitive traits that were previously thought to be restricted to higher primates. For example, scrub jays can remember the precise locations of food stored by other animals (see Emery (2006), for a review). They are also capable of ‘mental time travel’, or recalling past experiences of other animals that they have observed, which they incorporate into their own strategies of minimizing the loss of caches to pilferers (e.g. Clayton & Dickinson 1998; Clayton et al. 2005; Dally et al. 2006). Results reported here indicate that New Zealand robins can remember the number of items stored by other animals (i.e. humans), adding to a growing list of sophisticated cognitive traits used by food-hoarding birds and supporting suggestions that many birds have sophisticated cognitive abilities comparable to higher primates (Emery & Clayton 2004).


All of our experiments were conducted under the approval of the Karori Wildlife Sanctuary and the Victoria University of Wellington animal ethics committee.


    • Received May 22, 2008.
    • Accepted June 18, 2008.


View Abstract