In the absence of language, the comprehension of symbols is difficult to demonstrate. Tokens can be considered symbols since they arbitrarily stand for something else without having any iconic relation to their referent. We assessed whether capuchin monkeys (Cebus apella) can use tokens as symbols to represent and combine quantities. Our paradigm involved choices between various combinations of tokens A and B, worth one and three rewards, respectively. Pay-off maximization required the assessment of the value of each offer by (i) estimating token numerousness, (ii) representing what each token stands for and (iii) making simple computations. When one token B was presented against one to five tokens A (experiment 1), four out of ten capuchins relied on a flexible strategy that allowed to maximize their pay-off, i.e. they preferred one token B against one and two tokens A, and they preferred four or five tokens A against one token B. Moreover, when two tokens B were presented against three to six tokens A (experiment 2), two out of six capuchins performed summation over representation of quantities. These findings suggest that capuchins can use tokens as symbols to flexibly combine quantities.
Following classical semiotic theory (Peirce 1931–58, see also Bates 1979), a symbol is a sign related to its referent through the conventions agreed upon by a community of users. The relationship is truly arbitrary since the sign and its referent do not physically resemble each other. It has also been argued that the use of symbols should expand intellectual horizons in both time and space (DeLoache 2002). Tokens (i.e. inherently non-valuable objects that acquire an associative value upon exchange with the experimenter; Brosnan & de Waal 2004) can be considered symbols (sensu Peirce) since they arbitrarily stand for something else without having any iconic relation to their referent. Moreover, token exchange may be regarded as a proto-convention since after having learned it many individuals in the same group regularly exhibit this behaviour when appropriate; however, token exchange lacks a normative nature since group members do not expect other members to abide by it (see Itkonen (2003) for the characteristics of human conventions).
In non-human primates, the paradigm of token exchange has been repeatedly adopted to investigate topics such as the effectiveness of tokens as secondary reinforcement (Wolfe 1936; Cowles 1937; Kelleher 1956, 1957a,b; Sousa & Matsuzawa 2001), requesting a specific food (Brosnan & de Waal 2004, 2005) or a tool (Westergaard et al. 1998) by returning the corresponding token, and cognitive biases in economic behaviour (Chen et al. 2006). However, besides the pioneering work of Carpenter & Locke (1937), no other study has focused on whether tokens do indeed function as symbols.
In Carpenter & Locke's (1937) study, a male capuchin monkey (Cebus apella; Trader), who learned to associate several tokens with different foods, was unable to use tokens as symbols since (i) the order in which he exchanged the tokens did not show a clear pattern of token-mediated food preference, (ii) he also exchanged tokens associated with a low-preferred food (that was not eaten afterwards) and (iii) he also exchanged tokens not associated with a food (its exchange did not result in any reward). It is possible that Trader's high motivation to exchange no matter what precluded the assessment of his understanding that tokens may function as symbols. Similarly, capuchins and chimpanzees (Pan troglodytes) trained to exchange two different tokens for a low-preferred and a high-preferred food failed to return the token corresponding to the food offered by the experimenter (Brosnan & de Waal 2004, 2005).
So far, the most convincing evidence of symbol use comes from the long series of studies on language-trained apes (Savage-Rumbaugh 1986). Chimpanzees trained to sort out real foods from real tools and to identify each of them by choosing one out of two lexigrams (one generically indicating ‘food’ and the other generically indicating ‘tool’) kept categorizing using the correct lexigrams also when presented with new items (Savage-Rumbaugh et al. 1980; Savage-Rumbaugh 1986). Furthermore, Boysen & Bernston (1995) have elegantly demonstrated that the use of symbols allowed Sheba, a chimpanzee previously trained to associate Arabic numerals with the corresponding amount of food, to overcome her strong motivation to choose the largest between two food arrays. In fact, though Sheba failed to select a smaller food array in order to receive a larger one her performance immediately improved as soon as symbols (namely Arabic numerals) substituted food. Therefore, the use of symbols seemed to expand Sheba's capacities allowing an otherwise difficult achievement.
In the present study, we aimed to assess whether capuchins can flexibly use tokens as symbols to represent and combine quantities. From an evolutionary point of view, cognitive mechanisms allowing animals to represent and compare quantities are adaptive for species, such as capuchins, that forage on patchily distributed sources (Olthof et al. 1997; Call 2000). Moreover, action enables the development of abstract relational mental activity (Johnson 1989; Fragaszy et al. 2004) and, in this respect, capuchins are likely candidates for the emergence of abstract thought since they have a strong tendency to engage the world through direct physical contact and to combine objects (Fragaszy et al. 2004). Up to now, evidence for abstract reasoning in capuchins is contradictory; for example, they do not develop an abstract appreciation of the causal relations among actions, objects and surfaces as readily as humans (Fragaszy et al. 2004), but when tested with a matching-to-sample paradigm they can form abstract, conceptual-like representations for above and below spatial relations (Spinozzi et al. 2004).
Our paradigm involved a choice between offers of different quantities of tokens A and B, worth one and three rewards, respectively. Reward maximization required a flexible strategy based on (i) estimating token numerousness, (ii) representing what each token stands for and (iii) making simple computations. In particular, in experiment 1 we presented capuchins with choices between one token B and one to five tokens A. To maximize their pay-off, capuchins had to choose the single token B in the conditions 1B versus 1A and 1B versus 2A, and the tokens A in the conditions 1B versus 4A and 1B versus 5A, whereas the simpler strategies of always selecting the most numerous offer or the most valuable token (namely the token B) did not allow pay-off maximization.
In experiment 2, capuchins had to choose between two tokens B and three to six tokens A; therefore, the value of each type of token needed to be summed with the value of tokens of the same type, and the sums obtained to be compared. Capuchins faced three new conditions (2B versus 4A, 2B versus 5A and 2B versus 6A) interspersed with three familiar conditions (1B versus 5A, 1B versus 4A and 1B versus 3A). To maximize their pay-off, capuchins had to select the tokens B in the conditions 2B versus 4A and 2B versus 5A, and the tokens A in the conditions 1B versus 4A and 1B versus 5A, whereas always selecting token(s) B did not result in pay-off maximization. In both experiments, estimation of the value of the token A offers is based on perception since the offer value always corresponds to token numerousness. Estimation of the value of token B offers is also based on perception but, since the offer value never corresponds to token numerousness, it also requires the representation of what token(s) B stand(s) for.
2. Material and methods
Ten captive-born capuchin monkeys (five males, five females, average 16.4 years, range 5–27) were tested. They lived in three social groups at the Primate Centre of the Institute of Cognitive Sciences and Technologies of CNR, Rome; each group was housed in indoor–outdoor compartments (65.4–139.5 m3, depending on group size) and tested in one of the two indoor compartments (12.2 m3 each, for all groups). All compartments were furnished with wooden perches, tree trunks and branches. Separation for individual testing was achieved by first splitting the group into smaller units by means of sliding doors and then allowing one individual to enter the indoor compartment. Monkeys were not food deprived for testing. The main meal took place in the afternoon when fresh fruits, vegetables and monkey chow were provided. Water was available ad libitum.
This study complied with protocols approved by the Italian Health Ministry and all procedures were performed in full accordance with the European law on humane care and use of laboratory animals.
Tokens were objects of similar dimensions, differing in shape, material and colour. In particular, we used a blue plastic poker chip (3.7 cm in diameter), a yellow plastic poker chip (same diameter), a grey PVC cylinder (3 cm in diameter; 0.7 cm in height), a brass plug (2 cm in diameter) and a metal nut (2 cm in diameter). We randomly assigned two tokens to each subject.
Subjects were trained to exchange two types of tokens. Token A was exchanged with one reward and token B with three rewards. A reward consisted of one-eighth of a peanut seed and weighed on average 0.11±0.004 g. All capuchins had previously participated in a study on food and token relative numerousness judgments in which only tokens A were used (Addessi et al. submitted). Since all capuchins were already familiar with the object used as token A in the above study, we kept using the same type of object as token A also in the present investigation.
The training procedure consisted of placing 12 tokens into the indoor compartment, and repeatedly saying ‘give me’ to the monkey while requesting a token, with left hand outstretched and palm up. The right hand stayed in the lab coat pocket (where the rewards were hidden) until rewarding the monkey. The reward was given upon the placement of one token into the experimenter's left hand. There was a 10 s interval between one trial and the next one. Incorrect exchanges, in which tokens were thrown or incorrectly placed into the experimenter's hand, were not rewarded. Moreover, when the subject did not exchange a token within 30 s, the trial was considered incorrect and a new trial started after 10 s.
Subjects received a training session per day. Each session consisted of two blocks of 12 trials each, for a total of 24 trials. Criterion was set at 90% correct responses within two consecutive sessions. Each subject was trained to exchange tokens A until criterion and then tokens B. When criterion was reached for both types of token, subjects received six sessions of consolidation with tokens A or B alternated across days. Then, we assessed whether token B was preferred over token A; to do so, we placed 20 tokens, 10 of each type, on the cage floor, and then allowed the subject to enter and to exchange 10 tokens with the experimenter. Criterion was set at 90% correct responses (i.e. choosing token B over token A) within two consecutive sessions.
(b) Experiment 1
(i) Subjects and apparatus
Ten capuchins were tested. The apparatus was a black PVC table (65 cm×64 cm×13.5 cm) with two sliding aluminium trays (6.5×40 cm; 2.5 cm high), positioned at 32 cm distance from one another. Each tray had a hole (1.4 cm in diameter) at each end; the hole close to the experimental subject facilitated its pulling (see §2b(ii)) whereas that on the other side allowed the experimenter to block the tray by inserting a pin into the hole (figure 1).
In each trial, capuchins faced a binary choice between one token B and one to five token(s) A, depending on the experimental condition. Capuchins were tested in five conditions (1B versus 1A, 1B versus 2A, 1B versus 3A, 1B versus 4A and 1B versus 5A), each presented four times for a total of 20 trials in a pseudo-random order. Each subject received one session a day for a total of 40 sessions. Two experimenters tested the subjects: experimenter 1 sat in front of the subject's indoor enclosure, with the apparatus placed on the floor in between the experimenter and the enclosure (figure 1). Placed next to the experimenter was an opaque container with pieces of peanut inside. Experimenter 2 sat next to experimenter 1 and blocked the subject's visual access to the apparatus by an opaque screen so that subjects could neither observe the baiting process nor reach the trays during baiting. After baiting, experimenter 2 lifted the opaque screen and experimenter 1 pushed the apparatus towards the wire mesh, so that the monkey could pull one of the two trays. Both experimenters refrained from looking at the apparatus so as not to provide cues to the subject. After choosing an offer, the monkey was requested to exchange a token at once with the experimenter as in the training phase (see §2a(ii)).
(c) Experiment 2
Six capuchins participated in experiment 2 since we excluded the subjects that in experiment 1, regardless of condition, preferred token B. We used the same apparatus as in experiment 1.
In each trial, capuchins faced a binary choice between one or two token(s) B and three to six tokens A, depending on the experimental condition. They received three new conditions (2B versus 4A, 2B versus 5A and 2B versus 6A), each presented twice; moreover, three familiar conditions (1B versus 5A, 1B versus 4A and 1B versus 3A), each presented twice, were interspersed in each session for a total of 12 trials presented in a pseudo-random order. In both the new and the familiar conditions, values differed by two units (2B versus 4A and 1B versus 5A), one unit (2B versus 5A and 1B versus 4A) or were the same (2B versus 6A and 1B versus 3A). Each subject received one session a day for a total of 20 sessions. All other features of the procedure were the same as in experiment 1.
Each subject was trained to exchange two types of token: token A (worth one reward each) and then token B (worth three rewards each). All subjects reached criterion in an average of 6.3±1.2 sessions for token A (range 2–14) and in an average of 2.6±0.4 sessions for token B (range 2–4). Moreover, all subjects preferred token B over token A within a few sessions (average 3.4±0.4, range 2–6).
(b) Experiment 1
For each condition (1B versus 1A, 1B versus 2A, 1B versus 3A, 1B versus 4A and 1B versus 5A), we used a one-sample t-test to evaluate whether the percentage of choices for the token B differed significantly from a theoretical chance distribution with a mean of 50%. Moreover, to evaluate capuchins' initial performance, we analysed the frequency of correct choices (i.e. token B in the conditions 1B versus 1A and 1B versus 2A, and tokens A in the conditions 1B versus 4A and 1B versus 5A) in the first 12 trials (corresponding to the first three sessions) of each condition by a binomial test. All analyses were performed at the individual level.
As shown in table 1, the 10 subjects behaved according to three main strategies. Four capuchins (Sandokan, Gal, Paprica and Pepe) maximized their pay-off by selecting the token B when this was advantageous (i.e. when one token B was presented against one or two token(s) A), and selecting tokens A when this led to more than the three rewards of the token B (i.e. when four or five tokens A were presented against one token B). Three of these subjects were at chance level when the two offers had the same value, whereas a fourth (Sandokan) significantly preferred three tokens A over one token B. Moreover, the performance of the four maximizing capuchins depended to some extent on the numerical ratio of the food quantities represented by token offers (see electronic supplementary material and figure 2a).
Four other capuchins (Virginia, Pippi, Robinia and Cammello) preferred token B regardless of condition (table 1), although their performance showed some differences between the conditions (see electronic supplementary material). Finally, two capuchins (Robot and Carlotta) significantly preferred token B only when presented against one token A; in all other conditions (except for Carlotta in 1B versus 2A), they significantly preferred tokens A (table 1) and their performance significantly increased with the number of tokens A presented (see electronic supplementary material).
Although no subject maximized its pay-off in all conditions from the early beginning of experiment 1 (table 2), three out of the four maximizing subjects significantly preferred token B when presented with 1B versus 1A and 1B versus 2A, and one and two subjects significantly chose tokens A when presented with 1B versus 4A and 1B versus 5A, respectively. The most problematic condition was 1B versus 4A, in which only one capuchin preferred tokens A over token B above chance level; this difficulty is not unexpected since this condition has the highest numerical ratio (75%) among those presented in experiment 1.
(c) Experiment 2
For each condition (2B versus 4A, 2B versus 5A, 2B versus 6A, 1B versus 5A, 1B versus 4A and 1B versus 3A), we used a one-sample t-test to evaluate whether the percentage of choices for the token B differed significantly from a theoretical chance distribution with a mean of 50%. Moreover, to evaluate capuchins' initial performance, we analysed the frequency of correct choices (i.e. token B in the conditions 2B versus 4A and 2B versus 5A, and tokens A in the conditions 1B versus 4A and 1B versus 5A) in the first 12 trials (corresponding to the first six sessions) of each condition by a binomial test. All analyses were performed at the individual level.
As shown in table 3, only one subject (Sandokan) always maximized his pay-off by selecting the token B when this was advantageous (i.e. when two tokens B were presented against four or five tokens A), and selecting tokens A when four or five tokens A were presented against one token B. Carlotta behaved similarly to Sandokan in the conditions 2B versus 4A, 1B versus 4A and 1B versus 5A, but her performance did not reach statistical significance for B choices in condition 2B versus 5A. As in experiment 1, the performance of the maximizing capuchins depended to some extent on the numerical ratio of the food quantities represented by token offers (see electronic supplementary material and figure 2b).
Of the remaining four subjects, Pepe was at chance level in all conditions and Paprica, Gal and Robot developed a preference for token B regardless of condition. A comparison of their performance in conditions 1B versus 3A, 1B versus 4A and 1B versus 5A between the last 20 sessions of experiments 1 and 2 showed that their preference for token B increased significantly in experiment 2 (Paprica: 1B versus 3A t19=10.2, p<0.0001; 1B versus 4A t19=11.7, p<0.0001; 1B versus 5A t19=6.0, p<0.0001; Gal: 1B versus 3A t19=5.0, p<0.0001; 1B versus 4A t19=3.1, p<0.01; 1B versus 5A t19=4.7, p<0.001; Robot: 1B versus 3A t19=3.9, p<0.01; 1B versus 4A t19=4.3, p<0.001; 1B versus 5A t19=2.6, p<0.05).
The analysis of the first 12 trials, corresponding to the first six sessions of experiment 2 (table 4) showed that only Sandokan significantly maximized his pay-off from the early beginning in all conditions except 2B versus 5A; however, in this condition his performance improved over time (sessions 1–10 versus sessions 11–20: t9=−5.0, p<0.001).
Our study evaluated whether capuchins, after learning to exchange two types of tokens (A and B) for different quantities of the same food, maximized their pay-off when offered binary choices between various quantities of tokens A and B. Since in both experiments token numerousness and the value of the corresponding offer were often discordant, the assessment of the offer value required both the estimation of token quantities and the representation of what each token stands for.
In experiment 1, four out of ten capuchins maximized pay-off by taking into account both token numerousness and what each token represents; at least in some conditions, they correctly chose the highest value offers since the first sessions. This flexible strategy indicates a relativistic concept of relations, i.e. the appreciation that an object (such as token B) can have different properties depending on the properties of the object(s) it is put in relation with (Piaget & Inhelder 1974). Moreover, we found that, at least to some extent, capuchins' performance depended on the ratio between the number of food items represented by token offers. This is in agreement with recent studies showing that non-human primates rely on the analogue magnitude system (based on a ratio signature) also when representing small quantities (e.g. vanMarle et al. 2006; Tomonaga 2007; Addessi et al. submitted). However, in our study, capuchins' performance was not only ratio dependent. In fact, when comparing conditions 1B versus 2A and 1B versus 4A, capuchins showed a pattern opposite to that expected on the basis of the numerical ratio, i.e. their performance was higher in the condition 1B versus 4A (having the highest ratio, 75%) than in the condition 1B versus 2A (whose ratio was 66.7%), as if condition 1B versus 2A was the most difficult one. The reason for this apparent inconsistency might be that in the condition 1B versus 2A, before representing the total amount of food they can get for each choice, capuchins have also to inhibit their spontaneous tendency to take the two tokens A (i.e. the most numerous offer).
Of the remaining six capuchins, two consistently used token numerousness as a criterion for choice (‘numerousness’ strategy), i.e. their performance significantly increased with the number of tokens A presented. This indicates that these two capuchins could not take into account the value of each offer, and that they were unable to inhibit the choice of the highest quantity of tokens (even though it did not represent the highest pay-off, as in the condition 1B versus 2A). Finally, the last four capuchins preferred the token B (‘B-only’ strategy) regardless of the number of tokens A presented. These subjects behaved as having a categorical concept of relations, i.e. as if they considered relations in terms of absolute properties or attributes of objects, as children do at the preoperational stage (Piaget & Inhelder 1974). Similarly the capuchins and the chimpanzees tested by Brosnan & de Waal (2004, 2005) systematically chose the most valuable token instead of exchanging the token corresponding to the food shown by the experimenter. However, since in our study capuchins' choice of token B differed to some extent between conditions, it cannot be excluded that the four capuchins behaving according to the B-only strategy were quite sensitive to the discounting cost (i.e. the time required to exchange each token with the experimenter) and/or to the transaction cost (i.e. the effort made when exchanging) associated with the task.
Experiment 2 was more difficult than experiment 1 since in the new conditions proposed (namely 2B versus 4A and 2B versus 5A), in addition to estimating token quantities and representing what each token stands for, capuchins had to combine the representation of quantities and to compare them before making a choice. Nonetheless, in the new conditions, one capuchin (Sandokan) always maximized his pay-off; it did so from the first trials in the condition 2B versus 4A, whereas in the condition 2B versus 5A (the one with the highest ratio, 83.3%) he became proficient later on. Another capuchin (Carlotta) learned over time to maximize her pay-off in all conditions except 2B versus 5A, and both successful capuchins kept correctly choosing tokens A in the familiar conditions 1B versus 4A and 1B versus 5A. The performance of these two maximizing capuchins partly depended on the numerical ratio between the food quantities. However, when comparing conditions 1B versus 4A and 2B versus 4A (whose ratio was 75 and 66.7%, respectively), Sandokan behaved similarly in both conditions, whereas Carlotta showed a pattern opposite to that expected on the basis of the ratio between quantities (performing better in the condition 1B versus 4A than in the condition 2B versus 4A). These results can be explained by the fact that performance in the condition 2B versus 4A does not depend only on the numerical ratio, but it also implies the necessity of combining quantities. Of the remaining four capuchins, three developed the simple strategy of always selecting token B (no matter how many tokens A were presented), and one individual was at chance level.
By taking into account both numerical dimensions of the problem, i.e. the amount of food each token represents, and the number of tokens presented, the two successful capuchins outperformed the two chimpanzees and the rhesus macaque (Macaca mulatta) tested by Beran et al. (2005). In one condition of their summation task, subjects were required to choose between one container A of a given colour and representing a certain quantity, and two containers B of another colour, each representing a smaller quantity, that summed together led to a greater quantity than container A. The chimpanzees and the macaque were unable to maximize their pay-off since they relied on one stimulus dimension only (colour) without taking into account the respective numerousness of containers A and B.
The above findings mirror and extend the results of the few other studies on summation in monkeys. In Olthof et al.'s study (1997), two squirrel monkeys (Saimiri sciureus) trained to select the larger between Arabic numeral pairs, when presented with pairs of stimuli representing multiple Arabic numerals, preferred the stimulus representing the largest sum. Similarly, in studies employing the expectancy violation paradigm (in which longer looking times at impossible events are expected in comparison with possible test events), rhesus monkeys and cotton-top tamarins (Saguinus oedipus) responded differently to the presentation of events violating simple arithmetic laws and to events that were consistent with such laws, suggesting an understanding of combination and dissociation operations on small and large numbers (e.g. Uller et al. 2001; Flombaum et al. 2005).
In conclusion, at least some of the capuchins maximized their pay-off by using tokens as symbols (sensu Peirce; see § 1); to do so, they made complex reasoning on token quantities and flexibly combined them. Future studies should evaluate whether capuchins take advantage of the use of symbols (DeLoache 2002) by directly comparing their behaviour with real food and tokens (as in Boysen & Bernston 1995). Notwithstanding, our innovative paradigm to test quantitative and representational skills might open new insights on the limits of numerical competence and abstract reasoning in non-human primates.
This study complied with protocols approved by the Italian Health Ministry and all procedures were performed in full accordance with the European law on humane care and use of laboratory animals.
We thank Dan Ariely for inspiring discussions and constructive comments. We are grateful to Valentina Truppa and Francesco Natale for their statistical advice, and to Elena Gonzalez Torres, Monica Maranesi, Francesca Virgili and Alessandra Mancini for data collection. We especially thank Gloria Sabbatini for drawing figure 1 and two anonymous referees whose thoughtful comments greatly improved a previous version of the manuscript. We also thank the Bioparco SPA for hosting our Primate Centre and our keepers Massimiliano Bianchi and Simone Catarinacci. Funded by VI Framework, NEST Pathfinder Initiative ‘What it means to be human’, contract no. 12984, ‘Stages in the Evolution and Development of Sign Use’—SEDSU.