Royal Society Publishing

Tokens improve capuchin performance in the reverse–reward contingency task

Elsa Addessi , Sabrina Rossi

Abstract

In humans and apes, one of the most adaptive functions of symbols is to inhibit strong behavioural predispositions. However, to our knowledge, no study has yet investigated whether using symbols provides some advantage to non-ape primates. We aimed to trace the evolutionary roots of symbolic competence by examining whether tokens improve performance in the reverse–reward contingency task in capuchin monkeys, which diverged from the human lineage approximately 35 Ma. Eight capuchins chose between: (i) two food quantities, (ii) two quantities of ‘low-symbolic distance tokens’ (each corresponding to one unit of food), and (iii) two ‘high-symbolic distance tokens’ (each corresponding to a different amount of food). In all conditions, subjects had to select the smaller quantity to obtain the larger reward. No procedural modifications were employed. Tokens did improve performance: five subjects succeeded with high-symbolic distance tokens, though only one succeeded with food, and none succeeded with low-symbolic distance tokens. Moreover, two of the five subjects transferred the rule to novel token combinations. Learning effects or preference reversals could not account for the successful performance with high-symbolic distance tokens. This is, to our knowledge, the first demonstration that tokens do allow monkeys to inhibit strong behavioural predispositions, as occurs in chimpanzees and children.

1. Introduction

In the endless search for what makes humans unique, a long-debated issue is whether non-human animals comprehend and use symbols. Although animals lack a fully-fledged symbolic sign system [1], understanding how they process meaningful sign–object associations is fundamental to tracing the evolutionary roots of human symbolic competence.

The broadest research programme on the use of symbolic stimuli by non-human animals has been undertaken with apes [24], whereas symbolic competence in monkeys has so far been scarcely investigated (e.g. [5,6]). However, to understand which factors shaped complex cognitive skills as symbolic competence, it is important to assess whether species evolutionarily more distant from humans than apes have also evolved these capacities.

Capuchin monkeys are a valuable model to investigate the evolution of traits that are considered uniquely human because, despite 35 Myr of independent evolution, they show many striking analogies with us in terms of encephalization index, ontogeny, lifespan, omnivorous diet, manipulative skills and stone tool use [7,8]. Recently, we began to investigate capuchin symbolic competence by using tokens [911], i.e. inherently non-valuable objects that acquire an arbitrary value upon exchange with the experimenter [12]. When presented with choices between various amounts and combinations of tokens, capuchins flexibly operated on them to represent, estimate and combine quantities [9,10]. Furthermore, their preferences for foods and tokens were qualitatively similar and satisfied transitivity, a fundamental trait of rational decision-making [11].

Nevertheless, it is still unclear whether symbolic stimuli allow capuchins to inhibit strong behavioural predispositions, i.e. if they serve one of the most adaptive functions of abstract representations (e.g. [13]). Thus, we aimed to assess whether capuchins can take advantage of the use of tokens in the reverse–reward contingency task—an inhibition task where subjects presented with two different quantities of items must select the smaller quantity to obtain the larger reward (e.g. [3]). Most individuals tested in this paradigm with food were unable to spontaneously succeed and they could inhibit the selection of the larger amount only when procedural modifications, such as the large-or-none contingency or correction trials were implemented (e.g. [1416]). Only sporadic cases of success have been reported, and these occurred after hundreds or even thousands of trials [17].

The advantage provided using symbolic stimuli in the reverse–reward contingency task has thus far been investigated extensively only in chimpanzees (Pan troglodytes) (e.g. [3]). Whereas chimpanzees tested with food stimuli could not inhibit their strong motivation to choose the larger array, when tested with Arabic numerals they immediately succeeded by selecting the smaller numeral, which led to the larger reward. Thus, symbols allowed chimpanzees to achieve psychological distancing from the incentive features of food (e.g. [13]), improving their performance. By contrast, symbolic stimuli closely resembling the food, such as rocks having a one-to-one correspondence with food, did not ameliorate performance [3]. The above findings are supported by a more recent study [18] in which apes performed better when two coloured dishes representing food quantities replaced the food, and similar results have been obtained in 3-year-old children [19]. However, attempts to show that the use of coloured-food associations helps monkeys to succeed in the reverse–reward contingency task failed unless procedural modifications were used [16].

We tested capuchins in the reverse–reward contingency task with food stimuli and with two types of symbolic stimuli (‘low-symbolic distance tokens’ and ‘high-symbolic distance tokens’) having a different level of abstraction from real food. Specifically, the low-symbolic distance tokens corresponded to one unit of food each and thus closely resembled the food arrays (as the rocks in Boysen [3]), whereas the high-symbolic distance tokens corresponded to a different amount of food each, being more distant representations of the real food (as the Arabic numerals in Boysen [3]). We expected capuchin performance to improve with high-symbolic distance tokens, but not with low-symbolic distance tokens, as in chimpanzees and 3-year-old children tested in the same task [3,19]. In the latter studies, the perceptual features of the array mass (either food or inedible objects) interfered with performance, and individuals achieved success only when the array mass was replaced by symbols having the same reward contingency, but different perceptual features. Furthermore, we expected successful capuchins to transfer the reverse–reward contingency rule to novel quantity pairs of high-symbolic distance tokens but not from high-symbolic distance tokens to food quantities.

2. Material and methods

(a) Subjects

We tested eight capuchins housed at the Istituto di Scienze e Tecnologie della Cognizione of CNR, Rome (electronic supplementary material, table S1).

(b) Apparatus and general procedure

We used a modified version of the reverse–reward contingency paradigm (e.g. [3]). There was an experimental phase with three conditions: Food, Low-symbolic distance tokens and High-symbolic distance tokens (hereafter FOOD, LSDT and HSDT), a control phase and three generalization phases with two conditions (FOOD and HSDT; electronic supplementary material, table S2). Depending on the condition, capuchins faced choices between two arrays of food items (one-eighth of peanut seed each), two arrays of LSDT, i.e. objects corresponding to one-eighth of peanut seed each, or two HSDT, i.e. objects worth two different amounts of food (hereafter, high-value and low-value tokens). Familiar objects differing in shape, material and colour were used as tokens; their assignment was counterbalanced across subjects.

The apparatus was a platform (62 × 40 × 15 cm) with two transparent boxes (12 × 20 × 15 cm), 28 cm apart, where different quantities of food or different quantities/types of tokens (according to phase) were available (electronic supplementary material, video clips S1, S2 and S3). The left–right position of the two options was counterbalanced across trials. The subject performed its choice by inserting its finger through two openings in the wire mesh (8.5 × 3.8 cm) in a small hole (diameter 2 cm) of the selected box.

A successive procedure was used (e.g. [3]). Once the subject made its choice, the selected item(s) was removed from the box in full view of the subject and placed in a container positioned behind the apparatus, and then the items present in the non-selected box (either food items or a number of food items corresponding to the non-selected token or token array) were given to the subject. No procedural modifications (as the large-or-none contingency or correction trials, e.g. [1416]; see the electronic supplementary material) were employed.

For each session, the number of correct choices and the number of choices on the right side were scored. The latter measure allowed us to calculate a side bias index ((number of right choices–number of left choices)/total number of trials). This index varies between 0 and 1, where ‘0’ corresponds to the absence of side bias and ‘1’ indicates an absolute preference for one of the two sides.

(c) Token training

Before the onset of the study, subjects learned to exchange two HSDT for two and five food items, respectively (electronic supplementary material), until a criterion of 90 per cent correct responses for two 24 trial consecutive sessions was reached. When criterion was reached for both types of token, each subject received six sessions of consolidation, in which the same procedure employed during training was used and the two types of token alternated across days. After training, we assessed whether the high-value token was indeed preferred to the low-value token by placing 20 tokens, 10 of each type, on the compartment floor, and allowing each subject to exchange 10 tokens with the experimenter until a criterion of 90 per cent exchanges of the high-value token for two consecutive sessions was reached. All subjects were already familiar with the tokens worth one and three food items [9].

(d) Experimental phase

In the experimental phase, subjects were presented with choices between the same quantity pair (2 : 5) of (i) peanuts (FOOD-condition), (ii) identical tokens (LSDT-condition) or between, (iii) a high-value token, worth five food items, and a low-value token, worth two food items (HSDT-condition; electronic supplementary material, table S2). In each condition, subjects were tested until they reached at least 85 per cent correct responses (i.e. choice of the smaller quantity) over five consecutive sessions, or received twenty 20 trial sessions. In the LSDT- and HSDT-conditions, token exchange was not required.

(e) Control phase

To assess to what extent capuchin performance in the HSDT-condition of the experimental phase is improved using tokens rather than by the experience gathered during the FOOD-condition and the LSDT-condition, after the completion of the experimental phase each subject received a total of nine 20 trial control sessions with food, LSDT and HSDT alternated across days (FOOD-condition, LSDT-condition and HSDT-condition). The order of presentation of the three conditions was counterbalanced across subjects.

(f) Generalization phases

The five successful capuchins in the experimental phase and/or in the control phase (Robinia, Carlotta, Paprica, Robot and Sandokan) were tested in generalization phase 1 with another item pair (1 : 3) and with the familiar item pair (2 : 5) (either food or HSDT, depending on the condition in which they previously succeeded; electronic supplementary material, table S2). To participate in generalization phase 1, capuchins should have either reached the criterion in the experimental phase or to have, in the control phase, an overall performance significantly above chance with a significant number of correct choices in at least one session.

The two successful capuchins of generalization phase 1 (Robinia and Sandokan) were tested in generalization phase 2 with a novel item pair (1 : 2) and with the familiar item pair (2 : 5), and in generalization phase 3 with multiple item pairs. Robinia was tested in the FOOD-condition with all the possible comparisons between one and five food pairs (1 : 2, 1 : 3, 1 : 4, 1 : 5, 2 : 3, 2 : 4, 2 : 5, 3 : 4, 3 : 5 and 4 : 5) and Sandokan was tested in the HSDT-condition with the novel token pairs 2 : 3 and 3 : 5 and the familiar token pairs 1 : 2, 1 : 3 and 2 : 5 (electronic supplementary material, table S2). In the HSDT-condition, token exchange was not required.

(g) Experimental and generalization phases: preliminary preference test

In all phases, before the onset of the HSDT-condition, we carried out a preliminary preference test to verify that the high-value token was preferred to the low-value token. When multiple token pairs were used, preference was assessed separately for each token pair. We placed 20 tokens, 10 of each type, on the compartment floor, and then allowed the subject to exchange 10 tokens with the experimenter. For each token pair, criterion was set at 80 per cent correct responses (i.e. exchange of the high-value token) in a single 10 trial session.

(h) Experimental and generalization phases: final preference test

In all phases, to exclude the possibility that a positive performance in the HSDT-condition was owing to a preference reversal between the two tokens, we carried out a final preference test to assess whether capuchins still preferred the high-value token to the low-value token by using the same procedure as in the preliminary preference test.

3. Results

(a) Experimental phase

In the FOOD-condition, all subjects performed below chance except for Robinia (figure 1 and electronic supplementary material, figure S1). She mastered the task by performing above chance for the first time on session 12 and thus reached criterion on session 16 (electronic supplementary material, table S3). In the LSDT-condition, no subject mastered the task (figure 1 and electronic supplementary material, figure S2 and table S4). In the HSDT-condition, subjects completed training (including the six sessions of consolidation and the preference test) in an average of 13.5 ± 0.7 sessions (range: 12–18). During testing, three subjects (Carlotta, Robot and Sandokan) performed above chance (figure 1). Specifically, they correctly chose the low-value token above chance for the first time on sessions 1, 6 and 16 and reached criterion on session 5, 10 and 20, respectively (figure S3 and electronic supplementary material, table S5). In the final preference test, they maintained their preference for the high-value token, which was chosen and exchanged in 80 per cent of the trials.

Figure 1.

Experimental phase. Percentage of correct choices in the last five sessions of each condition (FOOD, LSDT, HSDT). The dotted line depicts the chance level: *p < 0.05; **p < 0.005. Unfilled bars, FOOD; light grey bars, LSDT; black bars, HSDT.

(b) Experimental phase versus control phase

Experience apparently affected capuchin performance with food; the percentage of correct choices differed between the FOOD-condition of the experimental phase and the same condition of the control phase (t7 = −2.93, p = 0.02). However, this apparent improvement was owing to a significant increase in the side bias index in the control phase (t7 = −2.98, p = 0.02). Percentage of correct choices, but not side bias index, marginally differed between the LSDT-condition of the experimental phase and the same condition of the control phase (correct choices: t7 = −2.31, p = 0.05; side bias index: t7 = −1.66, p = 0.14), whereas percentage of correct choices and side bias index did not significantly differ between the HSDT-condition of the experimental phase and the same condition of the control phase (correct choices: t7 = −1.64, p = 0.14; side bias index: t7 = 0.49, p = 0.64).

Nevertheless, some subjects improved their performance from the experimental phase to the control phase. In particular, three subjects performed above chance: Robot in the FOOD-condition and in the LSDT-condition, and Paprica and Robinia in the HSDT-condition. However, when looking at individual performance in each session, only Paprica and Robinia in the HSDT-condition performed above chance in one out of three sessions and in two out of three sessions, respectively. Conversely, one subject (Carlotta) performed worse in the HSDT-condition of the control phase than in the same condition of the experimental phase (electronic supplementary material, table S6).

(c) Generalization phases

In generalization phase 1, in the FOOD-condition, Robinia successfully transferred the reverse–reward contingency rule to the novel pair of food items (1 : 3) (figure 2). She performed above chance beginning in session 1 and reached criterion in session 5 (electronic supplementary material, figure S4 and table S7). In the HSDT-condition, all subjects (n = 5) successfully completed the preliminary preference test in 2.2 ± 1.0 sessions (range: 1–4) for the token pair 1 : 3, and in 3.4 ± 1.5 sessions (range: 1–11) for the token pair 2 : 5. During testing, two subjects out of five (Sandokan and Robinia) successfully transferred the reverse–reward contingency rule to the novel token pair (1 : 3) (figure 2). Specifically, Sandokan began to perform above chance in session 1 and reached criterion in session 5, whereas Robinia performed above chance beginning in session 3 and reached criterion in session 8 (electronic supplementary material, table S7). By contrast, Carlotta, Paprica and Robot did not generalize to the token pair 1 : 3 (electronic supplementary material, figure S4 and table S7). In the final preference test, Sandokan and Robinia maintained their preference for the high-value token, which was chosen and exchanged in 100 per cent of the trials (token pair 1 : 3) and in 90 per cent of the trials (token pair 2 : 5).

Figure 2.

Generalization phases 1 and 2. Percentage of correct choices in the last five sessions of the FOOD- and HSDT-conditions. The dotted line depicts the chance level: **p < 0.005. Unfilled bars, generalization phase 1–1 : 3; dark grey bars, generalization phase 1–2 : 5; black bars, generalization phase 2–1 : 2; light grey bars, generalization phase 2–2 : 5.

In generalization phase 2, in the FOOD-condition, Robinia successfully transferred the reverse–reward contingency rule to the novel pair of food items (1 : 2) (figure 2). She performed above chance starting in session 1 and reached criterion on session 5 (electronic supplementary material, figure S5 and table S8). In the HSDT-condition, both subjects (Sandokan and Robinia) successfully completed the preliminary preference test in four sessions for the novel token pair 1 : 2, and in a single session for the familiar token pair 2 : 5. During testing, only Sandokan successfully transferred the reverse–reward contingency rule to the novel token pair (1 : 2) (figure 2). He performed above chance beginning in session 8 and reached criterion in session 12 (electronic supplementary material, table S8). By contrast, Robinia did not generalize to the novel token pair 1 : 2, and her performance also deteriorated with the familiar token pair 2 : 5 (figure 2 and electronic supplementary material, figure S5 and table S8). In the final preference test, Sandokan maintained his preference for the high-value token, which was chosen and exchanged in 80 per cent of the trials (token pair 1 : 2) and in 100 per cent of the trials (token pair 2 : 5).

In generalization phase 3, in the FOOD-condition, Robinia performed above chance beginning in session 1 and reached criterion in session 5. She successfully transferred the reverse–reward contingency rule to all the novel pairs of food items, with the only exception of the 4 : 5 food pair, for which her performance was at chance level (figure 3 and electronic supplementary material, table S9). In the HSDT-condition, Sandokan successfully completed the preliminary preference test in one or two sessions with all the token pairs. During testing, Sandokan performed above chance in session 1 and reached criterion on session 5 (figure 3 and electronic supplementary material, table S9). He successfully transferred the reverse–reward contingency rule to the novel token pairs (2 : 3 and 3 : 5). In the final preference test, Sandokan maintained his preference for the high-value token, which was chosen and exchanged in 80 per cent of the trials (token pairs 1 : 3 and 2 : 3), in 90 per cent of the trials (token pairs 1 : 2 and 3 : 5) and in 100 per cent of the trials (token pair 2 : 5).

Figure 3.

Generalization phase 3. Percentage of correct choices in the last five sessions of the FOOD- and HSDT-conditions. The dotted line depicts the chance level: ***p < 0.001. Unfilled bars, Sandokan; black bars, Robinia.

4. Discussion

This was, to our knowledge, the first study to demonstrate that symbolic stimuli improve spontaneous performance of monkeys in the reverse–reward contingency task. Although in the FOOD-condition only one subject mastered the task, in the HSDT-condition five subjects succeeded. Thus, tokens allowed capuchins to achieve psychological distancing from the incentive features of food (e.g. [13]), leading them to avoid impulsive choices in favour of more advantageous alternatives. Similar results have never been obtained with animals other than apes or young children [3,18,19]. In fact, previous attempts to show that monkeys can solve the reverse–reward contingency task when arbitrary representation of the food replaced actual food did not produce positive results unless procedural modifications were employed [16].

Interestingly, no capuchin mastered the task in the LSDT-condition in which we presented choices between two arrays of identical tokens, each corresponding to one piece of food, as the perceptual features of the array mass interfered with performance, as shown for apes and children. In chimpanzees tested with symbolic objects having a one-to-one correspondence with real food (i.e. rocks), only two subjects showed a better performance with the rocks than with food stimuli, although they did not perform as well as with Arabic numerals [3]. Likewise, preschoolers tested in the reverse–reward contingency with rocks or with arrays of dots did not perform better than when tested with food [19].

Although capuchins received the HSDT-condition as the last condition of the experimental phase, their better performance was not owing to the familiarity with the reverse–reward contingency procedure gained during the FOOD-condition and the LSDT-condition, as shown when comparing the experimental phase with the control phase. The apparent improvement in performance observed in the FOOD-condition was in fact owing to a significant increase of side bias, and only in the HSDT-condition did the two subjects improve their performance over time. The possibility that reversal learning accounted for the successful performance of capuchin in the HSDT-condition can be also excluded because all subjects maintained their preference for the high-value token in the preference test carried out after the completion of the test phase. This was expected based on previous results showing that reverse–reward contingency and reversal learning are based on different neural substrates [20].

Notably, in the HSDT-condition, two of the five successful subjects (Sandokan and Robinia) generalized the reverse–reward rule to novel combinations of tokens. In generalization phase 1, they performed successfully with the token pair 1 : 3, which they had encountered in a previous study on numerical competence [9] where, in order to maximize their reward, they had to obey the rule ‘choose 3 over 1’, opposite to that employed in the present study (‘choose 1 over 3’). Sandokan performed correctly beginning with the first trial, and Robinia quickly transferred the reverse–reward rule to the novel token combination. Moreover, in the challenging generalization phase 2, Sandokan transferred the reverse–reward rule to the token pair 1 : 2 (in which the previously correct response—‘choose 2’—now leads to an incorrect outcome), and subsequently, in generalization phase 3, he immediately performed well with other token pairs, including combinations where the ratio between the rewards (2 : 3) was much smaller than the original one (2 : 5).

However, the difficulty of transferring the reverse–reward rule to novel token quantities was experienced by all successful individuals but Sandokan, and the decrease in performance over time experienced by Carlotta with HSDT reflects a limited mastery of the dual nature of tokens [21]. A similar phenomenon has been observed previously when the same population of capuchins was tested in two other tasks involving the comparison between a ‘real’ and a ‘symbolic’ condition [10,11]. When presented with relative numerousness judgements with food and tokens, capuchins performed better with food than with tokens [9]. Similarly, when assessing preference transitivity, their preferences for food and tokens were qualitatively similar; quantitatively, however, values measured with food and with tokens differed systematically [10,11].

In the FOOD-condition, the only successful subject (Robinia) immediately generalized the reverse–reward contingency rule to virtually all the novel food comparisons, as has been demonstrated in other species (sea lions, [22]; mangabeys, [23]; squirrel monkeys, [24]; lemurs, [25]; apes, [26]). As in chimpanzees [27], the numerical ratio between quantities affected Robinia's performance with different food pairs (but not Sandokan's performance with different combinations of HSDT).

Robinia also transferred the reverse–reward rule from the FOOD-condition to the HSDT-condition, although only for two token pairs (2 : 5 and 1 : 3). In a different task, lemurs trained to solve a reverse–reward contingency problem with two different quantities of the same food (Quantity condition) readily transferred the rule to a qualitative version of the task, in which they received two units of differently preferred foods (Quality condition) [28]. By contrast, none of the capuchins who solved the task in the HSDT-condition transferred the reverse–reward rule to the FOOD-condition. Likewise, both chimpanzees that succeeded with Arabic numerals [27] and the capuchins that solved a qualitative version of the reverse–reward contingency task [29] regressed to a below-chance performance when presented with two different food quantities. Thus, both the symbolic versions and the qualitative versions of the reverse–reward contingency task are easier than its quantitative version [3,28,29]. Again, the perceptual features of food arrays in the quantitative version might make it difficult to prevent the impulsive response towards the larger quantity [3]. Nevertheless, 3-year-old children who succeeded in a symbolic version of the reverse–reward contingency task were still successful when presented with different food quantities [19], thus showing better inhibition skills than non-human primates.

In conclusion, although capuchin monkeys are quite distantly related to humans, they can take advantage of the use of symbolic stimuli. The impact of this capacity on the daily life of wild capuchins is hard to envisage, but its emergence might well be a by-product of their sophisticated cognitive skills [7,8].

Acknowledgements

All procedures complied with protocols approved by the Italian Health Ministry (license no. 63/2007-C) and were performed in full accordance with the European law on humane care and use of laboratory animals.

We thank Elisabetta Visalberghi for thoughtful discussion and useful suggestions; Luigi Baciadonna, Sabrina Bechtel and Valentina Focaroli for data collection; Valentina Truppa and Francesco Natale for statistical advice; and Michael Beran, Camillo Padoa-Schioppa and Fabio Paglieri for valuable comments. We also thank the Fondazione Bioparco, our keepers Massimiliano Bianchi and Simone Catarinacci and our technician Luigi Fidanza. Funded by EU FP6 NEST Programme, ANALOGY (no. 029088).

  • Received July 26, 2010.
  • Accepted September 2, 2010.

References

View Abstract