For decades, attempts to understand cooperation between non-kin have generated substantial theoretical and empirical interest in the evolutionary mechanisms of reciprocal altruism. There is growing evidence that the cognitive limitations of animals can hinder direct and indirect reciprocity because the necessary mental capacity is costly. Here, we show that cooperation can evolve by generalized reciprocity (help anyone, if helped by someone) even in large groups, if individuals base their decision to cooperate on a state variable updated by the outcome of the last interaction with an anonymous partner. We demonstrate that this alternative mechanism emerges through small evolutionary steps under a wide range of conditions. Since this state-based generalized reciprocity works without advanced cognitive abilities it may help to understand the evolution of complex social behaviour in a wide range of organisms.
Cooperation by definition embodies an interaction between individuals that provides a benefit to the recipient but not necessarily to the donor . If cooperation is altruistic (costs temporarily surpass benefits for the donor), the temptation to cheat is high, because defection provides an immediate benefit [2,3]. Natural selection does not favour behaviour solely for the benefit of another individual; therefore cooperation can evolve through natural selection only if special mechanisms exist to prevent it from being exploited. Kin selection can facilitate cooperation between genetically related individuals through indirect fitness benefits gained by the actor who performs the altruistic behaviour. Among unrelated individuals, cooperation will be favoured by natural selection only if in the long run the actor gains direct fitness benefits. This can be obtained by the means of reciprocity , where cooperation is conditional upon the previous cooperative behaviour of others; individuals preferentially aid those who have previously helped either them (direct reciprocity) or others (indirect reciprocity) [4,5]. Direct reciprocity explains cooperation between the same, repeatedly interacting individuals, while indirect reciprocity means that cooperation is based on the knowledge about a social partner's behaviour towards others (i.e. it involves reputation) .
The empirical evidence for the importance of direct reciprocity in non-human animals is controversial , while indirect reciprocity has so far only been documented in humans  and in a multi-species mutualism between cleaner fish and their hosts . The reason behind this could be the cognitive complexity inherent in these mechanisms [6,9,10]: individuals have to recognize social partners and remember their previous behaviour towards themselves or others. This may be so difficult or costly  that direct and indirect reciprocity are of minor importance for the social evolution of non-human animals .
A recent empirical study has shown, however, that cooperation in rats can work between non-relative conspecifics by a simple mechanism lacking higher cognitive demands, generalized reciprocity . This means that an individual who received help in the past is more likely to help any new partner subsequently. The identity of the partners is irrelevant, so the mechanism requires neither special cognitive abilities, nor repeated interactions between the same partners. Generalized reciprocity has also been experimentally demonstrated in several studies of human behaviour [13–15] and it is probably a mechanism responsible for many altruistic services on the Internet (e.g. generation of encyclopaedic data bases).
It is highly probable that the proximate mechanism of generalized reciprocity is based on changes of the individuals' physiological/neurological state [12,14,15]. Models of cooperation, however, usually neglect the explicit consideration of the physiological/neurological state of the organisms. Nevertheless, a growing body of empirical research recognizes that the individual's state, influenced, for example, by experience and hormone titres, can affect social and cooperative behaviour. Recently, the neurotransmitter serotonin was shown to trigger human social behaviour , just as oxytocin can mediate positive social interactions and cooperation in human and non-human animals [17,18]. Positive emotions, like gratitude, can increase the propensity to perform costly social behaviour: after a positive social experience (like receiving a cookie from someone), humans are more helpful and generous in anonymous cooperative tasks [14,15]. If the internal state can motivate individuals to cooperate even with unknown partners, then because of its simplicity, a mechanism of generalized reciprocity merely based on a change of the internal state contingent on the experience of received help would seem much more likely than the direct or indirect types of reciprocity, which require much more specific memory and cognitive ability and effort . Consequently, unravelling the origin and stability of generalized reciprocity as an evolved state-based mechanism could help to understand cooperation among humans and other animals.
The evolutionary processes underlying generalized reciprocity, however, are still far from clear. Previous theoretical models have shown that generalized reciprocity can help the evolution of cooperation under rather specific conditions: if groups in which individuals interact are very small (two to four individuals; ; cf. also ), if the individuals' contingent decisions to cooperate and to disperse evolve independently and concurrently , as a by-product enhancing other mechanisms of cooperation , or if behavioural tactics are somewhat assorted, e.g. by population viscosity . None of these models are explicitly based on internal state and more generally, to our knowledge, no study has yet attempted to unravel the evolution of state-based mechanisms of cooperation.
2. The model
We use an evolutionary simulation to investigate the formation of a decision-making mechanism based on internal state, and the rise of cooperation via this mechanism in an initially entirely non-cooperative population. The population consists of N individuals. In each generation, the population is randomly divided into groups of M individuals for the duration of a game ([25,26]; groups are randomly re-formed at the beginning of each generation to avoid the effects of permanent assortment). In this state-based game, there are n pairwise interactions between group members. For an interaction, a pair of individuals is randomly chosen, one to be the actor, the other the receiver. The actor decides whether or not to help the receiver. In our game, individuals cooperate according to their internal state, the actual cooperativeness, Kact (0 ≤ Kact ≤ 1). As it is unrealistic to assume that the actor has perfect control over its own behaviour [27,28], we allow the decision to be probabilistic, so that increasing values of Kact steeply increase the probability of helping from zero to one around Kact = 0.5 (electronic supplementary material, figure S1; our results are robust against changes in the level of error, electronic supplementary material, figure S2). We assume that the mechanism producing social behaviour is at least partly determined by genes [16,29], hence Kact is set and updated according to three genetically determined traits: the initial cooperativeness, Kini (0 ≤ Kini ≤ 1), which specifies the individual's initial willingness to cooperate (i.e. Kact is set to Kini at the beginning of each generation), the increment, Kinc (0 ≤ Kinc ≤ 1), and decrement, Kdec (0 ≤ Kdec ≤ 1), of cooperativeness, which specify how the actual cooperativeness, Kact, changes after favourable (being helped) or unfavourable (not being helped) outcomes of an interaction (electronic supplementary material, figure S3). To avoid biased mutation at the boundaries of zero and one of the genetically determined traits, the actual values of the alleles range between −0.1 and 1.1. Values of less than zero (or larger than one) are remapped to zero (or one) when update of the state happens. At the beginning of each simulation, the population is non-cooperative, i.e. Kini, Kinc, Kdec are all set to zero. If the actor helps, its fitness decreases by c, while the receiver's fitness increases by b (b > c > 0). If the receiver has been helped, its actual cooperativeness increases by Kinc (until Kact ≤ 1). If the actor does not help, none of the pair members' fitnesses are altered, but the receiver's actual cooperativeness decreases by Kdec (until Kact ≥ 0). So the outcome of the interaction influences a receiver's helping behaviour in the next interaction. After finishing the game, the groups dissolve and all individuals in the population form a single mating pool. The probabilities of reproduction and survival are proportional to the fitness reached at the end of each generation. Individuals chosen for reproduction are paired at random. Pair members reproduce sexually with a recombination probability of 0.5. During reproduction, mutation can occur with a probability of 0.01. Mutation changes an allele by a random amount drawn from a normal distribution with zero mean and standard deviation 0.025; i.e. we consider the behavioural traits as being determined by many loci (; additional computations show that our results are robust against changes in the parameters of mutation). At the end of the generation 10 per cent of the population die and offspring replace those who die so that population size remains constant.
The results of the simulations show that cooperation arises under a wide range of conditions (figure 1). Detailed analysis reveals the sequence of evolutionary events (figure 2). First, in a non-cooperative population it is optimal to avoid unconditional cooperation, i.e. to have low Kini. However, owing to the inevitable variation caused by mutation, unconditional cooperators (i.e. individuals with high Kini) appear even if the population consists largely of non-cooperators. In such case, a relatively high Kdec is beneficial, because this prevents prolonged exploitation of cooperative individuals; they will stop helping immediately after experiencing defection (i.e. the probability of helping after defection is very close to zero; figure 2b). Hence, the population evolves to a state where the average initial cooperativeness, Kini, is small, but the decrement of cooperativeness, Kdec, is high enough to prevent exploitation (figure 2a). Under this condition, the chance that an individual initiates a cooperative interaction is very low, hence helping occurs rarely. Consequently, the effect of the increment of cooperativeness, Kinc, is negligible (it almost never takes effect), so Kinc is free to evolve. This means that, by random drift, a population can reach a state where the initial cooperativeness is low, while the value of increment of cooperativeness is high enough (figure 2a) so that most members of the population are initially non-cooperators but can turn into cooperators by receiving helpful interactions (figure 2b). Under these conditions, as the simulations show, the initial cooperativeness, Kini, rises quickly in the population and so cooperation evolves rapidly (figure 2). Cooperation evolves also if a small proportion of unconditional defectors continuously arise in the population by back mutation (electronic supplementary material, figure S4). After its evolution, cooperation persists even if a significant proportion of the cooperative population is replaced by unconditional defectors (electronic supplementary material, figure S5).
In our simulations, once most members of the population can turn into cooperators by receiving helpful interactions, the scene is set for cooperation to rapidly evolve. This phenomenon can be understood by considering a simplified formal model (see the analytical model in the electronic supplementary material). In this model at a given time each individual is in one of the two states: cooperator (C) or defector (D). A defector only becomes a cooperator if it receives help, while a cooperator only becomes a defector by not receiving help. Otherwise, the game proceeds as outlined above. We investigate the quantities HC, HD (the mean numbers of times an initially cooperator/defector individual gives help) and RC, RD (the mean numbers of times an initially cooperator/defector individual receives help) to see under what conditions an individual that is initially C in a group that is otherwise D gets enough help back to more than offset the cost of the initial act of helping. Provided that there is at least one individual in state C and one in state D at the start of the first interaction we have 3.1 (see electronic supplementary material). This result establishes that an initial cooperator has only a transient disadvantage compared with defectors because it either stops cooperating after experiencing a defection or it starts a chain reaction resulting in a group where everybody helps. Now consider a group in which there is initially one individual in state C and M − 1 individuals in state D. Then HC < (n/M 2) + 1 and RC > (n/M 2)− 1 (see electronic supplementary material). As a corollary to this result we can investigate when selection favours a rare C. Since the fitness increment in all-D groups is zero, C is favoured when bRC − cHC > 0. By the above a sufficient condition for this to hold is when (b − c)(n/M 2) > b + c. Assuming that b > c we have 3.2
This result shows that (given that b > c) selection will favour an initially C individual provided that the game has sufficiently many interactions. This critical number of interactions increases rapidly with group size, M, but its expected value per individual, n/M, increases only linearly with group size. Further calculation (see electronic supplementary material) shows that the population of conditional cooperators can resist the invasion by a rare unconditional defector mutant if 3.3
As the right-hand side of equation (3.2) is larger than the right-hand side of equation (3.3) (given that c > 0), the condition of equation (3.3) is always fulfilled if cooperation has already spread in the population.
Understanding the rise of cooperation elucidates the effects of group size, number of interactions and benefit-to-cost ratio (figure 1). Large group size and a low number of interactions reduce the probability that cooperative individuals will get help back in the future (figure 1a,b; equation (3.2)). High benefit-to-cost ratios decrease the disadvantage of individuals with high initial cooperativeness (figure 1c; equation (3.2)). The observation that cooperation still arises if the state-based game of n interactions is played more than once during a generation in a population remixed at the beginning of each game emphasizes that the establishment of cooperation does not depend on the competition between permanent groups (figure 1a; blue-dashed line). These results seem to be robust against changes in the size of the population (electronic supplementary material, figure S6).
In contrast to direct and indirect reciprocity, generalized reciprocity can generate cooperation when advanced cognitive abilities do not exist or when they entail non-trivial costs. By assuming a very simple framework that requires only one internal state variable we have shown that cooperation can evolve under less-specific conditions, even in large groups of anonymous individuals. We have established how the necessary conditions for cooperation change with group size. We have also shown that the required decision-making mechanism, a suitable set of update rules for the internal state variable, can gradually evolve through simple steps.
The finding that the appearance of a strong negative response against non-cooperators (high Kdec) is the first crucial step towards the emergence of cooperation underlines the importance of the detection of cheating, which is apparently a widespread component of the maintenance of cooperation in human societies . The recent empirical evidence showing that the appearance of cheaters in populations of social amoebae can select for cheater resistance  indicates that this mechanism can be important in other organisms as well. Cooperation in our simulations appears to be resistant against the invasion of unconditional non-cooperators. As the formal model shows, this resistance relies on the finding that once conditional cooperators have spread in the population, the pay off of individuals in pure groups of conditional cooperators will always be higher than the pay off of unconditional defectors in mixed groups, since conditional cooperators terminate the beneficial act of helping after being cheated.
In our model the tit for tat (TFT) strategy of the iterated prisoner's dilemma (IPD) game  arises as a special case when group size is two. Consequently, our model can provide a possible scenario of how TFT evolves in a non-cooperative population. Several models have been proposed to generalize the IPD to N persons (e.g. [21,34–39]). These N-player iterated Prisoner's Dilemma (NIPD) games differ from our model in several respects. In many NIPD games, cooperation depends on the proportion of cooperative and defective individuals: players cooperate if at least a certain number of partners cooperated last time (e.g. [35,36]). In most of the NIPD games, benefit is received by all individuals, while cost is paid by the cooperators only, which poses a public goods game to the participants (e.g. [34,37,39]). In other versions of NIPD, players act on a lattice or arranged along a ring, and a player's behaviour depends on previous actions of the neighbours and if a neighbour has higher pay off, it may adopt its strategy (e.g. [21,38]). By contrast, in our model, interactions happen between two individuals, and in each interaction cost is paid only by the actor and benefit is received only by the receiver. Individuals do not need to know about the proportion of cooperative individuals in the group and they do not compare their strategies. Actors' behaviour depends only on the outcome of previous interactions experienced as a receiver with any partner from the group, i.e. we assume a cognitively less-demanding mechanism. So our state-based generalized reciprocity scenario in a group composed of more than two individuals seems to be a more natural generalization of a two person TFT scenario.
State-based generalized reciprocity may be biologically significant for the following two reasons. First of all, this mechanism is cognitively much less demanding than direct or indirect reciprocity. The cognitive capabilities (like memory and recognition) required by direct or indirect reciprocity, seem to be costly , so it is plausible that generalized reciprocity is a mechanism allowing cooperation among animals that do not fulfil the requirements of more advanced types of reciprocity. Direct and indirect reciprocity might be more effective mechanisms in terms of supporting the evolution of cooperation, but it is difficult to compare their evolutionary plausibility to that of generalized reciprocity because current models usually neglect the cost of generating and maintaining the neural and behavioural mechanisms required for these more-demanding reciprocity mechanisms to work. The other reason, as Nowak & Roch  have shown, is that generalized reciprocity can be important in stabilizing direct reciprocity by a synergistic effect, since generalized reciprocators help not only those who helped them, but also several more individuals. According to this, generalized reciprocity decreases the benefit-to-cost ratio needed for the emergence of cooperation by direct reciprocity . However, in this model , the cost of capabilities needed for direct reciprocity was also ignored.
To summarize, the different types of reciprocity seem to be advantageous under different conditions, but it does not mean that these mechanisms have to be mutually exclusive. Our state-based approach supports the idea that generalized reciprocity is an important mechanism among organisms without advanced cognitive capabilities or in situations where the acquisition of information about social partners is costly, since the mechanism requires only a state variable, which is updated by the outcome of the last interaction with an anonymous partner. Therefore, state-dependent generalized reciprocity provides a basis for the evolution of complex social structures in a wide range of taxa, including our own species. The spreading of altruism in extended organ donor chains among anonymous patients, for instance, illustrates the potential power of cooperation based on mental state in modern human society .
We thank T. Bereczkei, S. Van Doorn, L. Fromhage, A. Gardner, L.-A. Giraldeau, I. Hamilton, A. I. Houston, J. Marshall and an anonymous referee for comments on previous versions of this paper. The cooperation between the authors was supported by the EU FP6/INCORE. Computations were supported by the Hungarian Scientific Research Fund (OTKA grant K75 696 to Z.B.). M.T. was supported by the Swiss National Science Foundation. The work is partially supported by the TÁMOP 4.2.1./B-09/1/KONV-2010-0007 project. The project is implemented through the New Hungary Development Plan, co-financed by the European Social Fund and the European Regional Development Fund.
- Received July 30, 2010.
- Accepted September 1, 2010.
- © 2010 The Royal Society