High strength-of-ties and low mobility enable the evolution of third-party punishment

Patrick Roos, Michele Gelfand, Dana Nau, Ryan Carr


As punishment can be essential to cooperation and norm maintenance but costly to the punisher, many evolutionary game-theoretic studies have explored how direct punishment can evolve in populations. Compared to direct punishment, in which an agent acts to punish another for an interaction in which both parties were involved, the evolution of third-party punishment (3PP) is even more puzzling, because the punishing agent itself was not involved in the original interaction. Despite significant empirical studies of 3PP, little is known about the conditions under which it can evolve. We find that punishment reputation is not, by itself, sufficient for the evolution of 3PP. Drawing on research streams in sociology and psychology, we implement a structured population model and show that high strength-of-ties and low mobility are critical for the evolution of responsible 3PP. Only in such settings of high social-structural constraint are punishers able to induce self-interested agents toward cooperation, making responsible 3PP ultimately beneficial to individuals as well as the collective. Our results illuminate the conditions under which 3PP is evolutionarily adaptive in populations. Responsible 3PP can evolve and induce cooperation in cases where other mechanisms alone fail to do so.

1. Introduction

Punishment can be essential to cooperation and norm maintenance [112] but costly to the punisher. Hence, a considerable effort has been made to understand how costly punishment can emerge and be maintained in populations where behaviours or individuals are subjected to biological or socio-cultural evolutionary pressures. Many empirical and evolutionary game-theoretic studies have extensively considered the case of direct punishment, in which an agent punishes another for an interaction in which both parties were involved. Much recent empirical attention in both the human and animal world however has turned to the case of third-party punishment (3PP), in which an agent incurs a cost to punish another for an interaction that the agent itself was not involved in [1316]. 3PP can arguably be more effective at norm maintenance than direct punishment, because a norm-violator might be punished by multiple other agents in a population of third-party punishers. Yet, while significant empirical evidence exists that humans [13,14] and non-human species [16] are indeed willing and act to punish in the role of an uninvolved third party, very little is known about the evolutionary conditions under which 3PP can evolve.

A natural approach to the puzzle of the evolution of 3PP is to draw on insights from the evolutionary game literature on direct punishment. Recent research showed for the first time how responsible direct punishment (punishment of non-cooperators only) can evolve even when allowing for the possibility of antisocial punishment (punishment of cooperators) [17]. The key to the evolution of responsible direct punishment in this work was the existence of punishment reputation. Accordingly, we asked: can punishment reputation also account for the evolution of responsible 3PP? Our results show that punishment reputation on its own cannot, leaving a puzzle concerning the conditions that lead to the evolution of 3PP.

To address this puzzle, we draw on classic sociological and psychological theory and theorize that social-structural constraints in human populations play a crucial role in enabling the evolution of responsible 3PP. The constraints we are interested in specifically are strength-of-ties [18] and mobility [19], which have been shown to have wide-ranging consequences for humans (see [20] for reviews). For centuries, humans lived in social-structural contexts characterized by strong social ties (where people interact in great frequency) and low mobility (where people are unable to exit or switch social groups with ease). Such strong social-structural constraints can be a consequence of kinship, which plays an important role in the evolution of cooperation and related behaviours [21,22]. It is precisely these conditions under which we anticipate 3PP to be adaptive because punishers can more effectively induce self-interested agents to cooperate under such constraints. More specifically, only in contexts of high social-structural constraint can punishment reputation foster a culture that incentivizes self-interested agents to cooperate, and hence make responsible punishment both beneficial to the individual and the collective. By contrast, in socio-structural contexts characterized by low strength-of-ties (i.e. where agents do not interact in great frequency) and/or high mobility (i.e. where they can exit the group with ease), any given individual's predisposition to punish misbehaviors will not have the same motivational force to sway self-interested agents towards cooperation, rendering responsible punishment ultimately costly to individuals and hindering the evolution of such 3PP.

To evaluate our hypothesis that these constraints are critical for the evolution of responsible 3PP, we implement variable notions of strength-of-ties and mobility in a structured population model. There exists a large evolutionary game literature exploring effects of population structure on evolutionary outcomes (see [23,24] for reviews), but structured population models on punishment have only considered direct punishment and not 3PP [4,2529]. Our model results show that—when other mechanisms alone are unable to—responsible 3PP can evolve and induce cooperation in structured populations with the help of punishment reputation. However, high strength-of-ties and low mobility are critical for this process. When responsible 3PP evolves, it does so as an ultimately non-altruistic trait. The behaviour acts as a signal to potential co-players in the neighbourhood that non-cooperation will not be tolerated. High strength-of-ties and low mobility allow clustered agents engaging in responsible 3PP to induce cooperation in their neighbourhood. By inducing such local cooperation, clusters of 3PP agents increase their own pay-off and spread. This process leads to the emergence of responsible 3PP in the population as a whole. By contrast, low strength-of-ties and high mobility prevent clusters of 3PP agents from inducing a local culture of cooperation, and hence responsible 3PP does not evolve. To our knowledge, this work is also of the first to illuminate the conditions under which 3PP evolves while allowing for non-responsible punishing strategies.

2. Material and methods

To study the evolution of 3PP in structured populations, we extend a recent evolutionary game model of direct punishment [17]. In their model, at each generation, agents interact in a game phase and then a punishment phase. In the game phase, agents are randomly paired to interact in a classic two-player cooperation game. In the cooperation game, agents can either cooperate, paying a cost c to bestow a benefit of cooperation b upon the other agent, or defect, not paying the cooperation cost c, but receiving any potential benefit from the other agent's action. In the punishment phase, agents get a chance to punish their interaction partner. The definition of a punishment action is the usual one in evolutionary game approaches: agents get an opportunity to punish other agents by an amount ρ at a cost of λ to the punishing agent, and it is generally assumed that ρ > λ.

As we are interested in 3PP in structured populations and not just direct punishment, we must make several changes to the above model. Instead of using a well-mixed population in which any individual may interact with any other individual, we structure the population on a graph where agents occupy nodes and edges that represent social connections, i.e. the other agents with whom an agent can interact. Instead of pairing each agent with the same partner for both the game phase and the punishment phase, we pair each agent a with one of its neighbours during the game phase (except in rare circumstances that preclude this, see the electronic supplementary material for details), and then in the punishment phase a randomly chosen neighbour (who may or may not be the same as the first neighbour) receives a chance to punish a. This allows for the possibility of 3PP. Punishers in our model punish on behalf of others as well as themselves. (We also explore the case when traits for 3PP and direct punishment exist and (co)evolve separately, and obtain similar results, see the electronic supplementary material.) Each agent interacts once per generation and is punished at most once.

The strategy set available to agents in our model is a straightforward extension of the strategy set used in [17] to the case of 3PP. A complete strategy determining an agent's actions in this environment consists of a strategy for the cooperation game phase and one for the punishment phase. In the cooperation game phase, there are three possible strategies: (C)ooperate, (D)efect and an (O)pportunistic strategy, as described in table 1. Cooperators and Defectors simply always cooperate or defect, respectively. Opportunistic agents take the punishment reputation of neighbours into account when deciding to cooperate or defect in the cooperation game phase. We assume that punishment reputation always exists, i.e. agents know the punishing strategies of their neighbours. Opportunistic agents choose the action that gives them the higher expected pay-off given this information. In the punishment phase, there are four possible strategies that may condition the decision to punish or not on the action of the other agent in their cooperation game: agents can punish (R)esponsibly (only punish Defectors), (A)ntisocially (only punish Cooperators), (S)pitefully (punish indiscriminately), or they can be (N)on-punishing (punish no one), as listed in table 2.

View this table:
Table 1.

Cooperation game phase strategies.

View this table:
Table 2.

Punishment phase strategies.

After the game and punishment phases, the population changes under a combination of pay-off-proportional imitation of neighbours and random exploration of strategies. Pay-off-proportional imitation can be viewed as a process of social learning. Each agent is assigned a random neighbour as a potential teacher, and then copies this neighbour's strategy with a probability that is proportional to how much higher the neighbour's pay-off is compared with the agent's own pay-off. Specifically, we use the Fermi rule (as in, e.g. [17,3032]): the probability that an agent copies a potential teacher's strategy is Embedded Image, where πv is the total pay-off of the agent, πw is the total pay-off of the potential teacher and s ≥ 0 is a general parameter determining the selection or imitation strength. We assume all agents update their strategy in such a way simultaneously. We also include a rate of random exploration of strategies [33], which is analogous to random mutation: with probability μ, instead of imitating, an agent chooses both a game phase strategy and a punishment phase strategy at random from the available strategies. Thus, random exploration is equally likely to result in any of the strategy types.

With a population of agents interacting and updating their strategies as described, we can observe the evolutionary dynamics of behaviours (i.e. strategies) under different conditions. The main results of this paper are based on varying the socio-structural constraints of strength-of-ties and mobility (described below) in our models to examine their effect on the evolution of 3PP. Using a structured population model is necessary to implement these social-structural constraints. Thus as mentioned above, we implement population structure by placing agents on a graph, following the large literature on spatially structured evolutionary games [4,23,25,2729,34]), and pairings for the game interaction and punishment opportunities can only occur between agents that are connected on the graph. A complete graph, where all agents are connected to all others, is the equivalent of a well-mixed or non-structured population, as used in [17].

(a) Strength-of-ties

In his classic sociological work, Granovetter [18] measured tie strength between two humans in terms of how often they interacted with each other during a period of time. As our model assumes that in general each agent has an equal number of interactions in a given time period, this means that in a given time period or generation, an agent with few connections has a relatively high number of interactions with its few neighbours, while an agent with many connections has a relatively low number of interactions with a greater variety of agents. Thus, by Granovetter's definition, the former agent has high strength-of-ties, whereas the latter has low strength-of-ties. The degree of a node is hence directly inversely correlated with the associated agents’ average strength-of-ties. (Note that if agents were paired with all their neighbours for interaction in each generation—as is often done in the evolutionary game literature—the concept of strength-of-ties would be eliminated, as all agent pairs would have an equal number of interactions in any given time period.) We shall denote the average strength-of-ties in a population as 1/d, where d is the average node degree of the graph representing the population structure. As a complete graph has the highest possible average degree, a non-structured or well-mixed population of size n has the lowest strength-of-ties possible, 1/n.

(b) Mobility

As a conceptual replication, we also explore the second form of social-structural constraint: residential mobility [19]. Residential mobility is the degree to which humans are able to change their location, and, as a result, their position within the social network within a population. Some human populations, particularly those that are individualistic, have very high mobility where people can easily exit the group, whereas others, particularly collectivistic cultures, are much more dependent on others and are less able to easily exit the group [20,3537]. In mobile populations, humans may change their location for a multitude of reasons. We implement a simple model of the concept of residential mobility using a probability m with which, at the beginning of each generation, an agent switches position with a randomly chosen other agent in the population.

3. Results

Our model results show that the evolution of responsible 3PP critically depends on conditions of high social-structural constraint, i.e. high average strength-of-ties and low mobility. Figure 1 plots the average long-term proportion of responsible punishers in the population under (i) varying strength-of-ties and (ii) varying mobility. We vary strength-of-ties by structuring the population on graphs of different average node degree. In order to use population structures with realistic social-network characteristics, we used Watts–Strogatz small-world networks [38]: each agent is connected to d nearest neighbours on a ring, and then each edge (holding one end fixed) is reattached to a random node with probability 0.1, giving average strength-of-ties 1/d. The degree of mobility is varied through our mobility parameter m. Populations were initialized with all opportunistic defectors and non-punishers (as in [17]). We can observe that conditions of high social-structural constraint, i.e. high average strength-of-ties and low mobility enable the evolution of responsible 3PP. The higher the strength-of-ties and the lower mobility (m), the easier it is for responsible punishment to evolve and be sustained at high population proportion in the population. The benefit of cooperation b quantifies the effectiveness of cooperation: the lower b, the more difficult it is for cooperation and responsible punishment to evolve. The rate of cooperation (percentage of cooperative actions) throughout these simulations is virtually identical to the proportion of third-party punishers in the population, hence only this quantity is shown.

Figure 1.

Surface plot of long-term average population proportions of responsible 3PP under varying constraint conditions. The z-axis (height) shows the long-term average proportion of responsible 3PP in populations under varying b and average tie strength (a) or mobility rate (b). Populations initialized with all opportunistic defectors and non-punishers (as in [17]). Populations are structured on Watts–Strogatz networks. Higher locations (lighter colours) mean higher population proportion. Cooperation rates (not shown) are virtually equivalent to the proportion of responsible 3PP. Long run average proportions were attained from averaging 100 simulation runs over 5000 generations for populations of 1000 agents with model parameters: c = λ = 1, ρ = 3, μ = 0.01 and s = 0.5. For (a) m = 0 and for (b) d = 4.

Figure 2 shows representative evolutionary trajectories for single simulation runs under high strength-of-ties sufficient for the evolution of 3PP (figure 2a) and under low strength-of-ties not sufficient (figure 2b). Under high strength-of-ties, responsible 3PP quickly invades the population and remains the prominent punishment strategy, while under low strength-of-ties, non-punishers and even antisocial punishers comprise the prominent punishment strategies. Again, the percentage of cooperative actions in a population closely approximates the percentage of responsible punishers in the population at that time.

Figure 2.

Typical evolutionary trajectories for single model simulation run under (a) high strength-of-ties that enable and (b) low strength-of-ties that prevent the evolution of responsible 3PP. For readability, the plots show the aggregated proportion of punishment phase (i) and cooperation game phase (ii) strategies over time separately. Panel (ii) also shows the average cooperation rate (percentage of cooperative actions) in black. Model parameters are b = 4, c = λ = 1, ρ = 3, µ = 0.01 and s = 0.5. Populations are 1000 agents and initialized with all opportunistic defectors and non-punishers. Populations are structured on Watts–Strogatz networks of d = 4 (a) and d = 14 (b), giving average strength-of-ties 1/4 and 1/14, respectively.

As an example of how responsible 3PP can induce cooperation and proliferate, see the illustration in figure 3, a small part of a network under different configurations showing how responsible (R) 3PP affects local pay-offs. Alone R punisher, as shown in the topmost configuration, is not enough to induce cooperation and actually suffers relative pay-off loss compared with neighbours. However, if the R punisher is joined by another R punisher (e.g. see middle configuration) in the neighbourhood, together they can induce cooperation, gain a large relative pay-off advantage, and hence be likely to spread. If the pay-off advantage allows the R punishers in the neighbourhood to increase in number, the relative pay-off advantage can become even greater (e.g. see bottommost panel). Put simply, an R punisher increases the likelihood that nearby R punishers will be able to induce cooperation in their co-players. Hence, the agent promotes the existence of other local R punishers that in turn encourages local cooperation further. Through this, responsible 3PP and cooperation can spread throughout the population as a whole.

Figure 3.

Examples of a small part of a network under different configurations of punishers, showing how the existence of responsible 3PP agents affects local pay-offs and cooperation. All nodes are assumed to be opportunistic and the node labels designate the punishment phase strategy: R, responsible; N, non-punishing. Non-labelled nodes are assumed to be non-punishing. Blue nodes choose to cooperate based on the punishment reputation of neighbours, red nodes defect. Expected pay-off calculations are shown next to the nodes. For example, in the middle configuration, the topmost right agent defects because Graphic (equation (3.1)). The agent has an expected pay-off of Graphic, because it has a 2/4 chance of being paired with a cooperating agent in the game phase, giving b; and in the punishment phase it has a 1/4 chance of being paired with an R agent who will punish it by ρ and a 2/4 chance of being paired with an agent who defected, in which case it will punish at a cost λ. Relevant model parameters are b = 4, c = λ = 1, ρ = 3.

Responsible 3PP does not evolve in well-mixed populations, which have the lowest possible strength-of-ties. Thus, the availability of punishment reputation and the ability of opportunistic agents to take this information into account in their cooperation game decision are not sufficient for the evolution of responsible 3PP. Punishing responsibly as a third party is only ultimately beneficial to agents, and hence can spread, if there exist enough other responsible punishers to induce cooperation in potential co-players. If there are not enough other responsible punishers, punishing responsibly is a wasteful and ultimately costly act to the punisher; non-punishers would have a pay-off advantage and quickly begin invading the population (the second-order free-rider problem). In an unstructured or well-mixed population, as we elaborate below, arriving at a state where there exist sufficient responsible punishers is extremely difficult to achieve from a population of non-punishers.

The fact that high social-structural constraint enables the evolution of responsible 3PP is a direct consequence of these constraints enabling responsible punishers to encourage self-interested opportunistic agents towards cooperation. Recall that with punishment reputation, opportunistic agents cooperate or defect depending on which action they expect to result in the better outcome, (i.e. pay-off) for themselves. Thus, the decision of an opportunistic agent to cooperate rather than defect occurs whenEmbedded Image 3.1(derived in detail in the electronic supplementary material), where P(R) is the likelihood that the agent will be paired for punishment with a neighbour that will punish the agent responsibly, and P(A) is the likelihood that the agent will be paired with a neighbour that will punish the agent antisocially. In a well-mixed population, P(R) and P(A) amount to the proportion of responsible punishers xR and the proportion of antisocial punishers xA in the population, respectively. Thus, with c = 1 and ρ = 3, an opportunistic agent would require Embedded Image. This means that, to induce cooperation in opportunistic agents, even with zero antisocial punishers, one-third of the entire population must be responsible punishers. In any sizable population, the likelihood that random exploration would lead to this ratio from a population of non-punishers is impossibly small. High strength-of-ties however alleviates this problem.

In a structured population, the quantities P(R) and P(A) in opportunistic agents’ decision calculation depend on the punishment strategy of the neighbours. Specifically, for an agent v with neighbourhood N(v) and degree d(v), if RN(v) is the number of responsible punishers in N(v), then P(R) = RN(v)/d(v). Similarly, P(A) = AN(v)/d(v), thus P(R) − Embedded Image, and we have the following version of equation (3.1) for agents on structured populations:Embedded Image 3.2

The variable d(v) here, which represents the inverse of strength-of-ties, is crucial: a higher d(v) (lower strength-of-ties) means a lower probability of interacting with any given agent in the neighbourhood. As equation (3.2) must hold for an opportunistic agent to cooperate, the higher d(v), the more responsible punishers must exist in order for self-interested, opportunistic agents to be induced towards cooperation. The lower d(v) however, the fewer responsible punishers are needed. As punishing responsibly is only ultimately beneficial to the punishing individual if there are enough other similar punishers in the neighbourhood, the lower d(v), the more favourable the conditions are for the evolution of responsible 3PP.

High mobility similarly hinders the evolution of responsible 3PP because, as low strength-of-ties, it renders the signalling of responsible punishers useless in promoting a sustained culture of cooperation in their neighbourhood. Inducing cooperation in opportunistic agents requires the symbiotic existence of several responsible punishers in a neighbourhood. When agents are highly mobile, it is difficult for punishers to maintain such localized coordination. Either needed fellow responsible punishers frequently move away or non-cooperative agents frequently replace cooperative agents that have been induced as such in the neighbourhood. Similar to conditions of low strength-of-ties, high mobility ultimately renders the cost of punishing responsibly fruitless, preventing the evolution of responsible 3PP.

Finally, we have conducted several experiments to further test the robustness of our findings, results of which are provided in the electronic supplementary material. First, as population structure alone can aid cooperation under certain conditions, we also provide results for baseline experiments without 3PP in order to untangle the effects of 3PP from effects of population structure alone. Repeating our simulations with identical conditions but without the punishment phase shows that population structure alone does not account for the evolution of cooperation in the presented model. Even with high strength-of-ties and low mobility, cooperation does not emerge without 3PP (see the electronic supplementary material, figures S1 and S2). Hence, the existence of 3PP is pivotal in the emergence of cooperation and increases overall pay-off. Similarly, to unconfound the effects of 3PP and direct punishment, we have repeated these simulations with only direct punishers. Our results show that cooperation and responsible direct punishment cannot evolve alone in our model. This is because, unlike in the model of [17], our model does not guarantee agents a chance to punish directly. When this is the case, 3PP is critically necessary for the evolution of responsible punishment and cooperation (see the electronic supplementary material, figures S3 and S4). Lastly, we have explored the evolution of 3PP when a separate trait for direct punishment can co-evolve, see the electronic supplementary material, figures S6–S13. We find that while the existence of direct-only punishers decreases the overall prevalence of responsible 3PP, responsible 3PP remains necessary to induce cooperation. Hence, responsible 3PP still evolves and promotes a high level of cooperation in the population as a whole under conditions of high social-structural constraint.

4. Discussion

While the evolution of direct punishment has received considerable attention, the evolution of 3PP has not been well understood. Through a structured population model that implements variable degrees of social-structural constraint, we have found that that high strength-of-ties and low mobility can provide a solution to the puzzle of the evolution of responsible 3PP. Responsible 3PP can evolve and induce cooperation with the help of punishment reputation when other mechanisms (e.g. population structure or direct punishment) alone fail to do so, but high strength-of-ties and low mobility are critical for this process.

In addition to focusing on 3PP, our model differs from related work on punishment in the literature [3,39] by allowing agents to punish antisocially. Allowing for the existence of antisocial punishment is crucial based on recent developments in the punishment-related literature. Empirical evidence from across the globe have shown that humans do sometimes engage in antisocial punishment [40], and antisocial punishment has been shown to potentially have a destructive effect on the evolution of responsible punishment [25,41]. Additionally, unlike the model of [3], our model does not rely on a form of group selection argued to be independent of kinship, which is important because the role of such group selection in the evolution of behaviours is disputed [21, 42]. Consequently, in contrast to the interpretation of altruistic punishment [3], our results suggest that 3PP evolves because it ultimately bestows an evolutionary benefit to the individual engaging in it in contexts of high social-structural constraint. This interpretation is more in line with the theoretical work of [39], where punishment is viewed as a form of ‘social investment’. Interestingly, in terms of empirical results, the fact that 3PP can evolve as a non-altruistic trait has only been shown in non-human species, where 3PP induces female cleaner fish to cooperate, and hence bestows a direct benefit to males [16]. Thus, an interesting avenue for future empirical work is to test whether the same holds in humans, considering the crucial context of the interplay between social-structural constraint and reputation as illustrated in this paper.

Another unique aspect of our model is that agent interactions were sampled among neighbours in a structured population. This enabled us to implement varying degrees of strength-of-ties, a classic concept in sociological research [18], and allowed us to examine the evolution of 3PP. By contrast, other work has focused exclusively on direct punishment [4,912,17,25,29,41,4345] or ignored population structure [912,17,43,45]. Furthermore, evolutionary game studies with population structure often assume that agents interact with all of their neighbours in each generation [25,29,46,47]). This effectively eliminates the concept of strength-of-ties, since all pairs of agents have an equal number of interactions in a given time frame. As the results in this research show, strength-of-ties can have a pivotal impact on evolutionary outcomes. We suspect that many other related studies may benefit from examining similar effects of varying strength-of-ties. Relatedly, the specific types of population structures used in the presented results are Watts–Strogatz small-world networks [38]. While the qualitative results that high social constraint facilitates the evolution of 3PP hold on other network structures (see the electronic supplementary material), it is possible that other network properties also play a crucial role in the evolutionary trajectory of behaviours. Examining this is an interesting area for future research.

We have also explored the implications of mobility, an important research stream in cultural psychology [19,20,3537], on the evolution of 3PP. Compared with conditions of high mobility, conditions of low mobility probably increase kinship ties in populations, i.e. interacting individuals are likely to be reproductively related [21]. Kinship plays an important in the evolution of cooperation [21,22] and is also key to the evolution of 3PP in our model. Future models would benefit from more detailed empirical studies on when and how agents are likely to move in social networks, and extensions to population structures in which connections change over time are probably necessary for more accurate models. We would also add that much like our work has benefited from the insights from cultural psychology research, laboratory and field studies in cultural psychology would benefit from incorporating insights from evolutionary game theory in studies of conflict and punishment across cultures [48,49].

Our model is an extension of a recent model for the evolution of direct punishment [17] to the case of 3PP. In the direct punishment case, opportunistic agents could know the punishment reputation of their game interaction partner. To capture an equivalent notion of punishment reputation, we allowed opportunistic agents to know the punishment strategies of their neighbours with whom they may be paired for punishment. This raises the question of what happens if agents can only estimate this information. Our main results, that 3PP cannot evolve in unconstrained populations but high social-structural constraint makes it possible, hold even with the existence of significant noise on agents’ estimation of the punishment strategies of their neighbours (see the electronic supplementary material).

In summary, when punishment evolves in our model, this happens because punishing responsibly fosters a culture of cooperation in the agent's neighbourhood, by signalling that defection is not tolerated. Agents learn to punish responsibly because it ultimately provides them with an evolutionary benefit even if the action is immediately costly. This occurs because responsible 3PP induces a local ‘culture of cooperation’ that, under conditions of high social-structural constraint, is able to be proliferate in the population as a whole. Thus, in line with recent theoretical work on direct punishment [17] and studies of 3PP in animals [16], responsible 3PP in our study evolves because it is ultimately beneficial to the individual engaging in it. When punishing responsibly cannot induce cooperative behaviour in an agent's neighbourhood, either owing to weak strength-of-ties or high mobility, responsible 3PP cannot evolve or be sustained in the population. Our results are hence also consistent with recent empirical data showing that human subjects do not exhibit 3PP when great care is taken to ensure that subjects are aware that interactions are completely anonymous [50]. 3PP can only emerge and persist in a context of punishment reputation, high strength-of-ties and low mobility.

Funding statement

This research was based on work supported in part by US Air Force grant no. FA95501210021, the US Army Research Laboratory and the US Army Research Office under grant nos. W911NF0810144 and W911NF1110344. The authors thank the anonymous reviewers for their helpful insights and suggestions and Plato D for facilitating this research.

  • Received October 10, 2013.
  • Accepted November 13, 2013.


View Abstract