Understanding how societies resolve conflicts between individual and common interests remains one of the most fundamental issues across disciplines. The observation that humans readily incur costs to sanction uncooperative individuals without tangible individual benefits has attracted considerable attention as a proximate cause as to why cooperative behaviours might evolve. However, the proliferation of individually costly punishment has been difficult to explain. Several studies over the last decade employing experimental designs with isolated groups have found clear evidence that the costs of punishment often nullify the benefits of increased cooperation, rendering the strong human tendency to punish a thorny evolutionary puzzle. Here, we show that group competition enhances the effectiveness of punishment so that when groups are in direct competition, individuals belonging to a group with punishment opportunity prevail over individuals in a group without this opportunity. In addition to competitive superiority in between-group competition, punishment reduces within-group variation in success, creating circumstances that are highly favourable for the evolution of accompanying group-functional behaviours. We find that the individual willingness to engage in costly punishment increases with tightening competitive pressure between groups. Our results suggest the importance of intergroup conflict behind the emergence of costly punishment and human cooperation.
The ability of humans to uphold cooperative relationships among large numbers of unrelated partners is an evolutionary puzzle. Among numerous proposed solutions to the problem of cooperation , punishment of uncooperative individuals has attracted considerable attention as a proximate reason why cooperative behaviours might proliferate [2–5]. While abundant experimental evidence [6,7] and direct neurobiological measurements  indicate that human readiness to incur costs to sanction uncooperative individuals is motivated by emotional mechanism, the evolution of individually costly punishment has been difficult to explain. Theoretical research [9–11] suggests that the evolutionary origin of group-beneficial behavioural traits and traditions is embedded in intergroup conflict. Consequently, costly behaviours that increase cooperation in groups may proliferate at the expense of less cooperative groups and individuals through extinction and emulation. This process is often seen as a consequence of military, economic and other forms of intergroup rivalries.
Warlike activity recorded among prehistoric humans [12,13] and quasi-experimental preference measurements in modern conflict areas  suggests that intergroup conflict has shaped human behaviour during the evolutionary history of the species. At the same time, anthropological annals [15,16], present-day field observations [17,18] and behavioural experiments among diverse human populations [19,20] all portray a picture of considerable variation in social organization and livelihood between human communities that explicitly differ in their willingness to sanction norm-violating behaviour. While a variety of different punishment mechanisms has been identified, sanctioning in many primitive societies without a judicial system  as well as in groups managing common-pool resources  is not coordinated by a central authority, but by individual group members or informal coalitions.
Since punishment is costly both for those who punish and typically even more so for those who are punished, punishment is expected to evolve only when the benefits of increased cooperation outweigh its costs. Several recent papers employing experimental set-ups in isolated groups have found clear evidence that while costly punishment increases the level of cooperation, the net effect in terms of material pay-off is often negative, decreasing the success of groups and individuals [23–26]. These results have cast doubt on the idea that costly punishment could evolve as a group-beneficial trait. Concurrently, arguments have been presented that the reduction in group and individual success owing to costs of punishment is likely to be overcome under longer time horizons  and coordinated punishment activity . The disparity in the conclusions from experiments with isolated groups underscores the lack of direct evidence on the selective merit of costly punishment in intergroup interactions.
In light of the consistent and strong empirical evidence, there is no question whether humans readily sacrifice individual resources to discipline uncooperative individuals without immediate tangible benefits. The unresolved question, however, is why humans incur costs to discipline their fellows. In this study, we explicitly address the possibility that costly punishment could evolve as a group-functional trait in intergroup conflict. For the purpose, we conducted a series of public goods experiments where we systematically varied the competitive environment between groups and individual opportunities to punish fellow group members. Our experimental design excludes the effects of direct reciprocation, reputation scores, communication and other conceivable proximate mechanism potentially supporting human cooperation in an effort to consistently focus on the importance of intergroup conflict in determining the selective benefit of costly punishment.
2. Material and methods
(a) Experimental design
To study the effectiveness of costly punishment in direct intergroup conflict, we conducted a series of public goods experiments with and without group competition. Most essentially, we varied the possibility of individuals to engage in costly punishment towards fellow group members. Altogether five different treatments were conducted (table 1).
In two control treatments, there was no group competition. In the no-punishment (NOPUN) treatment, participants played the linear public goods game without punishment opportunity, whereas in the punishment (PUN) treatment an equivalent public goods game was played with an opportunity to punish. These control treatments allow comparison with studies that have explored the effects of costly punishment in public goods games without incorporating between-group competition [5,23,24,27]. More importantly, by comparing the behaviour in group competition treatments with control treatments, we test the importance of group competition for the effectiveness of costly punishment.
In the asymmetric group competition treatment (PUN–NOPUN), participants of one group had the possibility to engage in costly punishment, while members of the rival group did not have this opportunity. Consequently, the comparison of net earnings of individuals in groups with contrasting punishment possibilities gives direct evidence on the benefits of costly punishment in intergroup competition, and shows whether costly punishment could evolve by influencing the success of individuals through direct intergroup competition.
In our symmetric group competition treatments (PUN–PUN and NOPUN–NOPUN), all members of the competing groups either had or did not have the opportunity to punish their fellow group members, respectively. The symmetric competition treatment without punishment (NOPUN–NOPUN) establishes an important benchmark to study if intergroup conflict is sufficient by itself to stabilize cooperation within strategically interdependent groups. The treatment with symmetric punishment opportunities (PUN–PUN) reveals the effects of punishment in a population where all competing groups have equal punishment opportunities.
In all treatments, eight participants played the game within their own group. The composition of groups stayed intact throughout the game. Participants' identities within a group were shuffled between periods. The game lasted for 30 identical periods. In the beginning of each period, participants received 20 monetary units (MUs) and simultaneously allocated these between group and personal accounts. The total amount allocated to group account was doubled by the experimenter and divided equally among all group members. In treatments with punishment, participants could then assign deduction points to their own group members after each period. Punishment was costly. Each deduction point cost the punisher 1 MU and reduced the earnings of the receiver by 3 MUs. We apply the 3 : 1 punishment ratio as well as the identity shuffling to facilitate the compatibility of our results with the pertinent literature [5,24,27].
In all treatments with group competition, the performances of competing groups were compared in each period after the public goods game was played within groups. The group with more MUs invested into group account won twice the difference in total investments. The group with lower investments lost an equivalent amount of MUs. Wins and losses from group competition were divided equally among the group members. Thus, the pay-off consequences of conflict were endogenously determined by the performance of the groups . This model of group competition introduces two important improvements to other existing experimental models of group competition [30,31]. First, as group competition does not involve an external prize, earnings can readily be compared between treatments with and without group competition. Second, the effect of group competition depends linearly on the difference between group performances. In other words, the more unequal the performances of the competing groups are, the more impact group competition has on individual earnings. When the group performances differ only slightly, group competition has only a minor effect on earnings. Finally, when the group performances are tied, group competition has no effect on earnings. In many scenarios relevant to the study of human behaviour, this structure can be seen as a more suitable way of modelling group competition than the probabilistic intergroup conflict used to model ‘winner-takes-it-all’ situations [32,33].
After the group competition stage, participants were informed about the contributions of their fellow group members and the total contribution made by the competing group. In groups with punishment opportunity, participants could then assign deduction points to their own group members using the same procedure as in treatments without competition. This means that the winner in the group competition was the group with highest level of cooperation before subtracting the costs of assigned and received punishments from the individual payments. It appears natural to assume that during the course of human evolution the success in group conflicts has been primarily determined by the coordination and cooperation among the group members, whereas net wealth has not been easily transformable to fighting power before the emergence of industrial societies. Likewise, it is unlikely that individuals mete out punishments at the time when informal punishments directly jeopardize individual or group success. This intuition is further supported by our data showing that the actual outcome of the conflict affects the likelihood and severity of assigned punishments (see §3 below). Notably, however, when we make inferences about the selective benefits of costly punishment we always compare net earnings that account for the costs of assigned and received punishments. Overall, the design reflects the importance of cooperation in surviving periodic war and abrupt environmental crises in conditions likely to have been experienced by late Pleistocene and early Holocene humans . For a more detailed discussion pertaining to the group conflict model and punishment, see the electronic supplementary material.
(b) Experimental procedure
The experiment was conducted at the laboratory of the Max Planck Institute of Economics in Jena (Germany). In 12 different experimental sessions, a total number of 288 participants took part in the experiment. The vast majority of the 169 female and 119 male participants were undergraduate students studying a range of different disciplines. None of the participants had previous experience with social dilemma experiments. In all treatments, participants were informed about the individual contributions and corresponding earnings in their own group after the contribution stage. In addition, the total amount of contributions in the directly competing group was revealed to participants. Participants were not informed about the individual punishment decisions of other participants. In the asymmetric competition treatment (PUN–NOPUN), where punishing and non-punishing groups were compared, no information about the punishment opportunity was revealed to the group without punishment.
The total earnings of a participant equalled the sum of net pay-offs over all 30 periods in all treatments. One experimental session lasted on average 90 min. Earnings per participant ranged from €9 to €36 with an average of €20. The experiment was programmed and run using z-Tree . A full description of the experimental procedure including sample instructions is available in the electronic supplementary material.
3. Results and discussion
Examining the behaviour in isolated PUN and NOPUN groups, we find that the effect of costly punishment on cooperation was, on average, substantial but only marginally significant owing to large variation between groups with punishment opportunity (figure 1a, mean contributions in PUN 14.5 MUs and in NOPUN 7.7 MUs; Mann–Whitney U6,6 = 30, exact p = 0.065, two-tailed). Consequentially, punishment did not significantly increase the net material pay-off that accounts for the cost of assigned and received punishment points (figure 1b, mean total net pay-offs in PUN 989 MUs and in NOPUN 830 MUs; Mann–Whitney U6,6 = 28, exact p = 0.132, two-tailed). A closer look at the distribution of net pay-offs reveals that the pay-offs vary widely among groups with punishment (figure 2b). The inconsistent effect of punishment in the absence of group competition is further illustrated by individual-level data showing that the highest individual pay-off was earned by an individual belonging to a group without punishment (compare figure 2a,b). These findings are consistent with several previous studies that have not found substantial benefits of punishment in isolated groups [4,23,24,36]. It is noteworthy that the games in our experiments lasted for 30 periods. Thus, the non-significant effect of punishment cannot be explained by short game duration. Our results from isolated groups using a larger group size than typical, at eight participants, suggest that the findings  stressing the long-term benefits of punishment perhaps apply to small groups (groups size of three in ), but do not readily extrapolate to larger groups.
In the asymmetric group competition treatment (PUN–NOPUN), the possibility of punishment had a dramatic effect on cooperation. In groups with punishment opportunity, contributions to the group account quickly rose and levelled close to maximum investment that significantly exceeds the contributions of non-punishing groups (figure 1c, groups with punishment 19.3 MUs, groups without punishment 13.6 MUs, Wilcoxon signed-rank test for six-matched observations, t = −21, exact p = 0.031, two-tailed). The effect of punishment on net pay-offs was even more pronounced (figure 1d, groups with punishment 1485 MUs, groups without punishment 586 MUs, Wilcoxon signed-rank test for six-matched observations t = −21, exact p = 0.031, two-tailed). Importantly, even the lowest earning individual in groups with punishment opportunity earned more than the highest earning individual in groups without punishment (figure 2c). The data unequivocally reveal that in an asymmetric group conflict where groups with and without punishment are in direct competition, punishment opportunity benefits both the group and the individual.
We provide further evidence on the effects of punishment in intergroup conflicts and test if intergroup competition alone suffices to maintain cooperation by examining the behaviour in symmetric conflicts where both of the groups either had or did not have the opportunity to punish. While group competition alone had a weak tendency to increase net pay-offs when compared with a situation without competition (mean total net pay-offs in NOPUN 830 MUs and in NOPUN–NOPUN 998 MUs; Mann–Whitney test assuming independence of observations: U6,6 = 29, exact p = 0.093, two-tailed), it was not sufficient to maintain stable cooperation (figure 1e). By contrast, in PUN–PUN, cooperation quickly stabilized near maximum contributions (figure 1e). Comparing the symmetric competition treatments reveals that mean contributions were higher in punishing groups than in groups without punishment (PUN–PUN 19.4 MUs and NOPUN–NOPUN 13.3 MUs, Mann–Whitney test assuming independence of observations: U6,6 = 36, exact p = 0.002, two-tailed). Punishment maintains high level of cooperation even when both parties of the group conflict adopt the culture of peer-punishment. However, the comparison of symmetric competition treatments in terms of net pay-offs reveals that the pay-off superiority of punishing groups becomes effective only with a longer time horizon and remains modest, even though growing, over time (figure 1f, mean total net pay-off in PUN–PUN 1113 MUs and in NOPUN–NOPUN 998; Mann–Whitney test assuming independence of observations: U6,6 = 30, exact p = 0.065, two-tailed). The narrow benefit of punishment over non-punishment in symmetric competition treatments may lead one to (erroneously) conclude that the disposition to punish may not need to proliferate in the long run, as tribes composed of several interacting group with a practice to sanction uncooperative individuals do not significantly outperform tribes without punishment. However, rivalrous interactions do not occur only within tribes, but also along the boundaries of their tribal territories. In conflict along the boundary, groups with punishment would prevail and punishment would thus inevitably spread to the entire population. Further, in a population consisting only of punishers, any group or tribe renouncing punishment would perish as evidenced by the major advantage for individuals in punishing groups in the asymmetric competition treatment (figures 1d and 2c). In sum, punishment appears to be very resistant to invasion in an environment characterized by group conflict.
An intriguing and robust finding is that the within-group variation in individual pay-offs is substantially smaller in groups with punishment than in groups without punishment (see the caption of figure 2). Clearly, the propensity to incur costs in order to sanction not only ensures a higher level of cooperation and net earnings in group conflicts, but also generates substantially greater equality within the group vis-à-vis a group without such opportunities. The result that punishment decreases within-group variation in success may have important ramifications to understanding the evolution of group-functional behaviours in humans. Selection favours group-beneficial but individually costly traits (the cost being relative to other members of the group) only when variation in success is low within and high among groups . Consequently, by repressing within-group differences in success, punishment attenuates selection operating against individually costly but group beneficial traits. In other words, punishment can function as a form of reproductive levelling that is likely to change the selective environment so that it becomes more favourable to the evolution of behaviours that increase the success of the group relative to other groups . While the idea that the repression of within-group competition shifts competition (and selection) to between-group level is in general well developed in social evolution theory , it has thus far been largely overlooked in an effort to understand the origin and effects of costly punishment. Our results suggest a fundamental role for costly punishment in shaping the selective regime in which human social behaviour has evolved.
To explore the factors motivating observed punishment behaviour more closely, we constructed various regression models that account for the fact that both individuals and groups undergo repeated measurements and each conflict pair creates a cluster of related groups (table 2). All models control for individual demographic factors referred to as controls (age, gender and cultural background). Given the collected data and our regression-based statistical models, we do not find any consistent demographic differences among our participants with respect to cooperation, punishment or response to punishment (see electronic supplementary material, tables S2 and S3 for more detailed information and estimates). Table 2 indicates that free-riding (negative deviation from the average contribution) was a major factor explaining the number of received punishment points in all treatments. Unlike many previous studies [38–40], we found no evidence for antisocial punishment targeted towards cooperators (electronic supplementary material, figure S1). Group's average contribution proves to be significant only in competing groups (table 2: models 2, 3 and 4) where it is proportional to the cooperative effort in the rival group, indicating that the motivations to punish are qualitatively different between groups with and without competitive pressure. In all treatments, the number of received punishment points decreased as the game proceeded (see also figure 3). In competing groups, the outcome of group competition (amount of MUs transferred to/from each group member) affected participants' eagerness to impose a penalty on their peers so that the severity of defeat increased the harshness of individual punishments (model 4). Overall, the intensity of group conflict had a marked effect on the willingness to punish uncooperative group members (figure 3). In PUN–PUN, where rivalry between groups was pronounced, the same amount of free riding was punished more severely than in PUN–NOPUN where groups with punishment opportunity dominated in group competition (models 4 and 5 in table 2). Results suggest that the growing external threat to group and individual success increases readiness to sacrifice individual resources to disciplinary action. This result corresponds with the real life observations of heightened sanctions adopted in times of group conflict, like the increased social sanctioning targeted towards surviving deserters .
Darwin  suggested that competition between bands could select for individual traits such as courage and faithfulness, which benefit the group in conflict situations. The little direct historical evidence available on intergroup variation and patterns of extinctions over the evolution of human social behaviour stresses the prospect that lethal group conflict may have been frequent enough to allow the proliferation of individually costly, but group beneficial traits [11,43]. Likewise, the importance of behavioural traditions for group cohesiveness is attested in many avenues of present-day social life. Intergroup rivalries are present in varying organizational levels including war, competition for foreign direct investment, promotion tournaments in labour markets, and team sports.
The inclination to punish norm violators is a human universal  but accounting for its evolution is an evolutionary puzzle. Our experimental results from direct intergroup conflict demonstrate that costly punishment entails undisputable individual benefits in a population of competing groups with repeated intra-and intergroup interactions. Moreover, we find that the competitive pressure between conflicting groups evokes qualitative differences in the use of punishment. Given the robust result that costly punishment generates significantly more equal distributions of material pay-offs, it prepares the ground for the evolution of parallel group-functional traits and traditions. These results support the importance of intergroup competition in the emergence of costly punishment and human cooperation, stressing the prospect that parts of the human behavioural repertoire have evolved as group-functional traits through conflicts between human communities.
This paper has created an illustrative setting to shed light on the ultimate cause behind the widely observed costly punishment. At the same time, it prepares the ground for more comprehensive experimental investigations to identify various parallel evolutionary causes such as reputation, signalling and moral standards that may as well maintain cooperation. In fact, earlier research suggests that various proximate causes may efficiency interact to boost cooperation . Our aim has been to conduct an experiment as parsimonious as possible without neglecting any of the design principles important for the emergence of costly punishment. We demonstrate the principle, not the actual course of human history. The results might have been different, had we, for instance, allowed more direct behavioural means for retaliation [46,47]. Likewise, the detrimental habit of punishing cooperators in some human societies  is shown to create a possible caveat to the coevolution of punishment and cooperation . Consequently, in future studies it would be worthwhile to examine the effect of between-group competition on the degree of antisocial punishment in societies where antisocial punishment is common.
The demonstrated success of costly punishment in situations where groups interact should not be understood as something that inevitably leads to behavioural adaptations, helping humans to establish a culture of cooperation. The characteristics of human evolution and socio-ecological trajectories are utterly complex phenomena where cooperative predisposition may concurrently co-evolve with destructive elements leading to ruinous rivalries [32,34]. A deeper understanding of these phenomena will help us to prevent tragedies.
The authors would like to thank Samuel Bowles, Werner Güth, Sebastian Krügel, Kari Nissinen, Elinor Ostrom, James Walker and Johannes Weisser for suggestions and discussions when preparing and revising the manuscript. The paper has benefited from comments made by the participants of the Experimental Reading Group at the Workshop in Political Theory and Policy Analysis during the autumn semester 2009 as well as the seminar participants at the Max Planck Institute of Economics, the ESA North-American Meeting 2009 and the IMEBE 2010 conference. Authors are indebted to Rico Löbel for his research assistance and to Abelheid Baker for her editing help. Financial support from the Max Planck Society and the Center of Excellence in Evolutionary Research, Academy of Finland, is gratefully acknowledged. The authors declare no conflict of interest.
- Received February 4, 2011.
- Accepted March 10, 2011.
- This Journal is © 2011 The Royal Society