An evolutionary model explaining the Neolithic transition from egalitarianism to leadership and despotism

Simon T. Powers, Laurent Lehmann


The Neolithic was marked by a transition from small and relatively egalitarian groups to much larger groups with increased stratification. But, the dynamics of this remain poorly understood. It is hard to see how despotism can arise without coercion, yet coercion could not easily have occurred in an egalitarian setting. Using a quantitative model of evolution in a patch-structured population, we demonstrate that the interaction between demographic and ecological factors can overcome this conundrum. We model the coevolution of individual preferences for hierarchy alongside the degree of despotism of leaders, and the dispersal preferences of followers. We show that voluntary leadership without coercion can evolve in small groups, when leaders help to solve coordination problems related to resource production. An example is coordinating construction of an irrigation system. Our model predicts that the transition to larger despotic groups will then occur when: (i) surplus resources lead to demographic expansion of groups, removing the viability of an acephalous niche in the same area and so locking individuals into hierarchy; (ii) high dispersal costs limit followers' ability to escape a despot. Empirical evidence suggests that these conditions were probably met, for the first time, during the subsistence intensification of the Neolithic.

1. Introduction

Understanding how leadership and dominance behaviours in humans have changed over evolutionary time is relevant to both biology and the social sciences. What drove the transition from largely egalitarian hunter–gatherer groups, where leadership was facultative and dominance attenuated [1], to the hereditary and more despotic forms of leadership that arose during the Neolithic [2,3]?

On the one hand, ‘coercive’ (or ‘agency’) theories have focused on the development of inequality that was made possible with the origin of food storage and agriculture, allowing dominant individuals to build up resource surpluses that could be used to consolidate their power [46]. On the other hand, ‘functional’ (or ‘integrative’) theories have addressed the benefits that leaders provide to other group members. In particular, as human group size increased during the Neolithic [7,8], the resulting scalar stress [9] would have necessitated increased hierarchy in order to solve various coordination and collective action problems [1016]. Leadership could have been favoured to solve problems, including the coordinated harvesting of marine resources [1719], the construction of irrigation systems [2023] and defensive warfare [24,25].

But when considered alone as competing theories, both coercive and functional models struggle to explain the transition to despotism seen during the Neolithic. Purely coercive theories cannot explain why individuals would initially choose to follow a despot [16,26]. Boehm [1] presents evidence suggesting that present-day hunter–gatherers actively form coalitions to suppress would-be dominants, and argues that prehistoric hunter–gatherers did likewise. Moreover, the advent of projectile weapons is likely to have made such coalitions particularly effective [27], tipping the balance of power away from an individual dominant. Thus, the question is, why would individuals not continue to prevent despotic behaviour? But if individuals are unconstrained in their choice of leader, then it is difficult to see how despotism could develop.

Several authors have argued that an adequate model of the origin of increased social stratification must incorporate both functional and coercive aspects [15,22,28]. There is evidence that aspiring leaders drove the development of technology that increased subsistence intensification and raised population carrying capacity [17,22]. For example, construction of irrigation systems would have allowed more land to be used for agriculture, providing an incentive for individuals to follow the leader. This fits with functional theories. On the other hand, the surplus resources that this provided could then be appropriated by leaders to further their own ends and consolidate their power. This is particularly the case given that irrigation farmers would be tied to the system, making dispersal away from a despot difficult. Spencer [22] developed a verbal model of this for the case of irrigation systems in prehispanic Mexico, and warfare in prehispanic Venezuela. However, the feedbacks between population size, functional aspects of leadership and the development of despotism remain poorly understood and are difficult to capture with verbal models.

Here, we present an evolutionary model of the dynamics of the transition from small-scale egalitarian to larger-scale hierarchical groups, which integrates both functional and coercive aspects of leadership. We use a demographically explicit model of a patch-structured population, in which surplus resources translate into increased reproductive output for those who receive them, as has been common throughout human history [29,30]. Unlike previous work, this allows us to capture the ecological and demographic interactions between subsistence intensification, dispersal costs and the evolution of despotic behaviour.

2. The model

(a) Life cycle and social traits

We consider a population that is subdivided into a finite number, Np, of patches, which are subjected to local stochastic demography (as per [31,32]). The life cycle consists of discrete and non-overlapping generations, as follows: (i) social interactions occur on each patch with its members possibly choosing a leader, who may affect local resource production; (ii) each individual on a patch has a Poisson-distributed number of offspring, with the mean determined by the outcome of social interactions and local resource abundance (defined explicitly below); (iii) adults of the previous generation perish; and (iv) individuals of the descendant generation may disperse, conditional on the result of the stage of social interactions. Dispersing individuals suffer a cost CD, such that individuals survive dispersal with probability 1 − CD, and then enter a patch taken at random from the population (excluding the natal patch).

Each individual in this population carries a cultural trait, h. This takes the value 0 or 1, and determines whether the carrier has a preference for hierarchy (h = 1) or acephalous (h = 0) social organization. In each generation and for each patch, one individual is chosen at random from the subset of individuals with a preference for hierarchy (h = 1) to act as the leader (this could be an individual with unusual characteristics such as strong organizational abilities). There are then up to three classes of individuals on a patch: (i) the individual chosen as the leader (l class); (ii) the remaining individuals with (h = 1) that act as followers (f class); and (iii) acephalous individuals (with h = 0) that choose not to have a leader (a class).

When in the role of a leader, an individual is assumed to express a culturally inherited trait, z, which represents the proportion of the surplus it generated that it keeps for itself. This is a continuous variable between 0 and 1. Offspring of the leader are assumed to remain philopatric, but offspring of followers or acephalous individuals may disperse. We denote by df the conditional dispersal strategy of the offspring of a follower. Specifically, df is the maximum proportion of the surplus that an individual will tolerate the leader of the parental generation taking, and is thus continuous between 0 and 1. This assumption accords with evidence from social psychology that individuals tend to disperse from groups with autocratic leaders [33]. Finally, da determines the unconditional dispersal probability of the offspring of an acephalous individual, which is independent of the outcome of social interactions. The assumption that the offspring of a leader remain philopatric is appropriate in this model, because by remaining philopatric, they increase the probability that one of their lineage will be chosen as leader on that patch in the next generation. Moreover, because offspring inherit the z trait of their parent, it is less biologically realistic that an individual would disperse based on how much of the surplus their parent took, when they themselves would take the same amount. We have, however, also investigated the effects of relaxing the assumption that the offspring of a leader must remain philopatric (electronic supplementary material, appendix S3).

Each individual carries all four cultural traits (h, z, df and da) which are all assumed to be transmitted vertically from parent to offspring [34] with independent probability 1 − μ. When a mutation occurs at trait h (probability μ), an offspring adopts the opposite trait. When a mutation occurs at the three remaining continuous traits, Gaussian mutation is performed by addition of a truncated Gaussian-distributed random variable centred around the current trait value, with variance 0.1.

Our model aims to capture qualitative behavioural trends. A more quantitatively accurate model would include individual and social learning of behavioural traits within a generation. For example, hierarchy preference could be a continuous trait updated by an individual's estimate of the likely pay-off from following a leader, and from copying the behaviour of more successful individuals. However, these processes would largely result in the same qualitative outcome as the vertical transmission with differential reproduction that we model, apart from the fact that they operate on a much shorter timescale.

(b) Reproduction

The mean number of offspring produced (of the Poisson distribution in stage two of the life cycle) by individuals within patches is assumed to follow a Beverton–Holt model, with two niches [31,35]. The two niches correspond to either having a leader (individuals of the l and f classes), which we refer to as the hierarchical niche H, or remaining acephalous (acephalous niche A, containing individuals of class a). The degree of competition between the niches is set by two parameters, αAH and αHA, which represent the per capita effects of individuals in the hierarchical niche on those in the acephalous niche, and vice versa, respectively. The total number of individuals in the hierarchical niche (leader plus followers) on patch j at time t is denoted by nHj(t), and the number of individuals in the acephalous niche by nAj(t).

According to these assumptions, we write the mean number of offspring produced, respectively, by a leader, a follower and an acephalous individual on patch j at time t as Embedded Image 2.1The numerator in each expression can be thought of as the maximal birth rate of an individual in the corresponding class. For followers and acephalous individuals, this is given by a constant rb, whereas for the leader, this depends upon the outcome of surplus production, as defined below. The denominator in each expression can be thought of as the intensity of density-dependent competition. This depends on a time-dependent variable Kij(t), which is a proxy for the carrying capacity of niche i on patch j (maximum population size). The exact carrying capacity in the Beverton–Holt model is a function of all fitness parameters, but increases directly with the K variable for each niche. In the classical one niche deterministic case, K gives the carrying capacity when rb = 2, which is a value we use throughout. Hence, we refer (loosely) to K as the ‘carrying capacity’. Kij(t) is affected by surplus resource production (detailed below), which allows for local demographic expansions owing to social interactions [31,32].

(c) Surplus production

In each patch, individuals take part in a social enterprise which may generate surplus resources for their niche. Individuals may also fail to produce this surplus, and to capture these two cases in a probabilistic way, we let Embedded Imagewhere ϕτj(t) is the indicator random variable taking the value one if the surplus is produced in niche Embedded Image on patch j at time t, zero otherwise. Surplus production occurs with probability Embedded Image 2.2where gτ is a parameter giving the gradient of how the probability of surplus generation changes with the number of individuals in the niche (‘social group size’). We assume that gτ is positive, such that the probability of success decreases with increasing social group size. This represents the effects of scalar stress. We further assume that gH < gA, such that the success probability declines at a slower rate with increasing group size in the presence of a leader, and that for a given group size, groups with a leader are more likely to generate the surplus.

(i) How surplus affects acephalous individuals

We relate surplus production to the ‘carrying capacity’ of acephalous individuals by assuming that Embedded Image 2.3where Kb is the baseline capacity. If the surplus is generated, then this is then increased by βk(1 − exp[−γknAj(t)]), which is a positive concave function of γknAj (entailing diminishing returns), where γk sets the gradient of the carrying capacity increase, and nAj is taken as the amount of surplus resource produced. Alternatively, the surplus can be thought of as proportional to population size, with conversion factor γk (this is assumed to hold for both niches). The parameter βk sets the maximum possible increase in carrying capacity. If the surplus is not successfully generated, the carrying capacity is then given by Embedded Image, where Embedded Image is the surplus decay rate from one generation to the next, and KAj(0) = Kb (if Embedded Image, there is some ecological inheritance of modified carrying capacity).

(ii) How surplus affects leaders and followers

For individuals in the hierarchical niche, the leader keeps a proportion of any surplus for itself, as given by the value of its z-trait. Let zlj(t) denote the z-trait of the leader on patch j at time t, then the carrying capacity of individuals in the hierarchical niche is given by an analogous expression to that of acephalous individuals (equation (2.3)), namely Embedded Image 2.4where {1 − zHj(t)}nHj(t) is the amount of surplus used to increase the carrying capacity of the leader and its followers. The remainder zlj(t)nHj(t) of the surplus is retained by the leader and used to increase its own birth rate (which has occurred throughout human history [29,30]) as follows: Embedded Image 2.5where γr gives the gradient of the increase in birth rate with respect to the absolute magnitude of the surplus that the leader takes. The parameter βr gives the maximal possible increase in the leader's birth rate. This represents the maximum degree of despotism that it is possible for a leader to exert. This will depend upon both ecological and social factors, and in particular, on the degree to which followers are able to resist coercion. Where followers have little power to resist the leader, then we would expect a large value of βr. Conversely, if followers are able to resist coercion to a large degree, for example by forming coalitions, then a smaller value of βr would be more plausible.

(d) Conditional dispersal of followers

To close the model, it only remains to specify how offspring of followers disperse conditionally on leader behaviour (offspring of acephalous individuals disperse unconditionally, and the offspring of the leader remain philopatric). Denoting by df,ij(t) the dispersal preference of follower offspring i on patch j at time t, that offspring is assumed to disperse if Embedded Imagethat is, if the leader of its parent took more than its threshold value.

The model defines a stochastic process for the four evolving traits (h, z, da, df), the number of individuals in each niche (nA, nH), and their respective carrying capacities (KA, KH). Because of the nonlinearities of the model, which result from the interactions of all of these variables, we analyse it using individual-based simulations.

3. Results

We focus on the effect that the following demographic and ecological parameters have on the transition to despotism: (i) the effect that a leader has on surplus generation (gA relative to gH); (ii) the degree to which surplus resources produce demographic expansion (βk); and (iii) the cost of dispersal (CD). The other parameters used in the simulations, unless otherwise specified, are Kb = 20, rb = 2, γk = 0.05, γr = 0.1, gH = 0.01, αAH = αHA = 0.03, Embedded Image, μ = 0.01, Np = 50.

(a) The voluntary creation of hierarchy through cultural evolution

Figure 1a,c illustrates that when leaders confer a large advantage in surplus generation (gA is large relative to gH), hierarchical individuals can invade a population of acephalous individuals. This is because for a given group size, hierarchical individuals are more likely to produce a surplus than acephalous individuals on their patch (equation (2.2)). Individuals that receive surplus resources then enjoy a fitness increase, mediated by a reduction in the intensity of density-dependent competition in their niche. Consequently, they produce more offspring than individuals that do not receive a surplus. In this way, when leaders increase the likelihood of surplus generation, and share some of this surplus with their followers, then hierarchical individuals can outcompete acephalous individuals.

Figure 1.

Illustration of ecological conditions under which either hierarchical (ad) or acephalous (eh) individuals are favoured by the coevolution of culturally transmitted behavioural traits with demography. When the presence of a leader confers a large advantage in surplus generation (gH much smaller than gA), then individuals with a preference for hierarchy can invade an acephalous population (a,c). Successful generation of the surplus then drives an increase in population size (b,d). The degree of despotism, measured by the amount of surplus the leader monopolizes for its own reproduction, increases with increasing dispersal cost (a,c). Conversely, if the presence of a leader does not confer a large advantage in surplus generation, then hierarchy fails to invade (eh), and groups remain acephalous. Parameters: βr = 5, βk = 100.

Crucially, this can occur even when leaders evolve to retain a large proportion of the surplus for themselves (figure 1c). This is because even when leaders retain some of the surplus, followers can still each receive more extra resource than they would in acephalous groups, where the surplus would be generated less frequently. This demonstrates the voluntary creation of hierarchy, where individuals that accept inequality in their groups are better off than those that remain egalitarian. Whether or not this is the case depends upon the magnitude of the advantage that leaders confer in surplus generation.

Figure 1e,g illustrates the case where leaders do not provide much advantage in surplus generation. In this situation, acephalous individuals each receive, on average, a larger amount of surplus resources than followers of a leader. This is because acephalous groups are almost as likely to generate the surplus as hierarchal groups, but all of the surplus is shared among themselves rather than some being retained by a leader. Consequently, hierarchy is not favoured, and the unconditional dispersal probability trait of acephalous individuals, da, depends mainly on the dispersal cost and decreases as the cost increases (figure 1e,g; further discussion in the electronic supplementary material, appendix S1). We discuss the conditional dispersal trait of followers, and its coevolution with the proportion of surplus that leaders retain, below.

(b) The coevolution of group size and hierarchy

When individuals receive surplus resources, this leads to a reduction in competition for resources with other individuals on the patch in their niche. As a result, their niche can support a larger number of individuals (equations (2.3) and (2.4)), leading to an increase in group size. Figure 1b,d illustrates that when hierarchy invades, it drives an increase in group size. For example, in figure 1b, the population initially starts out fixed for acephalous individuals, who produce some surplus. This surplus drives an increase in their local number from the base value of 20, to around 40. But because of the problems of coordinating in large groups without a leader (represented by a large value of gA), they are unable to reliably generate the surplus in groups above this size. Thus, their group size stabilizes around this value. However, as hierarchy invades, group size increases up to 80 individuals. This is because the coordination advantages of having a leader (gH < gA) mean that hierarchical individuals are able to continue generating the surplus in larger groups.

The increase in group size is driven by a positive feedback loop in which surplus production increases carrying capacity, causing an increase in group size, which then in turn allows greater amounts of surplus to be generated. This positive feedback loop stops when either (i) groups are too large for additional surplus to be reliably generated (equation (2.2)), or (ii) diminishing returns in the value of the surplus mean that the extra surplus produced by one more individual is not enough to increase carrying capacity by at least one individual (equations (2.3) and (2.4)). When gH is smaller than gA, then the feedback loop can stop at a larger group size for hierarchical individuals than for acephalous individuals. Thus, the ability of leaders to solve coordination problems in larger groups, combined with the effects of surplus resources on demography, means that the invasion of hierarchy produces a transition to larger-scale social groups.

The transition to a larger group size is crucial to the stability of hierarchy. This is because acephalous individuals experience density-dependent competition with hierarchical individuals on their patch, and vice versa (equation (2.1)). So the larger the absolute number of hierarchical individuals, the more they suppress the fitness of acephalous individuals by outcompeting them for shared resources, such as space. Conversely, when there are few hierarchical individuals, then it is relatively easy for acephalous individuals to reinvade and hierarchy to collapse. The parameter βk controls the extent to which surplus production can increase group size. As figure 2 shows, when this is low then although hierarchy can invade, it does not remain stable. As βk increases, however, then the invasion of hierarchy brings about a large increase in group size that suppresses mutant acephalous individuals. The transition to larger groups thus locks individuals into hierarchy.

Figure 2.

Stable hierarchy requires that surplus resources translate into demographic expansion of group size (large value of βk). Demographic expansion removes the viability of the acephalous niche on a patch, locking individuals into hierarchy. Panels show the stability of hierarchy on a single patch in the metapopulation. Parameters: gA = 0.15, βr = 2.

The degree to which group size increases when hierarchy invades also depends upon how much of the surplus the leader retains for itself. Specifically, when leaders evolve to share more surplus resources with their followers, then the group can grow to a larger size (figure 1a,b, compared with 1c,d).

(c) When does cultural evolution lead to despotism?

What determines how much of the surplus the leader takes? A selection pressure exists for a leader to take more of the surplus, because this translates into an increased birth rate (equation (2.5)) and hence a greater number of offspring relative to the other hierarchical individuals on its patch (equation (2.1)). Moreover, because the leader of the next generation is chosen by random sampling of the offspring of hierarchical individuals on the patch, this increased reproduction also increases the probability that one of the current leader's offspring will remain as leader in the next generation. This continued occupancy of the leader role then increases the reproductive share of the leader's lineage even further.

However, a pressure also exists for the leader to take less surplus. This is because the total amount of surplus generated increases with increasing group size, which provides an incentive for a leader to have more followers. But, followers have a choice in leader because they may disperse from the group and join a different one, conditional on the amount of surplus that the leader takes (as given by their df trait). Thus, if the leader takes too much of the surplus then it will lose followers. This then means that less surplus will be generated for hierarchical individuals in the next generation, which can cause hierarchical individuals to be outcompeted by acephalous individuals on their patch.

The proportion of surplus that the leader takes is therefore a trade-off between opposing selection pressures. The balance depends upon the cost of dispersal—how easily individuals may leave one leader and follow another. If the cost of dispersal is low, then leaders are constrained in how much of the surplus they can monopolize. This is because when dispersal costs are low then followers evolve low tolerance values of df, such that they readily disperse if leaders retain a larger proportion of the surplus (figure 1a). Consequently, leaders evolve to share a large fraction of the surplus with their followers in order to prevent them from dispersing. On the other hand, as dispersal cost increases, then followers evolve larger tolerance values of df in order to avoid paying a high dispersal cost (figure 1c). As a result, the strategy of leaders coevolves to appropriate more of the surplus for their own reproduction, because their followers will not readily disperse to other groups.

Thus, in an ecology where dispersal is costly, evolution leads to more despotic groups. Moreover, this increased despotism is voluntarily tolerated by followers, in the sense that individuals which allow the leader to retain more surplus before dispersing outcompete both acephalous individuals, and followers that more readily disperse. Figure 3 demonstrates this coevolution of follower dispersal preference and leader strategy for the full range of dispersal costs.

Figure 3.

As dispersal cost increases, followers tolerate their leader behaving more despotically (a). This in turn means that they enjoy a smaller increase in their carrying capacity, as the leader is able to direct more of the surplus into increasing its own reproductive success relative to that of its followers (b). Results show the long-run time averages over 3 × 106 generations of the stochastic simulation. Parameters: βr = 20, gA = 0.15, βk = 100.

(d) Sensitivity to parameters and model assumptions

We systematically varied the advantage in surplus production that leaders confer relative to acephalous groups (gA). When leadership does not confer much advantage in surplus production, then acephalous individuals outcompete hierarchical individuals (electronic supplementary material, figure S2). We also investigated the effect of varying the coercive power of the leader (electronic supplementary material, figure S3), as measured by the maximal birth rate advantage it can enjoy from surplus production (βr). As this increases, then, for a given dispersal cost, leaders evolve to retain more of the surplus for themselves. Further, we investigated the effects of varying the intergenerational decay in surplus resources, Embedded Image, including allowing for complete decay (electronic supplementary material, appendix S2 and figure S4). Finally, we allowed the offspring of a leader to disperse (electronic supplementary material, appendix S3 and figure S5). We found that varying all of these does not qualitatively affect our main results.

4. Discussion

We have presented a model that captures the dynamics of the transition from small egalitarian to larger despotic groups. In line with work by Hooper et al. [15], our model demonstrates that hierarchical systems of social organization can be voluntarily created by followers, rather than having to be imposed by a leader through coercion. This is in contrast to the current trend in archaeology that focuses on ‘agency’, that is, on how leaders promote their own interests at the expense of others. By such accounts, leadership is seen as benefiting the leader rather than the followers [4,6]. Yet, while it is certainly the case that leaders should be expected to promote their own ends, the agency of followers must also be considered [1,28,36]. If leadership provides no benefit to followers, then it is hard to see why previously egalitarian individuals would accept despotic appropriation of resources, unless there were coercive institutions such as a military already in place. But such institutional coercion could not have been paid for before a leader appropriated surplus resources, making it hard to see how hierarchy could become established [16,26].

The origin of despotism in human societies is similar to the problem addressed by reproductive skew theory [30]. In skew models, despotism is measured in terms of how much of the reproduction within a group is monopolized by a dominant individual. This is constrained by the outside options that subordinates have, either to live alone or in a different group. Skew models predict that dominants should behave more despotically as the feasibility of outside options decreases [37]. However, they do not consider the benefits leaders can provide to other group members in terms of surplus production, and so do not address how despotic leadership could evolve from an initial stable state of egalitarianism. Here, we have extended the basic logic of skew theory to incorporate the feedback between surplus production and demography that was likely to have been important during the Neolithic.

Previous work has explicitly modelled the formation of institutions to solve various collective action problems related to food production, as relevant to demographic growth in the Neolithic [31]. It was shown that groups could evolve institutionally coordinated punishment to secure cooperation in generating surplus resources, driving demographic expansion. This paper builds upon these results by investigating the political ecology of such institutions, in terms of the opportunities that they create for despotism as group size increases.

Hooper et al. [15] showed that hierarchy can evolve if leaders help to secure cooperation in the production of large-scale public goods, using a model with complete dispersal between groups every generation. Their static analysis implied that despotism should rise as the cost for followers of switching to a different leader increases. Our model has independently confirmed that this prediction holds in a demographically realistic setting, where the cost of switching leader is given a biological basis in terms of dispersal cost. Moreover, our model incorporates dynamic group size alongside explicit coevolution of leader despotism and follower tolerances. This framework has allowed us to demonstrate that the equilibrium of large groups with despotic leadership can actually be reached by gradual evolution, from an initial state of small egalitarian groups. Understanding the dynamics of this transition is one of the most pressing issues in Neolithic social evolution [16,26]. But previous models have not addressed the interaction between subsistence intensification, population size and dispersal costs. Our results demonstrate that the interaction between these factors provides a cogent explanation for the transition to large and despotic groups. We now turn to discuss the empirical evidence for this interaction during the Neolithic.

There is strong evidence that the presence of a leader conferred advantages in solving coordination problems related to food production in both complex hunter–gatherers [28,18,38] and agriculturalists [2123]. Arnold [17] stresses the role of leaders in technological innovation that increased carrying capacity. For example, the Chumash, a maritime culture in the North American Pacific, developed large boats made of rare materials, which required teams of specialists to construct. Consequently, only high-status individuals could finance and organize their construction. The boats greatly increased productivity by allowing access to new marine resources, and by increasing the amount of resource that could be transferred simultaneously. This increased carrying capacity [17], but also led to increased stratification by providing surplus resources that boat owners could monopolize.

There is also evidence that leaders coordinated the construction of irrigation systems [2123], even if not in the state-building sense argued by Wittfogel [20]. Spencer [22] presents archaeological evidence that the Purrón Dam, an irrigation system in prehispanic Mexico, was constructed by a faction that aspired to leadership. Because canal irrigation was essential for agriculture in this area, other individuals would have benefitted from following this faction in order to gain access to water [22]. Spencer presents evidence that population growth subsequently occurred, causing the leadership faction to coordinate many followers in the construction of a larger dam. Moreover, there is evidence that this expansion of both population size and the irrigation system led to increased social stratification, with elites beginning to trade surpluses that they controlled for prestige goods [22]. This fits the feedback between demographic expansion and hierarchy formation captured by our model.

An important question is why despotic hierarchy evolved under intensive food production, but not under hunting and gathering? Our results suggest that demography plays an important role in the stability of despotism. When groups are small, then hierarchy can easily collapse if despots take too much resource. But if groups are larger, then density-dependent competition means that hierarchical individuals can outcompete acephalous individuals for shared resources, even when despots retain most of the surplus. Demographic expansion can therefore cause individuals to become locked into hierarchy, by destroying the viability of a previous non-hierarchical niche. Although human health appears to have declined with the origin of agriculture [39], and agriculture may initially have been less productive than hunter–gathering [40], cemetery data strongly imply that a demographic expansion indeed occurred during the Neolithic [8]. Other data indicate that the population density of hunter–gatherer groups is usually below 0.1 person per square mile, whereas that of early dry farmers is around 4 persons per square mile, and that of early irrigation farmers from 6 to 25 persons per square mile [7]. The construction of irrigation systems, for example, could thus trigger the coevolution of demographic expansion and despotism.

Our model predicts that despotism should increase with increasing dispersal costs, for which there is strong empirical support [5,30,41]. Carneiro [41] presents evidence that state formation (increased hierarchy) happens when relatively small areas of productive agricultural land are surrounded by geographical barriers. This then allows leaders to extract tribute from other individuals, whose options to leave the group are limited. For example in Peru, early states evolved where agriculture was practiced in narrow valleys, making dispersal difficult. By contrast, states did not so readily evolve in the Amazon basin where there were large expanses of agricultural land available, making dispersal relatively easy [41]. Allen [5] also stresses the role of dispersal costs in the creation of the despotic ancient Egyptian state. He argues that the deserts bordering the Nile made dispersal very costly, thus allowing the Pharaohs to extract a large surplus from agriculturalists. Similarly, technological development can increase dispersal costs. For example, irrigation farming was likely to tie agriculturalists to the irrigation system, again limiting free movement and choice of leader [20,22].

In conclusion, our model predicts that despotic social organization will evolve from an initial state of egalitarianism when: (i) leaders generate surplus resources leading to demographic expansion of their groups, which removes the viability of an acephalous niche in the same area; and (ii) high dispersal costs subsequently limit outside options for followers by restricting choice of leader. The empirical evidence reviewed here suggests that these conditions were likely to have been satisfied during the Neolithic.

Funding statement

This work was supported by Swiss NSF grant no. PP00P3-123344.


We thank Mark van Vugt and two anonymous reviewers for useful comments on the manuscript. The computations were performed at the Vital-IT ( Center for high-performance computing of the SIB Swiss Institute of Bioinformatics.

  • Received June 5, 2014.
  • Accepted July 15, 2014.


View Abstract