Optimal behaviour can violate the principle of regularity

Pete C. Trimmer

Abstract

Understanding decisions is a fundamental aim of behavioural ecology, psychology and economics. The regularity axiom of utility theory holds that a preference between options should be maintained when other options are made available. Empirical studies have shown that animals violate regularity but this has not been understood from a theoretical perspective, such decisions have therefore been labelled as irrational. Here, I use models of state-dependent behaviour to demonstrate that choices can violate regularity even when behavioural strategies are optimal. I also show that the range of conditions over which regularity should be violated can be larger when options do not always persist into the future. Consequently, utility theory—based on axioms, including transitivity, regularity and the independence of irrelevant alternatives—is undermined, because even alternatives that are never chosen by an animal (in its current state) can be relevant to a decision.

1. Introduction

We have penetrated far less deeply into the regularities obtaining within the realm of living things, but deeply enough nevertheless to sense at least the rule of fixed necessity … what is still lacking here is a grasp of the connections of profound generality. Albert Einstein

Animals must often choose from discrete options: where to hunt, what to eat, with whom to mate and whether to hide, flee or fight. The value of a particular option will typically depend on the ecological setting. I consider how optimal choices can be modified by the range of available options in the current environment.

The principle of regularity states that ‘the addition of an option to a choice set should never increase the probability of selecting an option from the original set’ [1, p. 664]. Regularity is an axiom of rational choice and is therefore a cornerstone of utility theory. In its simplest form, regularity means that if option B is chosen from A and B, then if another option C, is also available, B should still be preferable to A, so either B or C should be chosen.

Despite the rational underpinnings of the principle, empirical studies have shown regularity to be violated in a wide range of taxa, including honeybees and grey jays (Perisoreus canadensis) [2], hummingbirds [3,4] and human beings [57]. Such behaviour is deemed irrational and thought to be inexplicable without assuming suboptimal choices. For instance, by varying the amount of food and distance required to move into a tube to retrieve it, Shafir et al. [2] have shown that grey jays violate regularity and conclude that their behaviour contradicts normative accounts of behaviour.

This paper provides normative models which show that regularity should be violated under particular conditions, i.e. conditions exist whereby it is optimal to choose option B when options A and B are the only alternatives available, but A should be chosen when an additional option C is also available. Violation of regularity is demonstrated in two situations: with or without persistence of the available options. I also show that, counterintuitively, non-persistence of the options can increase the range of conditions under which optimal behaviour should violate regularity.

In each demonstration, I make use of state-dependent modelling [8]. It has long been known that animals in different states will place different values on the same options; as Bernoulli [9] put it, ‘a gain of one thousand ducats is more significant to a pauper than to a rich man’. Altering choice according to one's state (e.g. reserves) would not be a violation of regularity. Here, I am careful to compare the choices of animals which are in the same condition.

2. Models

I consider a predator choosing which herd of prey to attack, and suppose that there are (at most) three types of herd from which to choose. Each option has two aspects: the probability of catching a prey item and the probability of being killed while making an attack, as shown in table 1.

View this table:
Table 1.

Fitness-related aspects of three options.

Food items vary in size and have a mean size of 2. (When an item is obtained, the size of the reward is: 1 with probability 0.25, 2 with probability 0.5, and 3 with probability 0.25.)

The predator is assumed to use one unit of reserves per time step. At each time step, it can choose which type of herd (of those available) to attack. The predator has a maximum limit to its reserves of 50, and will starve to death if reserves fall to zero. The time-independent strategy which maximizes the probability of survival over a long period is calculated using dynamic programming [8,10].

(a) Model 1: persistent options

Model 1 assumes that the current herds will always be present and that no other herd will arrive, i.e. the available options persist into the future. For a given scenario (e.g. options A and B available), the optimal state-dependent strategy is calculated to identify which option should be chosen at each reserve level.

(b) Model 2: non-persistent options

In the real world, feeding options appear and disappear over time (e.g. owing to the movement of herds of prey). In model 2, the herds can appear or disappear with time.

It is assumed that each type of herd moves independently of the others, and that there are never two herds of the same type present at one time. When a particular herd is not present, it can re-appear at the next time step with some probability, p, and when a herd is present, it will disappear in the next time step with probability q. I assume that p and q are small, so the environment is auto-correlated. When there are no options, the predator loses one unit of reserves.

For the sake of simplicity, it is assumed that there are no game-theoretic effects of herds responding to the predator's choices (e.g. by being more likely to disappear following an attack) and that each herd is sufficiently large that the predator will not run out of potential prey within a herd.

In this case, the optimal long-term strategy depends not only on the predator's reserve level but the current presence or absence of each of the three options. Having identified the optimal strategy, it is then possible to identify which option should be chosen at each reserve level, either when both options A and B are present, or options A, B and C are present.

3. Results

(a) Model 1: persistent options

Figure 1 shows which option should be taken at each level of reserves, either when only options A and B are available, or all three options are available.

Figure 1.

Violation of regularity at intermediate reserves. For reserves between the dashed lines, option B should be chosen if only options A and B are available (left column), but option A should be chosen, if option C were also available (right column).

Option A should be chosen at high reserves, as this provides the least risk of predation while still potentially providing food. When option C is also available, it should be chosen at very low reserves owing to the risk of starvation, but it is not a good option at most reserve levels owing to the high risk of predation.

Figure 1 shows that for a range of reserve levels, B should be chosen from A and B, whereas A should be chosen from A, B and C. Thus, the optimal strategy produces behaviour which violates regularity.

The violation occurs because the value of an option depends on the options available to the animal in the future. Although option A provides the lowest risk of immediate mortality, reserves will tend to decrease under this option, so alternatives will sometimes need to be taken to prevent starvation. When option B is the only other available option, this must be taken even at quite high reserves, as it only provides an intermediate amount of food. When option C is also available, option A can be taken until reserves are relatively low before taking other options, because if the animal has been unfortunate (during a period of taking option B), then option C can be taken if the animal is about to starve. Consequently, although option C will not be taken at high reserves, its presence or absence can affect whether it is better to choose A or B.

If the risk of predation associated with option C were reduced (e.g. to 0.003), then option A should be taken right the way down to reserves of 1 when all three options were present, i.e. the effect would be even more pronounced.

Also, it is worth noting that if option C were replaced by a ‘hide’ option, where no food would be gained (so the predator is guaranteed to lose one unit of reserves in that time step) but there would be no risk of predation, then for any reserve level less than the maximum, it is better to take option A or B than to do nothing. At maximum reserves, it is better to hide than to seek food; this is not surprising, as the best that could be achieved would be maintenance of maximum reserves. However, the change also affects choices at lower reserves. At a reserve level of 32, given a choice of A or B, the predator should choose A, but with options of A, B or hide, it should choose B.

Thus, seemingly irrelevant options, which will not be chosen by a predator in a state anywhere near its current state, can be relevant.

(b) Model 2: non-persistent options

The optimal choices with respect to reserves are shown in figure 2, for situations where options A and B, or all three options, are currently available.

Figure 2.

Violation of regularity with non-permanent options. For a reserve level between the dashed lines, option B should be chosen when only options A and B are available (left column), but option A should be taken if all three options were available. Parameters: p = q = 0.001.

Contrasting figures 1 and 2, we see that regularity is violated over a greater range of reserve levels when the options are non-persistent. This occurs because with non-persistent options, the predator should choose option C at much higher reserve levels, whereas it is not necessary to take option B (rather than option A) at much higher reserves when all three options are present.

4. Discussion

The results show that optimal behaviour can violate the principle of regularity; given the choice of options A and B, it can be best to switch choices depending on whether another option, C (which will not be chosen at the current time), is present. Therefore, if rational behaviour is equated to optimal behaviour, the principle of regularity cannot be an a priori axiom of rational behaviour.

Bateson et al. [4] have shown that rufous hummingbirds (Selasphorus rufus) violate regularity, and conclude that they have identified, ‘irrational choices in hummingbird foraging behaviour’. Similarly, Shafir et al. [2] have shown that grey jays (P. canadensis) and honeybees (Apis mellifera) violate regularity and conclude that the behaviour contradicts normative accounts of behaviour. By demonstrating that it can be best to violate regularity (using a normative model), I have shown that the animals are not necessarily acting suboptimally, as has previously been assumed.

Morgan et al. [11] show that the choices of rufous hummingbirds can violate regularity even when options differ in only one dimension, such as food concentration. This initially appears puzzling, as a single dimension makes the measure sound similar to utility. However, if in the normal ecology, concentration is related not only to food but is also correlated with other factors such as risk of mortality (or strain on the digestive organs of the bird), we can again see how options differing in only one dimension (from the experimenter's perspective; cf. [12]) could lead to a violation of regularity.

In individual choice theory, the principle of independence of irrelevant alternatives (IIAs) means that ‘if an alternative x chosen from a set T is an element of a subset S of T, then x must be chosen from S’ [13, p. 4, 17]; this is also known as Sen's property α. Equating T with the set {A, B, C} in this paper, and S with {A, B}, we see that this definition of IIA is equivalent to regularity, i.e. the principle of IIA can also be violated by optimal behaviour. Note that others define IIA somewhat differently, leading to seemingly conflicting statements in different papers, e.g. ‘regularity is a special case of the principle of independence of irrelevant alternatives’ [4, p. 588] or ‘there is no logical connection between IIA and regularity’ [1, p. 632]. For a summary of various definitions and their logical differences, see [14]. Schuck-Paim et al. [15] show that European starlings (Sturnus vulgaris) violate regularity and also violate a stronger form of IIA known as the constant-ratio rule. This is also related to the ‘choice axiom’, whereby, ‘if a choice set S contains two elements, a and b, such that a is never chosen over b when the choice is restricted to just a and b, then a can be deleted from S without affecting any of the choice probabilities’ [16]. I have dealt with absolute choices here (in a given state, 100% or 0% of choices are for a particular option), so the demonstration covers all forms of regularity, Sen's property α, the choice axiom and the constant-ratio rule. In each case, the principle should be violated by optimal behaviour.

Our study is similar to Houston et al.'s [17] demonstration that optimal behaviour can violate the principle of transitivity (i.e. A is preferred to B, B is preferred to C, but C preferred to A). In each case, the demonstration rests upon the autocorrelation of choices stretching into the future. A formal logician might argue that neither principle has been shown to be violated: by assuming that options persist into the future, Houston et al.'s [17] study shows that given a choice of A or B, with a continuing choice of A and B into the future, the animal should choose A. Thus, knowing that the animal should choose B of B and C when the future holds choices of only B and C formally tells us nothing about whether A or C should be chosen when the future consists of choices between A and C. However, from an operational perspective (i.e. measuring the preferences of animals by giving them options), it is clear that the principles of transitivity and regularity should sometimes be violated if fitness is to be maximized.

Violation of regularity is more general than the violation of transitivity, in so much as the violation of transitivity implies the violation of regularity (see the electronic supplementary material). It is therefore possible to conclude from other studies (showing violation of transitivity) that animals previously not known to violate regularity must do so (e.g. pigeons; see [18]). This is also consistent with Latty & Beekman [19] finding that under particular conditions, slime mould violates regularity while conforming to the principle of transitivity.

Some authors identify that optimal state-dependent behaviour can cause what seem like violations of rationality if the experiment has inadvertently shifted the reserves of the animal [15,20,21]. Here, we show that such violations can occur even when the reserves have not been altered or additional learning has taken place. Others have attempted to explain the violation of regularity as a behavioural error which makes sense from the perspective of efficiency. For instance, Nicolis et al. [22] show that violations of transitivity can be caused by positive feedback in a population (e.g. in slime mould) and argue that the use of such feedback can be regarded as ‘a heuristic which often produces fast and accurate decisions’. Similarly, Edwards & Pratt [23] regard the violation of transitivity in ants as the result of cognitive constraints, with comparative evaluation of nests sites being a generally efficient shortcut to approximating absolute fitness—which they presume would cause no such violations (also expressed by Latty & Beekman [19]). Livnat & Pippenger [24] also identify that systematic errors will tend to be generated by agents with bounded rationality. Rather than explain away such violations, this paper shows that such behaviour can be optimal.

Houston [25] shows that rather than options having absolute fitness consequences, the value of an option depends on its context, i.e. the fitness consequences of an option can depend on what other options are likely to be available in the future. Thus, the condition of an animal is not sufficient to specify the absolute worth of an option. To make this clear in relation to regularity, let us imagine that there are just two vitamins which are crucial for health; say vitamin C (available in oranges) and vitamin B2 (available in meat or eggs). Let us assume that, without eating, we deplete our reserves of each vitamin at a similar rate. Given a choice of oranges or eggs, an individual may want to eat them both—but while eating one of the options, there may be a significant risk of the other option being eaten by someone else. If the individual has slightly lower levels of vitamin B2 than vitamin C, then they should choose the eggs. However, if meat were also available, with little risk of both the eggs and the meat being eaten if they choose the oranges first, then the oranges can be a better choice. Thus, we would again see a violation of regularity: animals in the same condition could choose eggs when faced with eggs or oranges, but oranges when faced with eggs, meat or oranges. This makes perfect sense if we regard the remaining options as probabilistic reserves (i.e. additional reserves to those within the body). Thus, the state of an individual is effectively linked to the state of the environment.

Without autocorrelation (i.e. if the options did not tend to persist over time, being independent from one time to another), this effect would disappear and every decision would, optimally, be transitive and regular. However, I have also shown that with non-persistent options, the scope of the effect can be greater (in terms of the range of body condition) than when the options are entirely persistent. Thus, the amount of autocorrelation in options over time can have subtle, and sometimes counterintuitive, consequences. (Even without autocorrelation of options into the future, it is best not to be entirely regular in many zero-sum games, as an opponent could exploit the regularity—but it is clear that it is best not to be too predictable in games, so I have not focused on the effect here.)

This work also indicates that when a behaviour is difficult to understand or explain, it may help to consider options which the animal would take when its condition had altered significantly, such as when it was starving. As a hypothetical example, let us say that lions often attack zebra or wildebeest; in one location, they seem to prefer wildebeest, whereas in another, near a watering hole where crocodiles reside, they tend to choose zebra. In trying to explain the difference in preference, it might be natural to presume that the crocodiles are in some way affecting the behaviour of the zebra or wildebeest which, in turn, affects how easily each prey type is attacked, whereas our second model shows that if a starving lion has the additional (desperate) choice of attacking crocodiles, this alone could explain why its preferences differ even when it is not starving.

Utility theory assumes that each option has absolute fitness consequences, i.e. that each option has a fixed fitness value, irrespective of the other options present. If this were true, then regularity, a cornerstone of utility theory, would hold [1]. Other authors assume that it is suboptimal to violate regularity and hence propose reasons for the effect, such as decision heuristics, speed–accuracy trade-offs resulting in comparative methods being better [26], and effects of attention mechanisms (various reasons are summarized by Rieskamp et al. [1]). However, having identified that options do not have absolute fitness consequences (because the value of an option can depend on the availability of other options), we know that utility theory does not hold, so it is no longer clear whether the empirical findings of violation of regularity indicate any suboptimal behaviour.

This paper has considered regularity in a simple manner by assuming that choices in a given state are always the same, so an option is taken either 100 per cent or 0 per cent of the time by an animal in a particular state. However, behavioural studies of animals often produce probabilistic results, so given the choice of options A and B, for instance, A may be taken 80 per cent of the time, whereas B is chosen 20 per cent of the time. The principle of regularity then requires that an additional option C, should not increase either of these percentages. However, it can be difficult to predict how the addition of another option should influence choices in these probabilistic scenarios. I leave this analysis to future papers.

Acknowledgements

This work was supported by the European Research Council (Evomech Advanced grant no. 250209 to A. I. Houston). Many thanks to Tim Fawcett, Andy Higginson, Alasdair Houston and anonymous referees for useful suggestions.

  • Received April 5, 2013.
  • Accepted May 14, 2013.

References

View Abstract