How Rationality Leads to Sub-Optimal Outcomes

How Rationality Leads to Sub-Optimal Outcomes

Human rationality. The exercise of reason, the unique capacity human beings possess to make sense of things, to establish and verify facts and to justify their beliefs and actions. Labelled as ‘homo economicus’ or ‘economic man’, the concept of humans as rational beings goes further than simply our ability to reason: it depicts humans as self-interested actors who have the ability to maximise their desired outcomes and minimise their costs. Comparing the costs and benefits of an action is known as rational choice theory and a rational human being is one that weighs up these potential outcomes and elects the decision that best optimises his chance to pursue his goals, or more bluntly put, his selfish interests.

This being said, how is it possible that a rational, self-interested human being making logical and seemingly rational decisions can create such an outcome that does not lead to an optimal outcome of greater benefits than costs, but rather a sub-optimum outcome, both for the individual and for society? In the following examples I will demonstrate through the theory of games that such result is perfectly achievable and will highlight the concept of ‘rational irrationality’, whereby the application of self-interest leads to an inferior and socially irrational outcome. Originally drawn up by Merrill Flood and Melvin Dresher working at RAND, a US ‘think tank’, in 1950, the prisoner’s dilemma is a classic analogy that demonstrates why two individuals acting rationally will not agree, even if it appears if their best interests to do so.

The situation is thus:

Two men are arrested but the police do not have enough evidence
to sentence either man. The pair are split up and placed in two
separate cells where they cannot communicate with one another
and the police officers offer each man a deal. They are given two
options: confess to the crime, or deny the crime. If both men confess,
they each will serve 5 years. If they both deny committing the crime,
each man receives a year on a lesser charge. However if one man
confesses and the other one denies, the confessor gets off free and
the other man gets the biggest sentence of 10 years. What is the best
strategy for each man to adopt?

This can be shown using a pay-off matrix:

Prisoner 2
Deny Confess
Prisoner 1 Deny 1,1 10,0
Confess 0,10 5,5

As we can see, the most desirable outcome is the one in the top left entry, as both prisoners receive the shortest sentence and therefore minimise their cost. But can this most optimum outcome be achieved?

The key to analysing any game in game theory is to work out what your opponent is likely to do and then select your best option in response. So, assume you are Prisoner 1. The severity of your sentence depends on Prisoner 2’s decision. He has two options: If he sticks to denial, you options are limited to the first column, where it is clear that you should confess as you would get away for free as opposed to ending up with 1 year in jail. Now, if Prisoner 2 confesses your choice is reduced to column two. If you deny the charge, you receive the maximum 10 year sentence – the worst possible outcome. Confess and you will receive 5 years. Once again, confessing is the better option. So you conclude that whatever Prisoner 2 does, your best option is to confess. As both prisoners are rational actors, they both deduce that confessing is the best option. Thus, confessing is the ‘dominant strategy’, even though it leads to a sub-optimal outcome for both players. Even if this game is repeated, the prisoners both realise that trying to co-operate with somebody whose actions they cannot control, does not pay off, and they opt for the ‘confess’ strategy, and rationally so.

One criticism made of the prisoner’s dilemma is that surely the solution of ‘confess-confess’ resulting in a total of 5 years a piece cannot possibly be the rational outcome, because rational decision makers like our two prisoners would never deliberately choose an inferior outcome. This point would be valid only if the pair of criminals could communicate with one another. But this cooperation is ruled out by the assumption that both prisoners are kept in separate cells and cannot talk to each other. If there were allowed to talk, then both men could collude and promise to ‘deny’, meaning each man would receive a light sentence.

The prisoner’s dilemma is not merely an analogy depicting rational irrationality, but instead it forms the basis for many real-life situations. The installation game is another example where rationality does not lead to the optimum outcome.

Consider the following:

Two firms each own a coal-burning power plant and are in direct
competition to supply power to a small town. Currently the firms
split the market, each making £20 million in annual profits.
However, the local residents are pursuing a long-running lawsuit
to try and stop the firms polluting the area. Defending the lawsuit
is expensive, and will cost each firm £10 million a year. There is an
alternative: at £5 million each per year, the firms could install a filtering
system that would reduce their harmful emissions. Subsequently, the residents would drop the lawsuit and so each firm would make net
profits of £15 million per annum. If one firm installs the filter but the other does not, the latter will be able to supply power at a lower price thus increasing it’s market share. Its profits will rise to £20 million while the rival firm who installed the filters will lose out, making only £5 million annually. What decision should each firm take?

Firm B
Install Don’t Install
Firm A Install 15,15 5, 20
Don’t Install 20, 5 10,10
A similar payoff matrix can be designed; the numbers are in £ million.

Similar to the prisoner’s dilemma, the top left entry is the most desirable. The total payout if both firms were to install the filters is £30 million, which is higher than any other of the outcomes. In addition to this, the local town would be better off too, as the power plants’ harmful emissions would be reduced. But again we must put ourselves in the shoes of the CEO of Firm A. Imagine you have just convinced yourself that installing filters is the best way forward, and as the CEO of Firm B is equally as professional and skilled as you are, it is rational to assume he has reached the same conclusion too. If this were the case however, you would then decide your firm ought to betray your plans for conservation and choose not to install your filters, thereby increasing your profits from £15 million to £20 million and shoving your rival’s face in the dirt causing him to lose £10 million. Congratulations on your cunning plan!

But hang on, this is not the end. If you have just worked out this logical plan of action, then so has the head of Firm B which means he too will not look to install his filters. This decision from Firm B is a possibility, which means you should not even consider installing your filters, lest your rival pulls the same stunt you were initially planning to exact on him, and your profits plummet to just £5 million. Whatever Firm B does, is it in your best interest to choose the ‘don’t install’ option – this now becomes the dominant strategy, or the ‘confess’ option for our prisoner’s dilemma.
So as you can see, the CEO of each firm has acted rationally in his decision not to install filters but this is irrational rationality at work: the best possible outcome of £15 million profit each and a cleaner environment for our town is not achieved; rather a one where each firm receives only £10 million each and continues to pollute our nice, little town. The fact of the matter is, the temptation to steal extra market share from your opponent or the threat of others doing the same to you, outweighs the cooperative decision and optimum outcome.

Despite this, the prisoner’s dilemma is not always a bad thing. In many industries, firms would like to restrict supply and raise the price of their goods or services, but such a strategy of fiendish collusion is difficult to create and maintain. This is because each of the actors in the market has an incentive to betray its opponents and raise its output, gaining more market share. This is especially true in industries containing homogenous goods, such as oil and airlines.
Firm B
Firm A Low Low High
High 100, 100 25, 200
200, 25 50, 50

The table above shows a typical production game. Each player, again, has a choice of two strategies to adopt; “low” or “high” output. It is clear to see that the mutually benefit outcome is for both firms to run a ‘low’ output strategy, making profits of £100 million each. But if either firm plays ‘low’, the other firm now has an incentive to play ‘high’, as the firm with the higher output would see its profits double to £200 million, and the other only makes a £25 million one. Therefore, the dominant strategy for both firms is to play ‘high’, resulting in a sub-optimum outcome (for the firms) of £50 million each.
In industries such as oil and airlines, prisoner’s dilemmas help to prevent cartels from operation effectively.

OPEC, the association of oil-producing countries, is such an example. Frequently the members of the association will get together and make many loud claims about cutting output and raising prices, but this rarely happens thanks to our prisoner’s dilemma of ‘high’ being the dominant strategy. For western consumers, this is a blessing. Here, our prisoner’s dilemma ensures a ready supply of cheap oil for our cars, heating and power.
Unfortunately, an increased output can sometimes over-supply the market and lead to socially undesirable effects. For instance, when house prices are high and interest rates are low, banks will lower their credit standards in order to seek out quick profits. Borrowing becomes all too easy to do, and even borrowers who are desperately close to defaulting will be allowed to take out big loans from the banks that have the lowest credit standards. This is a classic example of ‘moral hazard’, a situation in which an operator protected from risk behaves differently from how it would behave if it were fully exposed to the risk. Initially, the more responsible banks hold back their lending and adopt a ‘low’ output strategy. However, as soon as even the most responsible banks start seeing their competitors gaining market share, the pressure to switch their approach to ‘high’ output becomes irresistible. Eventually, almost all the banks are operating on a ‘high’ strategy, resulting in disastrous consequences for everyone, as we saw with the economic recession 2007-09, and are still reeling from today. Although perhaps irresponsible, each individual bank operated rationally in response to its competitor’s decision: it increased output to ensure its own survival, and in the process of doing so, loans were given out to people who had no chance at all to pay them back – this issue was outright ignored.

So far these examples have all been centred on big firms making big decisions; one may reasonably think the issue of rational irrationality has got nothing the do with them whatsoever. But this problem of our rational brains deducing our actions so as to lead to a sub-optimum outcome plagues us even at a low, personal level.

One of Merrill Flood and Melvin Dresher’s first experiments in their theorising about cooperative gameplay involved Dresher’s three teenage children. Would they work together and reach a mutually beneficial solution? Or would they fall into the same trap as our cutthroat CEO pairings in previous examples? He offered them a simple proposition: a babysitting job for which he was prepared to pay them a maximum of $4. To decide which of his children got the job, he arranged a reverse auction asking each of the teenagers to submit the lowest wage they would accept to do the job. Despite forceful encouragement from Dresher to get together and cooperate, the young Dreshers became involved a bidding war, submitting competing bids. The winning bid which secured the job was a mere $0.90! If the three teens had agreed to accept the highest paid of $4, they could have equally divided the money receiving $1.33 each, rather than only one child receiving £0.43 less! This time, even three inexperienced teenagers did not achieve the optimum outcome despite the fact they were allowed, and explicitly encouraged, to communicate.

A further example is the auctioning of a £10 note. Designed by economist Martin Shubik this game illustrates a scenario where players with perfect information are compelled to make an ultimately irrational decision based completely on a sequence of rational choices made throughout the game.

An auctioneer is auctioning off a £10 note with the following rule: the 10 pounds goes to the highest bidder who pays the amount he bids. The second-highest bidder also pays the highest amount that he bid, but does not receive the £10 in return. To make matters easier, the bidders have to bid in £1 increments.

Suppose the first player bids £1 to begin the game, hoping for a tidy £9 profit to take home with him. He will quickly be outbid by the second player who bids £2 looking for that £8 profit. Then, the first bidder will seek to overturn his current £1 loss, by bidding £3 to gain a £7 profit. In this same way, a series of bids is maintained. A clear problem arises when one bidder declares a £9 bid; his opponent has made an £8 bid previously and now has a choice – to back out and lose his £8 bid, or bid £10 for a 10 pound note and break even. Rationally, he elects to bid £10. However, after that the player that had bid £9 will rationally choose to overpay for the £10 note by bidding £11, losing only 1 pound, instead of losing his £9. After this point, the players continue raise their bets, bidding the value up way beyond £10 and neither of them stand to profit. A clearly irrational decision – to pay more than £10 for £10 in return – has been made, but via an entirely rational sequence of decisions.

I had the pleasure of being fooled by such a game at the University of York, where I instantly jumped at the chance of making a nice sum of money to accompany what had been a useful day out. It was not until I confidently yelled a triumphant bid of £9 that I realised what I had done, and what I now could not get out of. Thankfully, the lecturer did not force my opponent or I to pay up, but this was a valuable firsthand experience of rational irrationality, one of the many factors that stimulated my interest in this area of study.

So far to capture the essential foundations of the prisoner’s dilemma I have outlined two-player games. In reality, banks, CEOs and other actors are constantly facing an ‘n-person game’. We have seen how hard it was to sustain a cooperative outcome with just two players, but when there are 10 or 50 or 100, it is almost impossible. Herein lies the crux of the matter: multiple individuals, acting independently and rationally consulting their own self-interest, will ultimately deplete a shared limited resource, even when it is clear that it is not in anyone's long-term interest for this to happen.

In 1968, a Texan ecologist named Garrett Hardin posed this issue in an article called ‘The Tragedy of the Commons’. He demonstrated the unsustainability of resources with an example of a pasture shared by local farmers. The size of the pasture is limited and all farmers know that overgrazing the pasture will render it useless for everyone. At the same time though, the farmers’ incomes are determined by the size of their individual herd, creating an incentive for them to add more livestock to the pasture. Before each farmer adds another cattle to his herd, he considers the consequences of doing so. On a positive note, once the animal is fully grown he can sell it and keep all the profits for himself. On the negative side, this extra animal adds to the risk of overgrazing which damages the pasture for all the farmers. Hardin was clear to point out that each farmer is primarily concerned with his own welfare.
As a consequence, the prospect of profits from another animal is far more inviting than worrying about the quality of the pasture for the other farmers. The rational farmer decides that the only sensible strategy is for him to add another animal to the pasture… and another. But this is the conclusion that each farmer reaches, and thus every rational herdsman keeps increasing the size of his herd until the pasture is ruined by overgrazing; and therein is the tragedy – each farmer is locked in a system that compels him to increase his herd without limit, in a world which is limited.

Although the story of the herdsman is just a parable, it is a valuable analogy relating to issues to the sustainability of our planet. Rivers, lakes, mountains, the oceans and oil reserves are all commons, yet we are showing irrational rationality by exploiting these vital resources via logging, pollution and deforestation. Blind reliance on self-interest is a recipe for further environmental catastrophes.

Perhaps the first step towards dealing with such issues is to recognise the slightly perverse nature of rational irrationality and how hard it is to tackle. Critics of game theorists often ask how it can possibly be rational for society to engineer its own ruin. They argue that surely everyone would be better off if everybody were to grab less of our every-depleting resources. The error in such an argument is to assume that a person in life can be branded as ‘everybody’. We are all separate individuals possessing our own aims and desires, and we therefore act in our own way, for our own reasons. If we fail to establish this key point, we will never get to grip with the Tragedy of the Commons.