Judgement Under Uncertainty: Heuristics and Biases by Amos Tversky and Daniel Kahneman

This article first appeared in Science, volume 185, in 1974. Tversky and Kahneman had been working for some time on unconscious biases in cognitive thinking and this paper summarises the findings of a number of their experiments. The paper was reprinted as an appendix in Kahneman’s 2011 book, Thinking, Fast and Slow. It is overflowing with ideas and insights about key aspects of how humans think, to be precise:

This article shows that people rely on a limited number of heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations.

The article focuses on three ‘heuristics’ which people use to assess probabilities and predict values and highlights their flaws and limitations. What is a heuristic? An intellectual short cut, a rule of thumb, a quick practical way of solving a problem.

The three heuristics discussed by the article are:

  1. Representativeness
  2. Availability
  3. Adjustment and anchoring

1. Representativeness

People make estimates and judgments of things and other people, based on their similarity to existing stereotypes, to representative types. This is the representativeness heuristic or, as it’s come to be known, the representative bias. In doing, people tend to completely ignore statistical and probabilistic factors which ought, in more rational thinking, to carry more weight.

T&K gave experimental subjects a description of ‘Steve’, describing him as shy and timid, meek and helpful and interested in order. The subjects were then asked to guess Steve’s profession from a list which included librarian and farmer. Most subjects guessed he was a librarian on the basis of his closeness to a pre-existing stereotype. But, given that there are ten times as many farmers in the U.S. as librarians and in the absence of any definitive evidence, in terms of pure probability, subjects should have realised that Steve is much more likely to be a farmer than a librarian.

In making this mistake, the subjects let the representativeness heuristic overshadow considerations of basic probability theory.

Insensitivity to prior probability of outcomes The prior probability or base rate frequency describes the likely occurrence of the event being assessed, the likelihood of an event occurring without any other intervention, its basic probability.

T&K told experimental subjects there were ten people in a room, nine men and one woman. Then T&K told the subjects that one of these ten people is caring and sharing, kind and nurturing, and asked the subjects who the description was of. Without any concrete evidence, the chance of it being the woman is the same as it being any of the men i.e. 1 in 10. But the representativeness heuristic overrode an understanding of base rate probability, and most of the subjects confidently said this description must be of the woman. They were overwhelmingly swayed by the description’s conformity to stereotype.

Insensitivity to sample size People don’t understand the significant difference which sample size makes to any calculation of probability.

Imagine a town has two hospitals, one large, one small. In the large one about 45 babies are born every day, in the small one about 15 babies. Now, the ratio of boys and girl babies born anywhere is usually around 50/50, but on particular days it can vary. Over a year, which hospital do you think had more days on which 60% or more of the babies born were boys?

When students were asked this question, 21 said the large hospital, 21 said the small hospital and 53 said it would be the same at both. The correct answer is the small hospital. Why? Because smaller samples are more likely to be unrepresentative, to have ‘freakish’ aberrations from the norm. T&K conclude that:

This fundamental notion of statistics is evidently not part of people’s repertoire of intuitions.

Imagine an urn filled with balls. Two thirds are one colour, a third are another. A subject draws five balls and finds 4 are red and one is white. Another subject draws 20 balls and finds that 12 are red and 8 are white. Which subject should feel more confident that 2/3 of the balls in the urn are red, and why?

Most people think it’s the first subject who should feel more confident. Four to one feels like – and is – a bigger ratio. Big is good. But they’re wrong. The second subject should feel more confident because, although his ratio is smaller – 3 to 2 – his sample size is larger. The larger the sample size, the closer you are likely to get to an accurate picture.

Misconception of chance Here are three sets of results from tossing a coin six times in a row, where T stands for tails and H stands for heads. Ask a selection of people which of the three sets is the random one.

  1. TTTTTT
  2. TTTHHH
  3. THHTTH

Most people will choose set 3 because it feels random. But, of course, all three are equally likely or unlikely. Tversky and Kahneman speculate that this is because people have in mind a representation of what randomness ought to look like, and let this override their statistical understanding (if they have any) that the total randomness of a system need not be exactly replicated at every level. In other words, a random series of tossing coins might well throw up sequences which appear to have order.

The gambler’s fallacy is the mistaken belief that, if you toss enough coins and get nothing but heads, the probability increases that the next result one will be tails, because you expect the series to ‘correct’ itself.

People who fall for this fallacy are using a representation of fairness (just as in the example above they use a representation of chaos) and letting it override what ought to be a basic knowledge of statistics, which is that each coin toss stands on its own and has its own probability i.e. 50/50 or 0.5. Just because someone tosses an increasing number of heads in a row is no reason at all for believing their next toss will be tails.

(In reality we all know that sooner or later a heads is likely to appear due to the law of large numbers, namely that if you perform probabilistic events enough times the total sum of events is likely to revert to the overall expected average. T&K shed light on the interaction of the gambler’s fallacy and the law of large numbers by clarifying that an unusual run of results is not ‘corrected’ by the coin (which obviously has no memory or intention) – such runs are diluted by a large number of occurrences, they are dissolved in the context of larger and larger samples.)

Insensitivity to predictability Subjects were given descriptions of two companies, one described in glowing terms, one in mediocre terms, and then asked about their future profitability. Although neither description mentioned anything about profitability, most subjects were swayed by the representativeness heuristic to predict that the positively described company would have higher profits.

Two groups of subjects were given descriptions of one practice lesson given by several student teachers. One group was asked to rate the teachers’ performances based on this one class, the other group was asked to predict the relative standing of the teachers five years in the future. The ratings of the groups agreed. Despite the wild improbability of being able to predict anything in five years time from one provisional piece of evidence, the subjects did just that.

The illusion of validity People make judgments or predictions based on the degree of representativeness (the quality of the match between the selected  outcome and the input) with no regard for probability or all the other factors which limit predictability. The illusion of validity is the profound mental conviction engendered when the ‘input information’ approaches representative models (stereotypes). I.e. if it matches a stereotype, people will believe it.

Misconceptions of regression Most people don’t understand a) where ‘regression to the mean’ applies b) recognise it when they see it, preferring to give all sorts of spurious explanations. For example, a sportsman has a great season – the commentators laud him, he wins sportsman of the year – but his next season is lousy. Critics and commentators come up with all kinds of reasons to explain this performance, but the good year might just have been a freak and now he has regressed closer to his average, mean ability.

2. Availability

Broadly speaking, this means going with the first thing that comes to mind. Like the two other heuristics, the availability heuristic has evolved because, in evolutionary terms, it is quick and useful. It does, however, in our complex industrial societies, lead to all kinds of biases and errors.

Biases due to the retrievability of incidences Experimenters read out a list of men and women to two groups without telling them that the list contained exactly 25 men and 25 women, then asked the groups to guess the ratio of the sexes. If the list included some famous men, the group was influenced to think there were more men, if the list included a sprinkling of famous women, the group thought there are more women than men. Why? Because the famous names carry more weight and literally influence people into thinking there are more of them.

Salience Seeing a house on fire makes people think about the danger of burning houses. Driving past a motorway accident makes people stop and think and drive more carefully (for a while). Then it wears off.

Biases due to the availability of a search set Imagine we sample words from a random text. Will there be more words starting with r or with r in the third position? For most people it is easier to call to mind words starting in r, so they think there are more of them, but there aren’t: there are more words in the English language with r in the third position than those with start with r.

Asked to estimate which are more common, abstract words like ‘love’ or concrete words like ‘door’, most subjects guess incorrectly that abstract words are more common. This is because they are more salient – love, fear, hate – and have more power in the mind. Are more available to conscious thought.

Biases of imaginability Say you’ve got a room of ten people. They have got to be formed into ‘committees. How many committees can be created which consist of between 2 and 8 people? Almost all people presented with this problem estimated there were many more possible committees of 2 than of 8, which is incorrect. There are 45 possible ways to create committees of 2 and of 8 (apparently). People prioritised 2 because it was easier to quickly begin working out permutations of 2, and then extrapolate this to the whole sample. This bias is very important when it comes to estimating the risk of any action, since we are programmed to call to mind big, striking, easy-to-imagine risks and often overlook hard-to-imagine risks (which is why risk factors should be written down and worked through as logically as possible).

Illusory correlation Subjects were given written profiles of several hypothetical mental patients along with drawings the patients were supposed to have made. When asked to associate the pictures with the diagnoses, subject came up with all kinds of spurious connections: for example, told that one patient was paranoid and suspicious, many of the subjects read ‘suspiciousness’ into one of the drawings and associated it with that patient, and so on.

But there were no connections. Both profiles and drawings were utterly spurious. But this didn’t stop all the subjects from making complex and plausible networks of connections and correlations.

Psychologists speculate that this tendency to attribute meaning is because we experience some strong correlations, especially early in life, and then project them onto every situation we encounter, regardless of factuality or probability.

It’s worth quoting T&K’s conclusion in full:

Lifelong experience has taught us that, in general, instances of large classes are recalled better and faster than instances of less frequent classes; that likely occurrences are easier to imagine than unlikely ones; and that the associative connections between events are strengthened when the events frequently co-occur. As a result, man has at his disposal a procedure (the availability heuristic) for estimating the numerosity of a class, the likelihood of an event, or the frequency of co-occurrences, by the ease with which the relevant mental operations of retrieval, construction, or association can be performed.

However, as the preceding examples have demonstrated, his valuable estimation procedure results in systematic errors.

3. Adjustment and Anchoring

In making estimates and calculations people tend to start from whatever initial value they have been given. All too often this value is not just wrong, but people are reluctant to move too far away from it. This is the anchor effect.

Insufficient adjustment Groups were given estimating tasks i.e. told to estimate various fairly easy values. Before each guess the group watched the invigilator spin a roulette wheel and pick a number entirely at random. Two groups were asked to estimate the number of African nations in the United Nations. The group which had watched the invigilator spin a roulette number of 10 guessed the number of nations at 25, the group which had watched him land a 65, guessed there were 45 nations.

Two groups of high school students were given these sums to calculate in 5 seconds: first group 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8, second group 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1. Without time to complete the sum both groups extrapolated from the part-completed task: first group guessed 512, second group guessed 2,250. (Both were wrong: it’s 40,320).

Biases in the evaluation of conjunctive and disjunctive events People tend to overestimate the probability of conjunctive events and underestimate the probability of disjunctive events. I found their explanation a little hard to follow here, but it seems to mean that when several events all need to occur in order to result in a certain outcome, we overestimate the likelihood that all of them will happen. If only one of many events needs to occur, we underestimate that probability.

Thus: subjects were asked to take part in the following activities:

  • simple event: pull a red marble from a bag containing half red marbles and half white marbles
  • conjunctive event: pulling a red marble seven times in succession from a bag containing 90% red and 10% whites – the point is, that this is only an event if it happens seven times in succession
  • disjunctive event: pulling a red marble at least once in seven successive goes

So the simple event is a yes-no result, with 50/50 odds; the conjunctive event requires that seven things happen in succession (pretty low odds); and the disjunctive event is a one (or more) in seven chance. Almost everyone overestimated the chances of the seven times in succession event compared to the at-least-one-in-seven outcome.

They then explain the real world significance of this finding. The development of a new product is a typically conjunctive event: a whole string of things must go right in order for the product to work. People’s tendency to overestimate conjunctive events leads to unwarranted optimism, which sometimes results in failure.

By contrast disjunctive structures are typically used in the calculation of risk. In a complex system, just one thing has to fail for the whole to fail. The chances of failure in each individual component might be low, but adding together the chances results in a high probability that something will go wrong, somewhere.

Yet people consistently underestimate the probability of disjunctive events, thus underestimating risk.

This explains why estimates for the completion of big, complex projects always tend to be over-optimistic – think Crossrail.

Anchoring in the assessment of subjective probability distributions This is an advanced statistical concept which they did not explain very well. I think it was to do with how you set a kind of basic value for a person’s guesses and estimates, and T&K then proceed to show that these kinds of calibrations are often wildly inaccurate.

Discussion

At the end of the summary of experiments, Tversky and Kahneman discuss their findings. This part was tricky to follow because they don’t discuss their findings’ impact on ordinary life in terms you or I might understand, but instead assess the impact of their findings on what appears to have been (back in 1974) modern decision theory.

think the idea is that modern decision theory was based on a model of human rationality which was itself based on an idealised notion of logical thinking calculated from an assessment or ‘calibration’ of subjective decision-making.

Modern decision theory regards subjective probability as the quantified opinion of an ideal person.

I found it impossible to grasp the detail of this idea, maybe because they don’t explain it very well, assuming that the audience for this kind of specialised research paper would already be familiar with it. Anyway, Tversky and Kahneman say that their findings undermine the coherence of this model of ‘modern decision theory’, explaining why in technical detail which, again, I found hard to follow.

Instead, for the lay reader like myself, the examples they’ve assembled, and the types of cognitive and logical and probabilistic errors they describe, give precision and detail enough to support one’s intuition that people (including oneself) are profoundly, alarmingly, irrational.

Summary

In their words:

This article described three heuristics that are employed in making judgements under uncertainty: (i) representativeness, which is usually employed when people are asked to judge the probability that an object or event A belongs to class or process B; (ii) availability of instances or scenarios, which is often employed when people are asked to assess the frequency of a class or the plausibility of a particular development; and (iii) adjustment from an anchor, which is usually employed in numerical prediction when a relevant value is available.

These heuristics are highly economical and usually effective, but they lead to systematic and predictable errors. A better understanding of these heuristics and of the biases to which they lead could improve judgments and decisions in situations of uncertainty.

My thoughts

1. The most obvious thing to me, fresh from reading John Allen Paulos’s two books about innumeracy and Stuart Sutherland’s book on irrationality, is how much the examples used by Tversky and Kahneman are repeated almost verbatim in those books, and thus what a rich source of data this article was for later writers.

2. The next thought is that this is because those books, especially the Sutherland, copy the way that Tversky and Kahneman use each heuristic as the basis for a section of their text, which they then sub-divide down into component parts, or variations on the basic idea.

Reading this paper made me realise this is exactly the approach that Sutherland uses in his book, taking one ‘error’ or bias at a time, and then working through all the sub-types and examples.

3. My next thought is the way Sutherland and Paulos only use some of the examples in this paper, the ones – reasonably enough – which are most comprehensible. Thus the final section in Tversky and Kahneman’s paper – about subjective probability distributions – is not picked up in the other books because it is couched in such dense mathematical terminology as to be almost impenetrable and because the idea they are critiquing – 1970s decision making theory – is too remote from most people’s everyday concerns.

So: having already read Paulos and Sutherland, not many of the examples Tversky and Kahneman use came as a surprise, nor did the basic idea of the availability error or representative error or the anchor effect.

But what did come over as new – what I found thought provoking – was the emphasis they put throughout on the fundamental usefulness of the heuristics.

Up till now – in Paulos and Sutherland – I had only heard negative things about these cognitive errors and prejudices and biases. It was a new experience to read Tversky and Kahneman explaining that these heuristics – these mental shortcuts – although they are often prone to error – nonetheless, have evolved deep in our minds because they are fundamentally useful.

That set off a new train of thought, and made me reflect that Paulos, Sutherland and Tversky and Kahneman are all dwelling on the drawbacks and limitations of these heuristics, leaving the many situations in which they are helpful, undescribed.

Now, as Sutherland repeats again and again – we should never let ourselves be dazzled by salient and striking results (such as coincidences and extreme results), we should always look at the full set of all the data, we should make sure we consider all the negative incidents where nothing dramatic or interesting happened, in order to make a correct calculation of probabilities.

So it struck me that you could argue that all these books and articles which focus on cognitive errors are, in their own way, rather unscientific, or lack a proper sample size – because they only focus on the times when the heuristics result in errors (and, also, that these errors are themselves measured in highly unrealistic conditions, in psychology labs, using highly unrepresentative samples of university students).

What I’m saying is that for a proper assessment of the real place of these heuristics in actual life, you would have to take into account all the numberless times when they have worked – when these short-cut, rule-of-thumb guesstimates, actually produce positive and beneficial results.

It may be that for every time a psychology professor conducts a highly restricted and unrealistic psychology experiment on high school students or undergraduates which results in them making howling errors in probability or misunderstanding the law of large numbers or whatever — it may just be that on that day literally billions of ‘ordinary’ people are using the same heuristic in the kind of real world situations most of us encounter in our day-to-day lives, to make the right decisions for us, and to achieve positive outcomes.

The drawbacks of these heuristics are front and centre of Paulos and Sutherland and Tversky and Kahneman’s works – but who’s measuring the advantages?


Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology