Judgement Under Uncertainty: Heuristics and Biases by Amos Tversky and Daniel Kahneman

This article first appeared in Science, volume 185, in 1974. Tversky and Kahneman had been working for some time on unconscious biases in cognitive thinking and this paper summarises the findings of a number of their experiments. The paper was reprinted as an appendix in Kahneman’s 2011 book, Thinking, Fast and Slow. It is overflowing with ideas and insights about key aspects of how humans think, to be precise:

This article shows that people rely on a limited number of heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations.

The article focuses on three ‘heuristics’ which people use to assess probabilities and predict values and highlights their flaws and limitations. What is a heuristic? An intellectual short cut, a rule of thumb, a quick practical way of solving a problem.

The three heuristics discussed by the article are:

  1. Representativeness
  2. Availability
  3. Adjustment and anchoring

1. Representativeness

People make estimates and judgments of things and other people, based on their similarity to existing stereotypes, to representative types. This is the representativeness heuristic or, as it’s come to be known, the representative bias. In doing, people tend to completely ignore statistical and probabilistic factors which ought, in more rational thinking, to carry more weight.

T&K gave experimental subjects a description of ‘Steve’, describing him as shy and timid, meek and helpful and interested in order. The subjects were then asked to guess Steve’s profession from a list which included librarian and farmer. Most subjects guessed he was a librarian on the basis of his closeness to a pre-existing stereotype. But, given that there are ten times as many farmers in the U.S. as librarians and in the absence of any definitive evidence, in terms of pure probability, subjects should have realised that Steve is much more likely to be a farmer than a librarian.

In making this mistake, the subjects let the representativeness heuristic overshadow considerations of basic probability theory.

Insensitivity to prior probability of outcomes The prior probability or base rate frequency describes the likely occurrence of the event being assessed, the likelihood of an event occurring without any other intervention, its basic probability.

T&K told experimental subjects there were ten people in a room, nine men and one woman. Then T&K told the subjects that one of these ten people is caring and sharing, kind and nurturing, and asked the subjects who the description was of. Without any concrete evidence, the chance of it being the woman is the same as it being any of the men i.e. 1 in 10. But the representativeness heuristic overrode an understanding of base rate probability, and most of the subjects confidently said this description must be of the woman. They were overwhelmingly swayed by the description’s conformity to stereotype.

Insensitivity to sample size People don’t understand the significant difference which sample size makes to any calculation of probability.

Imagine a town has two hospitals, one large, one small. In the large one about 45 babies are born every day, in the small one about 15 babies. Now, the ratio of boys and girl babies born anywhere is usually around 50/50, but on particular days it can vary. Over a year, which hospital do you think had more days on which 60% or more of the babies born were boys?

When students were asked this question, 21 said the large hospital, 21 said the small hospital and 53 said it would be the same at both. The correct answer is the small hospital. Why? Because smaller samples are more likely to be unrepresentative, to have ‘freakish’ aberrations from the norm. T&K conclude that:

This fundamental notion of statistics is evidently not part of people’s repertoire of intuitions.

Imagine an urn filled with balls. Two thirds are one colour, a third are another. A subject draws five balls and finds 4 are red and one is white. Another subject draws 20 balls and finds that 12 are red and 8 are white. Which subject should feel more confident that 2/3 of the balls in the urn are red, and why?

Most people think it’s the first subject who should feel more confident. Four to one feels like – and is – a bigger ratio. Big is good. But they’re wrong. The second subject should feel more confident because, although his ratio is smaller – 4 to 3 – his sample size is larger. The larger the sample size, the closer you are likely to get to an accurate picture.

Misconception of chance Here are three sets of results from tossing a coin six times in a row, where T stands for tails and H stands for heads. Ask a selection of people which of the three sets is the random one.

  1. TTTTTT
  2. TTTHHH
  3. THHTTH

Most people will choose set 3 because it feels random. But, of course, all three are equally likely or unlikely. Tversky and Kahneman speculate that this is because people have in mind a representation of what randomness ought to look like, and let this override their statistical understanding (if they have any) that the total randomness of a system need not be exactly replicated at every level. In other words, a random series of tossing coins might well throw up sequences which appear to have order.

The gambler’s fallacy is the mistaken belief that, if you toss enough coins and get nothing but heads, the probability increases that the next result one will be tails, because you expect the series to ‘correct’ itself.

People who fall for this fallacy are using a representation of fairness (just as in the example above they use a representation of chaos) and letting it override what ought to be a basic knowledge of statistics, which is that each coin toss stands on its own and has its own probability i.e. 50/50 or 0.5. Just because someone tosses an increasing number of heads in a row is no reason at all for believing their next toss will be tails.

(In reality we all know that sooner or later a heads is likely to appear due to the law of large numbers, namely that if you perform probabilistic events enough times the total sum of events is likely to revert to the overall expected average. T&K shed light on the interaction of the gambler’s fallacy and the law of large numbers by clarifying that an unusual run of results is not ‘corrected’ by the coin (which obviously has no memory or intention) – such runs are diluted by a large number of occurrences, they are dissolved in the context of larger and larger samples.)

Insensitivity to predictability Subjects were given descriptions of two companies, one described in glowing terms, one in mediocre terms, and then asked about their future profitability. Although neither description mentioned anything about profitability, most subjects were swayed by the representativeness heuristic to predict that the positively described company would have higher profits.

Two groups of subjects were given descriptions of one practice lesson given by several student teachers. One group was asked to rate the teachers’ performances based on this one class, the other group was asked to predict the relative standing of the teachers five years in the future. The ratings of the groups agreed. Despite the wild improbability of being able to predict anything in five years time from one provisional piece of evidence, the subjects did just that.

The illusion of validity People make judgments or predictions based on the degree of representativeness (the quality of the match between the selected  outcome and the input) with no regard for probability or all the other factors which limit predictability. The illusion of validity is the profound mental conviction engendered when the ‘input information’ approaches representative models (stereotypes). I.e. if it matches a stereotype, people will believe it.

Misconceptions of regression Most people don’t understand a) where ‘regression to the mean’ applies b) recognise it when they see it, preferring to give all sorts of spurious explanations. For example, a sportsman has a great season – the commentators laud him, he wins sportsman of the year – but his next season is lousy. Critics and commentators come up with all kinds of reasons to explain this performance, but the good year might just have been a freak and now he has regressed closer to his average, mean ability.

2. Availability

Broadly speaking, this means going with the first thing that comes to mind. Like the two other heuristics, the availability heuristic has evolved because, in evolutionary terms, it is quick and useful. It does, however, in our complex industrial societies, lead to all kinds of biases and errors.

Biases due to the retrievability of incidences Experimenters read out a list of men and women to two groups without telling them that the list contained exactly 25 men and 25 women, then asked the groups to guess the ratio of the sexes. If the list included some famous men, the group was influenced to think there were more men, if the list included a sprinkling of famous women, the group thought there are more women than men. Why? Because the famous names carry more weight and literally influence people into thinking there are more of them.

Salience Seeing a house on fire makes people think about the danger of burning houses. Driving past a motorway accident makes people stop and think and drive more carefully (for a while). Then it wears off.

Biases due to the availability of a search set Imagine we sample words from a random text. Will there be more words starting with r or with r in the third position? For most people it is easier to call to mind words starting in r, so they think there are more of them, but there aren’t: there are more words in the English language with r in the third position than those with start with r.

Asked to estimate which are more common, abstract words like ‘love’ or concrete words like ‘door’, most subjects guess incorrectly that abstract words are more common. This is because they are more salient – love, fear, hate – and have more power in the mind. Are more available to conscious thought.

Biases of imaginability Say you’ve got a room of ten people. They have got to be formed into ‘committees. How many committees can be created which consist of between 2 and 8 people? Almost all people presented with this problem estimated there were many more possible committees of 2 than of 8, which is incorrect. There are 45 possible ways to create committees of 2 and of 8 (apparently). People prioritised 2 because it was easier to quickly begin working out permutations of 2, and then extrapolate this to the whole sample. This bias is very important when it comes to estimating the risk of any action, since we are programmed to call to mind big, striking, easy-to-imagine risks and often overlook hard-to-imagine risks (which is why risk factors should be written down and worked through as logically as possible).

Illusory correlation Subjects were given written profiles of several hypothetical mental patients along with drawings the patients were supposed to have made. When asked to associate the pictures with the diagnoses, subject came up with all kinds of spurious connections: for example, told that one patient was paranoid and suspicious, many of the subjects read ‘suspiciousness’ into one of the drawings and associated it with that patient, and so on.

But there were no connections. Both profiles and drawings were utterly spurious. But this didn’t stop all the subjects from making complex and plausible networks of connections and correlations.

Psychologists speculate that this tendency to attribute meaning is because we experience some strong correlations, especially early in life, and then project them onto every situation we encounter, regardless of factuality or probability.

It’s worth quoting T&K’s conclusion in full:

Lifelong experience has taught us that, in general, instances of large classes are recalled better and faster than instances of less frequent classes; that likely occurrences are easier to imagine than unlikely ones; and that the associative connections between events are strengthened when the events frequently co-occur. As a result, man has at his disposal a procedure (the availability heuristic) for estimating the numerosity of a class, the likelihood of an event, or the frequency of co-occurrences, by the ease with which the relevant mental operations of retrieval, construction, or association can be performed.

However, as the preceding examples have demonstrated, his valuable estimation procedure results in systematic errors.

3. Adjustment and Anchoring

In making estimates and calculations people tend to start from whatever initial value they have been given. All too often this value is not just wrong, but people are reluctant to move too far away from it. This is the anchor effect.

Insufficient adjustment Groups were given estimating tasks i.e. told to estimate various fairly easy values. Before each guess the group watched the invigilator spin a roulette wheel and pick a number entirely at random. Two groups were asked to estimate the number of African nations in the United Nations. The group which had watched the invigilator spin a roulette number of 10 guessed the number of nations at 25, the group which had watched him land a 65, guessed there were 45 nations.

Two groups of high school students were given these sums to calculate in 5 seconds: first group 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8, second group 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1. Without time to complete the sum both groups extrapolated from the part-completed task: first group guessed 512, second group guessed 2,250. (Both were wrong: it’s 40,320).

Biases in the evaluation of conjunctive and disjunctive events People tend to overestimate the probability of conjunctive events and underestimate the probability of disjunctive events. I found their explanation a little hard to follow here, but it seems to mean that when several events all need to occur in order to result in a certain outcome, we overestimate the likelihood that all of them will happen. If only one of many events needs to occur, we underestimate that probability.

Thus: subjects were asked to take part in the following activities:

  • simple event: pull a red marble from a bag containing half red marbles and half white marbles
  • conjunctive event: pulling a red marble seven times in succession from a bag containing 90% red and 10% whites – the point is, that this is only an event if it happens seven times in succession
  • disjunctive event: pulling a red marble at least once in seven successive goes

So the simple event is a yes-no result, with 50/50 odds; the conjunctive event requires that seven things happen in succession (pretty low odds); and the disjunctive event is a one (or more) in seven chance. Almost everyone overestimated the chances of the seven times in succession event compared to the at-least-one-in-seven outcome.

They then explain the real world significance of this finding. The development of a new product is a typically conjunctive event: a whole string of things must go right in order for the product to work. People’s tendency to overestimate conjunctive events leads to unwarranted optimism, which sometimes results in failure.

By contrast disjunctive structures are typically used in the calculation of risk. In a complex system, just one thing has to fail for the whole to fail. The chances of failure in each individual component might be low, but adding together the chances results in a high probability that something will go wrong, somewhere.

Yet people consistently underestimate the probability of disjunctive events, thus underestimating risk.

This explains why estimates for the completion of big, complex projects always tend to be over-optimistic – think Crossrail.

Anchoring in the assessment of subjective probability distributions This is an advanced statistical concept which they did not explain very well. I think it was to do with how you set a kind of basic value for a person’s guesses and estimates, and T&K then proceed to show that these kinds of calibrations are often wildly inaccurate.

Discussion

At the end of the summary of experiments, Tversky and Kahneman discuss their findings. This part was tricky to follow because they don’t discuss their findings’ impact on ordinary life, in terms you or I might understand, but instead assess the impact of their findings on what appears to have been (back in 1974) modern decision theory.

think the idea is that modern decision theory was based on a modern of human rationality which was itself based on an idealised notion of logical thinking calculated from an assessment or ‘calibration’ of subjective decision-making.

Modern decision theory regards subjective probability as the quantified opinion of an ideal person.

I found it impossible to grasp the detail of this idea, maybe because they don’t explain it very well, assuming that the audience for this kind of specialised research paper would be totally au fait with it. Anyway, Tversky and Kahneman say that their findings undermine the coherence of this model of ‘modern decision theory’, explaining why in technical detail which, again, I found hard to follow.

Obviously, for the lay reader like myself, the examples they’ve assembled, and the types of cognitive and logical and probabilistic errors they describe, give precision and detail enough to support one’s intuition that people (including oneself) are profoundly, alarmingly, irrational.

Summary

In their words:

This article described three heuristics that are employed in making judgements under uncertainty: (i) representativeness, which is usually employed when people are asked to judge the probability that an object or event A belongs to class or process B; (ii) availability of instances or scenarios, which is often employed when people are asked to assess the frequency of a class or the plausibility of a particular development; and (iii) adjustment from an anchor, which is usually employed in numerical prediction when a relevant value is available.

These heuristics are highly economical and usually effective, but they lead to systematic and predictable errors. A better understanding of these heuristics and of the biases to which they lead could improve judgments and decisions in situations of uncertainty.

My thoughts

1. The most obvious thing to me, fresh from reading John Allen Paulos’s two books about innumeracy and Stuart Sutherland’s book on irrationality, is how much the examples used by Tversky and Kahneman are repeated almost verbatim in those books, and thus what a rich source of data this article was for later writers.

2. The next thought is that this is because those books, especially the Sutherland, copy the way that Tversky and Kahneman use each heuristic as the basis for a section of their text, which they then sub-divide down into component parts, or variations on the basic idea.

Reading this paper made me realise this is exactly the approach that Sutherland uses in his book, taking one ‘error’ or bias at a time, and then working through all the sub-types and examples.

3. My next thought is the way Sutherland and Paulos only use some of the examples in this paper, the ones – reasonably enough – which are most comprehensible. Thus the final section in Tversky and Kahneman’s paper – about subjective probability distributions – is not picked up in the other books because it is couched in such dense mathematical terminology as to be almost impenetrable and because the idea they are critiquing – 1970s decision making theory – is too remote from most people’s everyday concerns.

So: having already read Paulos and Sutherland, not many of the examples Tversky and Kahneman use came as a surprise, nor did the basic idea of the availability error or representative error or the anchor effect.

But what did come over as new – what I found thought provoking – was the emphasis they put throughout on the fundamental usefulness of the heuristics.

Up till now – in Paulos and Sutherland – I had only heard negative things about these cognitive errors and prejudices and biases. It was a new experience to read Tversky and Kahneman explaining that these heuristics – these mental shortcuts – although they are often prone to error – nonetheless, have evolved deep in our minds because they are fundamentally useful.

That set off a new train of thought, and made me reflect that Paulos, Sutherland and Tversky and Kahneman are all dwelling on the drawbacks and limitations of these heuristics, leaving the many situations in which they are helpful, undescribed.

Now, as Sutherland repeats again and again – we should never let ourselves be dazzled by salient and striking results (such as coincidences and extreme results), we should always look at the full set of all the data, we should make sure we consider all the negative incidents where nothing dramatic or interesting happened, in order to make a correct calculation of probabilities.

So it struck me that you could argue that all these books and articles which focus on cognitive errors are, in their own way, rather unscientific, or lack a proper sample size – because they only focus on the times when the heuristics result in errors (and, also, that these errors are themselves measured in highly unrealistic conditions, in psychology labs, using highly unrepresentative samples of university students).

What I’m saying is that for a proper assessment of the real place of these heuristics in actual life, you would have to take into account all the numberless times when they have worked – when these short-cut, rule-of-thumb guesstimates, actually produce positive and beneficial results.

It may be that for every time a psychology professor conducts a highly restricted and unrealistic psychology experiment on high school students or undergraduates which results in them making howling errors in probability or misunderstanding the law of large numbers or whatever —  it may just be that on that day literally billions of ‘ordinary’ people are using the same heuristic in the kind of real world situations most of us encounter in our day-to-day lives, to make the right decisions for us, and to achieve positive outcomes.

The drawbacks of these heuristics are front-centre of Paulos and Sutherland and Tversky and Kahneman’s works – but who’s measuring the advantages?


Related link

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Alex’s Adventures In Numberland by Alex Bellos (2010)

Alexander Bellos (born in 1969) is a British writer and broadcaster. He is the author of books about Brazil and mathematics, as well as having a column in The Guardian newspaper. After adventures in Brazil (see his Wikipedia page) he returned to England in 2007 and wrote this, his first book. It spent four months in the Sunday Times bestseller list and led on to five more popular maths books.

It’s a hugely enjoyable read for three reasons:

  1. Bellos immediately establishes a candid, open, good bloke persona, sharing stories from his early job as a reporter on the Brighton Argus, telling some colourful anecdotes about his time in Brazil and then being surprisingly open about the way that, when he moved back to Britain, he had no idea what to do. The tone of the book is immediately modern, accessible and friendly.
  2. However this doesn’t mean he is verbose. The opposite. The book is packed with fascinating information. Every single paragraph, almost every sentence contains a fact or insight which makes you sit up and marvel. It is stufffed with good things.
  3. Lastly, although its central theme is mathematics, it approaches this through a wealth of information from the humanities. There is as much history and psychology and anthropology and cultural studies and philosophy as there is actual maths, and these are all subjects which the average humanities graduate can immediately relate to and assimilate.

Chapter Zero – A Head for Numbers

Alex meets Pierre Pica, a linguist who’s studied the Munduruku people of the Amazon and discovered they have little or no sense of numbers. They only have names for numbers up to five. Also, they cluster numbers together logarithmically i.e. the higher the number, the closer together they clustered them. Same thing is done by kindergarten children who only slowly learn that numbers are evenly spaced, in a linear way.

This may be because small children and the Munduruku don’t count so much as estimate using the ratios between numbers.

It may also be because above a certain number (five) Stone Age man needed to make quick estimates along the lines of, Are there more wild animals / members of the other gang, than us?

Another possibility is that distance appears to us to be logarithmic due to perspective: the first fifty yards we see in close detail, the next fifty yards not so detailed, beyond 100 yards looking smaller, and so on.

It appears that we have to be actively taught when young to overcome our logarithmic instincts, and to apply the rule that each successive whole number is an equal distance from its predecessor and successor i.e. the rational numbers lies along a straight line at regular intervals.

More proof that the logarithmic approach is the deep, hard-wired one is the way most of us revert to its perspective when considering big numbers. As John Allen Paulos laments, people make no end of fuss about discrepancies between 2 or 3 or 4 – but are often merrily oblivious to the difference between a million or a billion, let alone a trillion. For most of us these numbers are just ‘big’.

He goes on to describe experiments done on chimpanzees, monkeys and lions which appear to show that animals have the ability to estimate numbers. And then onto experiments with small babies which appear to show that as soon as they can focus on the outside world, babies can detect changes in number of objects.

And it appears that we also have a further number skill, that guesstimating things – the journey takes 30 or 40 minutes, there were twenty or thirty people at the party, you get a hundred, maybe hundred and fifty peas in a sack. When it comes to these figures almost all of us give rough estimates.

To summarise:

  • we are sensitive to small numbers, acutely so of 1, 2, 3, 4, less so of 5, 6, 7, 8, 9
  • left to our own devices we think logarithmically about larger numbers i.e lose the sense of distinction between them, clump them together
  • we have a good ability to guesstimate medium size numbers – 30, 40, 100

But it was only with the invention of notation, a way of writing numbers down, that we were able to create the linear system of counting (where every number is 1 larger than its predecessor, laid out in a straight line, at regular intervals).

And that this cultural invention enabled human beings to transcend our vague guesstimating abilities, and laid the basis for the systematic manipulation of the world which followed

Chapter One – The Counter Culture

The probable origins of counting lie in stock taking in the early agricultural revolution some 8,000 years ago.

We nowadays count using a number base 10 i.e. the decimal system. But other bases have their virtues, especially base 12. It has more factors i.e. is easier to divide: 12 can be divided neatly by 2, 3, 4 and 6. A quarter of 10 is 2.5 but of 12 is 3. A third of 10 is 3.333 but of 12 is 4. Striking that a version of the duodecimal system (pounds, shillings and pence) hung on in Britain till we finally went metric in the 1970s. There is even a Duodecimal Society of America which still actively campaigns for the superiority of a base 12 counting scheme.

Bellos describes a bewildering variety of other counting systems and bases. In 1716 King Charles XII of Sweden asked Emmanuel Swedenborg to devise a new counting system with a base of 64. The Arara in the Amazon count in pairs, the Renaissance author Luca Paccioli was just one of hundreds who have devised finger-based systems of counting – indeed, the widespread use of base 10 probably stems from the fact that we have ten fingers and toes.

He describes a complicated Chinese system where every part of the hand and fingers has a value which allows you to count up to nearly a billion – on one hand!

The Yupno system which attributes a different value for parts of the body up to its highest number, 33, represented by the penis.

Diagram showing numbers attributed to parts of the body by the Yupno tribe

Diagram showing numbers attributed to parts of the body by the Yupno tribe

There’s another point to make about his whole approach which comes out if we compare him with the popular maths books by John Allen Paulos which I’ve just read.

Paulos clearly sees the need to leaven his explanations of comparative probability and Arrow’s Theorem and so on with lighter material and so his strategy is to chuck into his text things which interest him: corny jokes, anecdotes about baseball, casual random digressions which occur to him in mid-flow. But al his examples clearly 1. emanate from Paulos’s own interests and hobby horses (especially baseball) and 2. they are tacked onto the subjects being discussed.

Bellos, also, has grasped that the general reader needs to be spoonfed maths via generous helpings of other, more easily digestible material. But Bellos’s choice of material arises naturally from the topic under discussion. The humour emerges naturally and easily from the subject matter instead of being tacked on in the form of bad jokes.

You feel yourself in the hands of a master storyteller who has all sorts of wonderful things to explain to you.

In fourth millennium BC, an early counting system was created by pressing a reed into soft clay. By 2700 BC the Sumerians were using cuneiform. And they had number symbols for 1, 10, 60 and 3,600 – a mix of decimal and sexagesimal systems.

Why the Sumerians grouped their numbers in 60s has been described as one of the greatest unresolved mysteries in the history of arithmetic. (p.58)

Measuring in 60s was inherited by the Babylonians, the Egyptians and the Greeks and is why we still measure hours in 60 minutes and the divisions of a circle by 360 degrees.

I didn’t know that after the French Revolution, when the National Convention introduced the decimal system of weights and measures, it also tried to decimalise time, introducing a new system whereby every day would be divided into ten hours, each of a hundred minutes, each divided into 100 seconds. Thus there were a very neat 10 x 100 x 100 = 100,000 seconds in a day. But it failed. An hour of 60 minutes turns out to be a deeply useful division of time, intuitively measurable, and a reasonable amount of time to spend on tasks. The reform was quietly dropped after six months, although revolutionary decimal clocks still exist.

Studies consistently show that Chinese children find it easier to count than European children. This may be because of our system of notation, or the structure of number names. Instead of eleven or twelve, Chinese, Japanese and Koreans say the equivalent of ten one, ten two. 21 and 22 become two ten one and two ten two. It has been shown that this makes it a lot simpler and more intuitive to do basic addition and subtraction.

Bellos goes on to describe the various systems of abacuses which have developed in different cultures, before explaining the phenomenal popularity of abacus counting, abacus clubs, and abacus championships in Japan which helps kids develop the ability to perform anzan, using the mental image of an abacus to help its practitioners to sums at phenomenal speed.

Chapter Two – Behold!

The mystical sense of the deep meaning of numbers, from Pythagoras with his vegetarian religious cult of numbers in 4th century BC Athens to Jerome Carter who advises leading rap stars about the numerological significance of their names.

Euclid and the elegant and pure way he deduced mathematical theorems from a handful of basic axioms.

A description of the basic Platonic shapes leads into the nature of tessalating tiles, and the Arab pioneering of abstract design. The complex designs of the Sierpinski carpet and the Menger sponge. And then the complex and sophisticated world of origami, which has its traditionalists, its pioneers and surprising applications to various fields of advanced science, introducing us to the American guru of modern origami, Robert Lang, and the Japanese rebel, Kazuo Haga, father of Haga’s Theorem.

Chapter Three – Something About Nothing

A bombardment of information about the counting systems of ancient Hindus, Buddhists, about number symbols in Sanskrit, Hebrew, Greek and Latin. How the concept of zero was slowly evolved in India and moved to the Muslim world with the result that the symbols we use nowadays are known as the Arabic numerals.

A digression into ‘a set of arithmetical tricks known as Vedic Mathematics ‘ devised by a young Indian swami at the start of the twentieth century, Bharati Krishna Tirthaji, based on a series of 16 aphorisms which he found in the ancient holy texts known as the Vedas.

Shankaracharya is a commonly used title of heads of monasteries called mathas in the Advaita Vedanta tradition. Tirthaji was the Shankaracharya of the monastery at Puri. Bellos goes to visit the current Shankaracharya who explains the closeness, in fact the identity, of mathematics and Hindu spirituality.

Chapter Four – Life of Pi

An entire chapter about pi which turns out not only to be a fundamental aspect of calculating radiuses and diameters and volumes of circles and cubes, but also to have a long history of mathematicians vying with each other to work out its value to as many decimal places as possible (we currently know the value of pi to 2.7 trillion decimal places) and the surprising history of people who have set records reciting the value if pi.

Thus, in 2006, retired Japanese engineer Akira Haraguchi set a world record for reciting the value of pi to the first 100,000 decimal places from memory! It took 16 hours with five minute beaks every two hours to eat rice balls and drink some water.

There are several types or classes of numbers:

  • natural numbers – 1, 2, 3, 4, 5, 6, 7…
  • integers – all the natural numbers, but including the negative ones as well – …-3, -2, -1, 0, 1, 2, 3…
  • fractions
  • which are also called rational numbers
  • numbers which cannot be written as fractions are called irrational numbers
  • transcendent numbers – ‘a transcendental number is an irrational number that cannot be described by an equation with a finite number of terms’

The qualities of the heptagonal 50p coin and the related qualities of the Reuleux triangle.

Chapter Five – The x-factor

The origin of algebra (in Arab mathematicians).

Bellos makes the big historical point that for the Greeks (Pythagoras, Plato, Euclid) maths was geometric. They thought of maths as being about shapes – circles, triangles, squares and so on. These shapes had hidden properties which maths revealed, thus giving – the Pythagoreans thought – insight into the secret deeper values of the world.

It is only with the introduction of algebra in the 17th century (Bellos attributes its widespread adoption to Descartes’s Method in the 1640s) that it is possible to fly free of shapes into whole new worlds of abstract numbers and formulae.

Logarithms turn the difficult operation of multiplication into the simpler operation of addition. If X x Y = Z, then log X + log Y = log Z. They were invented by a Scottish laird John Napier, and publicised in a huge book of logarithmic tables published in 1614. Englishman Henry Briggs established logarithms to base 10 in 1628. In 1620 Englishman Edmund Gunter marked logarithms on a ruler. Later in the 1620s Englishman William Oughtred placed two logarithmic rulers next to each other to create the slide rule.

Three hundred years of dominance by the slide rule was brought to a screeching halt by the launch of the first pocket calculator in 1972.

Quadratic equations are equations with an x and an x², e.g. 3x² + 2x – 4 = 0. ‘Quadratics have become so crucial to the understanding of the world, that it is no exaggeration to say that they underpin modern science’ (p.200).

Chapter Six – Playtime

Number games. The origin of Sudoku, which is Japanese for ‘the number must appear only once’. There are some 5 billion ways for numbers to be arranged in a table of nine cells so that the sum of any row or column is the same.

There have, apparently, only been four international puzzle crazes with a mathematical slant – the tangram, the Fifteen puzzle, Rubik’s cube and Sudoku – and Bellos describes the origin and nature and solutions to all four. More than 300 million cubes have seen sold since Ernö Rubik came up with the idea in 1974. Bellos gives us the latest records set in the hyper-competitive sport of speedcubing: the current record of restoring a copletely scrambled cube to order (i.e. all the faces of one colour) is 7.08 seconds, a record held by Erik Akkersdijk, a 19-year-old Dutch student.

A visit to the annual Gathering for Gardner, honouring Martin Gardner, one of the greatest popularisers of mathematical games and puzzles who Bellos visits. The origin of the ambigram, and the computer game Tetris.

Chapter Seven – Secrets of Succession

The joy of sequences. Prime numbers.

The fundamental theorem of arithmetic – In number theory, the fundamental theorem of arithmetic, also called the unique factorization theorem or the unique-prime-factorization theorem, states that every integer greater than 1 either is a prime number itself or can be represented as the product of prime numbers.

The Goldbach conjecture – one of the oldest and best-known unsolved problems in number theory and all of mathematics. It states that, Every even integer greater than 2 can be expressed as the sum of two primes. The conjecture has been shown to hold for all integers less than 4 × 1018, but remains unproven despite considerable effort.

Neil Sloane’s idea of persistence – The number of steps it takes to get to a single digit by multiplying all the digits of the preceding number to obtain a second number, then multiplying all the digits of that number to get a third number, and so on until you get down to a single digit. 88 has a persistence of three.

88 → 8 x 8 = 64 → 6 x 4 = 24 → 2 x 4 = 8

John Horton Conway’s idea of the powertrain – For any number abcd its powertrain goes to abcd, in the case of numbers with an odd number of digits the final one has no power, abcde’s powertrain is abcde.

The Recamán sequence Subtract if you can, unless a) it would result in a negative number or b) the number is already in the sequence. The result is:

0, 1, 3, 6, 2, 7, 13, 20, 12, 21, 11….

Gijswijt’s sequence a self-describing sequence where each term counts the maximum number of repeated blocks of numbers in the sequence immediately preceding that term.

1, 1, 2, 1, 1, 2, 2, 2, 3, 1, 1, 2, 1, 1, 2, 2, 2, 3, 2, 1, …

Perfect number A perfect number is any number that is equal to the sum of its factors. Thus 6 – its factors (the numbers which divided into it) are 1, 2 and 3. Which also add up to (are the sum of) 6. The next perfect number is 28 because its factors – 1, 2, 4, 7, 14 – add up to 28. And so on.

Amicable numbers A number is amicable if the sum of the factors of the first number equals the second number, and if the sum of the factors of the second number equals the first. The factors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110. Added together these make 284. The factors of 284 are 1, 2, 4, 71 and 142. Added together they make 220!

Sociable numbers In 1918 Paul Poulet invented the term sociable numbers. ‘The members of aliquot cycles of length greater than 2 are often called sociable numbers. The smallest two such cycles have length 5 and 28’

Mersenne’s prime A prime number which can be written in the form 2n – 1 a prime number that is one less than a power of two. That is, it is a prime number of the form Mn = 2n − 1 for some integer n. The exponents n which give Mersenne primes are 2, 3, 5, 7, 13, 17, 19, 31, … and the resulting Mersenne primes are 3, 7, 31, 127, 8191, 131071, 524287, 2147483647, …

These and every other sequence ever created by humankind are documented on The On-Line Encyclopedia of Integer Sequences (OEIS), also cited simply as Sloane’s. This is an online database of integer sequences, created and maintained by Neil Sloane while a researcher at AT&T Labs.

Chapter Eight – Gold Finger

The golden section a number found by dividing a line into two parts so that the longer part divided by the smaller part is also equal to the whole length divided by the longer part.

Phi The number is often symbolized using phi, after the 21st letter of the Greek alphabet. In an equation form:

a/b = (a+b)/a = 1.6180339887498948420 …

As with pi (the ratio of the circumference of a circle to its diameter), the digits go on and on, theoretically into infinity. Phi is usually rounded off to 1.618.

The Fibonnaci sequence Each number in the sequence is the sum of the two numbers that precede it. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. The mathematical equation describing it is Xn+2= Xn+1 + Xn.

as the basis of seeds in flowerheads, arrangement of leaves round a stem, design of nautilus shell and much more.

Chapter Nine – Chance Is A Fine Thing

A chapter about probability and gambling.

Impossibility has a value 0, certainty a value 1, everything else is in between. Probabilities can be expressed as fractions e.g. 1/6 chance of rolling a 6 on a die, or as percentages, 16.6%, or as decimals, 0.16…

The probability is something not happening is 1 minus the probability of that thing happening.

Probability was defined and given mathematical form in 17th century. One contribution was the questions the Chevalier de Méré asked the mathematical prodigy Blaise Pascal. Pascal corresponded with his friend, Pierre de Fermat, and they worked out the bases of probability theory.

Expected value is what you can expect to get out of a bet. Bellos takes us on a tour of the usual suspects – rolling dice, tossing coins, and roulette (invented in France).

Payback percentage if you bet £10 at craps, you can expect – over time – to receive an average of about £9.86 back. In other words craps has a payback percentage of 98.6 percent. European roulette has a payback percentage of 97.3 percent. American roulette, 94.7 percent. On other words, gambling is a fancy way of giving your money away. A miserly slot machine has a payback percentage of 85%. The National Lottery has a payback percentage of 50%.

The law of large numbers The more you play a game of chance, the more likely the results will approach the statistical probability. Toss a coin three times, you might get three heads. Toss a coin a thousand times, the chances are you will get very close the statistical probability of 50% heads.

The law of very large numbers With a large enough sample, outrageous coincidences become likely.

The gambler’s fallacy The mistaken belief that, if something happens more frequently than normal during a given period, it will happen less frequently in the future (or vice versa). In other words, that a random process becomes less random, and more predictable, the more it is repeated.

The birthday paradox The probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are only 366 possible birthdays, including February 29). However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. (These conclusions are based on the assumption that each day of the year (excluding February 29) is equally probable for a birthday.) In other words you only need a group of 23 people to have an evens chance that two of them share a birthday.

The drunkard’s walk

The difficulty of attaining true randomness and the human addiction to finding meaning in anything.

The distinction between playing strategy (best strategy to win a game) and betting strategy (best strategy to maximise your winnings), not always the same.

Chapter Ten – Situation Normal

Carl Friedrich Gauss, the bell curve, normal distribution aka Gaussian distribution. Normal or Gaurrian distribution results in a bell curve. Bellos describes the invention and refinement of the bell curve (he explains that ‘the long tail’ results from a mathematician who envisioned a thin bell curve as looking like two kangaroos facing each other with their long tails heading off in opposite directions). And why

Regression to the mean – if the outcome of an event is determined at least in part by random factors, then an extreme event will probably be followed by one that is less extreme. And recent devastating analyses which show how startlingly random sports achievements are, from leading baseball hitters to Simon Kuper and Stefan Szymanski’s analysis of the form of the England soccer team.

Chapter Eleven – The End of the Line

Two breakthroughs which paved the way for modern i.e. 20th century, maths: the invention of non-Euclidean geometry, specifically the concept of hyperbolic geometry. To picture this draw a triangle on a Pringle. it is recognisably a triangle but all its angles do not add up to 180°, therefore it defies, escapes, eludes all the rule of Euclidean geometry, which were designed for flat 2D surfaces.

Bellos introduces us to Daina Taimina, a maths prof at Cornell University, who invented a way of crocheting hyperbolic surfaces. The result looks curly, like curly kale or the surface of coral.

Anyway, the breakaway from flat 2-D Euclidean space led to theories about curved geometry, either convex like a sphere, or hyperbolic like the pringle. It was this notion of curved space, which paved the way for Einstein’s breakthrough ideas in the early 20th century.

The second big breakthrough was Georg Cantor’s discovery that you can have many different types of infinity. Until Cantor the mathematical tradition from the ancient Greeks to Galileo and Newton had fought shy of infinity which threatened to disrupt so many formulae.

Cantor’s breakthrough was to stop thinking about numbers, and instead think of sets. This is demonstrated through the paradoxes of Hilbert’s Hotel. You need to buckle your safety belt to understand it.

Thoughts

This is easily the best book about maths I’ve ever read. It gives you a panoramic history of the subject which starts with innumerate cavemen and takes us to the edge of Einstein’s great discoveries. But Bellos adds to it all kinds of levels and abilities.

He is engaging and candid and funny. He is fantastically authoritative, taking us gently into forests of daunting mathematical theory without placing a foot wrong. He’s a great explainer. He knows a good story when he sees one, and how to tell it engagingly. And in every chapter there is a ‘human angle’ as he describes his own personal meetings and interviews with many of the (living) key players in the world of contemporary maths, games and puzzles.

Like the Ian Stewart book but on a vastly bigger scale, Bellos makes you feel what it is like to be a mathematician, not just interested in nature’s patterns (the basis of Stewart’s book, Nature’s Numbers) but in the beauty of mathematical theories and discoveries for their own sakes. (This comes over very strongly in chapter seven with its description of some of the weirdest and wackiest number sequences dreamed up by the human mind.) I’ve often read scientists describing the beauty of mathematical theories, but Bellos’s book really helps you develop a feel for this kind of beauty.

For me, I think three broad conclusions emerged:

1. Most mathematicians are in it for the fun. Setting yourself, and solving, mathematical puzzles is obviously extremely rewarding. Maths includes the vast territory of puzzles and games, such as the Sudoku and so on he describes in chapter six. Obviously it has all sorts of real-world application in physics, engineering and so on, but Bellos’s book really brings over that a true understanding of maths begins in puzzles, games and patterns, and often remains there for a lifetime. Like everything else maths is no highly professionalised the property of tenured professors in universities; and yet even to this day – as throughout its history – contributions can be made by enthusiastic amateurs.

2. As he points out repeatedly, many insights which started out as the hobby horses of obsessives, or arcane breakthroughs on the borders of our understanding, and which have been airily dismissed by the professionals, often end up being useful, having applications no-one dreamed of. Either they help unravel aspects of the physical universe undreamed of when they were discovered, or have been useful to human artificers. Thus the development of random number sequences seemed utterly pointless in the 19th century, but now underlies much internet security.

On a profounder note, Bellos expresses the eerie, mystical sense many mathematicians have that it seems so strange, so pregnant with meaning, that so many of these arcane numbers end up explaining aspects of the world their inventors knew nothing of. Ian Stewart has an admirably pragmatic explanation for this: he speculates that nature uses everything it can find in order to build efficient life forms. Or, to be less teleological, over the past 3 and a half billion years, every combination of useful patterns has been tried out. Given this length of time, and the incalculable variety of life forms which have evolved on this planet, it would be strange if every number system conceivable by one of those life forms – humankind – had not been tried out at one time or another.

3. My third conclusion is that, despite John Allen Paulos’s and Bellos’s insistence, I do not live in a world ever-more bombarded by maths. I don’t gamble on anything, and I don’t follow sports – the two biggest popular areas where maths is important – and the third is the twin areas of surveys and opinion polls (55% of Americans believe in alien abductions etc etc) and the daily blizzard of reports (for example, I see in today’s paper that the ‘Number of primary school children at referral units soars’).

I register their existence but they don’t impact on me for the simple reason that I don’t believe any of them. In 1992 every opinion poll said John Major would lose the general election, but he won with a thumping majority. Since then I haven’t believed any poll about anything. For example almost all the opinion polls predicted a win for Remain in the Brexit vote. Why does any sane person believe opinion polls?

And ‘new and shocking’ reports come out at the rate of a dozen a day and, on closer examination, lots of them turn out to be recycled information, or much much more mundane releases of data sets from which journalists are paid to draw the most shocking and extreme conclusions. Some may be of fleeting interest but once you really grasp that the people reporting them to you are paid to exaggerate and horrify, you soon learn to ignore them.

If you reject or ignore these areas – sport, gambling and the news (made up of rehashed opinion polls, surveys and reports) – then unless you’re in a profession which actively requires the sophisticated manipulation of figures, I’d speculate that most of the rest of us barely come into contact with numbers from one day to the next.

I think that’s the answer to Paulos and Bellos when they are in their ‘why aren’t more people mathematically numerate?’ mode. It’s because maths is difficult, and counter-intuitive, and hard to understand and follow, it is a lot of work, it does make your head ache. Even trying to solve a simple binomial equation hurt my brain.

But I think the biggest reason that ‘we’ are so innumerate is simply that – beautiful, elegant, satisfying and thought-provoking though maths may be to the professionals – maths is more or less irrelevant to most of our day to day lives, most of the time.


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Innumeracy by John Allen Paulos (1988)

Our innate desire for meaning and pattern can lead us astray… (p.81)

Giving due weight to the fortuitous nature of the world is, I think, a mark of maturity and balance. (p.133)

John Allen Paulos is an American professor of mathematics who won fame beyond his academic milieu with the publication of this short (134-page) but devastating book thirty years ago, the first of a series of books popularising mathematics in a range of spheres from playing the stock market to humour.

As Paulos explains in the introduction, the world is full of humanities graduates who blow a fuse if you misuse ‘infer’ and ‘imply’, or end a sentence with a dangling participle, but are quite happy to believe and repeat the most hair-raising errors in maths, statistics and probability.

The aim of this book was:

  • to lay out examples of classic maths howlers and correct them
  • to teach readers to be more alert when maths, stats and data need to be used
  • and to provide basic rules in order to understand when innumerate journalists, politicians, tax advisors and other crooks are trying to pull the wool over your eyes, or are just plain wrong

There are five chapters:

  1. Examples and principles
  2. Probability and coincidence
  3. Pseudoscience
  4. Whence innumeracy
  5. Statistics, trade-offs and society

Many common themes emerge:

Don’t personalise, numeratise

One contention of this book is that innumerate people characteristically have a strong tendency to personalise – to be misled by their own experiences, or by the media’s focus on individuals and drama… (p.1)

Powers

The first chapter uses lots of staggering statistics to get the reader used to very big and very small numbers, and how to compute them.

1 million seconds is 11 and a half days. 1 billion seconds is 32 years.

He suggests you come up with personal examples of numbers for each power up to 12 or 13 i.e. meaningful embodiments of thousands, tens of thousands, hundreds of thousands and so on to help you remember and contextualise them in a hurry.

A snail moves at 0.005 miles an hour, Concorde at 2,000 miles per hour. Escape velocity from earth is about 7 miles per second, or 25,000 miles per hour. The mass of the Earth is 5.98 x 1024 kg

Early on he tells us to get used to the nomenclature of ‘powers’ – using 10 to the power 3 or 10³ instead of 1,000, or 10 to negative powers to express numbers below 1. (In fact, right at this early stage I found myself stumbling because one thousand means more to me that 10³ and a thousandth means more than more 10-3 but if you keep at it, it is a trick you can acquire quite quickly.)

The additive principle

He introduces us to basic ideas like the additive principle (aka the rule of sum), which states that if some choice can be made in M different ways and some subsequent choice can be made in N different ways, then there are M x N different ways these choices can be made in succession – which can be applied to combinations of multiple items of clothes, combinations of dishes on a menu, and so on.

Thus the number of results you get from rolling a die is 6. If you roll two dice, you can now get 6 x 6 = 36 possible numbers. Three numbers = 216. If you want to exclude the number you get on the first dice from the second one, the chances of rolling two different numbers on two dice is 6 x 5, of rolling different numbers on three dice is 6 x 5 x 4, and so on.

Thus: Baskin Robbins advertises 31 different flavours of ice cream. Say you want a triple scoop cone. If you’re happy to have any combination of flavours, including where any 2 or 3 flavours are the same – that’s 31 x 31 x 31 = 29,791. But if you ask how many combinations of flavours there are, without a repetition of the same flavour in any of the cones – that is 31 x 30 x 29 = 26,970 ways of combining.

Probability

I struggled with even the basics of probability. I understand a 1 in five chance of something happening, reasonably understand a 20% chance of something happening, but struggled when probability was expressed as a decimal number e.g. 0.2 as a way of writing a 20 percent or 1 in 5 chance.

With the result that he lost me on page 16 on or about the place where he explained the following example.

Apparently a noted 17th century gambler asked the famous mathematician Pascal which is more likely to occur: obtaining at least one 6 in four rolls of a single die, or obtaining at least one 12 in twenty four rolls of a pair of dice. Here’s the solution:

Since 5/6 is the probability of not rolling a 6 on a single roll of a die, (5/6)is the probability of not rolling a 6 in four rolls of the die. Subtracting this number from 1 gives us the probability that this latter event (no 6s) doesn’t occur; in other words, of there being at least one 6 rolled in four tries: 1 – (5/6)= .52. Likewise, the probability of rolling at least one 12 in twenty-four rolls of a pair of dice is seen to be 1 – (35/36)24 = .49.

a) He loses me in the second sentence which I’ve read half a dozen times and still don’t understand – it’s where he says the chances that this latter event doesn’t occur: something about the phrasing there, about the double negative, loses me completely, with the result that b) I have no idea whether .52 is more likely or less likely than .49.

He goes on to give another example: if 20% of drinks dispensed by a vending machine overflow their cups, what is the probability that exactly three of the next ten will overflow?

The probability that the first three drinks overflow and the next seven do not is (.2)x (.8)7. But there are many different ways for exactly three of the ten cups to overflow, each way having probability (.2)x (.8)7. It may be that only the last three cups overflow, or only the fourth, fifth and ninth cups, and so on. Thus, since there are altogether (10 x 9 x 8) / (3 x 2 x 1) = 120 ways for us to pick three out of the ten cups, the probability of some collection of exactly three cups overflowing is 120 x (.2)x (.8)7.

I didn’t understand the need for the (10 x 9 x 8) / (3 x 2 x 1) equation – I didn’t understand what it was doing, and so didn’t understand what it was measuring, and so didn’t understand the final equation. I didn’t really have a clue what was going on.

In fact, by page 20, he’d done such a good job of bamboozling me with examples like this that I sadly concluded that I must be innumerate.

More than that, I appear to have ‘maths anxiety’ because I began to feel physically unwell as I read that problem paragraph again and again and again and didn’t understand it. I began to feel a tightening of my chest and a choking sensation in my throat. Rereading it now is making it feel like someone is trying to strangle me.

Maybe people don’t like maths because being forced to confront something you don’t understand, but which everyone around you is saying is easy-peasy, makes you feel ill.

2. Probability and coincidence

Having more or less given up on trying to understand Paulos’s maths demonstrations in the first twenty pages, I can at least latch on to his verbal explanations of what he’s driving at, in sentences like these:

A tendency to drastically underestimate the frequency of coincidences is a prime characteristic of innumerates, who generally accord great significance to correspondences of all sorts while attributing too little significance to quite conclusive but less flashy statistical evidence. (p.22)

It would be very unlikely for unlikely events not to occur. (p.24)

There is a strong general tendency to filter out the bad and the failed and to focus on the good and the successful. (p.29)

Belief in the… significance of coincidences is a psychological remnant of our past. It constitutes a kind of psychological illusion to which innumerate people are particularly prone. (p.82)

Slot machines light up and make a racket when people win, there is unnoticed silence for all the failures. Big winners on the lottery are widely publicised, whereas every one of the tens of millions of failures is not.

One result is ‘Golden Age’ thinking when people denigrate today’s sports or arts or political figures, by comparison with one or two super-notable figures from the vast past, Churchill or Shakespeare or Michelangelo, obviously neglecting the fact that there were millions of also-rans and losers in their time as well as ours.

The Expected value of a quality is the average of its values weighted according to their probabilities. I understood these words but I didn’t understand any of the five examples he gave.

The likelihood of probability In many situations, improbability is to be expected. The probability of being dealt a particular hand of 13 cards in bridge is less than 1 in 600 billion. And yet it happens every time someone is dealt a hand in bridge. The improbable can happen. In fact it happens all the time.

The gambler’s fallacy The belief that, because a tossed coin has come up tails for a number of tosses in a row, it becomes steadily more likely that the next toss will be a head.

3. Pseudoscience

Paulos rips into Freudianism and Marxism for the way they can explain away any result counter to their ‘theories’. The patient gets better due to therapy: therapy works. The patient doesn’t get better during therapy, well the patient was resisting, projecting their neuroses on the therapist, any of hundreds of excuses.

But this is just warming up before he rips into a real bugbear of  his, the wrong-headedness of Parapsychology, the Paranormal, Predictive dreams, Astrology, UFOs, Pseudoscience and so on.

As with predictive dreams, winning the lottery or miracle cures, many of these practices continue to flourish because it’s the handful of successes which stand out and grab our attention and not the thousands of negatives.

Probability

As Paulos steams on with examples from tossing coins, rolling dice, playing roulette, or poker, or blackjack, I realise all of them are to do with probability or conditional probability, none of which I understand.

This is why I have never gambled on anything, and can’t play poker. When he explains precisely how accumulating probabilities can help you win at blackjack in a casino, I switch off. I’ve never been to a casino. I don’t play blackjack. I have no intention of ever playing blackjack.

When he says that probability theory began with gambling problems in the seventeenth century, I think, well since I don’t gamble at all, on anything, maybe that’s why so much of this book is gibberish to me.

Medical testing and screening

Apart from gambling the two most ‘real world’ areas where probability is important appear to be medicine and risk and safety assessment. Here’s an extended example he gives of how even doctors make mistakes in the odds.

Assume there is a test for cancer which is 98% accurate i.e. if someone has cancer, the test will be positive 98 percent of the time, and if one doesn’t have it, the test will be negative 98 percent of the time. Assume further that .5 percent – one out of two hundred people – actually have cancer. Now imagine that you’ve taken the test and that your doctor sombrely informs you that you have tested positive. How depressed should you be? The surprising answer is that you should be cautiously optimistic. To find out why, let’s look at the conditional probability of your having cancer, given that you’ve tested positive.

Imagine that 10,000 tests for cancer are administered. Of these, how many are positive? On the average, 50 of these 10,000 people (.5 percent of 10,000) will have cancer, and, so, since 98 percent of them will test positive, we will have 49 positive tests. Of the 9,950 cancerless people, 2 percent of them will test positive, for a total of 199 positive tests (.02 x 9,950 = 199). Thus, of the total of 248 positive tests (199 + 49 = 248), most (199) are false positives, and so the conditional probability of having cancer given that one tests positive is only 49/248, or about 20 percent! (p.64)

I struggled to understand this explanation. I read it four or five times, controlling my sense of panic and did, eventually, I think, follow the argumen.

However, worse in a way, when I think I did finally understand it, I realised I just didn’t care. It’s not just that the examples he gives are hard to follow. It’s that they’re hard to care about.

Whereas his descriptions of human psychology and cognitive errors in human thinking are crystal clear and easy to assimilate:

If we have no direct evidence of theoretical support for a story, we find that detail and vividness vary inversely with likelihood; the more vivid details there are to a story, the less likely the story is to be true. (p.84)

4. Whence innumeracy?

It came as a vast relief when Paulos stopped trying to explain probability and switched to a long chapter puzzling over why innumeracy is so widespread in society, which kicks off by criticising the poor level of teaching of maths in school and university.

This was like the kind of hand-wringing newspaper article you can read any day of the week in a newspaper or online, and so felt reassuringly familiar and easy to assimilate. I stopped feeling so panic-stricken.

This puzzling over the disappointing level of innumeracy goes on for quite a while. Eventually it ends with a digression about what appears to be a pet idea of his: the notion of introducing a safety index for activities and illnesses.

Paulos’s suggestion is that his safety index would be on a logarithmic scale, like the Richter Scale – so straightaway he has to explain what a logarithm is: The logarithm for 100 is 2 because 100 is 102, the logarithm for 1,000 is 3 because 1,000 is 103. I’m with him so far, as he goes on to explain that the logarithm of 700 i.e. between 2 (100) and 3 (1,000) is 2.8. Since 1 in 5,300 Americans die in a car crash each year, the safety index for driving would be 3.7, the logarithm of 5,300. And so on with numerous more examples, whose relative risks or dangers he reduces to figures like 4.3 and 7.1.

I did understand his aim and the maths of this. I just thought it was bonkers:

1. What is the point of introducing a universal index which you would have to explain every time anyone wanted to use it? Either it is designed to be usable by the widest possible number of citizens; or it is a neat exercise on maths to please other mathematicians and statisticians.

2. And here’s the bigger objection – What Paulos, like most of the university-educated, white, liberal intellectuals I read in papers, magazines and books, fails to take into account is that a large proportion of the population is thick.

Up to a fifth of the adult population of the UK is functionally innumerate, that means they don’t know what a ‘25% off’ sign means on a shop window. For me an actual social catastrophe being brought about by this attitude is the introduction of Universal Credit by the Conservative government which, from top to bottom, is designed by middle-class, highly educated people who’ve all got internet accounts and countless apps on their smartphones, and who have shown a breath-taking ignorance about what life is like for the poor, sick, disabled, illiterate and innumerate people who are precisely the people the system is targeted at.

Same with Paulos’s scheme. Smoking is one of the most dangerous and stupid things which any human can do. Packs of cigarettes have for years, now, carried pictures of disgusting cancerous growths and the words SMOKING KILLS. And yet despite this, about a fifth of adults, getting on for 10 million people, still smoke. 🙂

Do you really think that introducing a system using ornate logarithms will get people to make rational assessments of the risks of common activities and habits?

Paulos then goes on to complicate the idea by suggesting that, since the media is always more interested in danger than safety, maybe it would be more effective, instead of creating a safety index, to create a danger index.

You would do this by

  1. working out the risk of an activity (i.e. number of deaths or accidents per person doing the activity)
  2. converting that into a logarithmic value (just to make sure than nobody understands it) and then
  3. subtracting the logarithmic value of the safety index from 10, in order to create a danger index

He goes on to say that driving a car and smoking would have ‘danger indices’ of 3.7 and 2.9, respectively. The trouble was that by this point I had completely ceased to understand what he’s saying. I felt like I’ve stepped off the edge of a tall building into thin air. I began to have that familiar choking sensation, as if someone was squeezing my chest. Maths anxiety.

Under this system being kidnapped would have a safety index of 6.7. Playing Russian roulette once a year would have a safety index of 0.8.

It is symptomatic of the uselessness of the whole idea that Paulos has to remind you what the values mean (‘Remember that the bigger the number, the smaller the risk.’ Really? You expect people to run with this idea?)

Having completed the danger index idea, Paulos returns to his extended lament on why people don’t like maths. He gives a long list of reasons why he thinks people are so innumerate a condition which is, for him, a puzzling mystery.

For me this lament is a classic example of what you could call intellectual out-of-touchness. He is genuinely puzzled why so many of his fellow citizens are innumerate, can’t calculate simple odds and fall for all sorts of paranormal, astrology, snake-oil blether.

He proposes typically academic, university-level explanations for this phenomenon – such as that people find maths too cold and analytical and worry that it prevents them thinking about the big philosophical questions in life. He worries that maths has an image problem.

In other words, he fails to consider the much more obvious explanation that maths, probability and numeracy in general might be a combination of fanciful, irrelevant and deeply, deeply boring.

I use the word ‘fanciful’ deliberately. When he writes that the probability of drawing two aces in succession from a pack of cards is not (4/52 x 4/52) but (4/52 x 3/51) I do actually understand the distinction he’s making (having drawn one ace there are only 3 left and only 52 cards left) – I just couldn’t care less. I really couldn’t care less.

Or take this paragraph:

Several years ago Pete Rose set a National League record by hitting safely in forty-four consecutive games. If we assume for the sake of simplicity that he batted .300 (30 percent of the time he got a hit, 70 percent of the time he didn’t) and that he came to bat four times a game, the chances of his not getting a hit in any given game were, assuming independence, (.7)4 – .24… [at this point Paulos has to explain what ‘independence’ means in a baseball context: I couldn’t care less]… So the probability he would get at least one hit in any game was 1-.24 = .76. Thus, the chances of him getting a hit in any given sequence of forty-four consecutive games were (.76)44 = .0000057, a tiny probability indeed. (p.44)

I did, in fact, understand the maths and the working out in this example. I just don’t care about the problem or the result.

For me this is a – maybe the – major flaw of this book. This is that in the blurbs on the front and back, in the introduction and all the way through the text, Paulos goes on and on about how we as a society need to be mathematically numerate because maths (and particularly probability) impinges on so many areas of our life.

But when he tries to show this – when he gets the opportunity to show us what all these areas of our lives actually are – he completely fails.

Almost all of the examples in the book are not taken from everyday life, they are remote and abstruse problems of gambling or sports statistics.

  • which is more likely: obtaining at least one 6 in four rolls of a single die, or obtaining at least one 12 in twenty four rolls of a pair of dice?
  • if 20% of drinks dispensed by a vending machine overflow their cups, what is the probability that exactly three of the next ten will overflow?
  • Assume there is a test for cancer which is 98% accurate i.e. if someone has cancer, the test will be positive 98 percent of the time, and if one doesn’t have it, the test will be negative 98 percent of the time. Assume further that .5 percent – one out of two hundred people – actually have cancer. Now imagine that you’ve taken the test and that your doctor sombrely informs you that you have tested positive. How depressed should you be?
  • What are the odds on Pete Rose getting a hit in a sequence of forty-four games?

Are these the kinds of problems you are going to encounter today? Or tomorrow? Or ever?

No. The longer the book went on, the more I realised just how little a role maths plays in my everyday life. In fact more or less the only role maths plays in my life is looking at the prices in supermarkets, where I am attracted to goods which have a temporary reduction on them. But I do that because they’re labels are coloured red, not because I calculate the savings. Being aware of the time, so I know when to do household chores or be somewhere punctually. Those are the only times I used numbers today.

5. Statistics, trade-offs and society

This feeling that the abstruseness of the examples utterly contradicts the bold claims that reading the book will help us with everyday experiences was confirmed in the final chapter, which begins with the following example.

Imagine four dice, A, B, C and D, strangely numbered as follows: A has 4 on four faces and 0 on two faces; B has 3s on all six faces; C has four faces with 2 and two faces with 6; and D has 5 on three faces and 1 on three faces…

I struggled to the end of this sentence and just thought: ‘No, no more, I don’t have to make myself feel sick and unhappy any more’ – and skipped the couple of pages detailing the fascinating and unexpected results you can get from rolling such a collection of dice.

This chapter goes on to a passage about the Prisoner’s Dilemma, a well-known problem in logic, which I have read about and instantly forgotten scores of times over the years.

Paulos gives us three or four variations on the idea, including:

  • Imagine you are locked up in prison by a philanthropist with 20 other people.

Or:

  • Imagine you are locked in a dungeon by a sadist with 20 other people.

Or:

  • Imagine you are one of two drug traffickers making a quick transaction on a street corner and forced to make a quick decision.

Or:

  • Imagine you are locked in a prison cell, and another prisoner is locked in an identical cell down the corridor.

Well, I’m not any of these things, I’m never likely to be, and I am not really interested in these fanciful speculations.

Moreover, I am well into middle age, have travelled round the world, had all sorts of jobs in companies small, large and enormous – and I am not aware of having ever been in any situation which remotely resembled any variation of the Prisoner’s Dilemma I’ve ever heard of.

In other words, to me, it is another one of the endless pile of games and puzzles which logicians and mathematicians love to spend all day playing but which have absolutely no impact whatsoever on any aspect of my life.

Pretty much all of his examples conclusively prove how remote mathematical problems and probabilistic calculation is from the everyday lives you and I lead. When he asks:

How many people would there have to be in a group in order for the probability to be half that at least two people in it have the same birthday? (p.23)

Imagine a factory which produces small batteries for toys, and assume the factory is run by a sadistic engineer… (p.117)

It dawns on me that my problem might not be that I’m innumerate, so much as I’m just uninterested in trivial or frivolous mental exercises.

Someone offers you a choice of two envelopes and tells you one has twice as much money in it as the other. (p.127)

Flip a coin continuously until a tail appears for the first time. If this doesn’t happen until the twentieth (or later) flip, you win $1 billion. If the first tail occurs before the twentieth flip, you pay $100. Would you play? (p.128)

No. I’d go and read an interesting book.

Thoughts

If Innumeracy: Mathematical Illiteracy and Its Consequences is meant to make its readers more numerate, it failed with me.

This is for a number of reasons:

  1. crucially – because he doesn’t explain maths very well; or, the way he explained probability had lost me by about page 16 – in other words, if this is meant to be a primer for innumerate people it’s a fail
  2. because the longer it goes on, the more convinced I became that I rarely use maths, arithmetic and probability in my day today life: whole days go by when I don’t do a single sum, and so lost all motivation to submit myself to the brain-hurting ordeal of trying to understand his examples

3. Also because the structure and presentation of the book is a mess. The book meanders through a fog of jokes, anecdotes and maths trivia, baseball stories and gossip about American politicians – before suddenly unleashing a fundamental aspect of probability theory on the unwary reader.

I’d have preferred the book to have had a clear, didactic structure, with an introduction and chapter headings explaining just what he was going to do, an explanation, say, of how he was going to take us through some basic concepts of probability one step at a time.

And then for the concepts to have been laid out very clearly and explained very clearly, from a number of angles, giving a variety of different examples until he and we were absolutely confident we’d got it – before we moved on to the next level of complexity.

The book is nothing like this. Instead it sacrifices any attempt at logical sequencing or clarity for anecdotes about Elvis Presley or UFOs, for digressions about Biblical numerology, the silliness of astrology, the long and bewildering digression about introducing a safety index for activities (summarised above), or prolonged analyses of baseball or basketball statistics. Oh, and a steady drizzle of terrible jokes.

Which two sports have face-offs?
Ice hockey and leper boxing.

Half way through the book, Paulos tells us that he struggles to write long texts (‘I have a difficult time writing at extended length about anything’, p.88), and I think it really shows.

It certainly explains why:

  • the blizzard of problems in coin tossing and dice rolling stopped without any warning, as he switched tone copletely, giving us first a long chapter about all the crazy irrational beliefs people hold, and then another chapter listing all the reasons why society is innumerate
  • the last ten pages of the book give up the attempt of trying to be a coherent narrative and disintegrate into a bunch of miscellaneous odds and ends he couldn’t find a place for in the main body of the text

Also, I found that the book was not about numeracy in the broadest sense, but mostly about probability. Again and again he reverted to examples of tossing coins and rolling dice. One enduring effect of reading this book is going to be that, the next time I read a description of someone tossing a coin or rolling a die, I’m just going to skip right over the passage, knowing that if I read it I’ll either be bored to death (if I understand it) or have an unpleasant panic attack (if I don’t).

In fact in the coda at the end of the book Paulos explicitly says it has mostly been about probability – God, I wish he’d explained that at the beginning.

Right at the very, very end he briefly lists key aspects of probability theory which he claims to have explained in the book – but he hasn’t, some of them are only briefly referred to with no explanation at all, including: statistical tests and confidence intervals, cause and correlation, conditional probability, independence, the multiplication principle, the notion of expected value and of probability distribution.

These are now names I have at least read about, but they are all concepts I am nowhere near understanding, and light years away from being able to use in practical life.

Innumeracy – or illogicality?

Also there was an odd disconnect between the broadly psychological and philosophical prose explanations of what makes people so irrational, and the incredibly narrow scope of the coin-tossing, baseball-scoring examples.

What I’m driving at is that, in the long central chapter on Pseudoscience, when he stopped to explain what makes people so credulous, so gullible, he didn’t really use any mathematical examples to disprove Freudianism or astrology or so on: he had to appeal to broad principles of psychology, such as:

  • people are drawn to notable exceptions, instead of considering the entire field of entities i.e.
  • people filter out the bad and the failed and focus on the good and the successful
  • people seize hold of the first available explanation, instead of considering every single possible permutation
  • people humanise and personalise events (‘bloody weather, bloody buses’)
  • people over-value coincidences

My point is that there is a fundamental conceptual confusion in the book which is revealed in the long chapter about pseudoscience which is that his complaint is not, deep down, right at bottom, that people are innumerate; it is that people are hopelessly irrational and illogical.

Now this subject – the fundamental ways in which people are irrational and illogical – is dealt with much better, at much greater length, in a much more thorough, structured and comprehensible way in Stuart Sutherland’s great book, Irrationality, which I’ll be reviewing and summarising later this week.

Innumeracy amounts to random scratches on the surface of the vast iceberg which is the deep human inability to think logically.

Conclusion

In summary, for me at any rate, this was not a good book – badly structured, meandering in direction, unable to explain even basic concepts but packed with digressions, hobby horses and cul-de-sacs, unsure of its real purpose, stopping for a long rant against pseudosciences and an even longer lament on why maths is taught so badly  – it’s a weird curate’s egg of a text.

Its one positive effect was to make me want to track down and read a good book about probability.


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

%d bloggers like this: