Judgement Under Uncertainty: Heuristics and Biases by Amos Tversky and Daniel Kahneman

This article first appeared in Science, volume 185, in 1974. Tversky and Kahneman had been working for some time on unconscious biases in cognitive thinking and this paper summarises the findings of a number of their experiments. The paper was reprinted as an appendix in Kahneman’s 2011 book, Thinking, Fast and Slow. It is overflowing with ideas and insights about key aspects of how humans think, to be precise:

This article shows that people rely on a limited number of heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations.

The article focuses on three ‘heuristics’ which people use to assess probabilities and predict values and highlights their flaws and limitations. What is a heuristic? An intellectual short cut, a rule of thumb, a quick practical way of solving a problem.

The three heuristics discussed by the article are:

  1. Representativeness
  2. Availability
  3. Adjustment and anchoring

1. Representativeness

People make estimates and judgments of things and other people, based on their similarity to existing stereotypes, to representative types. This is the representativeness heuristic or, as it’s come to be known, the representative bias. In doing, people tend to completely ignore statistical and probabilistic factors which ought, in more rational thinking, to carry more weight.

T&K gave experimental subjects a description of ‘Steve’, describing him as shy and timid, meek and helpful and interested in order. The subjects were then asked to guess Steve’s profession from a list which included librarian and farmer. Most subjects guessed he was a librarian on the basis of his closeness to a pre-existing stereotype. But, given that there are ten times as many farmers in the U.S. as librarians and in the absence of any definitive evidence, in terms of pure probability, subjects should have realised that Steve is much more likely to be a farmer than a librarian.

In making this mistake, the subjects let the representativeness heuristic overshadow considerations of basic probability theory.

Insensitivity to prior probability of outcomes The prior probability or base rate frequency describes the likely occurrence of the event being assessed, the likelihood of an event occurring without any other intervention, its basic probability.

T&K told experimental subjects there were ten people in a room, nine men and one woman. Then T&K told the subjects that one of these ten people is caring and sharing, kind and nurturing, and asked the subjects who the description was of. Without any concrete evidence, the chance of it being the woman is the same as it being any of the men i.e. 1 in 10. But the representativeness heuristic overrode an understanding of base rate probability, and most of the subjects confidently said this description must be of the woman. They were overwhelmingly swayed by the description’s conformity to stereotype.

Insensitivity to sample size People don’t understand the significant difference which sample size makes to any calculation of probability.

Imagine a town has two hospitals, one large, one small. In the large one about 45 babies are born every day, in the small one about 15 babies. Now, the ratio of boys and girl babies born anywhere is usually around 50/50, but on particular days it can vary. Over a year, which hospital do you think had more days on which 60% or more of the babies born were boys?

When students were asked this question, 21 said the large hospital, 21 said the small hospital and 53 said it would be the same at both. The correct answer is the small hospital. Why? Because smaller samples are more likely to be unrepresentative, to have ‘freakish’ aberrations from the norm. T&K conclude that:

This fundamental notion of statistics is evidently not part of people’s repertoire of intuitions.

Imagine an urn filled with balls. Two thirds are one colour, a third are another. A subject draws five balls and finds 4 are red and one is white. Another subject draws 20 balls and finds that 12 are red and 8 are white. Which subject should feel more confident that 2/3 of the balls in the urn are red, and why?

Most people think it’s the first subject who should feel more confident. Four to one feels like – and is – a bigger ratio. Big is good. But they’re wrong. The second subject should feel more confident because, although his ratio is smaller – 3 to 2 – his sample size is larger. The larger the sample size, the closer you are likely to get to an accurate picture.

Misconception of chance Here are three sets of results from tossing a coin six times in a row, where T stands for tails and H stands for heads. Ask a selection of people which of the three sets is the random one.

  1. TTTTTT
  2. TTTHHH
  3. THHTTH

Most people will choose set 3 because it feels random. But, of course, all three are equally likely or unlikely. Tversky and Kahneman speculate that this is because people have in mind a representation of what randomness ought to look like, and let this override their statistical understanding (if they have any) that the total randomness of a system need not be exactly replicated at every level. In other words, a random series of tossing coins might well throw up sequences which appear to have order.

The gambler’s fallacy is the mistaken belief that, if you toss enough coins and get nothing but heads, the probability increases that the next result one will be tails, because you expect the series to ‘correct’ itself.

People who fall for this fallacy are using a representation of fairness (just as in the example above they use a representation of chaos) and letting it override what ought to be a basic knowledge of statistics, which is that each coin toss stands on its own and has its own probability i.e. 50/50 or 0.5. Just because someone tosses an increasing number of heads in a row is no reason at all for believing their next toss will be tails.

(In reality we all know that sooner or later a heads is likely to appear due to the law of large numbers, namely that if you perform probabilistic events enough times the total sum of events is likely to revert to the overall expected average. T&K shed light on the interaction of the gambler’s fallacy and the law of large numbers by clarifying that an unusual run of results is not ‘corrected’ by the coin (which obviously has no memory or intention) – such runs are diluted by a large number of occurrences, they are dissolved in the context of larger and larger samples.)

Insensitivity to predictability Subjects were given descriptions of two companies, one described in glowing terms, one in mediocre terms, and then asked about their future profitability. Although neither description mentioned anything about profitability, most subjects were swayed by the representativeness heuristic to predict that the positively described company would have higher profits.

Two groups of subjects were given descriptions of one practice lesson given by several student teachers. One group was asked to rate the teachers’ performances based on this one class, the other group was asked to predict the relative standing of the teachers five years in the future. The ratings of the groups agreed. Despite the wild improbability of being able to predict anything in five years time from one provisional piece of evidence, the subjects did just that.

The illusion of validity People make judgments or predictions based on the degree of representativeness (the quality of the match between the selected  outcome and the input) with no regard for probability or all the other factors which limit predictability. The illusion of validity is the profound mental conviction engendered when the ‘input information’ approaches representative models (stereotypes). I.e. if it matches a stereotype, people will believe it.

Misconceptions of regression Most people don’t understand a) where ‘regression to the mean’ applies b) recognise it when they see it, preferring to give all sorts of spurious explanations. For example, a sportsman has a great season – the commentators laud him, he wins sportsman of the year – but his next season is lousy. Critics and commentators come up with all kinds of reasons to explain this performance, but the good year might just have been a freak and now he has regressed closer to his average, mean ability.

2. Availability

Broadly speaking, this means going with the first thing that comes to mind. Like the two other heuristics, the availability heuristic has evolved because, in evolutionary terms, it is quick and useful. It does, however, in our complex industrial societies, lead to all kinds of biases and errors.

Biases due to the retrievability of incidences Experimenters read out a list of men and women to two groups without telling them that the list contained exactly 25 men and 25 women, then asked the groups to guess the ratio of the sexes. If the list included some famous men, the group was influenced to think there were more men, if the list included a sprinkling of famous women, the group thought there are more women than men. Why? Because the famous names carry more weight and literally influence people into thinking there are more of them.

Salience Seeing a house on fire makes people think about the danger of burning houses. Driving past a motorway accident makes people stop and think and drive more carefully (for a while). Then it wears off.

Biases due to the availability of a search set Imagine we sample words from a random text. Will there be more words starting with r or with r in the third position? For most people it is easier to call to mind words starting in r, so they think there are more of them, but there aren’t: there are more words in the English language with r in the third position than those with start with r.

Asked to estimate which are more common, abstract words like ‘love’ or concrete words like ‘door’, most subjects guess incorrectly that abstract words are more common. This is because they are more salient – love, fear, hate – and have more power in the mind. Are more available to conscious thought.

Biases of imaginability Say you’ve got a room of ten people. They have got to be formed into ‘committees. How many committees can be created which consist of between 2 and 8 people? Almost all people presented with this problem estimated there were many more possible committees of 2 than of 8, which is incorrect. There are 45 possible ways to create committees of 2 and of 8 (apparently). People prioritised 2 because it was easier to quickly begin working out permutations of 2, and then extrapolate this to the whole sample. This bias is very important when it comes to estimating the risk of any action, since we are programmed to call to mind big, striking, easy-to-imagine risks and often overlook hard-to-imagine risks (which is why risk factors should be written down and worked through as logically as possible).

Illusory correlation Subjects were given written profiles of several hypothetical mental patients along with drawings the patients were supposed to have made. When asked to associate the pictures with the diagnoses, subject came up with all kinds of spurious connections: for example, told that one patient was paranoid and suspicious, many of the subjects read ‘suspiciousness’ into one of the drawings and associated it with that patient, and so on.

But there were no connections. Both profiles and drawings were utterly spurious. But this didn’t stop all the subjects from making complex and plausible networks of connections and correlations.

Psychologists speculate that this tendency to attribute meaning is because we experience some strong correlations, especially early in life, and then project them onto every situation we encounter, regardless of factuality or probability.

It’s worth quoting T&K’s conclusion in full:

Lifelong experience has taught us that, in general, instances of large classes are recalled better and faster than instances of less frequent classes; that likely occurrences are easier to imagine than unlikely ones; and that the associative connections between events are strengthened when the events frequently co-occur. As a result, man has at his disposal a procedure (the availability heuristic) for estimating the numerosity of a class, the likelihood of an event, or the frequency of co-occurrences, by the ease with which the relevant mental operations of retrieval, construction, or association can be performed.

However, as the preceding examples have demonstrated, his valuable estimation procedure results in systematic errors.

3. Adjustment and Anchoring

In making estimates and calculations people tend to start from whatever initial value they have been given. All too often this value is not just wrong, but people are reluctant to move too far away from it. This is the anchor effect.

Insufficient adjustment Groups were given estimating tasks i.e. told to estimate various fairly easy values. Before each guess the group watched the invigilator spin a roulette wheel and pick a number entirely at random. Two groups were asked to estimate the number of African nations in the United Nations. The group which had watched the invigilator spin a roulette number of 10 guessed the number of nations at 25, the group which had watched him land a 65, guessed there were 45 nations.

Two groups of high school students were given these sums to calculate in 5 seconds: first group 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8, second group 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1. Without time to complete the sum both groups extrapolated from the part-completed task: first group guessed 512, second group guessed 2,250. (Both were wrong: it’s 40,320).

Biases in the evaluation of conjunctive and disjunctive events People tend to overestimate the probability of conjunctive events and underestimate the probability of disjunctive events. I found their explanation a little hard to follow here, but it seems to mean that when several events all need to occur in order to result in a certain outcome, we overestimate the likelihood that all of them will happen. If only one of many events needs to occur, we underestimate that probability.

Thus: subjects were asked to take part in the following activities:

  • simple event: pull a red marble from a bag containing half red marbles and half white marbles
  • conjunctive event: pulling a red marble seven times in succession from a bag containing 90% red and 10% whites – the point is, that this is only an event if it happens seven times in succession
  • disjunctive event: pulling a red marble at least once in seven successive goes

So the simple event is a yes-no result, with 50/50 odds; the conjunctive event requires that seven things happen in succession (pretty low odds); and the disjunctive event is a one (or more) in seven chance. Almost everyone overestimated the chances of the seven times in succession event compared to the at-least-one-in-seven outcome.

They then explain the real world significance of this finding. The development of a new product is a typically conjunctive event: a whole string of things must go right in order for the product to work. People’s tendency to overestimate conjunctive events leads to unwarranted optimism, which sometimes results in failure.

By contrast disjunctive structures are typically used in the calculation of risk. In a complex system, just one thing has to fail for the whole to fail. The chances of failure in each individual component might be low, but adding together the chances results in a high probability that something will go wrong, somewhere.

Yet people consistently underestimate the probability of disjunctive events, thus underestimating risk.

This explains why estimates for the completion of big, complex projects always tend to be over-optimistic – think Crossrail.

Anchoring in the assessment of subjective probability distributions This is an advanced statistical concept which they did not explain very well. I think it was to do with how you set a kind of basic value for a person’s guesses and estimates, and T&K then proceed to show that these kinds of calibrations are often wildly inaccurate.

Discussion

At the end of the summary of experiments, Tversky and Kahneman discuss their findings. This part was tricky to follow because they don’t discuss their findings’ impact on ordinary life in terms you or I might understand, but instead assess the impact of their findings on what appears to have been (back in 1974) modern decision theory.

think the idea is that modern decision theory was based on a model of human rationality which was itself based on an idealised notion of logical thinking calculated from an assessment or ‘calibration’ of subjective decision-making.

Modern decision theory regards subjective probability as the quantified opinion of an ideal person.

I found it impossible to grasp the detail of this idea, maybe because they don’t explain it very well, assuming that the audience for this kind of specialised research paper would already be familiar with it. Anyway, Tversky and Kahneman say that their findings undermine the coherence of this model of ‘modern decision theory’, explaining why in technical detail which, again, I found hard to follow.

Instead, for the lay reader like myself, the examples they’ve assembled, and the types of cognitive and logical and probabilistic errors they describe, give precision and detail enough to support one’s intuition that people (including oneself) are profoundly, alarmingly, irrational.

Summary

In their words:

This article described three heuristics that are employed in making judgements under uncertainty: (i) representativeness, which is usually employed when people are asked to judge the probability that an object or event A belongs to class or process B; (ii) availability of instances or scenarios, which is often employed when people are asked to assess the frequency of a class or the plausibility of a particular development; and (iii) adjustment from an anchor, which is usually employed in numerical prediction when a relevant value is available.

These heuristics are highly economical and usually effective, but they lead to systematic and predictable errors. A better understanding of these heuristics and of the biases to which they lead could improve judgments and decisions in situations of uncertainty.

My thoughts

1. The most obvious thing to me, fresh from reading John Allen Paulos’s two books about innumeracy and Stuart Sutherland’s book on irrationality, is how much the examples used by Tversky and Kahneman are repeated almost verbatim in those books, and thus what a rich source of data this article was for later writers.

2. The next thought is that this is because those books, especially the Sutherland, copy the way that Tversky and Kahneman use each heuristic as the basis for a section of their text, which they then sub-divide down into component parts, or variations on the basic idea.

Reading this paper made me realise this is exactly the approach that Sutherland uses in his book, taking one ‘error’ or bias at a time, and then working through all the sub-types and examples.

3. My next thought is the way Sutherland and Paulos only use some of the examples in this paper, the ones – reasonably enough – which are most comprehensible. Thus the final section in Tversky and Kahneman’s paper – about subjective probability distributions – is not picked up in the other books because it is couched in such dense mathematical terminology as to be almost impenetrable and because the idea they are critiquing – 1970s decision making theory – is too remote from most people’s everyday concerns.

So: having already read Paulos and Sutherland, not many of the examples Tversky and Kahneman use came as a surprise, nor did the basic idea of the availability error or representative error or the anchor effect.

But what did come over as new – what I found thought provoking – was the emphasis they put throughout on the fundamental usefulness of the heuristics.

Up till now – in Paulos and Sutherland – I had only heard negative things about these cognitive errors and prejudices and biases. It was a new experience to read Tversky and Kahneman explaining that these heuristics – these mental shortcuts – although they are often prone to error – nonetheless, have evolved deep in our minds because they are fundamentally useful.

That set off a new train of thought, and made me reflect that Paulos, Sutherland and Tversky and Kahneman are all dwelling on the drawbacks and limitations of these heuristics, leaving the many situations in which they are helpful, undescribed.

Now, as Sutherland repeats again and again – we should never let ourselves be dazzled by salient and striking results (such as coincidences and extreme results), we should always look at the full set of all the data, we should make sure we consider all the negative incidents where nothing dramatic or interesting happened, in order to make a correct calculation of probabilities.

So it struck me that you could argue that all these books and articles which focus on cognitive errors are, in their own way, rather unscientific, or lack a proper sample size – because they only focus on the times when the heuristics result in errors (and, also, that these errors are themselves measured in highly unrealistic conditions, in psychology labs, using highly unrepresentative samples of university students).

What I’m saying is that for a proper assessment of the real place of these heuristics in actual life, you would have to take into account all the numberless times when they have worked – when these short-cut, rule-of-thumb guesstimates, actually produce positive and beneficial results.

It may be that for every time a psychology professor conducts a highly restricted and unrealistic psychology experiment on high school students or undergraduates which results in them making howling errors in probability or misunderstanding the law of large numbers or whatever — it may just be that on that day literally billions of ‘ordinary’ people are using the same heuristic in the kind of real world situations most of us encounter in our day-to-day lives, to make the right decisions for us, and to achieve positive outcomes.

The drawbacks of these heuristics are front and centre of Paulos and Sutherland and Tversky and Kahneman’s works – but who’s measuring the advantages?


Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Tips For Trying To Think Less Irrationally

Professor Stuart Sutherland divides his book Irrationality: The Enemy Within into 23 chapters, each addressing a different aspect of why human beings are so prone to irrational, illogical, biased and erroneous thinking.

Having trotted through its allotted subject, each chapter ends with a few tentative suggestions of how to address the various biases and errors described in it.

This blog post is a summary of that advice. I have omitted tips which are so tied to specific examples that they’re incomprehensible out of context, and trimmed most of them down (and expanded a few).

The Wrong Impression

  1. Never base a judgement or decision on a single case, no matter how striking.
  2. In forming an impression of a person (or object) try to break your judgement down into his (or its) separate qualities without letting any strikingly good or bad qualities influence your opinion about the remainder: especially in interviews or medical diagnoses.
  3. When exposed to a train of evidence or information, suspend judgement until the end: try to give as much weight to the last piece of evidence as the first.
  4. Try to resist the temptation to seek out only information which reinforces the decision you have already taken. Try to seek out all the relevant information needed to make a decision.

Obedience

  1. Think before obeying.
  2. Question whether an order is justified.

Conformity

  1. Think carefully before announcing a decision or commitment in front of others. Once done, these are hard to change.
  2. Ask yourself whether you are doing or saying something merely because other are doing or saying it. If you have doubts, really reflect on them and gather evidence for them.
  3. Don’t be impressed by advice on a subject from someone just because you admire them, unless they are an expert on the matter in hand.
  4. Don’t be stampeded into acting by crowds. Stand aloof.

In-groups and out-groups

  1. Don’t get carried away by group decisions. Consciously formulate arguments against the group decision.
  2. If you’re forming a team or committee, invite people with different beliefs or skill sets.
  3. Reflect on your own prejudices and the ‘types’ of people you dislike or despise.

Organisational folly

(A list of errors in large organisations, which are difficult to cure, hence there are no tips at the end of this chapter.)

Misplaced consistency

  1. Beware of over-rating the results of a choice you’ve made (because the human tendency is to slowly come to believe all your decisions have been perfect).
  2. Try not to move by small steps to an action or attitude you would initially have disapproved of.
  3. No matter how much time, effort or money you have invested in a project, cut your losses if the future looks uncertain / risky.

Misuse of Rewards and Punishments

  1. If you want someone to value a task and perform well, do not offer material rewards. Appeal to their sense of responsibility and pride.
  2. If you are a manager, adopt as participatory and egalitarian a style as possible.
  3. If you want to stop children (and anyone else) from doing something, try to persuade rather than threatening them with punishment.

Drive and Emotion

  1. Don’t take important decisions when under stress or strong emotions.
  2. Every time you subdue an impulse, it becomes easier to do so.

Ignoring the Evidence (Pearl Harbour)

  1. Search for the evidence against your hypothesis, decision, beliefs.
  2. Try to entertain hypotheses which are antagonistic to each other.
  3. Respect beliefs and ideas which conflict with your own. They might be right.

Distorting the Evidence (Battle of Arnhem)

  1. If new evidence comes in don’t distort it to support your existing actions or views. The reverse: consider carefully whether it disproves your position.
  2. Don’t trust your memory. Countless experiments prove that people remember what they need to remember to justify their actions and bolster their self-esteem.
  3. Changing your mind in light of new evidence is a sign of strength, not weakness.

Making the Wrong Connections

  1. If you want to determine whether one event is associated with another, never attempt to keep the co-occurrence of events in your head. Maintain a written tally of the four possible outcomes in a 2 x 2 box.
  2. Remember that A is only associated with B if B occurs a higher percentage of the time in the presence of A than in its absence.
  3. Pay particular attention to negative cases.
  4. In particular, do not associate things together because you expect them to be, or because they are unusual.

Mistaking Connections in Medicine

(Focuses on doctors failure to use 2 x 2 tables in order to establish correct probabilities in diagnosis, so maybe the tip should be: Don’t try to calculate conditional probabilities in your head – write it down.)

Mistaking the Cause

  1. Suspect any explanation of an event in which the cause and the effect are similar to one another.
  2. Suspect all epidemiological findings unless they are supported by more reliable evidence.
  3. Consider whether an event could have causes other than the one you first think of.
  4. In allocating cause and effect, consider that they might happen in the opposite direction to that you first choose.
  5. Be sceptical of any causal relationship unless there is an underlying theory that explains it.
  6. In apportioning responsibility for an action, do not be influenced by the magnitude of its effect.
  7. Don’t hold someone responsible for an action without first considering what others would have done in their place.
  8. Don’t assume that other people are like you.

Misinterpreting the Evidence

  1. Do not judge solely by appearances. If someone looks more like an X than a Y, they may still be a Y if there are many more Ys than Xs.
  2. A statement containing two or more pieces of information is always less likely to be true than one containing only one piece of information.
  3. Do not believe a statement is true just because part of it is true.
  4. If you learn the probability of X given Y, to arrive at a true probability you must know the base rate of X.
  5. Don’t trust small samples.
  6. Beware of biased samples.

Inconsistent decisions and Bad Bets

  1. Always work out the expected value of a gamble before accepting it.
  2. Before accepting any form of gamble be clear what you want from it – high expected value, the remote possibility of winning a large sum with a small outlay, a probable but small gain, or just the excitement of gambling and damn the expense. If you seriously intend solely to make money, work out the expected value of a gamble before accepting it.
  3. Don’t be anchored by the first figure you hear; ignore it and reason from scratch.
  4. Many connected or conditional probabilities make an event more unlikely with every new addition. Conversely, the sum of numerous independent probabilities may add up to make something quite likely.

Overconfidence

  1. Distrust anyone who says they can predict the present from the past.
  2. Be wary of anyone who claims to be able to predict the future.
  3. Try to control your own over-confidence e.g.
    • wherever possible, try to write out and calculate probabilities rather than using ‘intuition’
    • always think of arguments which contradict your position and work them through

Risks

  1. People are liable to ignore risks if told to i.e. it is managers’ responsibility to assess the risks for their staff.
  2. Insidious chronic dangers may kill more people than dramatic accidents i.e. coal pollution kills more people than nuclear accidents.

False Inferences

  1. Regression to the mean: remember that whenever anything extreme happens, chances are the next thing will be a lot less extreme: explains why second novels or albums or sports seasons are often disappointing following an award-winning first novel, album or season.
  2. If two pieces of evidence always agree, you only need one of them to make a prediction.
  3. Avoid the gambler’s fallacy i.e. the belief that a certain random event is less likely or more likely, given a previous event or a series of events i.e. if you toss a coin long enough heads must come up. No. Each new toss is a new event, uninfluenced by previous events.

The Failure of Intuition

  1. Suspect anyone who claims to have good intuition.
  2. If you are in a profession, consider using mathematical models of decision making instead of trusting your ‘intuition’.

Utility

  1. When the importance of a decision merits the expenditure of time, use Utility Theory.
  2. Before making an important decision decide what your overall aim is, whether it be to maximise the attainment of your goals, to save yourself from loss, to make at least some improvement to your situation etc.

Causes, cures and costs

  • keep an open mind
  • reach a conclusion only after reviewing all the possible evidence
  • it is a sign of strength to change one’s mind
  • seek out evidence which disproves your beliefs
  • do not ignore or distort evidence which disproves your beliefs
  • never make decisions in a hurry or under stress
  • where the evidence points to no obvious decision, don’t take one
  • learn basic statistics and probability
  • substitute mathematical methods (cost-benefit analysis, regression analysis, utility method) for intuition and subjective judgement

Thoughts

This is all very good advice, and I’d advise anyone to read Sutherland’s book. However, I can see scope for improvement or taking it further.

The structure above reflects Sutherland’s i.e. it has arranged the field in terms of the errors people make – each chapter devoted to a type of error with various types of evidence describing experiments which show how common they are.

In a sense this is an easy approach. There exist nowadays numerous lists of cognitive errors and biases.

Arguably, it would be more helpful to try and make a book or helpsheet arranged by problems and solutions, in which – instead of beginning another paragraph ‘Imagine you toss a coin a thousand times…’ in order to demonstrate another common misunderstanding of probability theory – each chapter focused on types real world of situation and how to handle them.

It would be titled something like How to think more clearly about… and then devote a chapter each to meeting new people, interviews, formal meetings and so on. There would be a standalone chapter devoted just to probability theory, since this stands out to me as being utterly different from the psychological biases – and maybe another one devoted solely to gambling since this, also, amounts to a specialised area of probability.

There would be one on financial advisers and stock brokers, giving really detailed advice on what to look for before hiring one, and whether you need one at all.

There would be one solely about medical statistics i.e. explaining how to understand the risks and benefits of medical treatment, if you ever need some.

Currently, although Sutherland’s book and the list of tips listed above are useful, it is impossible to remember all of them. A more practical approach would be to have a book (or website) of problems or situations where you could look up the situation and be reminded of the handful of simple but effective principles you should bear in mind.


Related link

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Alex’s Adventures In Numberland by Alex Bellos (2010)

Alexander Bellos (born in 1969) is a British writer and broadcaster. He is the author of books about Brazil and mathematics, as well as having a column in The Guardian newspaper. After adventures in Brazil (see his Wikipedia page) he returned to England in 2007 and wrote this, his first book. It spent four months in the Sunday Times bestseller list and led on to five more popular maths books.

It’s a hugely enjoyable read for three reasons:

  1. Bellos immediately establishes a candid, open, good bloke persona, sharing stories from his early job as a reporter on the Brighton Argus, telling some colourful anecdotes about his time in Brazil and then being surprisingly open about the way that, when he moved back to Britain, he had no idea what to do. The tone of the book is immediately modern, accessible and friendly.
  2. However this doesn’t mean he is verbose. The opposite. The book is packed with fascinating information. Every single paragraph, almost every sentence contains a fact or insight which makes you sit up and marvel. It is stufffed with good things.
  3. Lastly, although its central theme is mathematics, it approaches this through a wealth of information from the humanities. There is as much history and psychology and anthropology and cultural studies and philosophy as there is actual maths, and these are all subjects which the average humanities graduate can immediately relate to and assimilate.

Chapter Zero – A Head for Numbers

Alex meets Pierre Pica, a linguist who’s studied the Munduruku people of the Amazon and discovered they have little or no sense of numbers. They only have names for numbers up to five. Also, they cluster numbers together logarithmically i.e. the higher the number, the closer together they clustered them. Same thing is done by kindergarten children who only slowly learn that numbers are evenly spaced, in a linear way.

This may be because small children and the Munduruku don’t count so much as estimate using the ratios between numbers.

It may also be because above a certain number (five) Stone Age man needed to make quick estimates along the lines of, Are there more wild animals / members of the other gang, than us?

Another possibility is that distance appears to us to be logarithmic due to perspective: the first fifty yards we see in close detail, the next fifty yards not so detailed, beyond 100 yards looking smaller, and so on.

It appears that we have to be actively taught when young to overcome our logarithmic instincts, and to apply the rule that each successive whole number is an equal distance from its predecessor and successor i.e. the rational numbers lies along a straight line at regular intervals.

More proof that the logarithmic approach is the deep, hard-wired one is the way most of us revert to its perspective when considering big numbers. As John Allen Paulos laments, people make no end of fuss about discrepancies between 2 or 3 or 4 – but are often merrily oblivious to the difference between a million or a billion, let alone a trillion. For most of us these numbers are just ‘big’.

He goes on to describe experiments done on chimpanzees, monkeys and lions which appear to show that animals have the ability to estimate numbers. And then onto experiments with small babies which appear to show that as soon as they can focus on the outside world, babies can detect changes in number of objects.

And it appears that we also have a further number skill, that guesstimating things – the journey takes 30 or 40 minutes, there were twenty or thirty people at the party, you get a hundred, maybe hundred and fifty peas in a sack. When it comes to these figures almost all of us give rough estimates.

To summarise:

  • we are sensitive to small numbers, acutely so of 1, 2, 3, 4, less so of 5, 6, 7, 8, 9
  • left to our own devices we think logarithmically about larger numbers i.e lose the sense of distinction between them, clump them together
  • we have a good ability to guesstimate medium size numbers – 30, 40, 100

But it was only with the invention of notation, a way of writing numbers down, that we were able to create the linear system of counting (where every number is 1 larger than its predecessor, laid out in a straight line, at regular intervals).

And that this cultural invention enabled human beings to transcend our vague guesstimating abilities, and laid the basis for the systematic manipulation of the world which followed

Chapter One – The Counter Culture

The probable origins of counting lie in stock taking in the early agricultural revolution some 8,000 years ago.

We nowadays count using a number base 10 i.e. the decimal system. But other bases have their virtues, especially base 12. It has more factors i.e. is easier to divide: 12 can be divided neatly by 2, 3, 4 and 6. A quarter of 10 is 2.5 but of 12 is 3. A third of 10 is 3.333 but of 12 is 4. Striking that a version of the duodecimal system (pounds, shillings and pence) hung on in Britain till we finally went metric in the 1970s. There is even a Duodecimal Society of America which still actively campaigns for the superiority of a base 12 counting scheme.

Bellos describes a bewildering variety of other counting systems and bases. In 1716 King Charles XII of Sweden asked Emmanuel Swedenborg to devise a new counting system with a base of 64. The Arara in the Amazon count in pairs, the Renaissance author Luca Paccioli was just one of hundreds who have devised finger-based systems of counting – indeed, the widespread use of base 10 probably stems from the fact that we have ten fingers and toes.

He describes a complicated Chinese system where every part of the hand and fingers has a value which allows you to count up to nearly a billion – on one hand!

The Yupno system which attributes a different value for parts of the body up to its highest number, 33, represented by the penis.

Diagram showing numbers attributed to parts of the body by the Yupno tribe

Diagram showing numbers attributed to parts of the body by the Yupno tribe

There’s another point to make about his whole approach which comes out if we compare him with the popular maths books by John Allen Paulos which I’ve just read.

Paulos clearly sees the need to leaven his explanations of comparative probability and Arrow’s Theorem and so on with lighter material and so his strategy is to chuck into his text things which interest him: corny jokes, anecdotes about baseball, casual random digressions which occur to him in mid-flow. But al his examples clearly 1. emanate from Paulos’s own interests and hobby horses (especially baseball) and 2. they are tacked onto the subjects being discussed.

Bellos, also, has grasped that the general reader needs to be spoonfed maths via generous helpings of other, more easily digestible material. But Bellos’s choice of material arises naturally from the topic under discussion. The humour emerges naturally and easily from the subject matter instead of being tacked on in the form of bad jokes.

You feel yourself in the hands of a master storyteller who has all sorts of wonderful things to explain to you.

In fourth millennium BC, an early counting system was created by pressing a reed into soft clay. By 2700 BC the Sumerians were using cuneiform. And they had number symbols for 1, 10, 60 and 3,600 – a mix of decimal and sexagesimal systems.

Why the Sumerians grouped their numbers in 60s has been described as one of the greatest unresolved mysteries in the history of arithmetic. (p.58)

Measuring in 60s was inherited by the Babylonians, the Egyptians and the Greeks and is why we still measure hours in 60 minutes and the divisions of a circle by 360 degrees.

I didn’t know that after the French Revolution, when the National Convention introduced the decimal system of weights and measures, it also tried to decimalise time, introducing a new system whereby every day would be divided into ten hours, each of a hundred minutes, each divided into 100 seconds. Thus there were a very neat 10 x 100 x 100 = 100,000 seconds in a day. But it failed. An hour of 60 minutes turns out to be a deeply useful division of time, intuitively measurable, and a reasonable amount of time to spend on tasks. The reform was quietly dropped after six months, although revolutionary decimal clocks still exist.

Studies consistently show that Chinese children find it easier to count than European children. This may be because of our system of notation, or the structure of number names. Instead of eleven or twelve, Chinese, Japanese and Koreans say the equivalent of ten one, ten two. 21 and 22 become two ten one and two ten two. It has been shown that this makes it a lot simpler and more intuitive to do basic addition and subtraction.

Bellos goes on to describe the various systems of abacuses which have developed in different cultures, before explaining the phenomenal popularity of abacus counting, abacus clubs, and abacus championships in Japan which helps kids develop the ability to perform anzan, using the mental image of an abacus to help its practitioners to sums at phenomenal speed.

Chapter Two – Behold!

The mystical sense of the deep meaning of numbers, from Pythagoras with his vegetarian religious cult of numbers in 4th century BC Athens to Jerome Carter who advises leading rap stars about the numerological significance of their names.

Euclid and the elegant and pure way he deduced mathematical theorems from a handful of basic axioms.

A description of the basic Platonic shapes leads into the nature of tessalating tiles, and the Arab pioneering of abstract design. The complex designs of the Sierpinski carpet and the Menger sponge. And then the complex and sophisticated world of origami, which has its traditionalists, its pioneers and surprising applications to various fields of advanced science, introducing us to the American guru of modern origami, Robert Lang, and the Japanese rebel, Kazuo Haga, father of Haga’s Theorem.

Chapter Three – Something About Nothing

A bombardment of information about the counting systems of ancient Hindus, Buddhists, about number symbols in Sanskrit, Hebrew, Greek and Latin. How the concept of zero was slowly evolved in India and moved to the Muslim world with the result that the symbols we use nowadays are known as the Arabic numerals.

A digression into ‘a set of arithmetical tricks known as Vedic Mathematics ‘ devised by a young Indian swami at the start of the twentieth century, Bharati Krishna Tirthaji, based on a series of 16 aphorisms which he found in the ancient holy texts known as the Vedas.

Shankaracharya is a commonly used title of heads of monasteries called mathas in the Advaita Vedanta tradition. Tirthaji was the Shankaracharya of the monastery at Puri. Bellos goes to visit the current Shankaracharya who explains the closeness, in fact the identity, of mathematics and Hindu spirituality.

Chapter Four – Life of Pi

An entire chapter about pi which turns out not only to be a fundamental aspect of calculating radiuses and diameters and volumes of circles and cubes, but also to have a long history of mathematicians vying with each other to work out its value to as many decimal places as possible (we currently know the value of pi to 2.7 trillion decimal places) and the surprising history of people who have set records reciting the value if pi.

Thus, in 2006, retired Japanese engineer Akira Haraguchi set a world record for reciting the value of pi to the first 100,000 decimal places from memory! It took 16 hours with five minute beaks every two hours to eat rice balls and drink some water.

There are several types or classes of numbers:

  • natural numbers – 1, 2, 3, 4, 5, 6, 7…
  • integers – all the natural numbers, but including the negative ones as well – …-3, -2, -1, 0, 1, 2, 3…
  • fractions
  • which are also called rational numbers
  • numbers which cannot be written as fractions are called irrational numbers
  • transcendent numbers – ‘a transcendental number is an irrational number that cannot be described by an equation with a finite number of terms’

The qualities of the heptagonal 50p coin and the related qualities of the Reuleux triangle.

Chapter Five – The x-factor

The origin of algebra (in Arab mathematicians).

Bellos makes the big historical point that for the Greeks (Pythagoras, Plato, Euclid) maths was geometric. They thought of maths as being about shapes – circles, triangles, squares and so on. These shapes had hidden properties which maths revealed, thus giving – the Pythagoreans thought – insight into the secret deeper values of the world.

It is only with the introduction of algebra in the 17th century (Bellos attributes its widespread adoption to Descartes’s Method in the 1640s) that it is possible to fly free of shapes into whole new worlds of abstract numbers and formulae.

Logarithms turn the difficult operation of multiplication into the simpler operation of addition. If X x Y = Z, then log X + log Y = log Z. They were invented by a Scottish laird John Napier, and publicised in a huge book of logarithmic tables published in 1614. Englishman Henry Briggs established logarithms to base 10 in 1628. In 1620 Englishman Edmund Gunter marked logarithms on a ruler. Later in the 1620s Englishman William Oughtred placed two logarithmic rulers next to each other to create the slide rule.

Three hundred years of dominance by the slide rule was brought to a screeching halt by the launch of the first pocket calculator in 1972.

Quadratic equations are equations with an x and an x², e.g. 3x² + 2x – 4 = 0. ‘Quadratics have become so crucial to the understanding of the world, that it is no exaggeration to say that they underpin modern science’ (p.200).

Chapter Six – Playtime

Number games. The origin of Sudoku, which is Japanese for ‘the number must appear only once’. There are some 5 billion ways for numbers to be arranged in a table of nine cells so that the sum of any row or column is the same.

There have, apparently, only been four international puzzle crazes with a mathematical slant – the tangram, the Fifteen puzzle, Rubik’s cube and Sudoku – and Bellos describes the origin and nature and solutions to all four. More than 300 million cubes have seen sold since Ernö Rubik came up with the idea in 1974. Bellos gives us the latest records set in the hyper-competitive sport of speedcubing: the current record of restoring a copletely scrambled cube to order (i.e. all the faces of one colour) is 7.08 seconds, a record held by Erik Akkersdijk, a 19-year-old Dutch student.

A visit to the annual Gathering for Gardner, honouring Martin Gardner, one of the greatest popularisers of mathematical games and puzzles who Bellos visits. The origin of the ambigram, and the computer game Tetris.

Chapter Seven – Secrets of Succession

The joy of sequences. Prime numbers.

The fundamental theorem of arithmetic – In number theory, the fundamental theorem of arithmetic, also called the unique factorization theorem or the unique-prime-factorization theorem, states that every integer greater than 1 either is a prime number itself or can be represented as the product of prime numbers.

The Goldbach conjecture – one of the oldest and best-known unsolved problems in number theory and all of mathematics. It states that, Every even integer greater than 2 can be expressed as the sum of two primes. The conjecture has been shown to hold for all integers less than 4 × 1018, but remains unproven despite considerable effort.

Neil Sloane’s idea of persistence – The number of steps it takes to get to a single digit by multiplying all the digits of the preceding number to obtain a second number, then multiplying all the digits of that number to get a third number, and so on until you get down to a single digit. 88 has a persistence of three.

88 → 8 x 8 = 64 → 6 x 4 = 24 → 2 x 4 = 8

John Horton Conway’s idea of the powertrain – For any number abcd its powertrain goes to abcd, in the case of numbers with an odd number of digits the final one has no power, abcde’s powertrain is abcde.

The Recamán sequence Subtract if you can, unless a) it would result in a negative number or b) the number is already in the sequence. The result is:

0, 1, 3, 6, 2, 7, 13, 20, 12, 21, 11….

Gijswijt’s sequence a self-describing sequence where each term counts the maximum number of repeated blocks of numbers in the sequence immediately preceding that term.

1, 1, 2, 1, 1, 2, 2, 2, 3, 1, 1, 2, 1, 1, 2, 2, 2, 3, 2, 1, …

Perfect number A perfect number is any number that is equal to the sum of its factors. Thus 6 – its factors (the numbers which divided into it) are 1, 2 and 3. Which also add up to (are the sum of) 6. The next perfect number is 28 because its factors – 1, 2, 4, 7, 14 – add up to 28. And so on.

Amicable numbers A number is amicable if the sum of the factors of the first number equals the second number, and if the sum of the factors of the second number equals the first. The factors of 220 are 1, 2, 4, 5, 10, 11, 20, 22, 44, 55 and 110. Added together these make 284. The factors of 284 are 1, 2, 4, 71 and 142. Added together they make 220!

Sociable numbers In 1918 Paul Poulet invented the term sociable numbers. ‘The members of aliquot cycles of length greater than 2 are often called sociable numbers. The smallest two such cycles have length 5 and 28’

Mersenne’s prime A prime number which can be written in the form 2n – 1 a prime number that is one less than a power of two. That is, it is a prime number of the form Mn = 2n − 1 for some integer n. The exponents n which give Mersenne primes are 2, 3, 5, 7, 13, 17, 19, 31, … and the resulting Mersenne primes are 3, 7, 31, 127, 8191, 131071, 524287, 2147483647, …

These and every other sequence ever created by humankind are documented on The On-Line Encyclopedia of Integer Sequences (OEIS), also cited simply as Sloane’s. This is an online database of integer sequences, created and maintained by Neil Sloane while a researcher at AT&T Labs.

Chapter Eight – Gold Finger

The golden section a number found by dividing a line into two parts so that the longer part divided by the smaller part is also equal to the whole length divided by the longer part.

Phi The number is often symbolized using phi, after the 21st letter of the Greek alphabet. In an equation form:

a/b = (a+b)/a = 1.6180339887498948420 …

As with pi (the ratio of the circumference of a circle to its diameter), the digits go on and on, theoretically into infinity. Phi is usually rounded off to 1.618.

The Fibonnaci sequence Each number in the sequence is the sum of the two numbers that precede it. So the sequence goes: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, and so on. The mathematical equation describing it is Xn+2= Xn+1 + Xn.

as the basis of seeds in flowerheads, arrangement of leaves round a stem, design of nautilus shell and much more.

Chapter Nine – Chance Is A Fine Thing

A chapter about probability and gambling.

Impossibility has a value 0, certainty a value 1, everything else is in between. Probabilities can be expressed as fractions e.g. 1/6 chance of rolling a 6 on a die, or as percentages, 16.6%, or as decimals, 0.16…

The probability is something not happening is 1 minus the probability of that thing happening.

Probability was defined and given mathematical form in 17th century. One contribution was the questions the Chevalier de Méré asked the mathematical prodigy Blaise Pascal. Pascal corresponded with his friend, Pierre de Fermat, and they worked out the bases of probability theory.

Expected value is what you can expect to get out of a bet. Bellos takes us on a tour of the usual suspects – rolling dice, tossing coins, and roulette (invented in France).

Payback percentage if you bet £10 at craps, you can expect – over time – to receive an average of about £9.86 back. In other words craps has a payback percentage of 98.6 percent. European roulette has a payback percentage of 97.3 percent. American roulette, 94.7 percent. On other words, gambling is a fancy way of giving your money away. A miserly slot machine has a payback percentage of 85%. The National Lottery has a payback percentage of 50%.

The law of large numbers The more you play a game of chance, the more likely the results will approach the statistical probability. Toss a coin three times, you might get three heads. Toss a coin a thousand times, the chances are you will get very close the statistical probability of 50% heads.

The law of very large numbers With a large enough sample, outrageous coincidences become likely.

The gambler’s fallacy The mistaken belief that, if something happens more frequently than normal during a given period, it will happen less frequently in the future (or vice versa). In other words, that a random process becomes less random, and more predictable, the more it is repeated.

The birthday paradox The probability that, in a set of n randomly chosen people, some pair of them will have the same birthday. By the pigeonhole principle, the probability reaches 100% when the number of people reaches 367 (since there are only 366 possible birthdays, including February 29). However, 99.9% probability is reached with just 70 people, and 50% probability with 23 people. (These conclusions are based on the assumption that each day of the year (excluding February 29) is equally probable for a birthday.) In other words you only need a group of 23 people to have an evens chance that two of them share a birthday.

The drunkard’s walk

The difficulty of attaining true randomness and the human addiction to finding meaning in anything.

The distinction between playing strategy (best strategy to win a game) and betting strategy (best strategy to maximise your winnings), not always the same.

Chapter Ten – Situation Normal

Carl Friedrich Gauss, the bell curve, normal distribution aka Gaussian distribution. Normal or Gaurrian distribution results in a bell curve. Bellos describes the invention and refinement of the bell curve (he explains that ‘the long tail’ results from a mathematician who envisioned a thin bell curve as looking like two kangaroos facing each other with their long tails heading off in opposite directions). And why

Regression to the mean – if the outcome of an event is determined at least in part by random factors, then an extreme event will probably be followed by one that is less extreme. And recent devastating analyses which show how startlingly random sports achievements are, from leading baseball hitters to Simon Kuper and Stefan Szymanski’s analysis of the form of the England soccer team.

Chapter Eleven – The End of the Line

Two breakthroughs which paved the way for modern i.e. 20th century, maths: the invention of non-Euclidean geometry, specifically the concept of hyperbolic geometry. To picture this draw a triangle on a Pringle. it is recognisably a triangle but all its angles do not add up to 180°, therefore it defies, escapes, eludes all the rule of Euclidean geometry, which were designed for flat 2D surfaces.

Bellos introduces us to Daina Taimina, a maths prof at Cornell University, who invented a way of crocheting hyperbolic surfaces. The result looks curly, like curly kale or the surface of coral.

Anyway, the breakaway from flat 2-D Euclidean space led to theories about curved geometry, either convex like a sphere, or hyperbolic like the pringle. It was this notion of curved space, which paved the way for Einstein’s breakthrough ideas in the early 20th century.

The second big breakthrough was Georg Cantor’s discovery that you can have many different types of infinity. Until Cantor the mathematical tradition from the ancient Greeks to Galileo and Newton had fought shy of infinity which threatened to disrupt so many formulae.

Cantor’s breakthrough was to stop thinking about numbers, and instead think of sets. This is demonstrated through the paradoxes of Hilbert’s Hotel. You need to buckle your safety belt to understand it.

Thoughts

This is easily the best book about maths I’ve ever read. It gives you a panoramic history of the subject which starts with innumerate cavemen and takes us to the edge of Einstein’s great discoveries. But Bellos adds to it all kinds of levels and abilities.

He is engaging and candid and funny. He is fantastically authoritative, taking us gently into forests of daunting mathematical theory without placing a foot wrong. He’s a great explainer. He knows a good story when he sees one, and how to tell it engagingly. And in every chapter there is a ‘human angle’ as he describes his own personal meetings and interviews with many of the (living) key players in the world of contemporary maths, games and puzzles.

Like the Ian Stewart book but on a vastly bigger scale, Bellos makes you feel what it is like to be a mathematician, not just interested in nature’s patterns (the basis of Stewart’s book, Nature’s Numbers) but in the beauty of mathematical theories and discoveries for their own sakes. (This comes over very strongly in chapter seven with its description of some of the weirdest and wackiest number sequences dreamed up by the human mind.) I’ve often read scientists describing the beauty of mathematical theories, but Bellos’s book really helps you develop a feel for this kind of beauty.

For me, I think three broad conclusions emerged:

1. Most mathematicians are in it for the fun. Setting yourself, and solving, mathematical puzzles is obviously extremely rewarding. Maths includes the vast territory of puzzles and games, such as the Sudoku and so on he describes in chapter six. Obviously it has all sorts of real-world application in physics, engineering and so on, but Bellos’s book really brings over that a true understanding of maths begins in puzzles, games and patterns, and often remains there for a lifetime. Like everything else maths is no highly professionalised the property of tenured professors in universities; and yet even to this day – as throughout its history – contributions can be made by enthusiastic amateurs.

2. As he points out repeatedly, many insights which started out as the hobby horses of obsessives, or arcane breakthroughs on the borders of our understanding, and which have been airily dismissed by the professionals, often end up being useful, having applications no-one dreamed of. Either they help unravel aspects of the physical universe undreamed of when they were discovered, or have been useful to human artificers. Thus the development of random number sequences seemed utterly pointless in the 19th century, but now underlies much internet security.

On a profounder note, Bellos expresses the eerie, mystical sense many mathematicians have that it seems so strange, so pregnant with meaning, that so many of these arcane numbers end up explaining aspects of the world their inventors knew nothing of. Ian Stewart has an admirably pragmatic explanation for this: he speculates that nature uses everything it can find in order to build efficient life forms. Or, to be less teleological, over the past 3 and a half billion years, every combination of useful patterns has been tried out. Given this length of time, and the incalculable variety of life forms which have evolved on this planet, it would be strange if every number system conceivable by one of those life forms – humankind – had not been tried out at one time or another.

3. My third conclusion is that, despite John Allen Paulos’s and Bellos’s insistence, I do not live in a world ever-more bombarded by maths. I don’t gamble on anything, and I don’t follow sports – the two biggest popular areas where maths is important – and the third is the twin areas of surveys and opinion polls (55% of Americans believe in alien abductions etc etc) and the daily blizzard of reports (for example, I see in today’s paper that the ‘Number of primary school children at referral units soars’).

I register their existence but they don’t impact on me for the simple reason that I don’t believe any of them. In 1992 every opinion poll said John Major would lose the general election, but he won with a thumping majority. Since then I haven’t believed any poll about anything. For example almost all the opinion polls predicted a win for Remain in the Brexit vote. Why does any sane person believe opinion polls?

And ‘new and shocking’ reports come out at the rate of a dozen a day and, on closer examination, lots of them turn out to be recycled information, or much much more mundane releases of data sets from which journalists are paid to draw the most shocking and extreme conclusions. Some may be of fleeting interest but once you really grasp that the people reporting them to you are paid to exaggerate and horrify, you soon learn to ignore them.

If you reject or ignore these areas – sport, gambling and the news (made up of rehashed opinion polls, surveys and reports) – then unless you’re in a profession which actively requires the sophisticated manipulation of figures, I’d speculate that most of the rest of us barely come into contact with numbers from one day to the next.

I think that’s the answer to Paulos and Bellos when they are in their ‘why aren’t more people mathematically numerate?’ mode. It’s because maths is difficult, and counter-intuitive, and hard to understand and follow, it is a lot of work, it does make your head ache. Even trying to solve a simple binomial equation hurt my brain.

But I think the biggest reason that ‘we’ are so innumerate is simply that – beautiful, elegant, satisfying and thought-provoking though maths may be to the professionals – maths is more or less irrelevant to most of our day to day lives, most of the time.


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Nature’s Numbers by Ian Stewart (1995)

Ian Stewart is a mathematician and prolific author, having written over 40 books on all aspects of maths, as well as publishing several guides to the maths used in Terry Pratchett’s Discworld books, writing half a dozen textbooks for students, and co-authoring a couple of science fiction novels.

Stewart writes in a marvellously clear style but, more importantly, he is interesting: he sees the world in an interesting way, in a mathematical way, and manages to convey the wonder and strangeness and powerful insights which seeing the world in terms of patterns and shapes, numbers and maths, gives you.

He wants to help us see the world as a mathematician sees it, full of clues and information which can lead us to deeper and deeper appreciation of the patterns and harmonies all around us. It makes for a wonderfully illuminating read.

1. The Natural Order

Thus Stewart begins the book by describing just some of nature’s multitude of patterns: the regular movements of the stars in the night sky; the sixfold symmetry of snowflakes; the stripes of tigers and zebras; the recurring patterns of sand dunes; rainbows; the spiral of a snail’s shell; why nearly all flowers have petals arranged in one of the following numbers 5, 8, 13, 21, 34, 55, 89; the regular patterns or ‘rhythms’ made by animals scuttling, walking, flying and swimming.

2. What Mathematics is For

Mathematics is brilliant at helping us to solve puzzles. It is a more or less systematic way of digging out the rules and structures that lie behind some observed pattern or regularity, and then using those rules and structures to explain what’s going on. (p.16)

Having gotten our attention, Stewart trots through the history of major mathematical discoveries including Kepler discovering that the planets move not in circles but in ellipses; the discovery that the nature of acceleration is ‘not a fundamental quality, but a rate of change’, then Newton and Leibniz inventing calculus to help us work outcomplex rates of change, and so on.

Two of the main things that maths are for are 1. providing the tools which let scientists understand what nature is doing 2. providing new theoretical questions for mathematicians to explore further. These are handy rules of thumb for distinguishing between, respectively, applied and pure mathematics.

Stewart mentions one of the oddities, paradoxes or thought-provoking things that crops up in many science books, which is the eerie way that good mathematics, mathematics well done, whatever its source and no matter how abstract its origin, eventually turns out to be useful, to be applicable to the real world, to explain some aspect of nature.

Many philosophers have wondered why. Is there a deep congruence between the human mind and the structure of the universe? Did God make the universe mathematically and implant an understanding of maths in us? Is the universe made of maths?

Stewart’s answer is simple and elegant: he thinks that nature exploits every pattern that there is, which is why we keep discovering patterns everywhere. We humans express these patterns in numbers, but nature doesn’t use numbers as such – she uses the patterns and shapes and possibilities which the numbers express or define.

Mendel noticing the numerical relationships with which characteristics of peas are expressed when they are crossbred. The double helix structure of DNA. Computer simulations of the evolution of the eye from an initial mutation creating a patch of skin cells sensitive to light, published by Daniel Nilsson and Susanne Pelger in 1994. Pattern appears wherever we look.

Resonance = the relationship between periodically moving bodies in which their cycles lock together so that they take up the same relative positions at regular intervals. The cycle time is the period of the system. The individual bodies have different periods. The moon’s rotational period is the same as its revolution around the earth, so there is a 1:1 resonance of its orbital and rotational periods.

Mathematics doesn’t just analyse, it can predict, predict how all kinds of systems will work, from the aerodynamics which keep planes flying, to the amount of fertiliser required to increase crop yield, to the complicated calculations which keep communications satellites in orbit round the earth and therefore sustain our internet and mobile phone networks.

Time lags The gap between a new mathematical idea being developed and its practical implementation can be a century or more: it was 17th century interest in the mathematics of vibrating violin strings which led, three hundred years later, to the invention of radio, radar and TV.

3. What Mathematics is About

The word ‘number’ does not have any immutable, God-given meaning. (p.42)

Numbers are the most prominent part of mathematics and everyone is taught arithmetic at school, but numbers are just one type of object that mathematics is interested in.

Stewart outlines the invention of whole numbers, and then of fractions. Some time in the Dark Ages the invention of 0. The invention of negative numbers, then of square roots. Irrational numbers. ‘Real’ numbers.

Whole numbers 1, 2, 3… are known as the natural numbers. If you include negative whole numbers, the series is known as integers. Positive and negative numbers taken together are known as rational numbers. Then there are real numbers and complex numbers. Five systems in total.

But maths is also about operations such as addition, subtraction, multiplication and division. And functions, also known as transformations, rules for transforming one mathematical object into another. Many of these processes can be thought of as things which help to create data structures.

Maths is like a landscape in which similar proofs and theories cluster together to create peaks and troughs.

4. The Constants of Change

Newton’s basic insight was that changes in nature can be described by mathematical processes. Stewart explains how detailed consideration of what happens to a cannonball fired out of a cannon helps us towards Newton’s fundamental law, that force = mass x acceleration.

Newton invented calculus to help work out solutions to moving bodies. Its two basic operations – integration and differentiation – mean that, given one element – force, mass or acceleration – you can work out the other two. Differentiation is the technique for finding rates of change; integration is the technique for ‘undoing’ the effect of differentiation in order to isolate out the initial variables.

Calculating rates of change is a crucial aspect of maths, engineering, cosmology and many other areas of science.

5. From Violins to Videos

He gives a fascinating historical recap of how initial investigations into the way a violin string vibrates gave rise to formulae and equations which turned out to be useful in mapping electricity and magnetism, which turned out to be aspects of the same fundamental force, electromagnetism. It was understanding this which underpinned the invention of radio, radar, TV etc and Stewart’s account describes the contributions made by Michael Faraday, James Clerk Maxwell, Heinrich Hertz and Guglielmo Marconi.

Stewart makes the point that mathematical theory tends to start with the simple and immediate and grow ever-more complicated. This is because of a basic approach common in lots of mathematics which is that, you have to start somewhere.

6. Broken Symmetry

A symmetry of an object or system is any transformation that leaves it invariant. (p.87)

There are many types of symmetry. The most important ones are reflections, rotations and translations.

7. The Rhythm of Life

The nature of oscillation and Hopf bifurcation (if a simplified system wobbles, then so must the complex system it is derived from) leads into a discussion of how animals – specifically animals with legs – move, which turns out to be by staggered or syncopated oscillations, oscillations of muscles triggered by neural circuits in the brain.

This is a subject Stewart has written about elsewhere and is something of an expert on. Thus he tells us that the seven types of quadrupedal gait are: the trot, pace, bound, walk, rotary gallop, transverse gallop, and canter.

8. Do Dice Play God?

This chapter covers Stewart’s take on chaos theory.

Chaotic behaviour obeys deterministic laws, but is so irregular that to the untrained eye it looks pretty much random. Chaos is not complicated, patternless behaviour; it is much more subtle. Chaos is apparently complicated, apparently patternless behaviour that actually has a simple, deterministic explanation. (p.130)

19th century scientists thought that, if you knew the starting conditions, and then the rules governing any system, you could completely predict the outcomes. In the 1970s and 80s it became increasingly clear that this was wrong. It is impossible because you can never define the starting conditions with complete certainty.

Thus all real world behaviours are subject to ‘sensitivity to initial conditions’. From minuscule divergences at the starting point, cataclysmic differences may eventually emerge in mature systems.

Stewart goes on to explain the concept of ‘phase space’ developed by Henri Poincaré: this is an imaginary mathematical space that represents all possible motions in a given dynamic system. The phase space is the 3-D place in which you plot the behaviour in order to create the phase portrait. Instead of having to define a formula and worrying about identifying every number of the behaviour, the general shape can be determined.

Much use of phase portraits has shown that dynamic systems tend to have set shapes which emerge and which systems move towards. These are called attractors.

9. Drops, Dynamics and Daisies

The book ends by drawing some philosophical conclusions.

Chaos theory has all sorts of implications but the one Stewart closes on is this: the world is not chaotic; if anything, it is boringly predictable. And at the level of basic physics and maths, the laws which seem to underpin it are also schematic and simple. And yet, what we are only really beginning to appreciate is how complicated things are in the middle.

It is as if nature can only get from simple laws (like Newton’s incredibly simple law of thermodynamics) to fairly simple outcomes (the orbit of the planets) via almost incomprehensibly complex processes.

To end, Stewart gives us three examples of the way apparently ‘simple’ phenomena in nature derive from stupefying complexity:

  • what exactly happens when a drop of water falls off a tap
  • computer modelling of the growth of fox and rabbit populations
  • why petals on flowers are arranged in numbers derived from the Fibonacci sequence

In all three cases the underlying principles seem to be resolvable into easily stated laws and functions – and in our everyday lives we see water dropping off taps or flowerheads all the time – and yet the intermediate steps between simple mathematical principles and real world embodiment turn out to be mind-bogglingly complex.

Coda: Morphomatics

Stewart ends the book with an epilogue speculating, hoping and wishing for a new kind of mathematics which incorporates chaos theory and the other elements he’s discussed – a theory and study of form, which takes everything we already know about mathematics and seeks to work out how the almost incomprehensible complexity we are discovering in nature gives rise to all the ‘simple’ patterns which we see around us. He calls it morphomatics.

Request

If you have enjoyed reading this blog post, can you leave a comment and let me know why.


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

Maths ideas from John Allen Paulos

There’s always enough random success to justify anything to someone who wants to believe.
(Innumeracy, p.33)

It’s easier and more natural to react emotionally than it is to deal dispassionately with statistics or, for that matter, with fractions, percentages and decimals.
(A Mathematician Reads the Newspaper p.81)

I’ve just read two of John Allen Paulos’s popular books about maths, A Mathematician Reads the Newspaper: Making Sense of the Numbers in the Headlines (1995) and Innumeracy: Mathematical Illiteracy and Its Consequences (1998).

My reviews tended to focus on the psychological, logical and cognitive errors which Paulos finds so distressingly common on modern TV and in newspapers, among politicians and commentators, and in every walk of life. I focused on these for the simple reason that I didn’t understand the way he explained most of his mathematical arguments.

I also criticised a bit the style and presentation of the books, which I found meandering, haphazard and so quite difficult to follow, specially since he was packing in so many difficult mathematical concepts.

Looking back at my reviews I realise I spent so much time complaining that I missed out promoting and explaining large chunks of the mathematical concepts he describes (sometimes at length, sometimes only in throwaway references).

This blog post is designed to give a list and definitions of the mathematical principles which John Allen Paulos describes and explains in these two books.

They concepts appear, in the list below, in the same order as they crop up in the books.

1. Innumeracy: Mathematical Illiteracy and Its Consequences (1988)

The multiplication principle If some choice can be made in M different ways and some subsequent choice can be made in B different ways, then there are M x N different ways the choices can be made in succession. If a woman has 5 blouses and 3 skirts she has 5 x 3 = 15 possible combinations. If I roll two dice, there are 6 x 6 = 36 possible combinations.

If, however, I want the second category to exclude the option which occurred in the first category, the second number is reduced by one. If I roll two dice, there are 6 x 6 = 36 possible combinations. But the number of outcomes where the number on the second die differs from the first one is 6 x 5. The number of outcomes where the faces of three dice differ is 6 x 5 x 4.

If two events are independent in the sense that the outcome of one event has no influence on the outcome of the other, then the probability that they will both occur is computed by calculating the probabilities of the individual events. The probability of getting two head sin two flips of a coin is ½ x ½ = ¼ which can be written (½)². The probability of five heads in a row is (½)5.

The probability that an event doesn’t occur is 1 minus the probability that it will occur. If there’s a 20% chance of rain, there’s an 80% chance it won’t rain. Since a 20% chance can also be expressed as 0.2, we can say there is a 0.2 chance it will rain and a 1 – 0.2 = 0.8 chance it won’t rain.

Binomial probability distribution arises whenever a procedure or trial may result in a ‘success’ or ‘failure’ and we are interested in the probability of obtaining R successes from N trials.

Dirichlet’s Box Principle aka the pigeonhole principle Given n boxes and m>n objects, at least one box must contain more than one object. If the postman has 21 letters to deliver to 20 addresses he knows that at least one address will get two letters.

Expected value The expected value of a quantity is the average of its values weighted according to their probabilities. If a quarter of the time a quantity equals 2, a third of the time it equals 6, another third of the time it equals 15, and the remaining twelfth of the time it equals 54, then its expected value is 12. (2 x ¼) + (6 x 1/3) + (15 x 1/3) + (54 x 1/12) = 12.

Conditional probability Unless the events A and B are independent, the probability of A is different from the probability of A given that B has occurred. If the event of interest is A and the event B is known or assumed to have occurred, ‘the conditional probability of A given B’, or ‘the probability of A under the condition B’, is usually written as P(A | B), or sometimes PB(A) or P(A / B).

For example, the probability that any given person has a cough on any given day may be only 5%. But if we know that the person has a cold, then they are much more likely to have a cough. The conditional probability of someone with a cold having a cough might be 75%. So the probability of any member of the public having a cough is 5%, but the probability of any member of the public who has a cold having a cough is 75%. P(Cough) = 5%; P(Cough | Sick) = 75%

The law of large numbers is a principle of probability according to which the frequencies of events with the same likelihood of occurrence even out, given enough trials or instances. As the number of experiments increases, the actual ratio of outcomes will converge on the theoretical, or expected, ratio of outcomes.

For example, if a fair coin (where heads and tails come up equally often) is tossed 1,000,000 times, about half of the tosses will come up heads, and half will come up tails. The heads-to-tails ratio will be extremely close to 1:1. However, if the same coin is tossed only 10 times, the ratio will likely not be 1:1, and in fact might come out far different, say 3:7 or even 0:10.

The gambler’s fallacy a misunderstanding of probability: the mistaken belief that because a coin has come up heads a number of times in succession, it becomes more likely to come up tails. Over a very large number of instances the law of large numbers comes into play; but not in a handful.

Regression to the mean in any series with complex phenomena that are dependent on many variables, where chance is involved, extreme outcomes tend to be followed by more moderate ones. Or: the tendency for an extreme value of a random quantity whose values cluster around an average to be followed by a value closer to the average or mean.

Poisson probability distribution measures the probability that a certain number of events occur within a certain period of time. The events need to be a) unrelated to each other b) to occur with a known average rate. The Ppd can be used to work out things like the numbers of cars that pass on a certain road in a certain time, the number of telephone calls a call center receives per minute.

Bayes’ Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if cancer is related to age, then, using Bayes’ theorem, a person’s age can be used to more accurately assess the probability that they have cancer, compared to the assessment of the probability of cancer made without knowledge of the person’s age.

Arrow’s impossibility theorem (1951) no rank-order electoral system can be designed that always satisfies these three “fairness” criteria:

  • If every voter prefers alternative X over alternative Y, then the group prefers X over Y.
  • If every voter’s preference between X and Y remains unchanged, then the group’s preference between X and Y will also remain unchanged (even if voters’ preferences between other pairs like X and Z, Y and Z, or Z and W change).
  • There is no “dictator”: no single voter possesses the power to always determine the group’s preference.

The prisoner’s dilemma (1951) Two criminals are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge, but they have enough to convict both on a lesser charge. The prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The offer is:

  • If A and B each betray the other, each of them serves two years in prison
  • If A betrays B but B remains silent, A will be set free and B will serve three years in prison (and vice versa)
  • If A and B both remain silent, both of them will only serve one year in prison (on the lesser charge).
Prisoner's dilemma graphic. Source: Wikipedia

Prisoner’s dilemma graphic. Source: Wikipedia

Binomial probability Binomial means it has one of only two outcomes such as heads or tails. A binomial experiment is one that possesses the following properties:

  • The experiment consists of n repeated trials
  • Each trial results in an outcome that may be classified as a success or a failure (hence the name, binomial)
  • The probability of a success, denoted by p, remains constant from trial to trial and repeated trials are independent.

The number of successes X in n trials of a binomial experiment is called a binomial random variable. The probability distribution of the random variable X is called a binomial distribution.

Type I and type II errors Type I error is where a true hypothesis is rejected. Type II error is where a false hypothesis is accepted.

Confidence interval Used in surveys, the confidence interval is a range of values, above and below a finding, in which the actual value is likely to fall. The confidence interval represents the accuracy or precision of an estimate.

Central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in some situations, when independent random variables are added, their properly normalized sum tends toward a normal distribution (informally a “bell curve”) even if the original variables themselves are not normally distributed. OR: the sum or average of a large bunch of measurements follows a normal curve even if the individual measurements themselves do not. OR: averages and sums of non-normally distributed quantities will nevertheless themselves have a normal distribution. OR:

Under a wide variety of circumstances, averages (or sums) of even non-normally distributed quantities will nevertheless have a normal distribution (p.179)

Regression analysis here are many types of regression analysis, at their core they all examine the influence of one or more independent variables on a dependent variable. Performing a regression allows you to confidently determine which factors matter most, which factors can be ignored, and how these factors influence each other. In order to understand regression analysis you must comprehend the following terms:

  • Dependent Variable: This is the factor you’re trying to understand or predict.
  • Independent Variables: These are the factors that you hypothesize have an impact on your dependent variable.

Correlation is not causation a principle which cannot be repeated too often.

Gaussian distribution Gaussian distribution (also known as normal distribution) is a bell-shaped curve, and it is assumed that during any measurement values will follow a normal distribution with an equal number of measurements above and below the mean value.

The normal distribution is the most important probability distribution in statistics because it fits so many natural phenomena. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution.

Statistical significance A result is statistically significant if it is sufficiently unlikely to have occurred by chance.

2. A Mathematician Reads the Newspaper: Making Sense of the Numbers in the Headlines

Incidence matrices In mathematics, an incidence matrix is a matrix that shows the relationship between two classes of objects. If the first class is X and the second is Y, the matrix has one row for each element of X and one column for each element of Y. The entry in row x and column y is 1 if x and y are related (called incident in this context) and 0 if they are not. Paulos creates an incidence matrix to show

Complexity horizon On the analogy of an ‘event horizon’ in physics, Paulos suggests this as the name for levels of complexity in society around us beyond which mathematics cannot go. Some things just are too complex to be understood using any mathematical tools.

Nonlinear complexity Complex systems often have nonlinear behavior, meaning they may respond in different ways to the same input depending on their state or context. In mathematics and physics, nonlinearity describes systems in which a change in the size of the input does not produce a proportional change in the size of the output.

The Banzhaf power index is a power index defined by the probability of changing an outcome of a vote where voting rights are not necessarily equally divided among the voters or shareholders. To calculate the power of a voter using the Banzhaf index, list all the winning coalitions, then count the critical voters. A critical voter is a voter who, if he changed his vote from yes to no, would cause the measure to fail. A voter’s power is measured as the fraction of all swing votes that he could cast. There are several algorithms for calculating the power index.

Vector field may be thought of as a rule f saying that ‘if an object is currently at a point x, it moves next to point f(x), then to point f(f(x)), and so on. The rule f is non-linear if the variables involved are squared or multiplied together and the sequence of the object’s positions is its trajectory.

Chaos theory (1960) is a branch of mathematics focusing on the behavior of dynamical systems that are highly sensitive to initial conditions.

‘Chaos’ is an interdisciplinary theory stating that within the apparent randomness of chaotic complex systems, there are underlying patterns, constant feedback loops, repetition, self-similarity, fractals, self-organization, and reliance on programming at the initial point known as sensitive dependence on initial conditions.

The butterfly effect describes how a small change in one state of a deterministic nonlinear system can result in large differences in a later state, e.g. a butterfly flapping its wings in Brazil can cause a hurricane in Texas.

Linear models are used more often not because they are more accurate but because that are easier to handle mathematically.

All mathematical systems have limits, and even chaos theory cannot predict even relatively simple nonlinear situations.

Zipf’s Law states that given a large sample of words used, the frequency of any word is inversely proportional to its rank in the frequency table. So word number n has a frequency proportional to 1/n. Thus the most frequent word will occur about twice as often as the second most frequent word, three times as often as the third most frequent word, etc. For example, in one sample of words in the English language, the most frequently occurring word, “the”, accounts for nearly 7% of all the words (69,971 out of slightly over 1 million). True to Zipf’s Law, the second-place word “of” accounts for slightly over 3.5% of words (36,411 occurrences), followed by “and” (28,852). Only about 135 words are needed to account for half the sample of words in a large sample

Benchmark estimates Benchmark numbers are numbers against which other numbers or quantities can be estimated and compared. Benchmark numbers are usually multiples of 10 or 100.

Non standard models Almost everyone, mathematician or not, is comfortable with the standard model (N : +, ·) of arithmetic. Less familiar, even among logicians, are the non-standard models of arithmetic.

The S-curve A sigmoid function is a mathematical function having a characteristic “S”-shaped curve or sigmoid curve. Often, sigmoid function refers to the special case of the logistic function shown below

and defined by the formula:

This curve, sometimes called the logistic curve is extremely widespread: it appears to describe the growth of entities as disparate as Mozart’s symphony production, the rise of airline traffic, and the building of Gothic cathedrals (p.91)

Differential calculus The study of rates of change, rates of rates of change, and the relations among them.

Algorithm complexity gives on the length of the shortest program (algorithm) needed to generate a given sequence (p.123)

Chaitin’s theorem states that every computer, every formalisable system, and every human production is limited; there are always sequences that are too complex to be generated, outcomes too complex to be predicted, and events too dense to be compressed (p.124)

Simpson’s paradox (1951) A phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.

The amplification effect of repeated playing the same game, rolling the same dice, tossing the same coin.


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology

A Mathematician Reads the Newspaper: Making Sense of the Numbers in the Headlines by John Allen Paulos (1995)

Always be smart. Seldom be certain. (p.201)

Mathematics is not primarily a matter of plugging numbers into formulas and performing rote computations. It is a way of thinking and questioning that may be unfamiliar to many of us, but is available to almost all of us. (p.3)

John Allen Paulos

John Allen Paulos is an American professor of mathematics who came to wider fame with publication of his short (130-page) primer, Innumeracy: Mathematical Illiteracy and its Consequences, published in 1988.

It was followed by Beyond Numeracy: Ruminations of a Numbers Man in 1991 and this book, A Mathematician Reads the Newspaper in 1995.

Structure

The book is made up of about 50 short chapters. He explains that each one of them will take a topic in the news in 1993 and 1994 and show how it can be analysed and understood better using mathematical tools.

The subjects of the essays are laid out under the same broad headings that you’d encounter in a newspaper, with big political stories at the front, giving way to:

  • Local, business and social issues
  • Lifestyle, spin and soft news
  • Science, medicine and the environment
  • Food, book reviews, sports and obituaries

Response

The book is disappointing in all kinds of ways.

First and foremost, he does not look at specific stories. All the headlines are invented. Each 4 or 5-page essay may or may not call in aspects of various topics in the news, but they do not look at one major news story and carefully deconstruct how it has been created and publicised in disregard of basic mathematics and probability and statistics. (This alone is highly suggestive of the possibility that, despite all his complaints to the contrary, specific newspaper stories where specific mathematical howlers are made and can be corrected are, in fact surprisingly rare.)

The second disappointment is that, even though these essays are very short, they cannot stay focused on one idea or example for much more than a page. I hate to say it and I don’t mean to be rude, but Paulos’s text has some kind of attention deficit disorder: the essays skitter all over the place, quickly losing whatever thread they ever had in a blizzard of references to politics, baseball, pseudoscience and a steady stream of bad jokes. He is so fond of digressions, inserts, afterthoughts and tangents that it is often difficult to say what any given essay is about.

I was hoping that each essay would take a specific news story and show how journalists had misunderstood the relevant data and maths to get it wrong, and would then show the correct way to analyse and interpret it. I was hoping that the 50 or so examples would have been carefully chosen to build up for the reader an armoury of techniques of arithmetic, probability, calculus, logarithms and whatever else is necessary to immediately spot, deconstruct and correct articles with bad maths in them.

Nope. Not at all.

Lani ‘Quota Queen’ Guinier

Take the very first piece, Lani ‘Quota Queen’ Guinier. For a start he doesn’t tell us who Lani ‘Quota Queen’ Guinier is. I deduce from his introduction that she was President Clinton’s nomination for the post of assistant attorney general for civil rights. We can guess, then, that the nickname ‘quota queen’ implies she was a proponent of quotas, though whether for black people, women or what is not explained.

Why not?

Paulos introduces us to the Banzhaf power index, devised in 1965 by lawyer John F. Banzhaf.

The Banzhaf power index of a group, party or person is defined to be the number of ways in which that group, party or person can change a losing coalition into a winning coalition or vice versa. (p.10)

He gives examples of companies where three or four shareholders hold different percentages of voting rights and shows how some coalitions of shareholders will always have decisive voting rights, whereas others never will (these are called the dummy) while even quite small shareholders can hold disproportionate power. For example in a situation where three shareholders hold 45%, 45% and 10% of the shares, the 10% party can often have the decisive say. In 45%, 45%, 8% and 2% the 2% is the dummy.

He then moves on to consider voting systems in some American states, including: cumulative voting, systems where votes don’t count as 1 but are proportionate to population, Borda counts (where voters rank the candidates and award progressively more points to those higher up the rankings), approval voting (where voters have as many votes as they want and can vote for as many candidates as they approve of), before going on to conclude that all voting systems have their drawbacks.

The essay ends with a typical afterthought, one-paragraph coda suggesting how the Supreme Court could end up being run by a cabal of just three judges. There are nine judges on the U.S. Supreme Court. Imagine (key word for Paulos), imagine a group of five judges agree to always discuss issues among themselves first, before the vote of the entire nine, and imagine they decide to always vote according to whatever the majority (3) decide. Then imagine that a sub-group of just three judges go away and secretly decide, that in the group of five, they will always agree. Thus they will dictate the outcome of every Supreme Court decision.

So:

1. I had no idea who Lani ‘Quota Queen’ Guinier was or, more precisely, I had to do a bit of detective work to figure it out, and still wasn’t utterly sure.

2. This is a very sketchy introduction to the issue of democratic voting systems. This is a vast subject, which Paulos skates over quickly and thinly.

Thus, in these four and a bit pages you have the characteristic Paulos experience of feeling you are wandering all over the place, not quite at random, but certainly not in a carefully planned sequential way designed to explore a topic thoroughly and reach a conclusion. You are introduced to a number of interesting ideas, with some maths formulae, but not in enough detail or at sufficient length to really understand them. And because he’s not addressing any particular newspaper report or article, there are no particular misconceptions to clear up: the essay is a brief musing, a corralling of thoughts on an interesting topic.

This scattergun approach characterises the whole book.

Psychological availability and anchoring effects

The second essay is titled Psychological availability and anchoring effects. He explains what the availability error, the anchor effect and the halo effect are. If this is the first time you’ve come across these notions, they’re powerful new ideas. But I recently reread Irrationality by Stuart Sutherland which came out three years before Paulos’s book and spends over three hundred pages investigating these and all the other cognitive biases which afflict mankind in vastly more depth than Paulos, with many more examples. Next to it, Paulos’s three-minute essay seemed sketchy and superficial.

General points

Rather than take all 50 essays to pieces, here are notes on what I actually did learn. Note that almost none of it was about maths, but general-purpose cautions about how the news media work, and how to counter its errors of logic. In fact, all of it could have come from a media studies course without any maths at all:

  • almost all ‘news’ reinforces conventional wisdom
  • because they’re so brief, almost all headlines must rely on readers’ existing assumptions and prejudices
  • almost all news stories relate something new back to similar examples from the past, even when the comparison is inappropriate, again reinforcing conventional wisdom and failing to recognise the genuinely new
  • all economic forecasts are rubbish: this is because economics (like the weather and many other aspects of everyday life) is a non-linear system. Chaos theory shows that non-linear systems are highly sensitive to even minuscule differences in starting conditions, which has been translated into pop culture as the Butterfly Effect
  • and also with ‘futurologists’: the further ahead they look, the less reliable their predictions
  • the news is deeply biased by always assuming human agency is at work in any outcome: if any disaster happens anywhere the newspapers always go searching for a culprit; in the present Brexit crisis lots of news outlets are agreeing to blame Theresa May. But often things happen at random or as an accumulation of unpredictable factors. Humans are not good at acknowledging the role of chance and randomness.

There is a tendency to look primarily for culpability and conflicts of human will rather than at the dynamics of a natural process. (p.160)

  • Hence so many newspapers endlessly playing the blame game. The Grenfell Tower disaster was, first and foremost, an accident in the literal sense of ‘an unfortunate incident that happens unexpectedly and unintentionally, typically resulting in damage or injury’ – but you won’t find anybody who doesn’t fall in with the prevailing view that someone must be to blame. There is always someone to blame. We live in a Blame Society.
  • personalising beats stats, data or probability: nothing beats ‘the power of dramatic anecdote’ among the innumerate: ‘we all tend to be unduly swayed by the dramatic, the graphic, the visceral’ (p.82)
  • if you combine human beings’ tendency to personalise everything, and to look for someone to blame, you come up with Donald Trump, who dominates every day’s news
  • so much is happening all the time, in a world with more people and incidents than ever before, in which we are bombarded with more information via more media than ever before – that it would be extraordinary if all manner or extraordinary coincidences, correspondences and correlations didn’t happen all the time
  • random events can sometimes present a surprisingly ordered appearance
  • because people imbue meaning into absolutely everything, then the huge number of coincidences and correlations are wrongfully interpreted as meaningful

Tips and advice

I was dismayed at the poor quality of many of the little warnings which each chapter ends with. Although Paulos warns against truisms (on page 54) his book is full of them.

Local is not what it used to be, and we shouldn’t be surprised at how closely we’re linked. (p.55)

In the public realm, often the best we can do is to stand by and see how events unfold. (p.125)

Chapter three warns us that predictions about complex systems (the weather, the economy, big wars) are likely to be more reliable the simpler the system they’re predicting, and the shorter period they cover. Later he says we should be sceptical about all long-term predictions by politicians, economists and generals.

It didn’t need a mathematician to tell us that.

A lot of it just sounds like a grumpy old man complaining about society going to the dogs:

Our increasingly integrated and regimented society undermines our sense of self… Meaningless juxtapositions and coincidences replace conventional narratives and contribute to our dissociation… (pp.110-111)

News reports in general, and celebrity coverage in particular, are becoming ever-more self-referential. (p.113)

We need look no further than the perennial appeal of pseudoscientific garbage, now being presented in increasingly mainstream forums… (p.145)

The fashion pages have always puzzled me. In my smugly ignorant view, they appear to be so full of fluff and nonsense as to make the astrology columns insightful by comparison. (p.173)

Another aspect of articles in the society pages or in the stories about political and entertainment figures is the suggestion that ‘everybody’ knows everybody else. (p.189)

Sometimes his liberal earnestness topples into self-help book touchy-feeliness.

Achieving personal integration and a sense of self is for the benefit of ourselves and those we’re close to. (p.112)

But just occasionally he does say something unexpected:

The attention span created by television isn’t short; it’s long, but very, very shallow. (p.27)

That struck me as an interesting insight but, as with all his interesting comments, no maths was involved. You or I could have come up with it from general observation.

Complexity horizon

The notion that the interaction of human laws, conventions, events, politics, and general information overlap and interplay at ever-increasing speeds to eventually produce situations so complex as to appear unfathomable. Individuals, and groups and societies, have limits of complexity beyond which they cannot cope, but have to stand back and watch. Reading this made me think of Brexit.

He doesn’t mention it, but a logical spin-off would be that every individual has a complexity quotient like an intelligence quotient or IQ. Everyone could take a test in which they are faced with situations of slowly increasing complexity – or presented with increasingly complex sets of information – to find out where their understanding breaks off – which would become their CQ.

Social history

The book was published in 1995 and refers back to stories current in the news in 1993 and 1994. The run of domestic political subjects he covers in the book’s second quarter powerfully support my repeated conviction that it is surprising how little some issues have changed, how little movement there has been on them, and how they have just become a settled steady part of the social landscape of our era.

Thus Paulos has essays on:

  • gender bias in hiring
  • homophobia
  • accusations of racism arising from lack of ethnic minorities in top jobs (the problem of race crops up numerous times (pp.59-62, p.118)
  • the decline in educational standards
  • the appallingly high incidence of gun deaths, especially in black and minority communities
  • the fight over abortion

I feel increasingly disconnected from contemporary politics, not because it is addressing new issues I don’t understand, but for the opposite reason: it seems to be banging on about the same issues which I found old and tiresome twenty-five years ago.

The one topic which stood out as having changed is AIDS. In Innumeracy and in this book he mentions the prevalence or infection rates of AIDS and is obviously responding to numerous news stories which, he takes it for granted, report it in scary and alarmist terms. Reading these repeated references to AIDS made me realise how completely and utterly it has fallen off the news radar in the past decade or so.

In the section about political correctness he makes several good anti-PC points:

  • democracy is about individuals, the notion that everyone votes according to their conscience and best judgement; as soon as you start making it about groups (Muslims, blacks, women, gays) you start undermining democracy
  • racism and sexism and homophobia are common enough already without making them the standard go-to explanations for social phenomena which often have more complex causes; continually attributing all aspects of society to just a handful of inflammatory issues, keeps the issues inflammatory
  • members of groups often vie with each other to assert their loyalty, to proclaim their commitment to the party line and this suggests a powerful idea: that the more opinions are expressed, the more extreme these opinions will tend to become. This is a very relevant idea to our times when the ubiquity of social media has a) brought about a wonderful spirit of harmony and consensus, or b) divided society into evermore polarised and angry groupings

Something bad is coming

I learned to fear several phrases which indicate that a long, possibly incomprehensible and frivolously hypothetical example is about to appear:

‘Imagine…’

Imagine flipping a penny one thousand times in succession and obtaining some sequence of heads and tails… (p.75)

Imagine a supercomputer, the Delphic-Cray 1A, into which has been programmed the most complete and up-to-date scientific knowledge, the initial condition of all particles, and sophisticated mathematical techniques and formulas. Assume further that… Let’s assume for argument’s sake that… (p.115)

Imagine if a computer were able to generate a random sequence S more complex than itself. (p.124)

Imagine the toast moistened, folded, and compressed into a cubical piece of white dough… (p.174)

Imagine a factory that produces, say, diet food. Let’s suppose that it is run by a sadistic nutritionist… (p.179)

‘Assume that…’

Let’s assume that each of these sequences is a billion bits long… (p.121)

Assume the earth’s oceans contain pristinely pure water… (p.141)

Assume that there are three competing healthcare proposals before the senate… (p.155)

Assume that the probability of your winning the coin flip, thereby obtaining one point, is 25 percent. (p.177)

Assume that these packages come off the assembly line in random order and are packed in boxes of thirty-six. (p.179)

Jokes and Yanks

All the examples are taken from American politics (President Clinton), sports (baseball) and wars (Vietnam, First Gulf War) and from precisely 25 years ago (on page 77, he says he is writing in March 1994), both of which emphasise the sense of disconnect and irrelevance with a British reader in 2019.

As my kids know, I love corny, bad old jokes. But not as bad as the ones the book is littered with:

And then there was the man who answered a matchmaking company’s computerised personals ad in the paper. He expressed his desire for a partner who enjoys company, is comfortable in formal wear, likes winter sports, and is very short. The company matched him with a penguin. (pp.43-44)

The moronic inferno and the liberal fallacy

The net effect of reading this book carefully is something that the average person on the street knew long ago: don’t believe anything you read in the papers.

And especially don’t believe any story in a newspaper which involves numbers, statistics, percentages, data or probabilities. It will always be wrong.

More broadly his book simply fails to take account of the fact that most people are stupid and can’t think straight, even very, very educated people. All the bankers whose collective efforts brought about the 2008 crash. All the diplomats, strategists and military authorities who supported the Iraq War. All the well-meaning liberals who supported the Arab Spring in Egypt and Libya and Syria. Everyone who voted Trump. Everyone who voted Brexit.

Most books of this genre predicate readers who are white, university-educated, liberal middle class and interested in news and current affairs, the arts etc and – in my opinion – grotesquely over-estimate both their value and their relevance to the rest of the population. Because this section of the population – the liberal, university-educated elite – is demonstrably in a minority.

Over half of Americans believe in ghosts, and a similar number believes in alien abductions. A third of Americans believe the earth is flat, and that the theory of evolution is a lie. About a fifth of British adults are functionally illiterate and innumerate. This is what Saul Bellow referred to as ‘the moronic inferno’.

On a recent Radio 4 documentary about Brexit, one contributor who worked in David Cameron’s Number Ten commented that he and colleagues went out to do focus groups around the country to ask people whether we should leave the EU and that most people didn’t know what they were talking about. Many people they spoke to had never heard of the European Union.

On page 175 he says the purpose of reading a newspaper is to stretch the mind, to help us envision distant events, different people and unusual situations, and broaden our mental landscape.

Is that really why he thinks people read newspapers? As opposed to checking the sports results, catching up with celebrity gossip, checking what’s happening in the soaps, reading interviews with movie and pop stars, looking at fashion spreads, reading about health fads and, if you’re one of the minority who bother with political news, having all your prejudices about how wicked and stupid the government, the poor, the rich or foreigners etc are, and despising everyone who disagrees with you (Guardian readers hating Daily Mail readers; Daily Mail readers hating Guardian readers; Times readers feeling smugly superior to both).

This is a fairly entertaining, if very dated, book – although all the genuinely useful bits are generalisations about human nature which could have come from any media studies course.

But if it was intended as any kind of attempt to tackle the illogical thinking and profound innumeracy of Western societies, it is pissing in the wind. The problem is vastly bigger than this chatty, scattergun and occasionally impenetrable book can hope to scratch. On page 165 he says that a proper understanding of mathematics is vital to the creation of ‘an informed and effective citizenry’.

‘An informed and effective citizenry’?


Related links

Reviews of other science books

Chemistry

Cosmology

The Environment

Genetics and life

Human evolution

Maths

Particle physics

Psychology