Science Seen Time One author Colin Gillespie helps you understand the physics of your world.

# Nineteen of Twenty

Nineteen times out of twenty. We come across these numbers almost daily in the news. But behind them there’s an old problem. In the 1800s British Prime Minister Benjamin Disraeli famously said that there are lies, damned lies and statistics. Or so American author Mark Twain claimed. Too bad for his story that Disraeli died before statistics was a word. Whatever words he used Disraeli was onto something important. We are often told some number is within so many percent, nineteen times out of twenty? It may be true. But more often than not we are being misled. Here is how to tell the difference. All we need to do is take a good look at the twenty times.

One kind of nineteen out of twenty times may signal a serious survey. It requires four things. First there must be a large number of somethings, such as registered voters in California. Second is a random sample, the voters who were called. Third is a measure based on the sample, such as percent of Republican-inclined voters. And finally we need a range uncertainty, plus or minus. When we see all four things, the statement may be true. It might mean: The odds that asking all California voters would yield a number within that range of uncertainty are nineteen out of twenty (95%). But if, for example, the pollster called only landline phones (introducing an unscientific bias as is often the case) the statement may be misleading.

Another kind of nineteen out of twenty times seeks support for suspected links. Let’s say a scientist theorizes that fumes from welding might cause lung cancer. So she digs through health check-up data for a randomly selected sample of welders. She finds that four percent of them were diagnosed with lung cancer. Lung cancer incidence for the public at large is two percent. Welders seem to be coming down with cancer at twice the rate! What do these data mean? Based on her sample size, she has nineteen-out-of-twenty (95%) confidence that lung-cancer risk for all welders is in the range 4.0±0.5%. The odds that it’s as low as the general population’s 2% are very small. Looks like her theory might be right! Well, actually these kinds of data never provide evidence of cause. Indeed, they could just mean more welders smoke.

Then there is a third kind of twenty times, a pseudo-scientific kind we should always reject. Here’s a real-life example. Some scientists seek links between exposure to chemicals and lung cancer. Their scientific method sounds impressive:

‘Detailed job histories were obtained by interview and evaluated by an expert team of chemist-hygienists to estimate degree of exposure to approximately 300 substances for each job. Gas and arc welding fumes were among the agents evaluated. We estimated odds ratios (ORs) and 95% confidence intervals (CIs) of lung cancer using logistic regression, adjusting for smoking history and other covariates.’

They say both gas and arc welding cause high cancer rates. But not among non-smokers. Nor heavy smokers either. But light smokers have high risk and, they say, their data would not be random chance nineteen times out of twenty. But wait! They tried 300 substances for each job, and 3 levels (none, light and heavy) of exposure. That’s 900 chances; we’d expect the randomness that they acknowledge in their data would give positive results in forty-five of them. Such are the hazards of what’s known as data mining. Their conclusion: ‘Exposure to welding fumes increases lung cancer risk among light smokers but not among heavy smokers.’ If Disraeli were still around he would have harsh words.

So here’s one way to separate some of the damned lies from the sound statistics. The science may be okay when twenty times means times that scientists could try with more samples. But if the ‘science’ has a scattergun approach that did try many times to find a link, you’d be well advised to read another study.

Sources:

Eric Vallières et al. (2012), “Exposure to welding fumes increases lung cancer risk among light smokers but not among heavy smokers: evidence from two case-control studies in Montreal”, Cancer Med., Hoboken NJ: Wiley-Blackwell, vol. 1, p. 47; http://www.ncbi.nlm.nih.gov/pubmed/23342253

Image credit: Cornelius Jabez Hughes