EuroEpi thoughts

One thing that has struck me listening to talks at the European Congress of Epidemiology is the incredible weight given to the phrase “statistically significant”. This is an old chestnut among theoreticians in the area, so my surprise perhaps indicates more about my selective contact with epidemiology to date than anything else. It is nonetheless interesting to see the work this strange concept does.

The most striking example was in an interesting talk on risk factors for colorectal cancers. A slide was displayed showing results of a case control study. For every one of the 8 or so risk factors, incidence among cases was higher than controls. However, the speaker pointed out that only some of these differences were statistically significant.

This struck me as very strange. The level of statistical significance is more or less arbitrary – perhaps not entirely, but arbitrary in the same way as specifying a certain height for “short”. In this context, that means that the choice of risk factors to ignore was also, in the same sense, arbitrary. Moreover, the fact that the difference was the same way in all the risk factors (ie higher exposure in cases than controls) also seemed, to my untutored eye, to be the sort of unlikely coincidence one might wish to investigate further.

In a way, that is exactly what came next. One of the “insignificant” factors turned out – and I confess I did not follow how – to interact significantly with another (the two being fibre and calcium intake).

I am not sure that any of this is problematic, but it is certainly puzzling. The pattern is not unique to this talk. I have seen more than one table presented of variables potentially associated with an outcome, with the non significant ones then being excluded. On many occasions this must surely be a good, quick way to proceed. It seems like a strange exercise, to my untutored eye, if some non significant differences are studied further anyway. But that is surely an artefact of my lack of understanding.

I am less sure that my lack of understanding is to blame for other doubts, however. Where a number of risk factors are aligned, it seems arbitrary to ignore the ones that fail a certain level of statistical significance. The fact of alignment is itself some evidence of a non chance phenomenon of some kind. And, of course, the alignment might indicate something important, for example an as yet unthought of causal factor. The non significant factors could be as useful as the significant ones in detecting such a factor, by providing further means of triangulation.

The Myth of Translation

Next week I am part of a symposium at EuroEpi in Porto, Portugal with the title Achieving More Effective Translation of Epidemiologic Findings into Policy when Facts are not the Whole Story.

My presentation is called “The Myth of Translation” and the central thesis is, as you would guess, that talk of “translating” data into policy, discoveries into applications, and so forth is unhelpful and inaccurate. Instead, I am arguing that the major challenge facing epidemiological research is assuring non-epidemiologists who might want to rely on those results that they are stable, meaning that they are not likely to be reversed in the near future.

I expect my claim to be provocative in two ways. First, the most obvious reasons I can think of for the popularity of the “translation” metaphor, given its clear inappropriateness (which I have not argued here but which I argue in the presentation), are unpleasant ones: claiming of scientific authority for dearly-held policy objectives; or blaming some sort of translational failing for what are actually shortcomings (or, perhaps, over-ambitious claims) in epidemiological research. This point is not, however, something I intend to emphasize; nor am I sure it is particularly important. Second, the claim that epidemiological results are reasonably regarded by non-epidemiologists as too unstable to be useful might be expected to raise a bit of resistance at an epidemiology conference.

Given the possibility that what I have to say will be provocative, I thought I would try my central positive argument out here.

(1) It is hard to use results which one reasonably suspects might soon be found incorrect.

(2) Often, epidemiological results are such that a prospective user reasonably suspects that they will soon be found incorrect.

(3) Therefore, often, it is hard to use epidemiological results.

I think this argument is valid, or close enough for these purposes. I think that (1) does not need supporting: it is obviously true (or obviously enough for these purposes). The weight is on (2), and my argument for (2) is that from the outside, it is simply too hard to tell whether a given issue – for example, the effect of HRT on heart disease, or the effect of acetaminophen (paracetamol) on asthma – is still part of an ongoing debate, or can reasonably be regarded as settled. The problem infects even results that epidemiologists would widely regard as settled: the credibility of the evidence on the effect of smoking on lung cancer is not helped by reversals over HRT, for example, because from the outside, it is not unreasonable to wonder what the relevant difference is between the pronouncements on HRT and the pronouncements on lung cancer and smoking. There is a difference: my point is that epidemiology lacks a clear framework for saying what it is.

My claim, then, is that the main challenge facing the use of epidemiological results is not “translation” in any sense, but stability; and that devising a framework for expressing to non-epidemiologists (“users”, if you like) how stable a given result is, given best available current knowledge, is where efforts currently being directed at “translation” would be better spent.

Comments on this line of thought would be very welcome. I am happy to share the slides for my talk with anyone who might be interested.