I’ve decided to post an update every day, but the updates proper will be on Mondays, Wednesdays and Fridays, with shorter ones on the other days. This is in response to numerous requests in the comment from people who find it difficult to keep track of the threads if there are over 1,000 comments, something that happens if more than 24 hours pass since the previous update. Friday’s update, for instance, attracted 1,714 comments.
Today, I’ve decided to devote the entirety of the update to Professor Neil Ferguson and his team at Imperial College, including a guest post by “Sue Denim”, the software engineer who wrote “Code Review of Ferguson’s Model” for Lockdown Sceptics on May 6th. That article was the most talked-about post that’s appeared on this site, as well as the most viewed.
This seems like a good time to publish Sue’s latest thoughts about Professor Ferguson’s model because yesterday the Imperial College modelling team, including Neil Ferguson, published a paper in Nature, based on a new model, arguing that the lockdowns have saved the lives of approximately 3.1 million people in 11 European countries (Austria, Belgium, Denmark, France, Germany, Italy, Norway, Spain, Sweden, Switzerland and the UK). In the UK alone, the authors think the lockdown has saved 470,000 lives.
That 3.1 million figure, which they call “counterfactual deaths”, is the number of deaths they estimate would have occurred by May 4th if there had been no non-pharmaceutical interventions (NPIs) and people’s behaviour hadn’t changed one jot in response to the pandemic. But that’s a rather obvious sleight of hand. In effect, their argument involves contrasting the collective impact of the NPIs in all 11 countries, including Sweden, with a counterfactual scenario in which nothing was done at all, and saying, “Look! The lockdowns saved 3.1 million lives.”
No one, as far as I’m aware, has ever advocated that governments around the world do nothing in response to the pandemic. Rather, the argument is about whether they over-reacted. Have the full lockdowns saved more lives than less severe restrictions would have done, given the public health impact of imprisoning everyone in their homes, as well as the catastrophic economic consequences?
It’s also highly implausible to imagine people would have done nothing in response to the pandemic – just carried on as normal – in the absence of state-mandated, top-down directives. This flatly contradicts both common sense and actual mobility data from Google and other sources that shows people’s mobility falling before the lockdowns were imposed.
The authors try and get round this by including the following caveat:
The counterfactual model without interventions is illustrative only and reflects our model assumptions. We do not account for changes in behaviour; in reality even in the absence of government interventions we would expect Rt to decrease and therefore would overestimate deaths in the no-intervention model
So they know the 3.1 million number is wrong because their assumptions are wrong, but provide a specific number anyway for “illustrative only” purposes. But what is it supposed to illustrate, given that it doesn’t actually tell us how many people would have died in the absence of any NPIs?
I think I know the answer: it illustrates the ideological worldview of the scientists involved, which is that virtually the entire population in these 11 countries are sheep-like entities who must be told what to do by experts like them. Reading about a dangerous virus in the news –seeing pictures of hospitals in Italy being overwhelmed – won’t affect their behaviour in the slightest.
The epidemiologists who’ve been advising the UK Government during this crisis often protest that they are perfectly neutral scientists, and anyone who criticises them is “ideological”. But as we can see, this Imperial College model takes for granted an essentially communist worldview in which the masses must be directed by central planners.
In order to make a convincing argument for the lockdowns, the paper would have to compare the number of lives saved as a result of the severe restrictions imposed in 10 of the 11 countries with the number that would have been saved if those 10 countries had stuck with the same mitigation strategy as Sweden. That’s the relevant counterfactual, not the one they’ve conjured up, and the case for the lockdowns depends upon calculating the number of lives saved in contrast to that counterfactual and demonstrating that it’s greater than the collateral damage done by the extra measures taken. This paper only tells us how effective the lockdowns have been in contrast to an alternative scenario – the do-nothing approach – which no one is arguing for.
In other words, the paper isn’t a defence of the lockdowns imposed in Austria, Belgium, Denmark, France, Germany, Italy, Norway, Spain, Switzerland and the UK. Rather, it’s an argument for doing something rather than nothing.
Needless to say, that isn’t how it’s being presented by its authors, or how it’s being reported in the press. The BBC’s headline yesterday, predictably enough, was “Lockdowns in Europe saved millions of lives“, apparently taking the Imperial team’s claim at face value, while the Sun at least has the good sense to put that claim in inverted commas: “Lockdown ‘prevented the deaths of 470,000 Brits from coronavirus – and 3m across EU’.”
Imperial has put out a press release claiming that the lockdowns have saved 3.1 million lives in Europe. Hang on. Wasn’t that 3.1 million number supposed to be purely “illustrative”, i.e. not a meaningful estimate of how much loss of life (if any) the lockdowns have prevented? The same release includes a quote from Dr Seth Flaxman, one of the paper’s authors, bragging about how many lives have been saved because governments across Europe have followed the sagacious advice of him and his team:
Using a model based on data from the number of deaths in 11 European countries, it is clear to us that non-pharmaceutical interventions – such as lockdown and school closures, have saved about 3.1 million lives in these countries.
An “illustrative only” figure seems to have been transformed into a hard data point without a second glance.
In the BBC story, Dr Flaxman emphasises that the crisis is far from over. “Claims this is all over can be firmly rejected,” he says. “We are only at the beginning of this pandemic.” That warning is echoed by Dr Samir Bhatt, another of the paper’s authors: “There is a very real risk if mobility goes back up there could be a second wave coming reasonably soon, in the next month or two.”
This follows from their model, since they assume the only reason the rate of infection has declined in the 11 countries they’ve looked at is because it’s been effectively suppressed by NPIs, not because the number of people with natural immunity is far greater than initially thought, or because the virus is nosocomial, or seasonal. The possibility that the majority of people who’ve died from COVID-19 are unusually vulnerable – elderly people in care homes and hospitals with underlying health conditions – and that further waves of infection are unlikely to have anything like the same infection fatality rate (IFR) isn’t considered by the paper’s authors. The model assumes the IFR is and will continue to be about 1% – four times greater than the CDC estimate. It also doesn’t allow for the fact that the IFR varies according to age.
Meanwhile, another paper in Nature – this one from a team at the University of California – claims that NPIs in China, South Korea, Italy, Iran, France, and the United States have prevented 530 million people becoming infected. But, again, the relevant counterfactual is no NPIs whatsoever, rather than a more measured approach. So not a persuasive argument for lockdowns either.
The argument made in these papers for the lockdowns is unpersuasive. It’s the equivalent of justifying Rodrigo Duterte’s brutal crackdown on drug trafficking in the Philippines, in spite of the fact that it’s resulted in the deaths of over 7,000 suspects, by pointing to the number of drug deaths it’s prevented and making the relevant counterfactual the absence of any policing whatsoever rather than a less draconian approach.
There’s a saying among scientists – just because something is published in Nature doesn’t necessarily mean it’s wrong. But Imperial’s new paper takes the biscuit. Running my eye over the list of authors, I was surprised not to see Mystic Meg’s name there.
So, on to the main event: Sue Denim’s latest blog post. I’m posting it in the update today, but will move it to the right-hand menu tomorrow so it sits beneath Sue’s previous two blog posts. A quick reminder that “Sue Denim” is not the author’s real name – kinda obvious when you think about it. The writer is a senior software engineer/consultant who doesn’t want to disclose his/her identity. As he/she wrote at the beginning of his/her first post:
I have been writing software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects. I was also an independent consultant for a couple of years.
How Replicable is the Imperial College Model?
by Sue Denim
After Toby published my first and second pieces, Imperial College London (ICL) produced two responses. In this article I will study them. I’ve also written an appendix that provides some notes on the C programming language to address some common confusions observed amongst modellers, which Toby will publish tomorrow.
Attempted replication. On the June 1st ICL published a press release on its website stating that Stephen Eglen, an academic at Cambridge, was able to reproduce the numbers in ICL’s influential Report 9. I was quite interested to see how that was achieved. As a reminder, Imperial College’s Report 9 modelling drove lockdown in many countries.
Unfortunately, this press release continues ICL’s rather worrying practice of making misleading statements about its work. The headline is “Codecheck confirms reproducibility of COVID-19 model results”, and the article highlights this quote:
I was able to reproduce the results… from Report 9.
This is an unambiguous statement. However, the press release quotes the report as saying: “Small variations (mostly under 5%) in the numbers were observed between Report 9 and our runs.”
This is an odd definition of “replicate” for the output of a computer program, but it doesn’t really matter because what ICL doesn’t mention is this: the very next sentence of Eglen’s report says:
I observed 3 significant differences:
1. Table A1: R0=2.2, trigger = 3000, PC_CI_HQ_SDOL70, peak beds (in thousands): 40 vs 30, a 25% decrease.
2. Table 5: on trigger = 300, off trigger = 0.75, PC_CI_HQ_SD, total deaths: 39,000 vs 43,000, a 10% increase.
3. Table 5: on trigger = 400, off trigger = 0.75, CI_HQ_SD, total deaths: 100,000 vs 110,000, a 10% increase.
In other words, he wasn’t able to replicate Report 9. There were multiple “significant differences” between what he got and what the British Government based its decisions on.
How significant? The supposedly minor difference in peak bed demand between his run and Report 9 is 10,000 beds, or roughly the size of the entire UK field hospital deployment. This supports the argument that ICL’s model is unusable for planning purposes, although that’s the entire justification for its existence.
Eglen claims this non-replication is in fact a replication by arguing:
although the absolute values do not match the initial report, the overall trends are consistent with
the original report
A correctly written model will be replicable to the last decimal place. When using the same seeds and same input data the expected variance is zero, not 25%. Stephen Eglen should retract his “code check”, as it’s incorrect to claim a model is replicable when nobody can get it to generate the same outputs that other people saw.
Number of simulation runs. ICL have contradicted themselves about how Report 9 was generated. Their staff previously claimed that, “Many tens of thousands of runs contributed to the spread of results in report 9.” In Eglen’s report we see a very different claim. He explains some of the difference between his results and ICL’s by saying:
These results are the average of NR=10 runs, rather than just one simulation as used in Report 9
Imperial College’s internal controls are so poor they can’t give a straight accounting of how Report 9 was generated.
The point of stochasticity is to estimate confidence bounds. If incorporating random chance into your simulation changes the output only a bit, you assume random chance won’t affect real world outcomes much either and this increases your confidence. Report 9 is notable for not providing any confidence bounds whatsoever. All numbers are given as precise predictions in different scenarios, with no discussion of uncertainty beyond a few possible values of R0. None of the graphs render uncertainty bounds either (unlike e.g. the University of Washington model). The lack of bounds would certainly be explained if the simulation was run only once.
People working on the ICL model have argued the huge variety of bug reports they received don’t matter, because they just run it repeatedly and average the outputs. This argument is nonsense as discussed repeatedly, but if they didn’t actually run it multiple times at all then the argument falls apart on its own terms.
Models vs experiments. The belief that you can just average out model bugs appears to be based on a deep confusion between simulations and reality. A shockingly large number of academics seem to believe that running a program is the same thing as running an experiment, and thus any unexplained variance in output should just be recorded and treated as cosmic uncertainty. However, models aren’t experiments; they are predictions generated by entirely controllable machines. When replicating software-generated predictions, the goal is not to explore the natural world, but to ensure that the program can be correctly tested, and to stop model authors simply cherry-picking outputs to fit their pre-conceived beliefs. As we shall see, that is a vital requirement.
Does replication matter? It does. You don’t have to take my word for it: ask Richard Horton, editor of the Lancet, who in 2015 stated:
The case against science is straightforward: much of the scientific literature, perhaps half, may simply be untrue. Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness. As one participant put it, “poor methods get results”.
Alternatively ask Professor Neil Ferguson, who is a signatory to this open letter to the Lancet requesting retraction of the “hydroxychloroquine is dangerous” paper because of the unreliability of the data it’s based on, supplied by an American health analytics company called Surgisphere. The letter justifies the demand for retraction by saying:
The authors have not adhered to standard practices in the machine learning and statistics community. They have not released their code or data.
ICL should give the authors the benefit of the doubt – maybe Surgisphere just need a couple of months to release their code. They are peer-reviewed experts, after all. And statistics isn’t a sub-field of epidemiology, so according to Imperial College spokespeople that means Ferguson isn’t qualified to criticise it anyway.
Initial response and the British Computer Society. Via its opinion writers, the Daily Telegraph picked up on my analysis. ICL gave them this statement:
A spokesperson for the Imperial College COVID-19 Response Team responded to criticism of its code by saying the Government “has never relied on a single disease model to inform decision-making”.
“Within the Imperial research team we use several models of differing levels of complexity, all of which produce consistent results. We are working with a number of legitimate academic groups and technology companies to develop, test and further document the simulation code referred to. However, we reject the partisan reviews of a few clearly ideologically motivated commentators.“
The first bolded statement is typically misleading. In the SAGE publication from March 9th addressing lockdowns, the British Government was given the conclusions of the SPI-M SAGE subgroup in tables 1 and 2. On page 8, that document states the tables and assumptions are sourced to a single paper from ICL which has never been published, but from the title and content it seems clear that it was an earlier draft of Report 9. There is no evidence of modelling from any other institution contributing to this report, i.e. it doesn’t appear to be true that the Government has “never” relied on a single model – that’s exactly what it was fed by its own advisory panel.
The second bolded statement is merely unfortunate. By ideologically motivated commentators they must have meant the vast array of professional software engineers who posted their reactions on Twitter, on GitHub and on this site. The beliefs of the vast majority in the software industry were summarised by the British Computer Society (BCS), a body that represents people working in computer science in the UK. The BCS stated:
Computer code used to model the spread of diseases including coronavirus “must meet professional standards” … “the quality of the software implementations of scientific models appear to rely too much on the individual coding practices of the scientists who develop them”
Is Imperial College going to argue that the BCS is partisan and ideologically motivated?
On motivations. It’s especially unfortunate when academics defend themselves by claiming their critics – all of them, apparently – are ideological. Observing that coding standards are much higher in the private sector than in the academy isn’t even controversial, let alone ideological, as shown by the numerous responses from academics agreeing with this point, and stressing that they can’t be expected to produce code up to commercial standards. (They “need more funding”, obviously.)
But in recent days people have observed that “for months, health experts told people to stay home. Now, many are encouraging the public to join mass protests.” The world has watched as over 1,200 American epidemiologists, academics and other public health officials published an open letter which said: “[A]s public health advocates, we do not condemn these gatherings as risky for COVID-19 transmission …. this should not be confused with a permissive stance on all gatherings, particularly protests against stay at home orders.”
According to “the science” the danger posed by this virus depends on the ideological views of whoever is protesting. This is clearly nonsense and explains why Imperial College administrators were so quick to accuse others of political bias: they see it everywhere because academia is riven with it.
To rebuild trust in public science will require a firm policy response. As nobody rational will trust the claims of academic epidemiologists again any time soon, as the UK’s public finances are now seriously damaged by furlough and recession, and as professional modelling firms are attempting to develop reliable epidemic models themselves anyway, it’s unclear why this field should continue to receive taxpayer funding. The modellers with better standards can, and should, advise the Government in future.
Professor Sunetra Gupta Pooh-Poohs Imperial’s “We Saved the Planet” Baloney
Sunetra Gupta, Professor of Theoretical Epidemiology at Oxford and a long-standing sceptic, was interviewed by Sarah Montagu on Radio 4’s World at One today and asked whether she agreed that the lockdown has saved hundreds of thousands of lives. She demolished the lockdown case so I asked a regular contributor to this site to make a transcript.
Sarah Montagu: Coronavirus is in retreat across the country. So said England’s Health Secretary yesterday after reporting 55 deaths, the lowest death toll since the weekend before the lockdown. Is the virus in retreat because of the lockdown and does it mean that it was all worth it? Imperial College said this week that the lockdown saved millions of lives in Europe. But there is another view that the price of the lockdown was and will yet be felt in different ways. Sunetra Gupta is Professor of Theoretical Epidemiology at the University of Oxford and I asked her whether she would have argued for herd immunity all along.
Sunetra Gupta: Yes I think I would have said that but on the proviso that we put as much money as possible and to make up for what hasn’t happened over the last 30 years to support the vulnerable sections of the population. So I would have said yes, I think at that point we had enough information to know that there were certain sectors of the population who were particularly vulnerable and that we needed to protect them. And the word protect of course carries with it all sorts of implications, but essentially it seemed to me that there was a real gap in the resources available to achieve that. Let’s now try and divert as many resources as possible to protect the vulnerable population and to reduce their risk. And the way to reduce the risk to the vulnerable population, as we have done unwittingly in some cases with the pathogens that do kill elderly people and others who are vulnerable, is by having enough immunity ourselves such that the risk posed to the vulnerable population is low.
Sarah Montagu: So what the idea [is] that the rest of the population carry on as normal, try possibly to get the virus so that they can be the people who represent the herd immunity?
SG: I mean, that’s how we’ve traditionally dealt with the pathogens that do at the moment kill the elderly and vulnerable. I mean it’s a terrible thing that that happens, but it happens, but I guess we’ve made that decision that we need to balance out that problem against the problem of completely shutting down the economy or compromising our social interactions to the point of farce, let’s face it.
SM: So, what? Has the lockdown been a farce?
SG: We’re trying to wriggle out of this situation in a way that is I think quite farcical, we’ve come up with rules that are quite arbitrary to my mind.
SM: The idea would be to change the strategies so that it is the older and frail who should be staying indoors?
SG: First of all I think we need to go out there and make a proper map of what their risk is, and the risk to the elderly and frail is not just contingent on how elderly and frail they are but how immune the rest of the population is. So we need to go out and test to the best of our abilities, knowing now that some people are not going to register positive on these tests simply because they happen to be entirely resistant to the disease. We need some very clever statisticians and people who are disinterested in promoting any kind of sense of what they think is going on to make a proper clear, best assessment of what the risks are to the vulnerable in every part of the country – which is, talking about the UK, there is a huge variation in who’s been exposed, given the locality; there’s enormous heterogeneity and homogenising this data just to fit certain precepts or some preconceptions is not helpful. What we need to do is go out there, look at who’s been exposed in different regions, look at who is vulnerable and come up with a strategy, put money – public money – into supporting the people who are vulnerable, given the risks that they face. Though in London I think the epidemic has more or less run its course from what I can see and, you know, perhaps we can have different strategies, but it’s very likely from the data that it hasn’t spread out from London, so we need to make sensible decisions about how to protect people.
SM: So should we relax about the R number, lift the lockdown quickly and not be phased by the idea of a second wave?
SG: I think the R number is impossible… There will be another resurgence of this, like any other respiratory pathogen in the winter, and we need to prepare for that.
SM: We hear that there is some regret expressed in Sweden at their high death toll. Would that not have happened here if we didn’t have the lockdown?
SG: I think it’s unfortunate that people are focusing on that point. I think that what Tegnell said that they could have done better to protect the care homes and that is indeed what we should have done… we should have protected if we could, and far be it from me to say how that would have been possible, but to protect people in care homes – I think we’re agreed on that and I think it’s unfortunate that people are jumping on that to say that they should have locked down earlier. What I don’t understand about lockdown, is what is the exit strategy from it anyway?
SM: Would you just lift it as quickly as possible?
SG: Yes. Right now, yes, absolutely.
SM: Do you think the disease arose earlier in China than has been suggested?
SG: Absolutely, yes.
SM: When do you think it appeared?
SG: I wouldn’t want to put a number on it but I think that in any normal system by the time you detect deaths from a disease it’s been around for at least a month.
SM: So, what are we talking – October rather than November?
SG: Yes, something like that – October or November.
For those who want to listen to it, it’s here. Starts at the 20m 25s point.
And on to the round-up of all the stories I’ve noticed, or which have been been brought to my attention, in the last 24 hours:
- ‘WHO Says Transmission by Asymptomatic Covid Patients “Very Rare”‘ – The National Review reports on the WHO’s latest bit of bonkers guidance
- ‘Government to row back on pledge to have all primary children back to school before the summer‘ – What fresh hell is this?
- ‘False negatives, testing capacity and pheasants‘ – The team at BBC Radio 4’s More or Less train their forensic analytical skills on the reliability of the Government’s swab test
- ‘Over 95% of UK “COVID-19” deaths had “pre-existing condition”‘ – Off-Guardian points out this means the majority of us are not at risk
- ‘A Perfect Storm for the “Woke” Revolution‘ – Rory Hamilton worries that the combination of the pandemic and the BLM protests has injected a dangerous accelerant into the culture war
- ‘Zoos could shut for good with animals having to be put down, MPs warn‘ – Another unanticipated effect of the lockdowns flagged up by the Telegraph
- ‘The eight reasons building a contact-tracing app is so difficult‘ – Shouldn’t they have thought about these difficulties before announcing it?
- ‘Satellite images of packed Wuhan hospitals suggest coronavirus outbreak began earlier than thought‘ – Interesting story in the Telegraph
- ‘Businesses Struggle to Open After Being Hit a Third Time‘ – Depressing Wall St Journal report pointing out that retail businesses were first hit by the coronavirus pandemic, then by the economic downturn, and now some are reeling after being vandalized and looted
Shameless Begging Bit
Thanks as always to those of you who made a donation in the last 24 hours to pay for the upkeep of this site. If you feel like donating, however small the amount, please click here. And if you want to flag up any stories or links I should include in future updates, email me here.
For those of you who haven’t yet subscribed to mine and James Delingpole’s weekly London Calling podcast, here’s a link to the latest episode, recorded yesterday. We share our dismay at the events of last weekend – with the journalists who applauded the mob that tore down Edward Colston’s statue in Bristol, and the police who stood by and let them do it. What’s next? Hadrian’s Wall? It’s a symbol of colonialism, after all. I suggest to James that we should head north and start dismantling it ourselves as a parody of the “Rhodes Must Fall” nonsense, but he worries about the “wildlings” that might pour through the gap. You can listen to it here.