Code Review of Ferguson's Model

Code Review of Ferguson’s Model

by Sue Denim

[Please note: a follow-up analysis is now available here.]

Imperial finally released a derivative of Ferguson’s code. I figured I’d do a review of it and send you some of the things I noticed. I don’t know your background so apologies if some of this is pitched at the wrong level.

My background. I have been writing software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects. I was also an independent consultant for a couple of years. Obviously I’m giving only my own professional opinion and not speaking for my current employer.

The code. It isn’t the code Ferguson ran to produce his famous Report 9. What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others. This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice). A request for the original code was made 8 days ago but ignored, and it will probably take some kind of legal compulsion to make them release it. Clearly, Imperial are too embarrassed by the state of it ever to release it of their own free will, which is unacceptable given that it was paid for by the taxpayer and belongs to them.

The model. What it’s doing is best described as “SimCity without the graphics”. It attempts to simulate households, schools, offices, people and their movements, etc. I won’t go further into the underlying assumptions, since that’s well explored elsewhere.

Non-deterministic outputs. Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.

This problem makes the code unusable for scientific purposes, given that a key part of the scientific method is the ability to replicate results. Without replication, the findings might not be real at all – as the field of psychology has been finding out to its cost. Even if their original code was released, it’s apparent that the same numbers as in Report 9 might not come out of it.

Non-deterministic outputs may take some explanation, as it’s not something anyone previously floated as a possibility.

The documentation says:

The model is stochastic. Multiple runs with different seeds should be undertaken to see average behaviour.

“Stochastic” is just a scientific-sounding word for “random”. That’s not a problem if the randomness is intentional pseudo-randomness, i.e. the randomness is derived from a starting “seed” which is iterated to produce the random numbers. Such randomness is often used in Monte Carlo techniques. It’s safe because the seed can be recorded and the same (pseudo-)random numbers produced from it in future. Any kid who’s played Minecraft is familiar with pseudo-randomness because Minecraft gives you the seeds it uses to generate the random worlds, so by sharing seeds you can share worlds.

Clearly, the documentation wants us to think that, given a starting seed, the model will always produce the same results.

Investigation reveals the truth: the code produces critically different results, even for identical starting seeds and parameters.

I’ll illustrate with a few bugs. In issue 116 a UK “red team” at Edinburgh University reports that they tried to use a mode that stores data tables in a more efficient format for faster loading, and discovered – to their surprise – that the resulting predictions varied by around 80,000 deaths after 80 days:

That mode doesn’t change anything about the world being simulated, so this was obviously a bug.

The Imperial team’s response is that it doesn’t matter: they are “aware of some small non-determinisms”, but “this has historically been considered acceptable because of the general stochastic nature of the model”. Note the phrasing here: Imperial know their code has such bugs, but act as if it’s some inherent randomness of the universe, rather than a result of amateur coding. Apparently, in epidemiology, a difference of 80,000 deaths is “a small non-determinism”.

Imperial advised Edinburgh that the problem goes away if you run the model in single-threaded mode, like they do. This means they suggest using only a single CPU core rather than the many cores that any video game would successfully use. For a simulation of a country, using only a single CPU core is obviously a dire problem – as far from supercomputing as you can get. Nonetheless, that’s how Imperial use the code: they know it breaks when they try to run it faster. It’s clear from reading the code that in 2014 Imperial tried to make the code use multiple CPUs to speed it up, but never made it work reliably. This sort of programming is known to be difficult and usually requires senior, experienced engineers to get good results. Results that randomly change from run to run are a common consequence of thread-safety bugs. More colloquially, these are known as “Heisenbugs“.

But Edinburgh came back and reported that – even in single-threaded mode – they still see the problem. So Imperial’s understanding of the issue is wrong. Finally, Imperial admit there’s a bug by referencing a code change they’ve made that fixes it. The explanation given is “It looks like historically the second pair of seeds had been used at this point, to make the runs identical regardless of how the network was made, but that this had been changed when seed-resetting was implemented”. In other words, in the process of changing the model they made it non-replicable and never noticed.

Why didn’t they notice? Because their code is so deeply riddled with similar bugs and they struggled so much to fix them that they got into the habit of simply averaging the results of multiple runs to cover it up… and eventually this behaviour became normalised within the team.

In issue #30, someone reports that the model produces different outputs depending on what kind of computer it’s run on (regardless of the number of CPUs). Again, the explanation is that although this new problem “will just add to the issues” … “This isn’t a problem running the model in full as it is stochastic anyway”.

Although the academic on those threads isn’t Neil Ferguson, he is well aware that the code is filled with bugs that create random results. In change #107 he authored he comments: “It includes fixes to InitModel to ensure deterministic runs with holidays enabled”. In change #158 he describes the change only as “A lot of small changes, some critical to determinacy”.

Imperial are trying to have their cake and eat it. Reports of random results are dismissed with responses like “that’s not a problem, just run it a lot of times and take the average”, but at the same time, they’re fixing such bugs when they find them. They know their code can’t withstand scrutiny, so they hid it until professionals had a chance to fix it, but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right.

No tests. In the discussion of the fix for the first bug, Imperial state the code used to be deterministic in that place but they broke it without noticing when changing the code.

Regressions like that are common when working on a complex piece of software, which is why industrial software-engineering teams write automated regression tests. These are programs that run the program with varying inputs and then check the outputs are what’s expected. Every proposed change is run against every test and if any tests fail, the change may not be made.

The Imperial code doesn’t seem to have working regression tests. They tried, but the extent of the random behaviour in their code left them defeated. On 4th April they said: “However, we haven’t had the time to work out a scalable and maintainable way of running the regression test in a way that allows a small amount of variation, but doesn’t let the figures drift over time.”

Beyond the apparently unsalvageable nature of this specific codebase, testing model predictions faces a fundamental problem, in that the authors don’t know what the “correct” answer is until long after the fact, and by then the code has changed again anyway, thus changing the set of bugs in it. So it’s unclear what regression tests really mean for models like this – even if they had some that worked.

Undocumented equations. Much of the code consists of formulas for which no purpose is given. John Carmack (a legendary video-game programmer) surmised that some of the code might have been automatically translated from FORTRAN some years ago.

For example, on line 510 of SetupModel.cpp there is a loop over all the “places” the simulation knows about. This code appears to be trying to calculate R0 for “places”. Hotels are excluded during this pass, without explanation.

This bit of code highlights an issue Caswell Bligh has discussed in your site’s comments: R0 isn’t a real characteristic of the virus. R0 is both an input to and an output of these models, and is routinely adjusted for different environments and situations. Models that consume their own outputs as inputs is problem well known to the private sector – it can lead to rapid divergence and incorrect prediction. There’s a discussion of this problem in section 2.2 of the Google paper, “Machine learning: the high interest credit card of technical debt“.

Continuing development. Despite being aware of the severe problems in their code that they “haven’t had time” to fix, the Imperial team continue to add new features; for instance, the model attempts to simulate the impact of digital contact tracing apps.

Adding new features to a codebase with this many quality problems will just compound them and make them worse. If I saw this in a company I was consulting for I’d immediately advise them to halt new feature development until thorough regression testing was in place and code quality had been improved.

Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one.

On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.

My identity. Sue Denim isn’t a real person (read it out). I’ve chosen to remain anonymous partly because of the intense fighting that surrounds lockdown, but there’s also a deeper reason. This situation has come about due to rampant credentialism and I’m tired of it. As the widespread dismay by programmers demonstrates, if anyone in SAGE or the Government had shown the code to a working software engineer they happened to know, alarm bells would have been rung immediately. Instead, the Government is dominated by academics who apparently felt unable to question anything done by a fellow professor. Meanwhile, average citizens like myself are told we should never question “expertise”. Although I’ve proven my Google employment to Toby, this mentality is damaging and needs to end: please, evaluate the claims I’ve made for yourself, or ask a programmer you know and trust to evaluate them for you.

To join in with the discussion please make a donation to The Daily Sceptic.

Profanity and abuse will be removed and may lead to a permanent ban.

648 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Will Jones

5 years ago

Devastating. Heads must roll for this, and fundamental changes be made to the way government relates to academics and the standards expected of researchers. Imperial College should be ashamed of themselves.

441

-22

Lms2

Reply to Will Jones

The UK government should be just as ashamed for taking their advice.
And anyone in the media who repeated their nonsense.

255

-19

Bob Hope

Reply to Lms2

But the paper never explicitly recommended full lockdown. School closures, yes. Case isolation and social distancing, yep. But it doesn’t say anything about not going to work, not exercising frequently or travelling. Nor does it say anything about well people remaining in their homes and only being allowed to leave them with a “reasonable excuse”…

-14

fdsfxv

Reply to Bob Hope

Ferguson is on video telling – not suggesting – that millions of people will die if we don’t implement China style lockdowns

-4

Everette

Closing schools and going to work ???

DEEBEE

Reply to Everette

A little Home Alone never hurt anyone🤣

-31

Mbkmbk

Reply to DEEBEE

Bullshit! A friend’s brother just killed himself because of it…

Leo Pierson

So says the demented government

Catherine G Russell

Tell that to the over 18,000 additional people who died of heart and circulatory disease in April. https://www.cdc.gov/nchs/nvss/vsrr/covid19/excess_deaths.htm

MJBlair

I brought that up at the time, and was shouted down by the usual suspects. Grandparents would look after the children. But aren’t they the very ones at risk and children are the least likely to get the virus, but would certainly carry it straight to their grandparents.
I’ve written several pieces on the subject of the virus circus.
Now I’m just wanting some people to take responsibility for the hell this house arrest has caused.

-2

Not explicitly, true maybe, but when does any government need more than implication to force it’s sickening, power hungry will upon the general public?

Rachel

Sure it does. It’s what the paper refers to as “suppression,” rather than “mitigation.”

AZIS

This is a silly question, but which paper do you mean?

Sensible Sam

Ah, hindsight….

Robert

The problem is the nature of government and politics. Politics is a systematic way of transferring the consequences of inadequate or even reckless decision-making to others without the consent or often even the knowledge of those others. Politics and science are inherently antithetical. Science is about discovering the truth, no matter how inconvenient or unwelcome it may be to particular interested parties. Politics is about accomplishing the goal of interested parties and hiding any truth that would tend to impede that goal. The problem is not that “government has being doing it wrong;” the problem is that government has been doing it.

206

Tom Welsh

This article explains how such software should be written. (After the domain experts have reasoned out a correct model and had it verified by open peer review, and if possible by formal methods).

“They Write the Right Stuff” by Charles Fishman, December 1996

https://www.fastcompany.com/28121/they-write-right-stuff

Reply to Tom Welsh

After all, only 7 lives depended directly on the Space Shuttle software. The Imperial College program seems likely to have cost many thousands of extra deaths, and to have seriously damaged the economies and societies of scores of countries, affecting possibly billions of lives.

So why should the resources invested in the two efforts have been so vastly different?

-7

Helen

I agree totally. The underfunding of important programs like this feeds into the quality of the resultant model. Concerning Sue Denim’s point, it doesn’t mean that it should become privatised and the work transferred to the insurance sector. As a sector they have large invested interest in a more biased model, at least more than the average fame-hungry epidemiologist researcher. The whole purpose of scientific research is to push the boundaries of understanding, so politicians should be analytical enough to understand limitations of research. It is akin to using a prototype F-35 to go to war, reckless.

-13

afssdfsd

Reply to Helen

You don’t need a lot of funds to review code, they could actually open source it and the community would destroy it for them

Reply to afssdfsd

You’ve misread as that is not my point. I agree that the code should be reviewed and open source. However, it’s more about that the investment of time and resources should be made prior to COVID, not as a posthumous effort

Ex-Oligarch

The solution to incompetence and fraud is not to give more money to incompetent frauds.

-1

Mike Whittaker

Code review is not about “destroying” the code, it’s about improving it: not least, the knowledge it’s going to be reviewed improves the code as it’s written …

Jonathan Gilmore

Reply to Mike Whittaker

They term “destroy” in the context of code reviews is used often, and means destroy the credibility of the code – expose its flaws and failures. So, as you say, it is a positive thing.

Mathieu

A key lesson is that government should equip themselves with capacity to critically appraise risk of bias in scientists’ work. What strikes me is that professors in epidemiology and public health believe that such models are worth presenting to policy makers. WHO in its guideline for non pharmaceutical interventions against influenza grades mathematical models as very low level of evidence.

mojo

4 years ago

Reply to Mathieu

But so many virologists were prepared to stand up against Imperial College and were silenced. That’s the real issue to be dealt with. Scientists who have worked in the field of respiratory diseases were not asked to talk to Government or go on advisory committees. It was the pseudo scientists who were paid by Pfizer and Bill Gates whose advice was sort. Men and women who already had an agenda and an investment in vaccines etc.

LorenzoValla

I agree with the sentiment, but this is not science, and it’s only important because government officials were led to believe that it was science.

The entire notion of a ‘social science’ is the biggest intellectual fraud in human history and is only made possible by academics who exploited the hard earned credibility of the physical sciences to elevate the status of their own fields.

It’s nothing to do with funding. In fact the funding that Bill Gates has ploughed into Imperial College means he can call the shots.
however, it has everything to do with Governments being in bed with Bankers and Globalists. A real patriotic Government would look to the safety of the country, its people and its economy. It would also look at the figures for deaths on flu viruses over the last 50 years against real diseases like Ebola. Then they would talk to top virologists across the board. Nothing as sensible as this ever happens because our Government is run by Globalists.

Boccko

That’s the other side of the horse. You need more granularity.

mailman

But they wont. Everyone involved in this now has skin in the game to ensure NOTHING happens and the lockdown carries on as if its the only thing keeping the entire country from dying.

Mark

Well, this is exactly why there is a growing movement in academia at grassroots level to campaign for groups to use proper software practices (version control, automated testing and so on).

JHaywood

No. The issue with this analysis is that it attempts to discredit the Imperial code. It does not say that lockdown should not have taken place. It does not propose an alternative model that says a different course of action should be followed. It is reasonable to state that lockdown was the right approach given available data and models – this article does nothing to objectively state a different course of action would have led to different results. Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach given the output (including variances due to the core and any bugs). Interested if there is any view to substantiate a different path.

-74

Tim Bidie

Reply to JHaywood

There was no available reliable data: ‘From Jan 15 to March 3, 2020, seven versions of the case definition for COVID-19 were issued by the National Health Commission in China. We estimated that when the case definitions were changed, the proportion of infections being detected as cases increased by 7·1 times (95% credible interval [CrI] 4·8–10·9) from version 1 to 2, 2·8 times (1·9–4·2) from version 2 to 4, and 4·2 times (2·6–7·3) from version 4 to 5. If the fifth version of the case definition had been applied throughout the outbreak with sufficient testing capacity, we estimated that by Feb 20, 2020, there would have been 232 000 (95% CrI 161 000–359 000) confirmed cases in China as opposed to the 55 508 confirmed cases reported.’ https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(20)30089-X/fulltext So the use of a model, any model, in preference to, say, canvassing the best advice of a panel of epidemiologists with many years of experience of coronaviruses was, at best, ill judged. ‘Sunlight will cut the virus ability to grow in half so the half-life will be 2.5 minutes and in the dark it’s about 13m to 20m. Sunlight is really good at killing viruses. That’s why I believe that Australia and the southern hemisphere… Read more »

-5

Edo McGowan

Reply to Tim Bidie

If I remember correctly with SARS, they reduced the effort early-on in Toronto, perhaps from business pressure, and the thing came back. One of the areas needing some thought is dispersal via sewage treatment.

P. O'Nym

He/she would need a model of his/her own, and a much better one, to analyse results and compute whether this or another course of action was best. Do you suppose there is a better model that is, for some reason, not being used? It is of course not certain that a faulty model would produce the correct answer – even a stopped watch is right twice a day – but it is quite likely.

Reply to P. O'Nym

Sorry, missed a negative there. Not produce

The analysis doesn’t just “attempt[] to discredit the Imperial code.” It does so successfully.

And we now know that the Imperial model’s projections do not match the real outcomes.

Vast social and economic changes have been forced on the populace as a result of bad modeling and unreliable data.

It is emphatically not incumbent on critics of the models, the data gathering, or the lockdown regime to put forward their own models or data, let alone some alternative set of response measures.

-3

James Hamilton

Reply to Ex-Oligarch

“It does so successfully.” Not necessarily. The article presupposes that the code should stand up to the sort of tests commercial software engineers use when creating distributable software for general consumption. This is not the point of Ferguson’s code. Statistical models tend to generate their results by being run thousands of times using different starting values. This produces a set of vectors of output values which, when plotted on a linear chart, show a convergence around a given set of values. It is the converged values which are used to predict the distribution they are trying to model, and so provided that Imperial College knew about these flaws (which they say they do – that they are fixing them may be a PR exercise) then it shouldn’t really matter. I find it pretty incomprehensible that the government isn’t being more alert to the criticisms of Ferguson’s model – particularly given how wrong it has been in the past – but I am lead to believe that this is more due to its assumptions rather than any particular issue with the code. I have not heard of anyone else implementing the mathematical model in a different program and getting different results.… Read more »

Richard

Reply to James Hamilton

The article’s author bases his criticism on the presence of bugs and randomness. All software has bugs, the question is “does the bug materially reduce its utility?” I have not seen the code so this criticism is about the authors assumptions. In an epidemiological model, randomness is a feature, not a bug. The disease follows vectors probabilistically, not deterministically. This isn’t an email program or a database application, if the model always returned the same output for the same input that would be a bug. Prescribing deterministic behavior may prevent discovery of non-linear disease effects.

-21

Rakib Hassan

Reply to Richard

If the nondeterministic effects result in prediction variances the same order of magnitude as the predictions themselves, there is a fundamental problem that simply cannot be hand-waved out of.

Throgmorton

Indeterminism is an essential feature of stochastic modelling, but the outputs of successive model runs ought to converge to form a roughly similar picture if they are to be useful. If they are wildly divergent as a result of the way the program was written, which is the case here, then there is most certainly an issue which needs to be corrected.

Daniel

If the model is flawed at its core then it settling on a particular converged set of values after thousands of runs, lends no more credence to its accuracy than a single run. Given the real world data the model seems flawed at its core.

“Statistical models tend to generate their results by being run thousands of times using different starting values.”

Yes, but the idea of having different starting values is that when you run a model twice with the same starting values, it is supposed to give the same result each time.

The Imperial model cannot do that, which means it is not and cannot be correct.

There are several possible problem areas. Subtraction/multiplication/division of small floating point values is a beginner mistake, and yet the code quality sounds so bad that I bet there are some of those as well.

aspnaz

“It is reasonable to state that lockdown was the right approach given available data and models”

What evidence are you thinking of that demostrates that lockdown worked in the past and therefore make it a good policy for this pandemic? From my understanding this is the first such blanket lockdown. We know that self-isolation works for individuals but are you extrapolating from individuals to the whole population?

Your thinking reminds me of people that think that because washing your hands prevents the spread of disease, it must therefore be good to keep your baby in a clean environment: makes sense, logically it all hangs together. Unfortunately it is also a bad assumption because immunity does not work that way. Babies exposed to more germs are generally healthier in the long term: doesn’t make sense, but there you go. That is the advantage of science – real science involves reproducible outcomes and experimentation that is sometime surprising, it does not use heavily flawed models skewed by bad assumptions.

Henry Clayton

‘Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach’. . . . Lockdown attempts to mitigate one risk–spread of infection–while introducing numerous others, none of which are modeled. The world is far more complex than mathematical-modeling infectious disease epidemiologists seem to realise. This was not a sensible approach, which is why it is an approach that has never been taken for any pandemic in the history of the world prior to this.

JHaywood, that isn’t an “issue”; it’s merely a single point of fact. That the code is a mess and produces muddled results is only one piece of the puzzle. Another very important point is that even the best modeling done with the best code is only as good as the data entered into it. The “data” used to create this model were largely untested assumptions.

It’s fair enough to state that for the first couple of weeks, that’s the best we had. But sound thinkers would have realized the assumptions were assumptions and observed and collected data to TEST them and adjust as necessary. It took weeks to get anyone to REALLY look at most of them, and one by one, we’re seeing the assumptions proven false. There’s no excuse for having not sought answers to these questions — which “mere laypeople” were raising as early as January — much sooner.

Reply to Rachel

Quite honestly Imperial College should never have been used. Ferguson should have been sacked for his previous disasters and the codes they use should have been scrapped years ago. Everyone is barking up the wrong tree. The Swedish Doctor who advised his country to mistrust the code and stay open understood that Ferguson was modelling to a certain outcome which would create wealth for Big Pharma. He understood that Pfizer is under a prosecution for $billions regarding the disastrous vaccine programme in India.

Our government must have known this too. Oxford University said the modelling was way off at the very beginning of this debacle. Yet the Government cosied up to Bill Gates, GAVI and the WHO. TRUMP called all these people out but Boris gave them £millions more of our taxes. These are where the questions should be focussed.

Aayush Gupta

Yes there is a different path, which is to treat COVID 19 as a normal disease just like any other illness. Sweden has shown how the rest of the world should have coped up with this issue. There was unnecessary hype created all over the world just for a simple cold and cough. I mean seriously, who says we are more advanced than before ? We are still living in the stone age, being driven by fear rather than science or logic. Science refutes to call SARS COV-2 a deadly virus, there are no peer reviewed reports till date which claim that this virus is indeed capable of inflicting serious damage to otherwise healthy people. It is just getting tagged as the cause of death, even when the actual cause are the co-morbidities and co-infections.
The past two months have shown that no matter how much pride we take in us being scientifically advanced, in the end when put to a real testing situation we still suffer pathetically as before.

Imperial College coding and Ferguson have been discredited under swine flu, foot and mouth and bird flu. Each time there has been an outcry as to how wrong the code has been and how many animals unnecessarily destroyed and farms gone into liquidation. Such short memories some people have.

Barrie Singleton

Vital to factor in Britain’s endemic corruption before seeking head-roll redress. There is none.
I speak from experience. Case study: https://spoilpartygames.co.uk/?page_id=4454

thelastnameleft

It isn’t devastating at all.

tresmegistus

heads need to role not only at imperial but at the government as well. this is total incompetence and who is going to accept responsibility for the sheer destruction of the economy and those who have been made redundant. why was the parliament and government not aware of the previous modelling problems associated with this same professor re the mad cow disease when 6m cattle were slaughtered for no reason at all. why was his history not checked?

Reply to tresmegistus

It’s not incompetence its greed. You cannot tell me that 650 MPs don’t have the intelligence to ask sensible searching questions. Not one asked any questions u til Desmond Swayne and Charles Walker were interviewed by alternative media and shown to be highly ignorant of the facts. These two then started to question. Not one other MP did.

this is competence at the highest most treacherous level and once again Tony Blair and his Globalists are behind it. Follow the money. Who got very much richer. Who got to call the shots around the world. Who destroyed the careers of the real scientists……

JudgeP

“all models are wrong, some are useful” – if govt hadn’t acted on this it would have been far, far worse, so does it matter? At least they did the right thing as a result… Very easy to pick fault with no better solution..

Tom Sullivan

Reply to JudgeP

I don’t think Box had models with MAPE approaching 1000% in mind when he surmised that some are useful. You could obtain more accurate predictions by asking a few random people on the street than by relying on the output of Ferguson’s models.

Bill Richards

Can you providethe evidence that supports your statement?

No, it’s what should have happened under Peer review, but this is belatedly being applied.

Some public-spirited large company such as Google or Microsoft (don’t laugh !), should offer to modularise the model so it’s more easily a. Maintained b. Re-used c. Tested d. Updated

Nick Townsend

The first thought that springs to my mind is that, irrespective of the coding, hundreds of thousands have died, world wide, from a single cause attributed to this virus.

That, surely, is fairly potent evidence that a virus, that also came within measurable distance of killing the English Prime Minister, and HAS killed countless numbers in this country alone, has been accurately identified as a virus with lethal properties?

Professor ‘Fergason’s coding might have been out, but the virus is, potentially, and actually, a killer, and highly infectious.

Surely that justifies government strategy?

-6

Reply to Nick Townsend

Hundreds of thousands die each year from the influenza virus. Since when has shutting down entire societies and economies been the approach to the flu?

“Attribution” does not equal “causation” in the same way that “anecdotes” do not equal “evidence”.
“Professor ‘Fergason’s [sic] coding might have been out,……”, it was so far out [WRONG] as to be laughable. It is NOT FIT FOR PURPOSE!
Remember that many governments have DEEMED that if you test positive for SARS-CoV-2 then you are counted in the regardless of comorbidities.

Brock

The US CDC updated their Covid deaths 9/12/20. They had the decency to break it down by co-morbidity and even reported ICD-10 codes. Turns out, that out of 174,470 total deaths they included almost 20,000 deaths from dementia, 6000+ deaths from Alzheimer’s, almost 6000 deaths from suicide/unintentional injury including vehicle accidents, over 8000 deaths from cancer, and the single biggest category of deaths is from “all other conditions”, over 85,000 deaths. Their ICD-10 codes show they report from maternal death during childbirth, child death from childbirth including congenital deformities, metabolic diseases, psychiatric disorders, dermatitis and non-cancerous skin disorders, and a host of others.
All these deaths, by the way, are still from a population on average 80 years of age, and overwhelmingly from nursing homes/care facilities. For background, the US sees ~250,000 deaths every month from all causes. Still alarmed?

source: https://www.cdc.gov/nchs/nvss/vsrr/covid_weekly/index.htm

Reply to Brock

And I forgot to include, because I don’t believe this is widespread knowledge, the CDC does not require a positive laboratory test in order to count a person as a positive Covid case. Their case definition as of early April included probable cases, whereby you can report even a single symptom to a doctor (who for months were assessing patients remotely by videoconferencing) to satisfy the clinical component, which together with the epidemiological component will be enough to report a “positive case” to public health. The epidemiological component is satisfied by: exposure to positive cases, exposure to untested but symptomatic individuals who themselves had positive exposure, travel/residence in an area with Covid outbreak, or even simply being a member of a risk group. The CDC also says if Covid is listed as a factor on a death certificate, that alone satisfies the vital records component and will add both a positive death and a positive new case to the public record.

Since August the definition of Covid includes a third category, “suspect cases” with regards to antigen/antibody testing, a positive test is considered “supportive evidence” by CDC.

source: https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-covid-19/case-definition/2020/08/05/

Kieran

Agreed. Expert opinion is only as valuable as the reasoning which produces it. What mattters for decision makers is the logic and assumptions which underlie the experts conclusion. The advice that follows a conclusion also needs to be examined for logical flaws. The cult of the expert has allowed the development of extremely sloppy thinking both in the expert field and the decision makers field.

Reply to Kieran

Both the advice and conclusions drawn from that advice must be examined for logical flaws.

Mimi

Thank you so much for this! This code should’ve been available from the outset.

121

Sean Flanagan

Reply to Mimi

Amateur Hour all round!
The code should have been made available to all other Profs & top Coders & Data Scientists & Bio-Statisticians to PEER Review BEFORE the UK and USA Gvts made their decisions. Imperial should be sued for such amateur work.

152

-8

rickk

Reply to Sean Flanagan

Guy at carnival: Here, drink this
Some ol’bloke : What is it?
Guy at carnival: Never mind, it will fix what’s ailing ya
Some ol’bloke : What’s it cost?
Guy at carnival: It doesn’t matter, it’s a deal at twice the price
Some ol’bloke : What’s in it?
Guy at carnival: Shhhhh, just take 3 swigs
Some ol’bloke : It tastes horrible
Guy at carnival: Ya, but it will help you
Some ol’bloke : …if you say so
Guy at carnival: I know hey, but you feel better already

Russ Nelson

But “This code” isn’t what Ferguson was running. The code on github has been munged by other authors in attempt to make it less horrifying. We must remember that what he ran was much worse than what we can see, which is bad enough.

Vadim Antonov

Reply to Russ Nelson

This code is IMPROVED (and cleaned, a lot, by professional software engineers) version of code which was run by Ferguson & Co. It’s still a steaming pile of crap. Ferguson refuses to release the original code, if you haven’t noticed. One is left to wonder why.

Caswell Bligh

This is an outstanding investigation. Many thanks for doing it – and to Toby for providing a place to publish it.

160

lesg

So this is ‘the science’ that the Government thinks is that it is following!

Reply to lesg

*the Government reminds us*

Says who ?

ChrisH29

This is isn’t a piece of poor software for a computer game, it is, apparently, the useless software that has shut down the entire western economy. Not only will it have wasted staggeringly vast sums of money but every day we are hearing of the lives that will be lost as a result.
We are today learning of 1.4 million avoidable deaths from TB but that is nothing compared to the UN’s own forecast of “famine on a biblical scale”. Does one think that the odious, inept, morally bankrupt hypocrite, Ferguson will feel any shame, sorrow or remorse if, heaven forbid, the news in a couple of months time is dominated by the deaths of hundreds of thousands of children from starvation in the 3rd World or will his hubris protect him?

189

speedy

Reply to ChrisH29

I don’t understand why governments are still going for this ridiculous policy and NGOs all pretend it is Covid 19 that will cause this devastation RATHER than our reaction to it.

Ilma

Reply to speedy

It’s the same with the myriad of climate change campaigners. It’s their climate change *policies* that are dangerous, not climate change itself (whatever ‘climate change’ means!).

Joseph A-Smith

Simple – they are afraid to say that they have made a mistake. And, people who follow this are afraid, as per The Emperor’s New Clothes, to admit that they are being used as gullible fools.

EppingBlogger

Impperial and the Professor should start to worry about claims for losses incurred as a result of decisions taken based on such a poor effort. Could we know, please, what this has cost over how many years and how much of the Professor’s career has been achieved on the back of it.

Andy

Reply to EppingBlogger

Remember that Ferguson has a track record of failure:

in 2002 he predicted 50,000 people would die of BSE. Actual number: 178 (national CJD research and survellance team)
In 2005 he predicted 200 million people would die of avian flu H5N1. Actual number according to the WHO: 78
In 2009 he predicted that swine flu H1N1 would kill 65,000 people. Actual number 457.
In 2020 he predicted 500,000 Britons would die from Covid-19.

Still employed by the government. Maybe 5th time lucky?

Reply to Andy

Actually he didn’t. The model said if no action was taken up to 500,000 people could die. Please weigh in objectively to support or challenge the theory above.

itsspideyman

Maybe but he’ll have to step up his game.

William Gruff

The figure of 500,000 deaths was based on the government’s ‘do nothing, business as usual to achieve herd immunity’ strategy then in effect. Ferguson predicted 250,000 deaths if the government acted as it has done since.

whatever

Yeah… way more people died of BSE than 178…

Reply to whatever

Source?

Vee

Do you mean just in the UK? Because swine flu killed way more than 457 in the parts of the world where they didn’t vaccinate.

-9

Juan Luna

Ferguson should be retired and his team disbanded. As a former software professional I am horrified at the state of the code explained here. But then, the University of East Anglia code for modelling climate change was just as bad. Academics and programming don’t go together.

At the very least the Government should have commissioned a Red team vs Blue team debate between Ferguson and Oxford plus other interested parties, with full disclosure of source code and inputs.

I support the idea of letting the Insurance industry do the modelling. They are the experts in this field.

jont

Reply to Juan Luna

The software is irrelevant : a convenient peg to hang a global action on for reasons I cannot divine at present but which will become clearer

Ruby

Ferguson and Oxford are the same team. If you look at the authors of the Ferguson papers you’ll find Oxford names there. If you look at the authors of papers from John Edmunds group you’ll find people who hold posts at Imperial. These groups are not independent.

Martin A

Reply to Ruby

I read that Ferguson has a house in Oxford.

Em Comments

There was a RANGE from the MODEL, not a PREDICTION. From a 2002 report by the Guardian (https://www.theguardian.com/education/2002/jan/09/research.highereducation)
“The Imperial College team predicted that the future number of deaths from Creutzfeldt-Jakob disease (vCJD) due to exposure to BSE in beef was likely to lie between 50 and 50,000.

In the “worst case” scenario of a growing sheep epidemic, the range of future numbers of death increased to between 110 and 150,000. Other more optimistic scenarios had little impact on the figures.

The latest figures from the Department of Health, dated January 7, show that a total of 113 definite and probable cases of vCJD have been recorded since the disease first emerged in 1995. Nine of these victims are still alive.”

Reply to Em Comments

““The Imperial College team predicted that the future number of deaths from Creutzfeldt-Jakob disease (vCJD) due to exposure to BSE in beef was likely to lie between 50 and 50,000…..” That’s three orders of magnitude for the margin of error!! What other science would accept such a wide margin of error?

“The latest figures from the Department of Health, dated January 7, show that a total of 113 definite and probable cases of vCJD have been recorded since the disease first emerged in 1995. Nine of these victims are still alive.”” So strictly speaking the Imperial College was correct, thankfully the reality was within their lowest estimate.

nick

Pathetic review. You should go through the logic of what is coded and not write superficial criticisms which implies you know nothing of what you critique.

-111

Dean Cardno

Reply to nick

I couldn’t disagree more. The issue isn’t the virology, or the immunology, or even the behaviour of whatever disease is being examined / simulated. It is the programming discipline applied to the modelling effort. I doubt the author has the domain-specific expertise to comment on the immunological (etc) assumptions embedded in the program. What the author does have is the programming expertise to identify that the model could not produce useful output, no matter how accurate the virology / immunology assumptions, because the software that translated those assumptions into predictions of infections and case loads was so poorly written.

-10

james

If only the code could actually be understood. It’s so bad you can’t even be certain of what exactly it’s doing.

Doug

Pretty sure the only point of the article was to bring light to the fact that the “model” is flawed and Ferguson has a track record of being VERY wrong on mortality rate predictions based upon flawed models. Solution, stop it. This time around it almost took down an entire country’s economy because of elitist’s overreaction and overreach. Just stop it.

silent one

Reply to Doug

‘almost’ took down an entire country’s economy”
they haven’t stopped the Lockdown yet, plenty of time yet to destroy small businesses.

Ben Grove

I’m afraid Ferguson is a very small part of the plan, and merely doing what he was hired for by KillBill.

Lewian

It’s inappropriately UK-centric to speak of “the useless software that has shut down the entire western economy”. All governments have scientific advisors, there’s lots of modelling going on in many countries, and much of this influenced the lockdown decisions all over the world. If I remember it correctly, when Italy started its lockdown, Imperial hadn’t ye made their recommendation, and many if not most countries have not relied on Imperial. The software may be garbage, but the belief that there wouldn’t be strong scientific arguments for a lockdown without that piece is nonsense as well.

It's Science, Yo

Reply to Lewian

Thank goodness I’ve encountered a small injection of level-headedness here. The original critique is limited, appropriately, to the flawed coding and reliance on its outputs to inform UK policy; it draws none of the sweeping conclusions that others here seem to think are implied – perhaps owing to their own biases. (I came to this site thinking it was named for skeptics who are in lockdown, before I realised it was for skeptics *of* lockdown – so, yeah, plenty of motivated reasoning and politically-charged statements masquerading as incontrovertible truths, but hey, I’m just an actual skeptic… ) So anyway, I’m glad that Lewian has pointed out, because somehow it needed to be, that the world is bigger than the UK and that science (note: not code or software or politics or a dude called Neil) does not operate in a vacuum. One needn’t input even a single data point into a single model in order to undertake risk mitigation strategies if you (and by you, I mean the relevant scientific minds, not actually you) have even a comparatively rudimentary understanding of an infectious agent such as Covid-19. It’s simple cause and effect, extrapolated. Want to be a skeptic in the… Read more »

It seems to have been the primary determinant here in the US, too (and, I think, in Canada). Or at least that’s what they’re telling us.

Marc

I think you are all missing the point. The use of the model was to impress upon the U.S. President the severity of the outbreak. He needed more than “the best available science from the most experienced scientists and researchers in virology, infectious disease, public health and epidemiology” to take it seriously. Clearly, they had done their own modeling and assessment. This thread is what happens when you lose yourself in the code and aren’t looking at the big picture.

Simon Conway-Smith

Why any of this isn’t obvious to our politicians says a lot about our politicians, but your summary also shows that that it is ENGINEERs and not academics that should be generating the input to policy making. It is only engineers who have the discipline to make things work, properly and reliably.

119

-15

Basileus

Reply to Simon Conway-Smith

For decades I have opined that our society was exposed to the risk inherent in being a technologically dependent culture governed by the technically illiterate. QED?

“The Chinese Government Is Dominated by Scientists and Engineers”

https://gineersnow.com/leadership/chinese-government-dominated-scientists-engineers

They are also communists. Which is another way of saying “psychopathic liars”.

el muchacho

No, scientists can write perfectly good code if they have the incentive to do so. Heck, most of the really important math codebases have been written by scientists. But the problem is, most scientists have the incentive to publish quickly, butnot that their methods follow good engineering practice, even when it should be mandated. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology. Unfortunately, that still isn’t the case in other branches of science. It should be.

Guest

A good point, but should you not add two other categories to the statement? First, civil servants; unlike the politicians, these are employed to use their expertise in advising politicians. They tend to be recruited by other civil servants, rather than the polticians.
The second group is journalists. I have seen no mention of this kind of criticism aired publicly by journalists. Indeed, this touches on another of my gripes; in the almost never-ending press conferences, current affairs programmes and interviews, the same old questions are asked over and over again, to be answered by the same generalised statements, while the more interesting and detailed matters are omitted, or, in a tiny number of occasions, interrupted or run out of time.

Chris Martin

This kind of thing frequently happens with academic research. I’m a statistician and I hate working with academics for exactly this sort of reason.

137

skeptik

Reply to Chris Martin

the global warming models are secret too (mostly) and probably the same kind of mess as this code

ANNRQ

Reply to skeptik

Perhaps, if enough people come to understand how badly this has been managed, they will start to ask the same questions of the climate scientists and demand to see their models published.

It could be the start of some clearer reasoning on the whole subject, before we spend the trillions that are being demanded to avert or mitigate events that may never happen.

Debster 1

Reply to ANNRQ

These so called Climate scientists were asked to provide the data, but they come back and said they lost the data when they moved offices.

Michael Mann pointedly refused to share his modelling code for climate change when he was sued for libel in a Canadian court. Ended up losing that will cost him millions. Now why would an academic rather lose millions of dollars than show their working.

Lets hope this “workings not required” doesn’t get picked up by schoolkids taking their exams 🙂

Charly

Tried to find something about this on the BBC news site. Found this:

https://www.bbc.com/news/uk-politics-52553229

At the end of the article, there is “analysis” from a BBC health correspondent.

With such pitiful performance from the national broadcaster, I think Ferguson and his team will face no consequences.

LOL wat a load of crap, it’s the other way around: it’s Mann who sued.

“In 2011 the Frontier Centre for Public Policy think tank interviewed Tim Ball and published his allegations about Mann and the CRU email controversy. Mann promptly sued for defamation[61] against Ball, the Frontier Centre and its interviewer.[62] In June 2019 the Frontier Centre apologized for publishing, on its website and in letters, “untrue and disparaging accusations which impugned the character of Dr. Mann”. It said that Mann had “graciously accepted our apology and retraction”.[63] This did not settle Mann’s claims against Ball, who remained a defendant.[64] On March 21, 2019, Ball applied to the court to dismiss the action for delay; this request was granted at a hearing on August 22, 2019, and court costs were awarded to Ball. The actual defamation claims were not judged, but instead the case was dismissed due to delay, for which Mann and his legal team were held responsible”

Another

Reply to el muchacho

Yes, Mann brought the case; on the other hand, it’s also correct that the case was dismissed when he didn’t produce his code. 9 years after the case started. The step that caused the enventual dismissal of the case was that Mann applied for an adjournment, and the defendents agreed on the condition that he supplied his code. Mann didn’t do that by the deadline specified, and the case was then dismissed for delay. Mann did say he would appeal.

Reply to Another

The take-home point is that even though Dr. Mann sued for defamation, he incongruously refused to provide evidence that the supposed defamation was actually false, something he could easily have done.

If I were publicly defamed as a liar, I would wish for my name to be cleared immediately, and the falsehood shown definitively to be untrue. Dr. Mann stonewalled for more than nine years, refusing to provide the evidence which supposedly should have cleared his good name, which suggests that he was using the legal process as a weapon, rather than trying to purge a slur on his character.

It was worse than that. Dr. Mann took the libel lawsuit against Dr. Timothy Ball, a retiree. Dr. Ball made a truth defence, which is acceptable in Canadian common law, and requested that the plaintiff, Dr. Mann, provide the code and data on which he based his conclusions. Dr. Mann stalled for a decade until at Dr. Ball’s request to expedite the case due to his age and ill health, the judge threw out the suit.

Tl;dr Dr. Mann sued for libel, but refused to provide evidence that the supposed libel was, in fact, false. It appears he was hoping for Dr. Ball would run out of money and fold.

Adrian

Not really, they aren’t. But they are indeed garbage. For example you may download the code for GISS GCM ModelE from here: https://www.giss.nasa.gov/tools/modelE/

No. Quite the opposite. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology.
In fact there is a guide of practice for climatologists:
https://library.wmo.int/doc_num.php?explnum_id=5541

The so-called ‘hockey-team’ were not cleared by the series of inquiries following the release of the ‘climategate’ emails. In fact, the inquiries seemed designed to avoid the serious issues raised by the email dump.

https://www.rossmckitrick.com/uploads/4/8/0/8/4808045/climategate.10yearsafter.pdf

It raises the questions (a) what other academic models that have driven public policy have such bad quality?, and (b) do the climate models suffer in the same way, also making them untrustworthy?

AlanReynolds

Similar skeptical attention should be paid to the credibility automatically granted to economic model projections – even for decades ahead. Economic estimates are routinely treated as facts by the biggest U.S. newspaper and TV networks, particularly if the estimates are (1) from the Federal Reserve or Congressional Budget Office, and (2) useful as a lobbying tool to some politically-influential interest group.

Academics are paid peanuts in the UK. It’s not the US with their 6 figure salaries. You need to teach 8+ hours, do your adminitrivia, and perhaps you’ll squeeze a couple of hours in for research at the end (or beginning) of a very long day. Nothing like Google, with its 500K salaries, and its code reviews. Sure non-determinism sucks but if the orders of magnitude of results fit expectations from other models, it’s good enough to compete with other papers in the field. Want to change that? Fund intelligent people in academia the way you fund lawyers and bankers. Oh, and managers in private industry will change results if it suits them, so “privatise it” is bollocks.

The problem does not lie with non determinism in the model, but with wild divergence of output.

Jeremy Crawford

Just wonderful and sadly utterly devastating. As an IT bod myself and early days skeptic this was such a pleasure to ŕead. Well done

110

Mike Haseler

Thanks for doing the analysis. Totally agree that leaving this kind of job to amateur academics is completely non sensical. I like your suggestion of using the insurance industry and if I were PM I would take that up immediately.

Matthew Dixon

Reply to Mike Haseler

Scientists provide the science, insurers provide insurance. I would never go to an academic for insurance. There is an obvious conflict of interest with relying on an insurance company. It has a fiduciary responsibility to share holders and policy making should be entirely separate from the commercial interests of providing health insurance. The purpose of academia, besides providing education, is to pursue R&D in a non-commercial environment where all IP and research products (i.e. papers and codes) are disclosed to the public. Unfortunately, the insurance industry does not work to the same open standard. The industry is plagued by grotesque profiteering and opaque modeling practices – there are few universal standards for modeling. Try getting an insurance company to fully disclose details of its mortality models and provide beautifully curated source code for everyone to reproduce the decisions made by insurance companies when reviewing claims. You will not find one insurance company’s code in the public domain that is representative of production. My experience has been that the insurance industry is on the whole exactly the opposite of what you are proposing – no transparency and is clearly designed to profit on the misfortunes of others. Granted academia has its… Read more »

Robert Borland

Reply to Matthew Dixon

I am not a big fan of the insurance business, but to be objective: -Actuarial models in the insurance industry are used to determine insurance pricing, not to settle claims. Claims are based on evidence. -The insurance business is not designed to profit on the misfortunes of others; a perfect insurance business model outcome would be that there were NO misfortunes. One must also remember that the overwhelming desired outcome of purchasers of insurance is that it not be required to make claims. -Academic science has not fallen victim to capitalism, it has fallen victim to bureaucracy and conformity; if you do not conform to espouse expected and required outcomes you are labeled as a pariah, demonised and excluded. Evidence contradicting official policy is suppressed, falsified, or rationalised away. But see Thomas Kuhn’s ‘The Structure of Scientific Revolutions’ which touched on the herd mentality of structured organizations and eventual paradigm shifts. In the example of these pandemic modelling disasters, the paradigm shift would be to exclude modelling as an influence on government policy, and the manias that can result. -And finally, in science there is usually no accountability, liability, or consequences except temporary. In this most recent marriage of political… Read more »

Andy Riley

Look at SetupModel.ccp from line 2060 – pages of nested conditionals and loops with nary a comment. Nightmare!

James

Reply to Andy Riley

The best is there’s all this commented out code. Was it commented out by accident? Was there a reason for it being there to begin with? Who knows, it’s a mystery.

Alicat2441

Haven’t time to read the article and stopped at the portion where the data can’t be replicated. That right there is a huuuuuuge red flag and makes the “models” useless. I’ll come back tonight to finish reading. I have to ask: Is this the same with the University of Washington IMHE models?. Why do I have a sneaking suspicion that it is.

Laurence_R

Reply to Alicat2441

The IMHE ‘model’ is much worse – it’s just a simple exercise in curve fitting, with little or no actual modelling happening at all. I have collected screenshots of its predictions (for the US, UK, Italy, Spain, Sweden) every few days over the last few weeks, so I could track them against reality, and it is completely useless. But, according to what I’ve read, the US government trusts it! Until a few days ago, its curves didn’t even look plausible – for countries on a downward trend (e.g. Italy and Spain), they showed the numbers falling off a cliff and going down to almost zero within days, and for countries still on an upward trend (e.g. the UK and Sweden) they were very pessimistic. However, the figures for the US were strangely optimistic – maybe that’s why the White House liked them. They seem to have changed their model in the last few days – the curves look more plausible now. However, plausible looking curves mean nothing – any one of us could take the existing data (up to today) and ‘extrapolate’ a curve into the future. So plausibility means nothing – it’s just making stuff up based on pseudo-science.… Read more »

Patrick McCormack

Reply to Laurence_R

Ditto. The IMHE predictions are completely silly.

They leap at them for fear of the MSM accusing them of not doing anything.

I had hoped Donald Trump would be a stronger leader than that, and insisted on any model being independently and repeatedly verified before making any decision.

The other factor that seems entirely missing from the models is the ability of existing medicines, even off-label ones, to treat the virus, and there have been many trials of Hydroxy Chloroquine with Zinc sulphate (& some also with Azithromycin) that have demonstrated great success. It constantly dismays me that this is ignored, and here in the UK, patients are just given paracetamol; as if they have a headache!!

I offer a critical review of past and present IHME death projections here: https://www.cato.org/blog/six-models-project-drop-covid-19-deaths-states-open

Desmond

Could these popularity contest winners perhaps just be idiots? Occam’s razor applies.

“It’s because they hate not knowing what’s going to happen, so they are willing to believe anyone with academic credentials who claims to have a crystal ball.”
Problem with this one is that Neil Fergusson and Imperial College have been consistently wrong.

G H

I’m a guy working in the biz for 40+ years. Just a grunt, but paid pretty well for being a grunt. The “can’t be replicated” is insane.

The only time “can’t be replicated” is an issue when real time is involved. If you can’t say “Ready, set , go” with the same set of data and assumptions that are plugged in, you have some serious issues going on.

“But, we have to multi-thread…..on multiple CPU cores or we won’t get results fast enough”. Ok, you got bogus results.

Robin66

This is scary stuff. I’ve been a professional developer and researcher in the finance sector for 12 years. My background is Physics PhD. I have seen this sort of single file code structure a lot and it is a minefield for bugs. This can be mitigated to some extent by regression tests but it’s only as good as the number of test scenarios that have been written. Randomness cannot just be dismissed like this. It is difficult to nail down non-determinism but it can be done and requires the developer to adopt some standard practices to lock down the computation path. It sounds like the team have lost control of their codebase and have their heads in the sand. I wouldn’t invest money in a fund that was so shoddily run. The fact that the future of the country depends on such code is a scandal

125

Reply to Robin66

‘Software volatility’ is the expression Robin and it is always bad.

dr_t

I have not looked at Neil Ferguson’s model and I’m not interested in doing so. Ferguson has not influenced my thinking in any way and I have reached my own conclusions, on my own. I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number, but suffice to say that I decided to fly to a different country, stock up on food right, and lock myself up so I have no contact with anybody, right at the beginning of February, when nobody else was stocking up yet, nobody else was locking themselves up, and people thought it was all a bit strange. When I flew to my isolation location, I wore a mask, and everyone thought it was a bit strange. Make your own conclusions. I’ve read this review. Firstly, I’ll stress this again, I’m not going to defend Ferguson’s model. I have not seen it. I don’t know what it’s like. I don’t know if it’s any good. I don’t share Ferguson’s politics, even less so those of his girlfriend. His estimate of the number that would likely die if we took no public… Read more »

178

-102

MFP

Reply to dr_t

I read the author’s discussion of the single-thread/multi-thread issue not so much as a criticism but as a rebuttal to possible counter-arguments. I agree it probably should have been left out (or relegated to a footnote), but the rest of the author’s arguments stand independently of the mult-thread issues.

I disagree with your framing of the author’s other criticisms as amounting to criticism of stochastic models. It does not appear the author has an issue with stochastic models, but rather with models where it is impossible to determine whether the variation in outputs is a product of intended pseudo-randomness or whether the variation is a product of unintended variability in the underlying process.

Reply to MFP

According to Github, the reproductibility bugs mentionned have been corrected by either the Microsoft team or John Carmack, and the software is now fully repoducible. They sure checked what result was given by the software before and after the corrrection and they must have found out it was the same.
The question is, have the bugs led to incorrect simulations ? I can’t say but realistically it’s very unlikely. As a scientist, Neil Ferguson and his team are trained to see errors like that and the fact that they commented at these bugs is evidence enough that they knew they were buggy.

Is it poor software practice ? Absolutely.

Should scientists systematically open source their code and data ? I think so, and I deplore the fact that it’s still not standard practice (except in climatology).

Are the simulations flawed and is it bad science ? You certainly cannot conclude anything even close to that from such a shallow code review.

Paul Penrose

dr_t, I am also a Software Engineer with over 35 years of experience, so I understand what you are saying as far as 30 year old code, however if the software is not fit for purpose because it is riddled with bugs, then it should not be used for making policy decisions. And frankly I don’t care how old the code is, if it is poorly written and documented, then it should be thrown out and rewritten, otherwise it is useless. As a side note, I currently work on a code base that is pure C and close to 30 years old. It is properly composed of manageable sized units and reasonably organized. It also has up to date function specifications and decent regression tests. When this was written, these were probably cutting-edge ideas, but clearly wasn’t unknown. Since then we’ve upgraded to using current tech compilers, source code repositories, and critical peer review of all changes. So there really is no excuse for using software models that are so deficient. The problem is these academics are ignorant of professional standards in software development and frankly don’t care. I’ve worked with a few over the course of my career and… Read more »

130

Reply to Paul Penrose

I agree 100%, I wrote c/c++ code for years and this single file atrocity reminds me of student code

Neil

The fact it wasn’t refactored in 30 years is a sin plain and simple.

Reply to Neil

That’s human nature. I work as a S.E. in financial services. No real degree. Been doing it for 40 years, pays well, can probably work into my 70s if I want. Just got a little project to make a Access Data Base (MDB file) via a small program for a vendor that our clients love and trust. What the ??????? MicroSoft never canceled it but hasn’t promoted it in at least 15 years. I also get projects based on COBOL specs.

That tells me that people are kicking the can down the road because “It still runs. It’ll be fine”. And, they hope they are retired when it’s not fine.

More over, this was likely the ‘code’ used for his swine flu predictions – which performed magnificently 😉

dodgy geezer

I was coding on a large multi-language and multi-machine project 40 years ago. This was before Jsckson Structured Programming, but we were still required to document, to modularise, and to perform regression testing as well as test for new functionality. These were not new ideas when this model was originally created.

The point of key importance is that code must be useful to the user. This is normally ensured by managers providing feedback from the business and specifying user requirements in better detail as the product develops. And this stage was, of course , missing here.

Instead we had the politicians deferring to the ‘scientists’, who were trying out a predictive model untested against real life. That seems to have worked out about as well as if you had sacked the sales team of a company and let the IT manager run sales simulations on his own according to a theory which had been developed by his mates…

physicist137

Reply to dodgy geezer

> untested against real life.
And _untestable_? There is no mention in the review of how many parameter values need to be fixed to produce a run. More than 6-10 and I cannot imagine searching for parameters for a best fit [to past data] to result in stable values over time.

steve brown

All I know is that my son is same as Ferguson. Physics PhD BUT is now a commercial machine learning Data Scientist. However, he has spent five years out of academia learning the additional software skills required, passing all AWS certs etc. Ferguson didn’t.

Yes, I was coding 30 years ago and we wrote modular, commented code using SCCS for version control.

And I know a juggler who can juggle 7 balls while rubbing his belly. He is a juggler, you may be a software developer and Ferguson is an epidemiological modeler. How good are your epidemiological modelling and ball juggling skills?

Fred Streeter

Working as an Analyst/Programmer together with a Metallurgist and a Production Engineer, I designed and programmed a Production Scheduling system, derived from their expertise.

This was some 35 years ago. Documentation of the system was provided in the terminology of the experts, with links to the documentation of the code – and vice versa.

So, no, I would not claim to have been able to juggle with their 7 balls, but, equally, they could not juggle with mine.

Robbo

How wrong you will be proved to be. Testing is already indicating that huge numbers of the global population have already caught it. The virus has been in Europe since December at the latest, and as more information comes to light, that date will likely be moved significantly backwards. If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.
This virus will be proven to be less deadly than a bad strain of influenza, with it without a vaccinated population. Total deaths have only peaked post lockdown. That is not a coincidence.

-16

Jacqueline

Reply to Robbo

@Robbo Why is it not a coincidence? I am not sure what to think about this virus: you say it will proven to be like a bad strain of influenza, but I work in a hospital and our clinical staff are saying they have never seen anything like it in terms of number of deaths.

Reply to Jacqueline

There empty hospitals full of tik tok stars?

I would not be surprised at a large number of initial deaths with a new disease when the medical staff have no protocol for dealing with it. In fact, I understand that their treatment was sub-optimal and could have made things worse.

When we have a treatment for it we will see how dangerous it is compared to flu. Which can certainly kill you if not treated properly…

https://vk.ovg.ox.ac.uk/vk/influenza-flu

“In the UK it is estimated that an average of 600 people a year die from complications of flu. In some years it is estimated that this can rise to over 10,000 deaths (see for example this UK study from 2013, which estimated over 13,000 deaths resulting from flu in 2008-09).”

This thing has already killed 30,000 in NHS hospitals and probably another 15,000 who died at home and in care homes – 45,000 in total. The numbers are only this low because of the draconian lockdown measures.

This is in the space of the first 2 months, and we are nowhere near the saturation point yet. Those countries in the EU which have conducted randomized antibody testing trials have determined that 2%-3% of their populations have been infected to-date.

The Spanish flu killed 220,000 in the UK over a period of 3 years between 1918 and 1920.

We may not know exactly how dangerous this thing is, but we already know that it is nothing like the flu and a heck of a lot more deadly.

What are the deaths of those that have died FROM covid 19 and how are those written on the death certificates and how is it that those that die of a disease other than covid 19 are also included as covid 19 deaths when they were only infected by covid 19. As we know there are asymptomatic carriers so there MUST be deaths were they had the covid but that it was not a factor in those deaths but were included on the death certificate. The numbers of deaths that have been attributable to covid 19 have been over-inflated. Never mind that the test is for a general coronavirus and not specific to covid 19.

Dr T, do you have references to the randomized antibody studies that show 2-3% spread? Some of the studies I’ve seen for the EU indicate higher (e.g., the Gangelt study).

How many of these clinical staff were working during the 1957 pandemic? Probably …. none. It was worse on an absolute and per-capita basis than what we’re seeing now.

The 1957 flu pandemic killed 70,000 in the USA in the space of more than a year. SARS-2 has killed nearly 80,000 in the USA in the first 2 months. I cannot find reliable numbers for the number of deaths in the 1957 pandemic in the UK, but all the partial numbers I can find are a lot lower than the current number of SARS-2 deaths in the UK in the first 2 months (30,000 in the NHS + 15,000 at home and in care homes = 45,000). As all epidemics, the growth in the absence of mitigation measures is exponential until saturation is reached (we are very far from that point), so most of the deaths occur at the peak, and what you see even a week before the peak is a drop in the ocean. I think you need to check the facts before making such claims.

Galt1138

The US didn’t have anywhere near 80K deaths in the first two months. Where are you getting these numbers? And what date are you placing the first US COVID-19 death?
The CDC website shows a total of just under 49K as of May 11:
https://www.cdc.gov/nchs/nvss/vsrr/covid19/index.htm

Bumble

Brilliant comment. This model assumes first infections at least two months too late. The unsuppressed peak was supposed to be mid May (the ‘terrifying’ graph) so what we have seen in April is likely the real peak and lockdown has had no impact on the virus. Lockdown will have killed far more people. Elderly see no point in living in lockdown. Anecdotal reports that people in care homes have just stopped eating.

Reply to Bumble

Nope. Base rate. Look outside your own wee land.
Spain is a better example.
Just model Spain with a simple statistical model and you see the lockdown impact.
It’s easy, you can do it in an afternoon.

Frank

> If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.

That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked.

> This virus will be proven to be less deadly than a bad strain of influenza

Flu kills around 30,000/year in the US, mostly over a five-month period. Covid-19 has killed 70,000 in about six weeks, despite the lockdown.

SteveB

Reply to Frank

@Frank, “That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked”. Incorrect. Peak deaths in NHS hospitals in England were 874 on 08/04. A week earlier, on 01/04, there were 607 deaths. Crude Rt = 874/607 = 1.4. On average, a patient dying on 08/04 would have been infected c. 17 days earlier on 22/03. So, by 22/03 (before the full lockdown), Rt was (only) approx 1.4. Ok, so that doesn’t tell us too much, but if we repeat the calculation and go back a further week to 15/03, Rt was approx 2.3. Another week back to 08/03 and it was approximately 4.0. Propagating forward a week from 22/03, Rt then fell to 0.8 on 29/03 So you can see that Rt fell from 4.0 to 1.4 over the two weeks preceding the full lockdown and then from 1.4 to 0.8 over the following week, pretty much following the same trend regardless. So, using the data we can see that we could have predicted the peak before the lockdown occurred, simply using the trend of Rt. In my hypothesis, this was a consequence of limited social distancing… Read more »

djaustin

Reply to SteveB

Peak excess all-cause mortality was last week – yes the last week in April. Don’t just look at reported COVID19 hospital deaths, And don’t just focus on one model.

Reply to djaustin

How do you know that? ONS stats have only just been published for w/e 24th April and they were down a bit on the week before?

Epidemic curve are flat or down in so many countries with such different mitigation policies that it’s hard to say this policy or that made big difference, aside from two – ban all international travel by ship or airplane and stop mass transit commuting. No U.S. state could or did so either, but island states like New Zealand could and did both. In the U.S., state policies differ from doing everything (except ban travel and transit) to doing almost nothing (9 low-density Republican states, like Utah and the Dakotas). But again, Rt is at or below in almost all U.S. states, meaning the curve is flat or down. Policymakers hope to take credit for something that happened regardless of their harsh or gentle “mitigation” efforts, but it looks like something else –such as more sunshine and humidity or the virus just weakening for unknown reasons (as SARS-1 did in the U.S. by May). https://rt.live/

Isabel Page

I started distancing myself before the end of January when I was abroad on holiday in Tenerife with no known cases. But someone coughing next to me? I reacted. Also kept away from vulnerable people on my return home. Surely others behaved likewise long before lockdown?

Reply to Isabel Page

I also did the same at the end of January / early February.

Frank, the peak Flu season is December through February, which is about the same amount of time that we’ve officially been recording deaths in the U.S. from the SARS-CoV-2 pathogen (February through April). Likewise, regarding a lockdown vs. no lockdown scenario comparison, that is also offset by the vaccine vs. no vaccine aspect of these two pathogens.

Please keep in mind that we’ve had numerous Flu seasons where between 60,000 to more than 100,000 Americans have passed away due to it, all despite a solid vaccination program.

“Flu season deaths top 80,000 last year, CDC says”
By Susan Scutti, CNN
Updated 1645 GMT (0045 HKT) September 27, 2018

https://edition.cnn.com/2018/09/26/health/flu-deaths-2017–2018-cdc-bn/index.html

Right, but you’re comparing apples to oranges. Compare Covid-19 to other pandemics, like 1917, 1957, or 1968.

Chebyshev

May be it is not “despite” but “because of”?

If you start the lockdown as late as March, then you ensure that infection and death rates are going to be higher because of high dosage and fragile immune system that comes from lockdown.

There are plenty of countries without lockdown to compare against. So it is not an unverifiable hypothesis.

Epictetus

Yes but the manner in which they count COVID-19 deaths is flawed. Even with co-morbidity they ascribe to COVID, and in cases where they do not test but there were COVID-like symptoms, they ascribe it to COVID according to CDC.

Bazza McKenzie

Reply to Epictetus

Most governments are busily fudging the numbers up, to ex-post “justify” the extreme and massively damaging actions they imposed on communities and to gain financial benefit (e.g. states and hospitals which get larger payouts for Wuhan virus treatment than for treatment for other diseases).

As with “global warming”, the politicians, bureaucrats and academics are circling the wagons together to protect their interlinked interests.

Peter B

You are confusing deaths ASSOCIATED with Covid with EXCESS deaths resulting from flu. If all those who died of pneumonia, cancer, heart disease were routinely tested for flu we’d find that hundreds of thousands die WITH flu every year, though not as a direct result of it.

David Blackall

“The virus has been in Europe since December at the latest” https://www.sciencedirect.com/science/article/pii/S1567134820301829?via%3Dihub

Oh yes. The model is all rather irrelevant now as we catch up on burying the dead.
In point of fact a ten line logistic model does as good a job.
Still, academic coding is usually a disaster. I went back to grad school in my late thirties after twenty years of software development. I should have brought a bigger stick.

Reply to Patrick McCormack

“I should have brought a bigger stick”.

A PART, maybe? “Professor Attitude Realignment Tool”

Jon

SARS-CoV-1 (SARS) and SARS-CoV-2 (covid-19) both bind to the same receptor/enzyme (ACE2), causing increased angiotensin II (as it is no longer converted to angiotension 1,7 due to reduction ACE2 receptors/enzyme) and a cascade of pathological effects from that, causing: pneumonia, ARDS, hypoxia, local immune response, cytokine storm, inflammation, blood clots. SARS has a mortality rate of 10%, why would covid-19 be on a par with flu and not higher given it has the same/similar pathology as SARS ?

I have not seen the model or intend to do either. On thing that rang alarm bells with me was the statement that R0 was an input into its calculation making it a feedback system. These types of dynamical systems are known to exhibit truely chaotic behaviour. Even when not operating in those chaotic regions, the numerical methods must be chosen carefully so that they themselves do not introduce artificial method-induced pseudo non-deterministic behaviour (small differences in the initial conditions or bugs such as use of uninitialised variables)

Reply to Mark

The modellers argument would be that life is chaotic, and introducing a virus to two separate but identical towns could indeed result in very different outcomes.

Which makes me wonder about the validity of modelling chaotic systems at all…

I think the practice could make sense. The input R0 might describe how communicable the disease is without countermeasures, while the output R0 is the resulting communicability with the countermeasures being modelled. Nowhere in the article does it actually say the output is used as the following run’s input, and while I agree that’d be illogical and give huge swings in outputs (e.g., perhaps converging on infinity!) there’s no sign that’s being done. Is one of the top five critiques we can make of this code, that if used in a manner it’s not being used, it’s output would go crazy?

Jayne

The value of your comprehensive reply was completely invalidated when you declined to provide your own calculations!

Bob

Yeah you’ve written millions of lines of code in dozens of languages, but didn’t read the review carefully. There’s a difference between randomness you introduce, which you can reproduce with the correct seed, and bugs which give you random results. You can’t just say, ‘oh it’s stochastic’, no, it’s bug ridden. They don’t understand the behaviour of their own model.

Saying it’s crappy because it’s 30 years old is nonsense. You can’t then use your crappy, bug ridden code to influence policies which have shut the economy down.

Reply to Bob

Unix is 50 years old. And IBM mainframe operating systems even older. And CICS…

Software on which the world runs every second of the year.

David collier

Oh for heaven’s sake. Have you read the Linux kernel? It only even begins to work because people live and breathe it. It wouldn’t pass structured programming 101. Linux specifically discourages comments.

And how many systems are running Unix (as opposed to Linux) nowadays ?

Nic Lewis

The review is a code review, not a review of the mathematical model, so I don’t see that one would expect it to present the substance of the model in any detail. ” There are EU countries which have conducted tests of large random, unbiased, samples of their population to estimate what percentage of their population has had the virus. The number – in case of those countries – comes out at 2%-3%. If the same is true of the UK, then 30,000 deaths would translate to 1 million deaths if the virus infected everybody. ” Antibody tests indicate those people who have been sufficiently susceptible to the virus for their innate immune systems and existing T-cells to be unable to defeat the SARS-COV-2 virus, resulting in their slower-responding adaptive immune systems generating antibodies against the virus. But there are potentially a much larger number of people whose innate immune systems and/or existing T-cells are able to defeat this virus, and have done so in many cases, without generation of a detectable quantity of SARS-COV-2 specific antibodies. That seems the most likely explanation for why the epidemic is waning in Sweden, indicating a Reproduction number below 1, contrary to even… Read more »

duncanpt

It seems to me that most of your comments are excuses for practices that were poor at the time, let alone now. Most of them simply reinforce the view that the code should have been ditched and rewritten top to bottom years ago as being no longer fit for purpose, if it ever was. Opportunities or signals to do so: move from single to multi-thread machines; publication of new/revised libraries with different flavours; discovery of absence of comments (!); discovery that same input does not yield same output (when it’s intended to); etc Incidentally, “… [no] reason to think these bugs existed in the original code or that they were material.” which is precisely why we need to see the *actual* code that produced the key reports leading to the trashing of our economy and the lockdown with its consequential deaths. Personally, I don’t think programmers necessarily criticise old code so long as it does what it claims to do. They may not like or understand the style but they can accept that it works. But here’s the thing: if it doesn’t do what it claims, then the gloves are off and they will come gunning not only for the… Read more »

Dene Bebbington

Ferguson said his code was written 13 years ago, not 30. Even so, 30 years ago undocumented code was still bad practice even if that’s how some programmers worked. Unless Ferguson can provide evidence that his original code underwent stringent testing then there’s little reason to trust it. But if it was tested properly the question still remains whether the model it implements is a reliable reflection of what would happen in reality.

Reply to Dene Bebbington

question what his predictions for: BSE, Swine Flu, Avian Flu in the past were compared to reality.

hint: his predictions were worse than asking Mystic Meg.

forsyth

the code was written 13 years ago, not 30.

It was a different time is no basis for a defence and your comments are a defence. They either thought their code worked or they didn’t. This shows that they didn’t. That’s all that matters. As for your fear. That’s yours to deal with. Sounds like you’ve got issues to me.

Sorry, but this is an absurd criticism. We have all seen old legacy code that needs refactoring and modernization. Anything that is mission critical for a business, in medicine, in aviation, etc., will often have far more testing and scrutiny applied to it than the actual act of writing the code because either huge amounts of money are at stake, or even more importantly, lives are at stake. For this kind of modeling to be taken seriously, a serious effort should have been made to EARN credibility.

There is simply no excuse for Ferguson, his team, and Imperial College for peddling such garbage. I COMPLETELY agree with the author here that “all academic epidemiology be defunded.” There are far superior organizations that can do this work. And even better, those organizations will generate predictions that will be questioned by others because they are not hiding behind the faux credibility of academia.

dr_t,

Linux is nearly 30 years old. What’s your point again?

And Linux – although legally unencumbered – is essentially a Unix-like operating system. And Unix dates back to 1970.

TerryB

I started work as a trainee programmer for a commercial company in 1971. The first thing I learned was ‘comment everywhere, document everything’.

And I doubt whether there is much/any original code there now.

Kirk Parker

Given the lead-in remarks here, I wonder if this commenter is just trolling us.

Given the fantastical view of software development 30 years ago, I wonder if he really knows that much about software development? Comment free code? 15,000 line single-source files? GMAB! Kernighan and Plauger were complaining about standard Pascal’s lack of separate compilation 40 years ago when they rewrote “Software Tools” as “Software Tools In Pascal, stating that while it might be better for teaching, that lack made it worse than even Fortran for large scale programming projects.

another_d

I have a PhD in biochemistry and currently do academic research in systems biology. I have about 20 years coding experience. This kind of approach to statistical analysis is very familiar. I concur with dr_t. The stochasticity is a feature not a bug; it is used to empirically estimate uncertainty (i.e. error bars). The model *should* be run many times and the mean/average and variance of the outputs are exactly the correct approach. Highlighting the difference between two individual runs of a stochastic model is only outdone in incorrectness by highlighting a single run. You’re effectively criticizing the failure to correctly implement a run guarantee that wasn’t important in the first place. Based on your description it sounds like the same instance of an RNG is shared between multiple threads. Your RNG then becomes conditioned on the load of the cores, because any alteration in the order in which the RNG is called from individual threads changes the values used. If it’s accurate that the problem persists in a single threaded environment then it could be the result of a single call to any well-intentioned RNG that used a default seed like date/time. The consequence is only that parameter values… Read more »

Dave

Reply to another_d

The other thing to notice is that the difference between the two runs seems to be (almost) entirely a question of “onset”. That is, the curves are shifted in time. You’d expect a model to be far more influenced by randomness “at the start” (where individual random choices can have a big effect), and so you shouldn’t be reading very much into the onset behaviour anyhow (c.f. nearly all the charts published show “deaths since 20 deaths” or similar, because the behaviour since the *first* death has a lot of random variation). If this is what’s actually happening (and it certainly looks like it to me) the people making the critique are being fairly disingenuous not to point it out. To be clear: I don’t think the non-reproducibility (in a single thread environment) is good, and it’s a definite PITA in an academic environment, but I’m doubting it makes any substantial difference to the results. “80,000 deaths difference” looks to be massively overstating things, when more accurate would be “the peak death rate comes a week later” (with the final number of deaths the same). And even if 80,000 was accurate, it’s only a 20% difference. There are lots of… Read more »

While we can debate the reviewer’s understanding of stochasticity used in this model, there doesn’t appear to be much debate about the quality of program/model itself. Put another way, it does not matter if the correct ideas were used in the attempt to create a model if the execution was so poor that the results cannot be trusted. As an academic, I would expect you to be appalled that the program wasn’t peer reviewed. I can only hope that your omission here does not represent a tacit understanding that such practice is customary. But I suspect such hope is misplaced. All of the modern standards (modularization, documentation, code review, modularization, unit and regression testing, etc.) are standards because they are necessary to create a trustworthy and reliable program. This is standard practice in the private sector because when their programs don’t work, the business fails. Another difference here is that when that business fails, the program either dies with it or is reconstituted in a corrected form by another business. In an academic setting, it’s far more likely that the failure will be blamed on insufficient funding, or that more research is required, or some other excuse that escapes blame… Read more »

Reply to LorenzoValla

I’m not going to defend coding practices as such in the academy. Just realize that modularization, documentation, code review, etc. become much more burdensome when the objective of the code is a moving target. This is how it is in a basic research environment where the how is, by definition, not known a priori. How do you plan the programming when the solution is unspecified until the very end. The solution itself is what the research scientist is after, the implementation is just a means to that end. The code is going to carry the legacy of every bad idea and dead end that was pursued during the project. This will always be a point of friction because once the solution is found it always looks straightforward and obvious in retrospect. A professional coder can always come in after all that toil and failure and turn their nose up at all the individual suboptimal choices scattered throughout. This happens constantly; a researcher develops a novel approach that solves 99% of the unknowns and then a hotshot software engineer comes in and complains that there’s still 1% left and if s/he had written the program (now conveniently armed with all the… Read more »

-12

Look, you want your opinions to have merit, then carry the burden. That’s what he rest of us have to do. Moreover, it’s very, very likely that much of the code could be modularized for reuse and that the tweaking can be done systematically in a subset of modules.

What you’re describing is akin to an actual scientist puttering around in a lab and then telling the world they have found the solution while at the same telling the world it’s too complicated to explain or document along the way, so just trust the results. Just another reason why this process fails the basic principles of the scientific method.

Leif R

Well, current agile development practices do this continuous “problem discovery” all the time, but with sustained code quality at every commit (or at the very least at every pull request).

Well you clearly are fresh off the boat. Academic source code is uniformly shit. It is very rarely provided, and never “peer reviewed”. “Peer reviewal” isn’t paid, it’s an extra “voluntary activity” done in one’s free time. You seriously think scientists have so much money that they’ll spend weeks peer viewing each others’ 15K line files looking for bugs?

That’s why the open source approach is valuable.

Steven Wittens

It is simply a) wrong and b) stupid to pretend that every call to an RNG is an instance of a statistically independent and uncorrelated variable.

It is wrong because it is untrue, and it is stupid because it makes it a nightmare to maintain reproducibility of results in an evolving project.

If you want to see a serious engineering treatment of RNGs and noise in integration problems, look to computer graphics, where the difference between white and blue noise is crucial for instance, and the difference between theory and practice can be huge due to quantization and sampling effects.

Peter W

I was programming point of sale and some financial software about 40 years ago so I agree with your point that it was very different – a few K of RAM and a few years later a massive 10 megabyte hard drive!
However Stochastic still equals random and we can’t do what we’ve done on random information.
Good luck with hiding from a Caronovirus! It was right across the UK weeks before lockdown and will, in my view, be asymptomatic in between 30 to 60% of population. My guess is as good as any guesswork produced by predictive, stochastic models!

“I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number”.

So, in other words, you are just like Ferguson: You made a prediction, which might have been reasonable at the time, but you won’t show your workings (and you won’t even tell us the prediction) but now you’re going to stick with it no matter what. That’s terrible science.

The latest meta analysis of Sero studies:

https://docs.google.com/spreadsheets/d/1zC3kW1sMu0sjnT_vP1sh4zL0tF6fIHbA6fcG5RQdqSc/

shows an overall IFR in the region of 0.2%, higher in major population centres. For people under 65 with no underlying health conditions it’s more like 0.02%. Research from the well-respected Drosten in Germany suggests perhaps 1/3 of people have natural immunity anyway:

https://www.medrxiv.org/content/10.1101/2020.04.17.20061440v1

Did you factor this in?

If your estimate is different to this, it’s looking increasingly likely that your estimate was wrong. Have you back-casted your estimate, perhaps using Sweden or Belarus as references?

adp

Well said, Dr_t!!! Exactly my sentiments – from someone who started FORTRAN modelling 50 years ago and has continued through today.

I would describe this as simplistic and superficial critique – not really adding anything material to the discussion.

For those who don’t agree with a stochastic modelling approach, tell me from where you have “typical lock down behavioural patterns” for a truly probabilistic model. Nonsense!!!

Go back to the drawing board and come up with some useful and materially significant comments.

David George

30 years ago I was developing the Mach operating system (the thing that runs Apple computers today). Written in ‘C” I can assure you that it was multi-threaded, modularized, structured and documented. Multi-cpu computers were already commonplace if not on the desktop. Dining philosophers dates from 1965 and every computer scientist should have come across that at university for the last 50 years. Multithreading has been available to coders since at least the days of Java (1995) if not before (it doesn’t require a cpu with more than 1 core just language and/or OS support).

Reply to David George

I went to university in 1988, and one of the 1st year modules was concurrent programming. We used a language called Concurrent Euclid (a pascal clone with threading) possibly because threads weren’t well supported or were awkward to use and understand in other languages. Multi-threading programming in mainstream systems has been around for a long while.

Indeed and I remember Modula 2, another Pascal derivative, supported threads. Concurrent programming is pretty old hat really.

Yes, and I too wrote multi-threaded software in the 1980s, including a thread scheduler I wrote in 8086 assembler for the IBM PC, and used Mach on my NeXT and Logitech Modula-2 on my 80286 PC clone (though I’m pretty sure that version only implemented co-routines, not real concurrency). But I think you may be missing the wood for the trees. Firstly, who in 1990 had a CPU capable of executing multiple threads in hardware simultaneously? Not on PCs. Not even on workstations like NeXT, Sun, Apollo, etc. I lost track of what the minis could do by that stage, but hardly anyone was still using them. More likely, you literally had to have access to a mainframe – an even smaller set of users. Outside computer science nerds and academia, multi-threaded programming was not in general use. There is no benefit to an end user like an epidemiological modeler using commodity hardware in using multi-threaded code if it’s going to run on hardware capable of only executing a single thread at one time. His objective is not to show off his computer science knowledge and skills but to get the results of his simulations. Therefore, it makes absolute sense… Read more »

Cristiano

Your stupid “””model””” clearly failed to take into account asymptomatic cases (between 60 and 80%). Maybe you ought to look at Iceland since they’ve done testing on 100% of their population, albeit still using low-specificity tests. Say, how come during the same time period in the US, 10% of the population contracted influenza but only 0.3% contracted COVID-19? I thought COVID-19’s R0 was many times higher than the influenza viruses..? Pro tip: infections are WIDELY underestimated, meaning CFR is widely overestimated.

Reply to Cristiano

Iceland. Where their total population is about the size of Oakland CA. USA. A country isolated most of the time and more so in the winter. A country that is basically 1 racial group. A country without a thriving economy of world travel, and imports and exports on a grand scale.

Sure, why not. If you are going to slice and dice based on massive disparities in population and area, I’ll use US states Oregon and Arkansas. Same # of deaths per million as Iceland. I would imagine both those states get more economic “Action” than Iceland.

Neil F

Completely agree.

It should also be noted that this ‘bug’ has been fixed – https://github.com/mrc-ide/covid-sim/pull/121

Ilma630

The very fact that the model code was USED now correctly lays it open to review and criticism, the same as if it were written yesterday, particularly as it has a direct affect on the wellbeing of millions NOW. If it’s not fit for purpose, it doesn’t matter how old or new it is.

CTD

Ferguson wrote this on his Twitter account a few months back: “I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics.”

So it more like 13 years old – not 30 years old.

“30 years ago, there was no multi-threading, so it was reasonable to write programs on the assumption that they were going to run on a single-threaded CPU. ”

Well yes. I am involved in a big upgrade to academic software to multithreading for the same reason. But we are extensively testing and validating this before even considering using it. Sounds like Ferguson’s group did this, found differences that indicated the single threaded code had wrong behaviour and then ignored it. So the problem is not lack of multi-threading, its lack of good testing and responsible behaviour (not using code you know is dangerously wrong)?

Tom

Very interesting. I know nothing about the coding aspects, but have long harboured suspicions about Professor Ferguson and his work. The discrepancies between his projections and what is actually observed (and he has modelled many epidemics) is beyond surreal! He was the shadowy figure, incidentally, advising the Govt. on foot and mouth in 2001, research which was described as ‘seriously flawed’, and which decimated the farming industry, via a quite disproportionate and unnecessary cull of animals.

I agree with the author that theoretical biologists should not be giving advice to the Govt. on these incredibly important issues at all! Let alone treated as ‘experts’ whose advice must be followed unquestioningly. I don’t know what the Govt. was thinking of. All this needs to come out in a review later, and, in my view, Ferguson needs to shoulder a large part of the blame if his advice is found to have done criminal damage to our country and our economy. This whole business has been handled very badly, not just by the UK but everyone, with the honourable exception of Sweden.

Guillermo

Thanks for your words of wisdom (I truly think they are). Nevertheless, for me (if true) the main point of the critique is: same input -> different output, under ceteris paribus conditions. Best regards and luck in your lockdown.

None of what you say excuses the use to which this farrago of nonsense has been put.
I’m not sure that the code we can see deserves much detailed analysis, since it is NOT what Ferguson ran. It has been munged by theoretically expert programmers and yet it STILL has horrific problems.

I don’t know how you code, but I’ll stand by my software from 40 years ago, because I’m not an idiot and never was. Now … where did I put that Tektronix 4014 tape?

OKRickety

“I’ve been writing software for 41 years.”

” I have written millions of lines of code in dozens of different programming languages. ”

Pffft! 41 years is 21.5 million minutes. There is no way you have written that much code, much less in dozens of languages.

You may have some valid points but I’m not going to take the chance.

Reply to OKRickety

First of all, you are calling me a liar which I do not appreciate. Secondly, your analysis is nonsense and any competent software developer will know that. I’ve never counted how many lines of source code I can write per minute, as it is a pointless metric, though it will be quite a few. But I do know that writing 1000 lines of (tested and debugged) code in a day is a slow day with plenty of time to spare for other things. In order not to speak purely in the abstract, there was an event (a coding contest) which took place in the 1980s where I was given a problem specification and had 24 hours (during which time I also had to do things like eat and sleep) to design a programming language and implement an interpreter for it. I still have the source code and it is 2547 lines long. Yes, it is spread over 18 modular source files, the longest two of which contain 451 lines (the parser) and 335 lines (the lexer), respectively. No, it is not multi-threaded. It is probably all re-entrant and so thread-safe. There are 365 days in a year. Programming is not… Read more »

1000 LOC per day ? Tested and debugged ? Blimey !!

Eric B Rasmusen

In my field, economics, 61-year-olds like me face the problem that the tools are different from what they were 30 years ago, but we old guys can’t use that as an excuse. To get published, you have to use up-to-date statistical techniques. It’s hard to teach an old dog new tricks, so most of us stop publishing.
Your point that 30 years ago, programs didn’t have to cope with multiple cores sounds legit— but the post above seems to be saying that’s not the main problem, and it wouldn’t even work if run slowly on one core.
The biggest problem, though, is not making the code public. I’m amazed at how in so many fields it’s considered okay to keep your data and code secret. That’s totally unscholarly, and makes the results uncheckable.

ClarkeP

Reply to Eric B Rasmusen

I think their code is available and what third parties such as the University of Edinburgh and Sue Denim (!) have found when scrutinizing it is that it’s pretty poor. Following the science sounds sensible but when they are employing such poor models it’s not sound after all.

I have a lot of experience with simulations and stochastic models. But I’m an engineer, not an academic. In the field, if you cannot explain every bit of randomness in your model, you do not understand it. This has nothing to do with “modern” code or not, because 30 years ago, the requirements of responsible engineering were exactly the same as they are today. If a company builds 20 bridges, and 1 of them falls down, we don’t call that a 95% success rate, we call that irresponsible and unacceptable failure. The multi-threading is particularly important, because of the random seed. It doesn’t matter if you generate the same sequence of random numbers every time, if the order they are used in is non-deterministic. You’re effectively randomly swapping certain pairs of numbers in your sequence, every run. This also makes it harder to improve and refactor the code. If you have only a single random generator, then each call to random() depends on all of the ones that came before. If you instead use an independent generator for each unique aspect of the model, now it doesn’t matter which order you process them in, the randomization does not cross component… Read more »

A good defence of Neil Ferguson being ‘shy’ of releasing 30yr old coding but with all due respect(I mean that) he should have indicated as much, now he should release the original code with the disclaimer that it was written as you describe above.

malf

“Are you saying you have a model which can predict, with certainty, how many dead people there will be 2 weeks from now?” No one is criticizing it on that basis, the issue is that, generalize the program as a function f(x), where x is some random seed value, every other random value in the program should be a function of x, so that f(x) = y over every run. There’s no reason that it shouldn’t run this way, if the author’s contention that, call every run f_t(x), f_0(x) = a, f_1(x) = b, f_2(x) = c, but that a, b and c converge on some number, that may well be the case, but there’s a very big difficulty in determining what it is that they converge upon. Now, if you were writing something to simulate an empirical situation, where you were able to check your algorithm against the real world, and you found that the average of a,b,c did in fact converge on the real world observations, sure, that’s a sort of validity, but it’s still needlessly bad form, tho, of course, I am well aware that the best program is one that solves the problem and costs as… Read more »

Shahin

I am a machine learning engineer (a PhD) and I confess I haven’t read the paper at all and I am not interested to do so either (and I don’t have the biology or virus knowledge at all). As much as anyone else, I love the restrictions to be eased or lifted, but I try to be unbiased in giving comments on these. Apparently I haven’t read the paper so I would take my own word with a grain of salt! When it comes to modelling, it is normal to use stochastic models (which is a very good idea to do so for such a case in my mind, in comparison to a deterministic model), and getting different results is probably because they have forgotten to pass the seed parameter to one of the random functions, not an actual bug (an educated guess). Again, when it comes to modelling and when you are working on data, you can spend your whole life writing unit tests, but it becomes so hard (and exponentially growing) so quickly that it is not possible to cover all different cases and is not worth the time to write tests. Google has given up writing tests… Read more »

TOD

This is an utterly bizarre take.

The REASON that coding standards have changed is precisely because of the problems that is inherent in monoliths. You don’t get to say, “we shouldn’t hold his outdated code up to modern standards, it’s not fair to criticise it as if it was written today”. You instead have to say, “this moron is using 30 year old code base and totally outdated, obsolete, and rightfully abandoned coding practises.”

scuzzaman

“On a personal level I’d actually go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people and the results speak for themselves.”

Perhaps even more significantly, they pay a price when they get it wrong, a check on overreaching idiocy that appears completely lacking in these “advisory” academic roles in government.

See also https://www.youtube.com/watch?v=Dn_XEDPIeU8&t=593s Nassim Nicholas Taleb on having Skin In the Game.

Edward Reeves

On Monday I got so angry that I created a change.org petition on this very subject.

https://www.change.org/p/never-again-the-uk-s-response-to-covid-19

KurtGeek

It sounds like something an undergrad would knock together, but this team is supposed to be the cream of their profession.

If this is the best the best can do then to ‘suggest that all academic epidemiology be defunded’ sounds like a good plan to me. But, sadly, this is shutting the stable door after the horse has bolted.

AndrewF

Reply to KurtGeek

and exceedingly well funded (by Gates and others). No excuses at all for old or poor code.

non whatsoever.

Reply to AndrewF

“Gates”, “poor code”! Now where have I seen that before?

Uwe Pleban

We should not assume that the cream of the academic crop knows how to develop industrial strength software, or at least we’ll written code. Proper software development techniques are usually NOT taught in academia.

We should not assume that the cream of the academic crop knows how to develop industrial strength software, or at least well-written code. Proper software development techniques are usually NOT taught in academia.

Thomas

Thank you. Are the mainstream media capable of covering this? That is what frightens me.

Who is going to be the first to point out that the reason sick peoples weren’t getting hospital beds is because the models were telling us to expect thousands more sick people than there were? How many people died because of this?

And what about all this new normal talk? All these assumptions life will change for ever built on fantastic predictions which are being falsified by Swedish and Dutch data?

This diktat that we can’t set free young people who are not threatened by the virus because the model says hundreds of thousands would die? All nonsense.

This is the greatest academic scandal in our history.

Reply to Thomas

‘Are the mainstream media capable of covering this?’ Let me think………. ‘No’.

Reply to Basileus

They are certainly capable, but is it in their interests to? Not until the wave turns and is racing back towards them to swamp their current rhetoric. Then they’ll go into self-preservation mode, and make you believe they were asking this all along.

John Dawson

Slightly off topic but I would suggest that some of the climate science work suffers from similar problems and at a comparable scale. Dr Mann’s flawed hockey stick comes to mind; my understanding is that the analysis code has never been released.

I am science trained but a HW guy, not SW. I place most of my trust in measurements, especially ones that can be reproduced by others.

microdave

Reply to John Dawson

“I would suggest that some of the climate science work suffers from similar problems”

The infamous “Harry_Read_Me” file contained in the original Climate Gate release springs to mind. As I recall, it was a similar tale of a technician desperately trying to make sense of terrible software & coding being used by the “Climate Scientists” – one of whom had to ask for help using Excel…

Currently in court charging defamation but needs to provide disclosure (his ‘code’) and is kind of having cold feet – so proceedings drag on

MUCH more politics in Climate Change! You are simply not allowed to question the basic assumptions..

Er… “much more politics” than the model that has been used to shut down most of the world?

…the assumptions – built into MODELS!!

What, that CO2 absorbs infrared ?

Jon Bovey

Any virus has an inherent R0 for a constant set of conditions (input R0). It also has an effective average R0 in the population for the given social conditions (output R0). Hence this explains why R0 appears as input and output.
I would call it
R0 inherent (input)
R0 effective (output)

Barney McGrew

Reply to Jon Bovey

How do you arrive at the R0 that you feed in?

Mr Cabbage

Reply to Barney McGrew

R0 is a number that is calculated from other model parameters (contact rates, migration rates, recovery rates, death rates, etc.). It has no value to be fed in. The model parameters are fed in, and R0 is some function of these parameter values. Also, too much emphasis is placed on this mystical “R0”, as if an entire epidemic is controlled by one number. This is plainly ridiculous. Mathematical models of epidemics are just simplified representations of reality. One can fit them to data, once one has data, and the fit may be impressive. But as a predictive tool, in the absence of much data, they may be useless. Having experience of mathematical modelling of epidemics, and knowing their limitations, it is bewildering how countries around the world have imposed all these silly lockdown measures, seemingly because of one computer program by someone who isn’t even a mathematician or programmer. It seems to me that politicians, afraid of appearing ignorant when their academic “professor” buddy told them everyone was gonna die, and being pressured by an increasingly hysterical media reporting every individual case of coronavirus, decided lockdown was a good idea. Of course, as time will tell, it was never a… Read more »

Reply to Mr Cabbage

Perhaps I should re-phrase it as “How do you arrive at the parameters you feed in?”. Ferguson speaks of trying different R0 values over a specified range, so presumably his model does have some notion of R0 as an input. However, it could be that he simply sweeps the model with different transmission parameters and observes which one effectively produces the R0 that he wants.

Either way, he is starting with a range of values of R0 that has been obtained from somewhere. Maybe it’s just that “everyone knows that SARS-Cov-2’s R0 is about 2.5”. But where did that come from?

I think it comes from fitting a ‘model’ (maybe just an exponential formula) to real data (typically at the start of the epidemic – although that’s an assumption in itself) and adjusting its R0 for best fit. As others have observed, if the early data all comes from hospitals, and is affected by arbitrary factors like availability of tests, choice of subjects etc., then that R0 is already very ‘wrong’.

John Dodd

Your last paragraph Mr Cabbage is spot on. We seem to have employed and accepted flawed evidence which inevitably leads to the wrong conclusion. In a scenario such as this is too important to indulge in such activities.

Patriot

That’s not a conclusion, that’s a recommendation.

Agree that a proper epidemiology model that is robust and peer reviewed is required and should be a good outcome from this pandemic.

As someone who has worked in the areas of Software Maintenance, Legacy Systems, and Software Testing, and has taught Computer Science to MSc level, I have to say I am appalled. A Computer Science student could do much better than this. Why is Prof Ferguson still being employed by the once prestigious Imperial College?

“Why is Prof Ferguson still being employed by the once prestigious Imperial College?”
About that….
https://www.bbc.com/news/amp/uk-politics-52553229

Neal O'Kelly

They could write better code, yeah. But they wouldn’t understand the epidemiology bit. Brogrammers…

Reply to Neal O'Kelly

But he does. It’s called working in a team.

Interesting, I downloaded what purported to be the Imperial Model software from github and it was in Python. Full of hard coded numbers seemingly pulled from the ether. Didn’t see any C++ in there

I think the original code was written in C++ (Ferguson said C) but I read somewhere that it was ported to R and Python recently – presumably that was the work done by Microsoft.

Although there is R in this, that’s for analysis and display. The one being discussed here is in C++. There is *another* model from Imperial College (the one for “Report 13”) that’s essentially the implementation of an analytical model, and that uses Stan, Python (to set it up) and R (for analysis and display). That’s not the one described here, which is elsewhere on github.

Mike Smith

Reply to forsyth

I found that version too. Similar coding style but I could not get it to work – lots of NULL pointer accesses!

Oh good, it will be fine if Microsoft have done it.

Yes, my reaction precisely. Thank goodness no Microsoft products have ever had any bugs.

MS have done some good work in software quality improvement.

Paul

The .cpp files are in the src directory: https://github.com/mrc-ide/covid-sim/tree/master/src

Yasmin Mattox

This is stunning in how awful this all is. The word criminal comes to mind. Thank you so much for this assessment.

madgrizzle

No do the same with climate change models

“Clearly the documentation wants us to think that given a starting seed, the model will always produce the same results.”

No! That’s not how stochastic simulations work! Or indeed the real world! In biological systems we literally *expect* a range outcomes given the same input. You run the model repeatedly, and then report an average, an the 95th percentiles of the results.

You absolutely, 100%, Do NOT want a model that gives the same results given the same inputs.

You might be software engineer, but you’re no biologist.

-27

Anthony Giles

Explain more please: is this entire critique invalid?

Try reading more carefully. She said, given the same starting seed you get the same results. That’s exactly how a stochastic simulation is supposed to work. If you don’t you’ve introduced a bug, I mean ‘non-determinism’

That’s not what the review says. There was a brief period while there was a bug, but outside that period – both the original code, and the code once the bug was corrected – did not exhibit such behaviour in the execution environment the program was written for – a single threaded single CPU computer. Name one piece of software under active development with a regular release schedule where every intermediate release is bug-free. Trying to use monolithic 15000 line 30 year old (apparently) C code which is a result of auto-translation from Fortran on a multi-threaded multi-core computer or trying to get it to work in such a setup is a fool’s errand. Use the code on a single-threaded CPU, for which it was designed and judge it on its performance in such an environment. To be used in a multi-threaded setup, the software would have to be rewritten from scratch using the proper tools for parallel computing (hint: it isn’t tools using shared memory and mutexes). I’m sorry, but I really am not persuaded by these criticisms which appear to be very superficial. What is the substance of the model? Is it wrong? Why is it wrong? Is… Read more »

earthflattener

Absolutely spot on dr_t. You clearly know your business. I’m a bit shocked by the critique as well. As a stochastic modelling expert who has written many a ‘rat’s nest’, it is obvious to me that the seed bug which she makes a meal of is not an issue at all for this particular code as it depends on an ensemble of results. Of course, it’s nice to fix it to have reproducibiity of individual runs as that may confuse novice users, but from the perspective of the end result, it changes nothing.
By the way, my rats nest doesn’t stay that way. I work with a team of great developers who are pretty mediocre modellers. We work together to produce something that can be consumed by a fairly large body of non-expert users, but if it’s usage stayed with a handful of experts, we could save the expense of the refactoring and the glamorous user interface.

Reply to earthflattener

Absolutely agree with dr_t and earthflattener! You are both spot on with your criticism of the author of this misleading and erroneous report. There seem to be two camps forming here – the “IT geeks” focussed on the purity of code and the true “modellers” who are interested in concepts and theories, which is why they become modellers in the first place. It’s like when learning about a water molecule being H2O – one oxygen with two hydrogens around it – a truly simple “model” of a water atom. Is this technically correct? Of course, NOT! Is it adequate to explain what’s happening without too many – significant – detrimental effects? Of course, YES! There’s no mention of neutrinos or other particles but it doesn’t invalidate the basic model – H2O – and how it’s applied. Get a grip all you geeks! You just cannot see the wood for the trees – the author of the critical analysis should get some experience in “modelling” before writing critical commentaries. So here’s an exercise for all IT geeks. Do your personal budget for the next 3 years – forecast your income and expenditure. Let’s see if you can figure out your travel… Read more »

Potkin

Reply to adp

Isnt the model generated on code, and of the code is wrong the model is wrong?

First, it’s a strawman argument to suggest that professional software developers expect some kind of code purity. Second, when you refer to professionals as “IT geeks”, you are attempting to undermine their professional credibility without addressing their merits of their concerns. It’s just banal rhetoric.

Expecting well organized and documented code is not an expectation of purity. It’s a best practice so that when bugs are discovered, they will be far easier to track down when the code is orderly and the programmer’s intentions are documented. Every professional “IT geek” understands this.

Look, we all write proof of concept routines when we’re experimenting with different ideas. Novice programmers tend to get so wrapped up with their project that they don’t take the time to rewrite their doodling into something more orderly and reusable. Experienced programmers learn from those mistakes.

Lastly, in case you haven’t noticed, the world runs on quality software. We literally trust our lives to it whenever we fly on a plane, for example. I can’t say the same for much of what the geeks in academia generate.

JBSJ

The point is, you want some way to know whether the code has bugs — whether the model is doing what it was written to do. If it’s poorly documented and untested, doesn’t reproduce itself (to some level of consistency) you can’t really.

That’s why this is a problem. Most well written systems using stochastics use pseudorandom numbers, that look random, but are fixed based on the ‘seed’ to the random number generator. With the same seed, they give the same results.

This didn’t, which is a sign that something broken is going on. With C++ that can be a lot of things. One of the most obvious is using an uninitialized variable. E.g. you are summing numbers, but you forget to set it to zero at the beginning. Often it will be zero, but sometimes it won’t be. This introduces a bug, and non-determinism, and means your results generally can’t be trusted.

There are actually a lot of good static analysis tools for C++ — I’d love to see them applied to this code base.

the only concept that really matters is results.

Have Dr Ferguson’s results been valid in the past?

Were his predictions for BSE deaths, avian flu deaths and swine flu deaths using this modelling software borne out as roughly correct when looking back at the actuality?

The answer is that Mystic Meg would have done a better job than his software.

chris c

I have some tea leaves that did a better job. Should I put them forward for a knighthood?

mds

“How well he does this job is the relevant question, not his coding skills. Especially not his coding skills 30 years ago judged from a modern day perspective.”

respectfully, if one’s brilliant mathematical modelling skills are encoded in ways that undermine the ability to produce coherent/consistent and applicable results based on that model’s logic & assumptions, then of what use is that brilliance? a crap translation of the illiad or shakespeare destroys its power…that’s what crap software instantiations do to ‘complex’ mathematical models….i would alos add the obvious: if the imperial model had proved even remotely accurate/consistent, it would not be undergoing this level of (literal) disbelief and scrutiny….i’m sorry but both your criticism and critique is unfail & inaccurate

Reply to mds

Here’s another analogy: someone can come up with a brilliant model, that woudl generate highly accurate results— but if he keeps making sign errors in his algebra, the results will be garbage. In fact, if he just goes over his algebra once, he probably *will* make sign errors (I always do), but experienced modellers know they have to check their work. In fact, many an amazing result turns out to be an algebra error on second look and disappears.

Another person who doesn’t read carefully enough. I was responding to the person who claimed, incorrectly, that this is not how stochastic simulation works.

Furthermore, he doesn’t need an army of professional programmers working for him, but he does need people with professional programming skills who can adhere to standard practices, This guys model has been the motivation for shutting down the entire UK economy.

Who cares what you’ve seen engineers hack together? The complaint is not about aesthetics, it’s about correctness, reproducibility, transparency. If your model stays in your research group then who cares? But if it’s used for something this important you don’t get to say ‘I’d like to see a programmer do MY job’.

You cannot seperate the model in this case from the implementation of it. Besides all that, you’re still wrong. If you read the issue tracker, they thought the issue would be resolved by running it on a CPU, but it wasn’t. That’s why the reviewer pointed out they don’t understand how their code is behaving.

If you are running under Windows, then I question whether ANYONE understands what their code is doing in detail. Even with ‘Hello World!”…

But your “Hello World” program still runs the same way every time you run it. If you change your program and it stops working then you know the last change broke it – it isn’t a problem with Windows.

In the case of this model, give the same set of inputs and the same random number seed, it should output the same results every time the program is run. But in this case, the Ferguson model generates different results indicating that the internals of the model is broken. They should be fixing this problem before using the model to get results.

In addition, they should have used data from an epidemic from the past as a reference to test the logic of the model. If the inputs are close to real values, the output should be comparable with a known epidemic case and this would verify the logic.

The next step is to try to model the current epidemic.

Just creating a buggy model and then feeding in guesses will only give stupid answers as we have seen.

I am a programmer, mathematician, and mathematical biologist. Regardless of whether Neil Ferguson’s program bears any relation to reality or not, I just want you to know that your tone in your messages is very rude and is totally unacceptable. Your arrogance completely undermines your credibility. Put your face mask on and stop talking.

-11

I am ever so sorry that I have caused you offence. I very much hope you have not melted as a result.

Anne.

The proof is in the reality pudding as they say.

Here are the results of Professor Ferguson’s previous modelling efforts.

Bird Flu “200m globally” – Actual 282
Swine flu “65,000 UK” – Actual 457
Mad Cow “50-50,000 UK” – Actual 177

You do protest a bit much.

Reply to Anne.

They actually say, the proof of the pudding is in the eating ….

“I really don’t care about Ferguson’s coding style. He is not a professional programmer, he is an epidemiological mathematical modeler. How well he does this job is the relevant question, not his coding skills. ”

This is a strange defence. If the code doesn’t implement the model correctly then his coding skills, more specifically his software engineering skills, is highly relevant. If the code hasn’t been validated through rigorous testing and contains bugs then it’s worse than having no model at all.

I’m pretty sure his software engineering skills are non-existent, because he’s not a software engineer. I do understand that’s a difficult concept to grasp.

His approach to programming is the reason why the concept of Software Engineering was introduced.

There is no excuse for untidy, unstructured code that is difficult for others (and, after the passage of time, yourself) to follow..

Just describe what it is you are going do, in a logical sequence of steps, and then code accordingly.

There, Software Engineering in a nutshell.

Just because spaghetti code exists doesn’t mean it’s the norm in a professional development environment. The bottom line is that if the recommendations from a computer program are going to be used to make decisions that significantly affect the daily lives of millions of people, the friggen program absolutely needs to be as solid as possible, which includes frequent code review, proper documentation, and in-depth testing. Then, it needs to be shared for peer review.

“in a professional development environment”. And I am sure when the epidemic struck, he had 6 months to get his team of 100 professional software engineers to refactor that code he had lying in his drawer so it was brought to modern day standards, write automated test rigs, run regression tests and submit the code for certification before using it in anger. And while his minions did all this work, he sat at home in his pyjamas, blogging and sniping at people at the coal face trying to deal with the outbreak.

@dr_t How silly. He and his team should have been doing that along. That’s the responsible process. The irresponsible, and apparently expected process, is to take the spaghetti approach and rely on the rest of academic ‘experts’ to defend them if question arose. Fail.

Precisely, lying in his desk drawer for years. No attempt to bring it up to the standards of today (or 13/30 years ago) it would appear. Just hobby coding, then?

Documentation, Automated Test Rigs, Regression Tests, Certified Code during those years would have been icing on the cake.

Stephen

I disagree. A stochastic model is simply a deterministic model that has inputs that are generated randomly. For example, let’s say I run a stochastic simulation of a random walk. I build a deterministic model that says if my input number is less than 0.5 then take a step left, else take a step right. I then use a random number generator that gives a me a number between 0 and 1. If I generate 5 numbers and they are all less than 0.5, then the person should have taken 5 steps left. If my output says they are anywhere else other than 5 steps left, then my model is broken and running it multiple times and averaging it doesn’t fix the issue. For example, let’s say my model actually says take a step left if the number is less than 0.5 AND if Neil Ferguson is horny. Unless Neil is horny every second, then my output will be wrong. If Neil is horny only 1/2 of the time, then the random walk will be too far right. Averaging the outputs will not fix that error. And I work in insurance modeling. The comment about insurance modeling is a bit… Read more »

Reply to Stephen

I don’t think you understood the bug she was saying was a disaster (the only one she really made a deal of … though in my long comment above, I deal with all her points). The error is not systematic. It is simply that saved state incorrectly codes a seed, so that using your random walk example, If you ran it once you obviously started with a seed that generates your vector of random numbers. If you now run it again with the same seed, you will get the same answer. Lose that seed, then running it gives a different answer, but a perfectly legitimate random walk. Since what you are interested in is the average of a large number of walks, then the result is the same whether there was a bug in the seed saving or not…..becuse in both cases what you need (and get) is a large number of independent runs. The seed saving issue does not compromise the independence of the runs…so no problem!

Correct, if you need randomness, you run such a model by providing a controlled range of random *inputs*. You don’t build randomness *into* the code *unless* (i) you report them for each run, and (ii) you can override them as inputs for testing, validation, QA and reproducibility.

Neal,

Bob’s response to your comment is right. Your comment shows that you do not understand how programming works. Also, given exactly the same inputs, I am afraid that you would expect exactly the same ouputs, even in biology.

In biological systems, there are so many variables that one can never know all the “input” values. If you measure only a few things in an experiment and get “random” results, this doesn’t prove that life is inherently random. It just shows that you haven’t measured everything.

At a quantum level, there may be an inherent randomess in some processes, but not for computer programs, and not on the macroscale of life.

Pigpen

Your simulation has to produce identical results with an identical seed, otherwise it would be impossible to test for correct output. You can use random numbers as the seeds, and run many simulations to model reality, and then see what you get. But if two runs using the same seed produce different output, that’s not a good sign.

Reply to Pigpen

They don’t. There was a problem with saving the seed, so in fact when you thought you were using the same seed, in fact you were not. For the rest, then running many runs gives the same answer, whether you succesfully saved state between runs or not.

You are confusing your model with code.

Computer code idempotent and deterministic. That is for a given set of inputs it will produce the same results. If this were not so then computers would be pretty useless. Mathematically this makes sense as a computer processes binary data with a limited range of mathematical and boolean operators and branch routines.

Now you want your model to be non deterministic. You do that by introducing some randomness but that has to be under your control not via some bug in the code or race hazard or timing event between the thread scheduler or some such. You want to be able to actually test the model under controlled circumstances and this clearly wasn’t possible with the code Fergusson wrote.

Getting randomness in computer systems is actually pretty hard and an area of study in itself.

The critic of Fergusson’s code appears to be valid. I don’t think you read it properly or you didn’t understand what was said. You may be a biologist but you ain’t no computer scientist.

If what you’re saying were true, hardware random number generators, such as those based on thermal noise or quantum phenomena, would be pretty useless. I have not read the code, but earthflattener apparently has. The non-determinism (which – we are told – was not present in the original code, and was fixed shortly after being discovered in the refactored code, and since the refactored code is new, was thus very short lived, and was presumably not present in the code Ferguson actually used, which is his original code) was due to some garbling of the seed used to restart the random sequence, which just means that a different pseudo-random sequence was used in his stochastic model. This affects reproducibility but is immaterial to the correctness of the stochastic model. You would get the same effect if you used a hardware random number generator, which is superior as it gives a more truly random sequence. If your criticism were correct, that implementation would be incorrect too. Yes, Ferguson is no computer scientist. This is precisely the point – which you seem to be missing. If someone presents a substantive analysis of his model which shows it is substantively wrong, or substantively… Read more »

Annette Jones

I am a lay person who does not understand computer modelling….but for such huge decisions to be made without adequate peer review of the data is shocking.

Reply to Annette Jones

Code should always be unit tested by someone other than the developer. Rule number one.

Paul Johnson

Disagree. Unit testing is the job of the developer. All other forms of post-unit testing are carried out by testers/reviewers.

Many thanks indeed for putting in the time to review the code and to write your informative review.

Andrew Johnson

Thanks for this article – I wrote C code solidly for 5 years – and still do bits and bobs. It does not surprise me one bit – because I knew this was a scam more or less from the “get go” – due to the investigation I did into the “Swine Flu” affair back in 2009. It’s not about public health – it’s about control – and selling pharmaceuticals (tamiflu and vaccines in 2009 and vaccines now). See my report at https://cvpandemicinvestigation.com/ if interested.

Hoong-Wai Cheah

So the problem is one of computational mathematics, rather than that of software development, which most programmers do. Thankfully i have some learning in computational mathematics. Unfortunately, the vast vast majority of computational mathematicians are programming amateurs. Maths is the end goal, and programming is just a means to get there. As a result, their software is crap. Ergo, most of the critique is spot on. HOWEVER… the excuses imperial gave are valid. Stochastic mathematics does not require exact and reproducible results. Its aim is to simulate trends and patterns. Therefore the key thing to critique is the model. The author has failed to do this, so the entire article is bluster. The author’s background is in database programming, not mathematics, so i wouldn’t be surprised if the subject was outside her area of expertise.

P S AUSTIN

Reply to Hoong-Wai Cheah

While Stochastic mathematics may not require reproducible results, software which simulates such models does (in order to prove that the software works as expected and that bugs are not introduced). This is why in modelling programs the stochastic model is given a seed to initiate the randomness of the model.

During development we can use the same seeds repeatedly to ensure we haven’t ‘broken’ the model by introducing bugs which cause unexpected outputs.

For production use we use as many seeds as desired to repeatedly run the model (introducing the the required level of randomness) before averaging results.

This should be a national scandal but the media will probably make little, if anything, of it. We live in a world of staggering absurdity that the govt has been consulting Ferguson on pandemic issues given his poor track record of predictions and inadequate software engineering practices.