Code Review of Ferguson’s Model

6 May 2020. Updated 10 May 2020.

by Sue Denim (not the author’s real name)

[Please note: a follow-up analysis is now available here.]

Imperial finally released a derivative of Ferguson’s code. I figured I’d do a review of it and send you some of the things I noticed. I don’t know your background so apologies if some of this is pitched at the wrong level.

My background. I have been writing software for 30 years. I worked at Google between 2006 and 2014, where I was a senior software engineer working on Maps, Gmail and account security. I spent the last five years at a US/UK firm where I designed the company’s database product, amongst other jobs and projects. I was also an independent consultant for a couple of years. Obviously I’m giving only my own professional opinion and not speaking for my current employer.

The code. It isn’t the code Ferguson ran to produce his famous Report 9. What’s been released on GitHub is a heavily modified derivative of it, after having been upgraded for over a month by a team from Microsoft and others. This revised codebase is split into multiple files for legibility and written in C++, whereas the original program was “a single 15,000 line file that had been worked on for a decade” (this is considered extremely poor practice). A request for the original code was made 8 days ago but ignored, and it will probably take some kind of legal compulsion to make them release it.  Clearly, Imperial are too embarrassed by the state of it ever to release it of their own free will, which is unacceptable given that it was paid for by the taxpayer and belongs to them.

The model.  What it’s doing is best described as “SimCity without the graphics”. It attempts to simulate households, schools, offices, people and their movements, etc. I won’t go further into the underlying assumptions, since that’s well explored elsewhere.

Non-deterministic outputs. Due to bugs, the code can produce very different results given identical inputs. They routinely act as if this is unimportant.

This problem makes the code unusable for scientific purposes, given that a key part of the scientific method is the ability to replicate results. Without replication, the findings might not be real at all – as the field of psychology has been finding out to its cost. Even if their original code was released, it’s apparent that the same numbers as in Report 9 might not come out of it.

Non-deterministic outputs may take some explanation, as it’s not something anyone previously floated as a possibility. 

The documentation says:

The model is stochastic. Multiple runs with different seeds should be undertaken to see average behaviour.

“Stochastic” is just a scientific-sounding word for “random”. That’s not a problem if the randomness is intentional pseudo-randomness, i.e. the randomness is derived from a starting “seed” which is iterated to produce the random numbers. Such randomness is often used in Monte Carlo techniques. It’s safe because the seed can be recorded and the same (pseudo-)random numbers produced from it in future. Any kid who’s played Minecraft is familiar with pseudo-randomness because Minecraft gives you the seeds it uses to generate the random worlds, so by sharing seeds you can share worlds.

Clearly, the documentation wants us to think that, given a starting seed, the model will always produce the same results.

Investigation reveals the truth: the code produces critically different results, even for identical starting seeds and parameters.

I’ll illustrate with a few bugs. In issue 116 a UK “red team” at Edinburgh University reports that they tried to use a mode that stores data tables in a more efficient format for faster loading, and discovered – to their surprise – that the resulting predictions varied by around 80,000 deaths after 80 days:

That mode doesn’t change anything about the world being simulated, so this was obviously a bug.

The Imperial team’s response is that it doesn’t matter: they are “aware of some small non-determinisms”, but “this has historically been considered acceptable because of the general stochastic nature of the model”. Note the phrasing here: Imperial know their code has such bugs, but act as if it’s some inherent randomness of the universe, rather than a result of amateur coding. Apparently, in epidemiology, a difference of 80,000 deaths is “a small non-determinism”.

Imperial advised Edinburgh that the problem goes away if you run the model in single-threaded mode, like they do. This means they suggest using only a single CPU core rather than the many cores that any video game would successfully use. For a simulation of a country, using only a single CPU core is obviously a dire problem – as far from supercomputing as you can get. Nonetheless, that’s how Imperial use the code: they know it breaks when they try to run it faster. It’s clear from reading the code that in 2014 Imperial tried to make the code use multiple CPUs to speed it up, but never made it work reliably. This sort of programming is known to be difficult and usually requires senior, experienced engineers to get good results. Results that randomly change from run to run are a common consequence of thread-safety bugs. More colloquially, these are known as “Heisenbugs“.

But Edinburgh came back and reported that – even in single-threaded mode – they still see the problem. So Imperial’s understanding of the issue is wrong.  Finally, Imperial admit there’s a bug by referencing a code change they’ve made that fixes it. The explanation given is “It looks like historically the second pair of seeds had been used at this point, to make the runs identical regardless of how the network was made, but that this had been changed when seed-resetting was implemented”. In other words, in the process of changing the model they made it non-replicable and never noticed.

Why didn’t they notice? Because their code is so deeply riddled with similar bugs and they struggled so much to fix them that they got into the habit of simply averaging the results of multiple runs to cover it up… and eventually this behaviour became normalised within the team.

In issue #30, someone reports that the model produces different outputs depending on what kind of computer it’s run on (regardless of the number of CPUs). Again, the explanation is that although this new problem “will just add to the issues” …  “This isn’t a problem running the model in full as it is stochastic anyway”.

Although the academic on those threads isn’t Neil Ferguson, he is well aware that the code is filled with bugs that create random results. In change #107 he authored he comments: “It includes fixes to InitModel to ensure deterministic runs with holidays enabled”.  In change #158 he describes the change only as “A lot of small changes, some critical to determinacy”.

Imperial are trying to have their cake and eat it.  Reports of random results are dismissed with responses like “that’s not a problem, just run it a lot of times and take the average”, but at the same time, they’re fixing such bugs when they find them. They know their code can’t withstand scrutiny, so they hid it until professionals had a chance to fix it, but the damage from over a decade of amateur hobby programming is so extensive that even Microsoft were unable to make it run right.

No tests. In the discussion of the fix for the first bug, Imperial state the code used to be deterministic in that place but they broke it without noticing when changing the code.

Regressions like that are common when working on a complex piece of software, which is why industrial software-engineering teams write automated regression tests. These are programs that run the program with varying inputs and then check the outputs are what’s expected. Every proposed change is run against every test and if any tests fail, the change may not be made.

The Imperial code doesn’t seem to have working regression tests. They tried, but the extent of the random behaviour in their code left them defeated. On 4th April they said:  “However, we haven’t had the time to work out a scalable and maintainable way of running the regression test in a way that allows a small amount of variation, but doesn’t let the figures drift over time.”

Beyond the apparently unsalvageable nature of this specific codebase, testing model predictions faces a fundamental problem, in that the authors don’t know what the “correct” answer is until long after the fact, and by then the code has changed again anyway, thus changing the set of bugs in it. So it’s unclear what regression tests really mean for models like this – even if they had some that worked.

Undocumented equations. Much of the code consists of formulas for which no purpose is given. John Carmack (a legendary video-game programmer) surmised that some of the code might have been automatically translated from FORTRAN some years ago.

For example, on line 510 of SetupModel.cpp there is a loop over all the “places”  the simulation knows about. This code appears to be trying to calculate R0 for “places”. Hotels are excluded during this pass, without explanation.

This bit of code highlights an issue Caswell Bligh has discussed in your site’s comments: R0 isn’t a real characteristic of the virus. R0 is both an input to and an output of these models, and is routinely adjusted for different environments and situations. Models that consume their own outputs as inputs is problem well known to the private sector – it can lead to rapid divergence and incorrect prediction. There’s a discussion of this problem in section 2.2 of the Google paper, “Machine learning: the high interest credit card of technical debt“.

Continuing development. Despite being aware of the severe problems in their code that they “haven’t had time” to fix, the Imperial team continue to add new features; for instance, the model attempts to simulate the impact of digital contact tracing apps.

Adding new features to a codebase with this many quality problems will just compound them and make them worse. If I saw this in a company I was consulting for I’d immediately advise them to halt new feature development until thorough regression testing was in place and code quality had been improved.

Conclusions. All papers based on this code should be retracted immediately. Imperial’s modelling efforts should be reset with a new team that isn’t under Professor Ferguson, and which has a commitment to replicable results with published code from day one. 

On a personal level, I’d go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people, and the results speak for themselves.

My identity. Sue Denim isn’t a real person (read it out). I’ve chosen to remain anonymous partly because of the intense fighting that surrounds lockdown, but there’s also a deeper reason. This situation has come about due to rampant credentialism and I’m tired of it. As the widespread dismay by programmers demonstrates, if anyone in SAGE or the Government had shown the code to a working software engineer they happened to know, alarm bells would have been rung immediately. Instead, the Government is dominated by academics who apparently felt unable to question anything done by a fellow professor. Meanwhile, average citizens like myself are told we should never question “expertise”. Although I’ve proven my Google employment to Toby, this mentality is damaging and needs to end: please, evaluate the claims I’ve made for yourself, or ask a programmer you know and trust to evaluate them for you.

Subscribe
Notify of
guest
612 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Will Jones
29 days ago

Devastating. Heads must roll for this, and fundamental changes be made to the way government relates to academics and the standards expected of researchers. Imperial College should be ashamed of themselves.

Lms2
Lms2
29 days ago
Reply to  Will Jones

The UK government should be just as ashamed for taking their advice.
And anyone in the media who repeated their nonsense.

Bob Hope
Bob Hope
27 days ago
Reply to  Lms2

But the paper never explicitly recommended full lockdown. School closures, yes. Case isolation and social distancing, yep. But it doesn’t say anything about not going to work, not exercising frequently or travelling. Nor does it say anything about well people remaining in their homes and only being allowed to leave them with a “reasonable excuse”…

fdsfxv
fdsfxv
27 days ago
Reply to  Bob Hope

Ferguson is on video telling – not suggesting – that millions of people will die if we don’t implement China style lockdowns

Everette
Everette
27 days ago
Reply to  Bob Hope

Closing schools and going to work ???

DEEBEE
DEEBEE
26 days ago
Reply to  Everette

A little Home Alone never hurt anyone🤣

Mbkmbk
Mbkmbk
23 days ago
Reply to  DEEBEE

Bullshit! A friend’s brother just killed himself because of it…

Leo Pierson
Leo Pierson
21 days ago
Reply to  DEEBEE

So says the demented government

MJBlair
MJBlair
26 days ago
Reply to  Everette

I brought that up at the time, and was shouted down by the usual suspects. Grandparents would look after the children. But aren’t they the very ones at risk and children are the least likely to get the virus, but would certainly carry it straight to their grandparents.
I’ve written several pieces on the subject of the virus circus.
Now I’m just wanting some people to take responsibility for the hell this house arrest has caused.

Leo Pierson
Leo Pierson
21 days ago
Reply to  Bob Hope

Not explicitly, true maybe, but when does any government need more than implication to force it’s sickening, power hungry will upon the general public?

Rachel
Rachel
21 days ago
Reply to  Bob Hope

Sure it does. It’s what the paper refers to as “suppression,” rather than “mitigation.”

Sensible Sam
Sensible Sam
16 days ago
Reply to  Lms2

Ah, hindsight….

Robert
Robert
29 days ago
Reply to  Will Jones

The problem is the nature of government and politics. Politics is a systematic way of transferring the consequences of inadequate or even reckless decision-making to others without the consent or often even the knowledge of those others. Politics and science are inherently antithetical. Science is about discovering the truth, no matter how inconvenient or unwelcome it may be to particular interested parties. Politics is about accomplishing the goal of interested parties and hiding any truth that would tend to impede that goal. The problem is not that “government has being doing it wrong;” the problem is that government has been doing it.

Tom Welsh
Tom Welsh
28 days ago
Reply to  Will Jones

This article explains how such software should be written. (After the domain experts have reasoned out a correct model and had it verified by open peer review, and if possible by formal methods).

“They Write the Right Stuff” by Charles Fishman, December 1996

https://www.fastcompany.com/28121/they-write-right-stuff

Tom Welsh
Tom Welsh
28 days ago
Reply to  Tom Welsh

After all, only 7 lives depended directly on the Space Shuttle software. The Imperial College program seems likely to have cost many thousands of extra deaths, and to have seriously damaged the economies and societies of scores of countries, affecting possibly billions of lives.

So why should the resources invested in the two efforts have been so vastly different?

Helen
Helen
27 days ago
Reply to  Tom Welsh

I agree totally. The underfunding of important programs like this feeds into the quality of the resultant model. Concerning Sue Denim’s point, it doesn’t mean that it should become privatised and the work transferred to the insurance sector. As a sector they have large invested interest in a more biased model, at least more than the average fame-hungry epidemiologist researcher. The whole purpose of scientific research is to push the boundaries of understanding, so politicians should be analytical enough to understand limitations of research. It is akin to using a prototype F-35 to go to war, reckless.

afssdfsd
afssdfsd
27 days ago
Reply to  Helen

You don’t need a lot of funds to review code, they could actually open source it and the community would destroy it for them

Helen
Helen
27 days ago
Reply to  afssdfsd

You’ve misread as that is not my point. I agree that the code should be reviewed and open source. However, it’s more about that the investment of time and resources should be made prior to COVID, not as a posthumous effort

Ex-Oligarch
Ex-Oligarch
27 days ago
Reply to  Helen

The solution to incompetence and fraud is not to give more money to incompetent frauds.

Mike Whittaker
Mike Whittaker
24 days ago
Reply to  afssdfsd

Code review is not about “destroying” the code, it’s about improving it: not least, the knowledge it’s going to be reviewed improves the code as it’s written …

Jonathan Gilmore
Jonathan Gilmore
22 days ago
Reply to  Mike Whittaker

They term “destroy” in the context of code reviews is used often, and means destroy the credibility of the code – expose its flaws and failures. So, as you say, it is a positive thing.

Mathieu
Mathieu
26 days ago
Reply to  Helen

A key lesson is that government should equip themselves with capacity to critically appraise risk of bias in scientists’ work. What strikes me is that professors in epidemiology and public health believe that such models are worth presenting to policy makers. WHO in its guideline for non pharmaceutical interventions against influenza grades mathematical models as very low level of evidence.

LorenzoValla
LorenzoValla
19 days ago
Reply to  Helen

I agree with the sentiment, but this is not science, and it’s only important because government officials were led to believe that it was science.

The entire notion of a ‘social science’ is the biggest intellectual fraud in human history and is only made possible by academics who exploited the hard earned credibility of the physical sciences to elevate the status of their own fields.

Boccko
Boccko
17 days ago
Reply to  Tom Welsh

That’s the other side of the horse. You need more granularity.

mailman
mailman
28 days ago
Reply to  Will Jones

But they wont. Everyone involved in this now has skin in the game to ensure NOTHING happens and the lockdown carries on as if its the only thing keeping the entire country from dying.

Mark
Mark
28 days ago
Reply to  Will Jones

Well, this is exactly why there is a growing movement in academia at grassroots level to campaign for groups to use proper software practices (version control, automated testing and so on).

Barrie Singleton
28 days ago
Reply to  Will Jones

Vital to factor in Britain’s endemic corruption before seeking head-roll redress. There is none.
I speak from experience. Case study: https://spoilpartygames.co.uk/?page_id=4454

thelastnameleft
thelastnameleft
28 days ago
Reply to  Will Jones

It isn’t devastating at all.

JHaywood
JHaywood
28 days ago
Reply to  Will Jones

No. The issue with this analysis is that it attempts to discredit the Imperial code. It does not say that lockdown should not have taken place. It does not propose an alternative model that says a different course of action should be followed. It is reasonable to state that lockdown was the right approach given available data and models – this article does nothing to objectively state a different course of action would have led to different results. Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach given the output (including variances due to the core and any bugs). Interested if there is any view to substantiate a different path.

Tim Bidie
Tim Bidie
27 days ago
Reply to  JHaywood

There was no available reliable data:

‘From Jan 15 to March 3, 2020, seven versions of the case definition for COVID-19 were issued by the National Health Commission in China. We estimated that when the case definitions were changed, the proportion of infections being detected as cases increased by 7·1 times (95% credible interval [CrI] 4·8–10·9) from version 1 to 2, 2·8 times (1·9–4·2) from version 2 to 4, and 4·2 times (2·6–7·3) from version 4 to 5. If the fifth version of the case definition had been applied throughout the outbreak with sufficient testing capacity, we estimated that by Feb 20, 2020, there would have been 232 000 (95% CrI 161 000–359 000) confirmed cases in China as opposed to the 55 508 confirmed cases reported.’

https://www.thelancet.com/journals/lanpub/article/PIIS2468-2667(20)30089-X/fulltext

So the use of a model, any model, in preference to, say, canvassing the best advice of a panel of epidemiologists with many years of experience of coronaviruses was, at best, ill judged.

‘Sunlight will cut the virus ability to grow in half so the half-life will be 2.5 minutes and in the dark it’s about 13m to 20m. Sunlight is really good at killing viruses. That’s why I believe that Australia and the southern hemisphere will not see any great infections rates because they have lots of sunlight and they are in the middle of summer. And Wuhan and Beijing is still cold which is why there’s high infection rates.’

‘With SARS, in 6 months the virus was all gone and it pretty much never came back. SARS pretty much found a sweet spot of the perfect environment to develop and hasn’t come back. So no pharmaceutical company will spend millions and millions to develop a vaccine for something which may never come back. It’s Hollywood to think that vaccines will save the world. The social conditions are what will control the virus – the cleaning of hands, isolating sick people etc…’

https://www.fwdeveryone.com/t/puzmZFQGRTiiquwLa6tT-g/conference-call-coronavirus-expert

Professor John Nicholls, Coronavirus expert, University of Hong Kong

Deaths from Covid 19 in Hong Kong? Four, exactly…….

Edo McGowan
27 days ago
Reply to  Tim Bidie

If I remember correctly with SARS, they reduced the effort early-on in Toronto, perhaps from business pressure, and the thing came back. One of the areas needing some thought is dispersal via sewage treatment.

P. O'Nym
P. O'Nym
27 days ago
Reply to  JHaywood

He/she would need a model of his/her own, and a much better one, to analyse results and compute whether this or another course of action was best. Do you suppose there is a better model that is, for some reason, not being used? It is of course not certain that a faulty model would produce the correct answer – even a stopped watch is right twice a day – but it is quite likely.

P. O'Nym
P. O'Nym
27 days ago
Reply to  P. O'Nym

Sorry, missed a negative there. Not produce

Ex-Oligarch
Ex-Oligarch
27 days ago
Reply to  JHaywood

The analysis doesn’t just “attempt[] to discredit the Imperial code.” It does so successfully.

And we now know that the Imperial model’s projections do not match the real outcomes.

Vast social and economic changes have been forced on the populace as a result of bad modeling and unreliable data.

It is emphatically not incumbent on critics of the models, the data gathering, or the lockdown regime to put forward their own models or data, let alone some alternative set of response measures.

James Hamilton
James Hamilton
26 days ago
Reply to  Ex-Oligarch

“It does so successfully.”

Not necessarily. The article presupposes that the code should stand up to the sort of tests commercial software engineers use when creating distributable software for general consumption. This is not the point of Ferguson’s code.

Statistical models tend to generate their results by being run thousands of times using different starting values. This produces a set of vectors of output values which, when plotted on a linear chart, show a convergence around a given set of values. It is the converged values which are used to predict the distribution they are trying to model, and so provided that Imperial College knew about these flaws (which they say they do – that they are fixing them may be a PR exercise) then it shouldn’t really matter.

I find it pretty incomprehensible that the government isn’t being more alert to the criticisms of Ferguson’s model – particularly given how wrong it has been in the past – but I am lead to believe that this is more due to its assumptions rather than any particular issue with the code. I have not heard of anyone else implementing the mathematical model in a different program and getting different results. That would be genuine evidence of a serious problem with the simulation.

Richard
Richard
26 days ago
Reply to  James Hamilton

The article’s author bases his criticism on the presence of bugs and randomness. All software has bugs, the question is “does the bug materially reduce its utility?” I have not seen the code so this criticism is about the authors assumptions. In an epidemiological model, randomness is a feature, not a bug. The disease follows vectors probabilistically, not deterministically. This isn’t an email program or a database application, if the model always returned the same output for the same input that would be a bug. Prescribing deterministic behavior may prevent discovery of non-linear disease effects.

Rakib Hassan
25 days ago
Reply to  Richard

If the nondeterministic effects result in prediction variances the same order of magnitude as the predictions themselves, there is a fundamental problem that simply cannot be hand-waved out of.

Throgmorton
Throgmorton
18 days ago
Reply to  Richard

Indeterminism is an essential feature of stochastic modelling, but the outputs of successive model runs ought to converge to form a roughly similar picture if they are to be useful. If they are wildly divergent as a result of the way the program was written, which is the case here, then there is most certainly an issue which needs to be corrected.

Daniel
Daniel
25 days ago
Reply to  James Hamilton

If the model is flawed at its core then it settling on a particular converged set of values after thousands of runs, lends no more credence to its accuracy than a single run. Given the real world data the model seems flawed at its core.

Mark
Mark
24 days ago
Reply to  James Hamilton

“Statistical models tend to generate their results by being run thousands of times using different starting values.”

Yes, but the idea of having different starting values is that when you run a model twice with the same starting values, it is supposed to give the same result each time.

The Imperial model cannot do that, which means it is not and cannot be correct.

There are several possible problem areas. Subtraction/multiplication/division of small floating point values is a beginner mistake, and yet the code quality sounds so bad that I bet there are some of those as well.

aspnaz
aspnaz
26 days ago
Reply to  JHaywood

“It is reasonable to state that lockdown was the right approach given available data and models”

What evidence are you thinking of that demostrates that lockdown worked in the past and therefore make it a good policy for this pandemic? From my understanding this is the first such blanket lockdown. We know that self-isolation works for individuals but are you extrapolating from individuals to the whole population?

Your thinking reminds me of people that think that because washing your hands prevents the spread of disease, it must therefore be good to keep your baby in a clean environment: makes sense, logically it all hangs together. Unfortunately it is also a bad assumption because immunity does not work that way. Babies exposed to more germs are generally healthier in the long term: doesn’t make sense, but there you go. That is the advantage of science – real science involves reproducible outcomes and experimentation that is sometime surprising, it does not use heavily flawed models skewed by bad assumptions.

Henry Clayton
Henry Clayton
22 days ago
Reply to  JHaywood

‘Taking an approach of risk mitigation, I.e. lockdown, is the sensible approach’. . . . Lockdown attempts to mitigate one risk–spread of infection–while introducing numerous others, none of which are modeled. The world is far more complex than mathematical-modeling infectious disease epidemiologists seem to realise. This was not a sensible approach, which is why it is an approach that has never been taken for any pandemic in the history of the world prior to this.

Rachel
Rachel
21 days ago
Reply to  JHaywood

JHaywood, that isn’t an “issue”; it’s merely a single point of fact. That the code is a mess and produces muddled results is only one piece of the puzzle. Another very important point is that even the best modeling done with the best code is only as good as the data entered into it. The “data” used to create this model were largely untested assumptions.

It’s fair enough to state that for the first couple of weeks, that’s the best we had. But sound thinkers would have realized the assumptions were assumptions and observed and collected data to TEST them and adjust as necessary. It took weeks to get anyone to REALLY look at most of them, and one by one, we’re seeing the assumptions proven false. There’s no excuse for having not sought answers to these questions — which “mere laypeople” were raising as early as January — much sooner.

Aayush Gupta
Aayush Gupta
19 hours ago
Reply to  JHaywood

Yes there is a different path, which is to treat COVID 19 as a normal disease just like any other illness. Sweden has shown how the rest of the world should have coped up with this issue. There was unnecessary hype created all over the world just for a simple cold and cough. I mean seriously, who says we are more advanced than before ? We are still living in the stone age, being driven by fear rather than science or logic. Science refutes to call SARS COV-2 a deadly virus, there are no peer reviewed reports till date which claim that this virus is indeed capable of inflicting serious damage to otherwise healthy people. It is just getting tagged as the cause of death, even when the actual cause are the co-morbidities and co-infections.
The past two months have shown that no matter how much pride we take in us being scientifically advanced, in the end when put to a real testing situation we still suffer pathetically as before.

tresmegistus
tresmegistus
27 days ago
Reply to  Will Jones

heads need to role not only at imperial but at the government as well. this is total incompetence and who is going to accept responsibility for the sheer destruction of the economy and those who have been made redundant. why was the parliament and government not aware of the previous modelling problems associated with this same professor re the mad cow disease when 6m cattle were slaughtered for no reason at all. why was his history not checked?

JudgeP
JudgeP
25 days ago
Reply to  Will Jones

“all models are wrong, some are useful” – if govt hadn’t acted on this it would have been far, far worse, so does it matter? At least they did the right thing as a result… Very easy to pick fault with no better solution..

Mike Whittaker
Mike Whittaker
24 days ago
Reply to  Will Jones

No, it’s what should have happened under Peer review, but this is belatedly being applied.

Some public-spirited large company such as Google or Microsoft (don’t laugh !), should offer to modularise the model so it’s more easily a. Maintained b. Re-used c. Tested d. Updated

Nick Townsend
Nick Townsend
18 days ago
Reply to  Will Jones

The first thought that springs to my mind is that, irrespective of the coding, hundreds of thousands have died, world wide, from a single cause attributed to this virus.

That, surely, is fairly potent evidence that a virus, that also came within measurable distance of killing the English Prime Minister, and HAS killed countless numbers in this country alone, has been accurately identified as a virus with lethal properties?

Professor ‘Fergason’s coding might have been out, but the virus is, potentially, and actually, a killer, and highly infectious.

Surely that justifies government strategy?

Kieran
Kieran
11 days ago
Reply to  Will Jones

Agreed. Expert opinion is only as valuable as the reasoning which produces it. What mattters for decision makers is the logic and assumptions which underlie the experts conclusion. The advice that follows a conclusion also needs to be examined for logical flaws. The cult of the expert has allowed the development of extremely sloppy thinking both in the expert field and the decision makers field.

Mimi
Mimi
29 days ago

Thank you so much for this! This code should’ve been available from the outset.

Sean Flanagan
Sean Flanagan
29 days ago
Reply to  Mimi

Amateur Hour all round!
The code should have been made available to all other Profs & top Coders & Data Scientists & Bio-Statisticians to PEER Review BEFORE the UK and USA Gvts made their decisions. Imperial should be sued for such amateur work.

rickk
rickk
28 days ago
Reply to  Sean Flanagan

Guy at carnival: Here, drink this
Some ol’bloke : What is it?
Guy at carnival: Never mind, it will fix what’s ailing ya
Some ol’bloke : What’s it cost?
Guy at carnival: It doesn’t matter, it’s a deal at twice the price
Some ol’bloke : What’s in it?
Guy at carnival: Shhhhh, just take 3 swigs
Some ol’bloke : It tastes horrible
Guy at carnival: Ya, but it will help you
Some ol’bloke : …if you say so
Guy at carnival: I know hey, but you feel better already

Russ Nelson
28 days ago
Reply to  Mimi

But “This code” isn’t what Ferguson was running. The code on github has been munged by other authors in attempt to make it less horrifying. We must remember that what he ran was much worse than what we can see, which is bad enough.

Vadim Antonov
Vadim Antonov
27 days ago
Reply to  Russ Nelson

This code is IMPROVED (and cleaned, a lot, by professional software engineers) version of code which was run by Ferguson & Co. It’s still a steaming pile of crap. Ferguson refuses to release the original code, if you haven’t noticed. One is left to wonder why.

Russ Nelson
28 days ago
Reply to  Mimi

But “This code” isn’t what Ferguson was running. The code on github has been munged by other authors in attempt to make it less horrifying. We must remember that what he ran was much worse than what we can see, which is bad enough.

Caswell Bligh
Caswell Bligh
29 days ago

This is an outstanding investigation. Many thanks for doing it – and to Toby for providing a place to publish it.

lesg
lesg
29 days ago

So this is ‘the science’ that the Government thinks is that it is following!

lesg
lesg
29 days ago
Reply to  lesg

*the Government reminds us*

Mike Whittaker
Mike Whittaker
20 days ago
Reply to  lesg

Says who ?

ChrisH29
ChrisH29
29 days ago

This is isn’t a piece of poor software for a computer game, it is, apparently, the useless software that has shut down the entire western economy. Not only will it have wasted staggeringly vast sums of money but every day we are hearing of the lives that will be lost as a result.
We are today learning of 1.4 million avoidable deaths from TB but that is nothing compared to the UN’s own forecast of “famine on a biblical scale”. Does one think that the odious, inept, morally bankrupt hypocrite, Ferguson will feel any shame, sorrow or remorse if, heaven forbid, the news in a couple of months time is dominated by the deaths of hundreds of thousands of children from starvation in the 3rd World or will his hubris protect him?

speedy
speedy
29 days ago
Reply to  ChrisH29

I don’t understand why governments are still going for this ridiculous policy and NGOs all pretend it is Covid 19 that will cause this devastation RATHER than our reaction to it.

Ilma
Ilma
27 days ago
Reply to  speedy

It’s the same with the myriad of climate change campaigners. It’s their climate change *policies* that are dangerous, not climate change itself (whatever ‘climate change’ means!).

Joseph A-Smith
Joseph A-Smith
26 days ago
Reply to  speedy

Simple – they are afraid to say that they have made a mistake. And, people who follow this are afraid, as per The Emperor’s New Clothes, to admit that they are being used as gullible fools.

EppingBlogger
EppingBlogger
28 days ago
Reply to  ChrisH29

Impperial and the Professor should start to worry about claims for losses incurred as a result of decisions taken based on such a poor effort. Could we know, please, what this has cost over how many years and how much of the Professor’s career has been achieved on the back of it.

Andy
Andy
28 days ago
Reply to  EppingBlogger

Remember that Ferguson has a track record of failure:

in 2002 he predicted 50,000 people would die of BSE. Actual number: 178 (national CJD research and survellance team)
In 2005 he predicted 200 million people would die of avian flu H5N1. Actual number according to the WHO: 78
In 2009 he predicted that swine flu H1N1 would kill 65,000 people. Actual number 457.
In 2020 he predicted 500,000 Britons would die from Covid-19.

Still employed by the government. Maybe 5th time lucky?

itsspideyman
itsspideyman
28 days ago
Reply to  Andy

Maybe but he’ll have to step up his game.

William Gruff
William Gruff
28 days ago
Reply to  Andy

The figure of 500,000 deaths was based on the government’s ‘do nothing, business as usual to achieve herd immunity’ strategy then in effect. Ferguson predicted 250,000 deaths if the government acted as it has done since.

whatever
whatever
27 days ago
Reply to  Andy

Yeah… way more people died of BSE than 178…

Vadim Antonov
Vadim Antonov
27 days ago
Reply to  whatever

Source?

JHaywood
JHaywood
28 days ago
Reply to  Andy

Actually he didn’t. The model said if no action was taken up to 500,000 people could die. Please weigh in objectively to support or challenge the theory above.

Vee
Vee
27 days ago
Reply to  Andy

Do you mean just in the UK? Because swine flu killed way more than 457 in the parts of the world where they didn’t vaccinate.

Juan Luna
Juan Luna
27 days ago
Reply to  Andy

Ferguson should be retired and his team disbanded. As a former software professional I am horrified at the state of the code explained here. But then, the University of East Anglia code for modelling climate change was just as bad. Academics and programming don’t go together.

At the very least the Government should have commissioned a Red team vs Blue team debate between Ferguson and Oxford plus other interested parties, with full disclosure of source code and inputs.

I support the idea of letting the Insurance industry do the modelling. They are the experts in this field.

jont
jont
27 days ago
Reply to  Juan Luna

The software is irrelevant : a convenient peg to hang a global action on for reasons I cannot divine at present but which will become clearer

Ruby
Ruby
26 days ago
Reply to  Juan Luna

Ferguson and Oxford are the same team. If you look at the authors of the Ferguson papers you’ll find Oxford names there. If you look at the authors of papers from John Edmunds group you’ll find people who hold posts at Imperial. These groups are not independent.

Martin A
Martin A
26 days ago
Reply to  Ruby

I read that Ferguson has a house in Oxford.

Em Comments
Em Comments
19 days ago
Reply to  Andy

There was a RANGE from the MODEL, not a PREDICTION. From a 2002 report by the Guardian (https://www.theguardian.com/education/2002/jan/09/research.highereducation)
“The Imperial College team predicted that the future number of deaths from Creutzfeldt-Jakob disease (vCJD) due to exposure to BSE in beef was likely to lie between 50 and 50,000.

In the “worst case” scenario of a growing sheep epidemic, the range of future numbers of death increased to between 110 and 150,000. Other more optimistic scenarios had little impact on the figures.

The latest figures from the Department of Health, dated January 7, show that a total of 113 definite and probable cases of vCJD have been recorded since the disease first emerged in 1995. Nine of these victims are still alive.”

nick
nick
28 days ago
Reply to  ChrisH29

Pathetic review. You should go through the logic of what is coded and not write superficial criticisms which implies you know nothing of what you critique.

james
james
28 days ago
Reply to  nick

If only the code could actually be understood. It’s so bad you can’t even be certain of what exactly it’s doing.

Doug
Doug
28 days ago
Reply to  nick

Pretty sure the only point of the article was to bring light to the fact that the “model” is flawed and Ferguson has a track record of being VERY wrong on mortality rate predictions based upon flawed models. Solution, stop it. This time around it almost took down an entire country’s economy because of elitist’s overreaction and overreach. Just stop it.

silent one
silent one
27 days ago
Reply to  Doug

‘almost’ took down an entire country’s economy”
they haven’t stopped the Lockdown yet, plenty of time yet to destroy small businesses.

Dean Cardno
Dean Cardno
28 days ago
Reply to  nick

I couldn’t disagree more. The issue isn’t the virology, or the immunology, or even the behaviour of whatever disease is being examined / simulated. It is the programming discipline applied to the modelling effort. I doubt the author has the domain-specific expertise to comment on the immunological (etc) assumptions embedded in the program. What the author does have is the programming expertise to identify that the model could not produce useful output, no matter how accurate the virology / immunology assumptions, because the software that translated those assumptions into predictions of infections and case loads was so poorly written.

Ben Grove
Ben Grove
28 days ago
Reply to  ChrisH29

I’m afraid Ferguson is a very small part of the plan, and merely doing what he was hired for by KillBill.

Lewian
Lewian
27 days ago
Reply to  ChrisH29

It’s inappropriately UK-centric to speak of “the useless software that has shut down the entire western economy”. All governments have scientific advisors, there’s lots of modelling going on in many countries, and much of this influenced the lockdown decisions all over the world. If I remember it correctly, when Italy started its lockdown, Imperial hadn’t ye made their recommendation, and many if not most countries have not relied on Imperial. The software may be garbage, but the belief that there wouldn’t be strong scientific arguments for a lockdown without that piece is nonsense as well.

It's Science, Yo
It's Science, Yo
25 days ago
Reply to  Lewian

Thank goodness I’ve encountered a small injection of level-headedness here. The original critique is limited, appropriately, to the flawed coding and reliance on its outputs to inform UK policy; it draws none of the sweeping conclusions that others here seem to think are implied – perhaps owing to their own biases. (I came to this site thinking it was named for skeptics who are in lockdown, before I realised it was for skeptics *of* lockdown – so, yeah, plenty of motivated reasoning and politically-charged statements masquerading as incontrovertible truths, but hey, I’m just an actual skeptic… )

So anyway, I’m glad that Lewian has pointed out, because somehow it needed to be, that the world is bigger than the UK and that science (note: not code or software or politics or a dude called Neil) does not operate in a vacuum. One needn’t input even a single data point into a single model in order to undertake risk mitigation strategies if you (and by you, I mean the relevant scientific minds, not actually you) have even a comparatively rudimentary understanding of an infectious agent such as Covid-19. It’s simple cause and effect, extrapolated.

Want to be a skeptic in the classical tradition? Listen to the best available science from the most experienced scientists and researchers in virology, infectious disease, public health and epidemiology. Rely on their collective expertise to inform your own positions, because their baseline knowledge and understanding of the variables at play is granular and complex and anchored in decades of science and scientific research and informed by the fluid facts on the ground.

Rachel
Rachel
21 days ago
Reply to  Lewian

It seems to have been the primary determinant here in the US, too (and, I think, in Canada). Or at least that’s what they’re telling us.

Marc
Marc
21 days ago
Reply to  Rachel

I think you are all missing the point. The use of the model was to impress upon the U.S. President the severity of the outbreak. He needed more than “the best available science from the most experienced scientists and researchers in virology, infectious disease, public health and epidemiology” to take it seriously. Clearly, they had done their own modeling and assessment. This thread is what happens when you lose yourself in the code and aren’t looking at the big picture.

Simon Conway-Smith
Simon Conway-Smith
29 days ago

Why any of this isn’t obvious to our politicians says a lot about our politicians, but your summary also shows that that it is ENGINEERs and not academics that should be generating the input to policy making. It is only engineers who have the discipline to make things work, properly and reliably.

Basileus
Basileus
29 days ago

For decades I have opined that our society was exposed to the risk inherent in being a technologically dependent culture governed by the technically illiterate. QED?

Tom Welsh
Tom Welsh
28 days ago

“The Chinese Government Is Dominated by Scientists and Engineers”

https://gineersnow.com/leadership/chinese-government-dominated-scientists-engineers

Vadim Antonov
Vadim Antonov
27 days ago
Reply to  Tom Welsh

They are also communists. Which is another way of saying “psychopathic liars”.

el muchacho
el muchacho
28 days ago

No, scientists can write perfectly good code if they have the incentive to do so. Heck, most of the really important math codebases have been written by scientists. But the problem is, most scientists have the incentive to publish quickly, butnot that their methods follow good engineering practice, even when it should be mandated. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology. Unfortunately, that still isn’t the case in other branches of science. It should be.

Guest
Guest
27 days ago

A good point, but should you not add two other categories to the statement? First, civil servants; unlike the politicians, these are employed to use their expertise in advising politicians. They tend to be recruited by other civil servants, rather than the polticians.
The second group is journalists. I have seen no mention of this kind of criticism aired publicly by journalists. Indeed, this touches on another of my gripes; in the almost never-ending press conferences, current affairs programmes and interviews, the same old questions are asked over and over again, to be answered by the same generalised statements, while the more interesting and detailed matters are omitted, or, in a tiny number of occasions, interrupted or run out of time.

Chris Martin
Chris Martin
29 days ago

This kind of thing frequently happens with academic research. I’m a statistician and I hate working with academics for exactly this sort of reason.

skeptik
skeptik
29 days ago
Reply to  Chris Martin

the global warming models are secret too (mostly) and probably the same kind of mess as this code

ANNRQ
ANNRQ
28 days ago
Reply to  skeptik

Perhaps, if enough people come to understand how badly this has been managed, they will start to ask the same questions of the climate scientists and demand to see their models published.

It could be the start of some clearer reasoning on the whole subject, before we spend the trillions that are being demanded to avert or mitigate events that may never happen.

Debster 1
Debster 1
27 days ago
Reply to  ANNRQ

These so called Climate scientists were asked to provide the data, but they come back and said they lost the data when they moved offices.

Andy
Andy
28 days ago
Reply to  skeptik

Michael Mann pointedly refused to share his modelling code for climate change when he was sued for libel in a Canadian court. Ended up losing that will cost him millions. Now why would an academic rather lose millions of dollars than show their working.

Lets hope this “workings not required” doesn’t get picked up by schoolkids taking their exams 🙂

Charly
Charly
28 days ago
Reply to  Andy

Tried to find something about this on the BBC news site. Found this:

https://www.bbc.com/news/uk-politics-52553229

At the end of the article, there is “analysis” from a BBC health correspondent.

With such pitiful performance from the national broadcaster, I think Ferguson and his team will face no consequences.

el muchacho
el muchacho
28 days ago
Reply to  Andy

LOL wat a load of crap, it’s the other way around: it’s Mann who sued.

“In 2011 the Frontier Centre for Public Policy think tank interviewed Tim Ball and published his allegations about Mann and the CRU email controversy. Mann promptly sued for defamation[61] against Ball, the Frontier Centre and its interviewer.[62] In June 2019 the Frontier Centre apologized for publishing, on its website and in letters, “untrue and disparaging accusations which impugned the character of Dr. Mann”. It said that Mann had “graciously accepted our apology and retraction”.[63] This did not settle Mann’s claims against Ball, who remained a defendant.[64] On March 21, 2019, Ball applied to the court to dismiss the action for delay; this request was granted at a hearing on August 22, 2019, and court costs were awarded to Ball. The actual defamation claims were not judged, but instead the case was dismissed due to delay, for which Mann and his legal team were held responsible”

Another
Another
27 days ago
Reply to  el muchacho

Yes, Mann brought the case; on the other hand, it’s also correct that the case was dismissed when he didn’t produce his code. 9 years after the case started. The step that caused the enventual dismissal of the case was that Mann applied for an adjournment, and the defendents agreed on the condition that he supplied his code. Mann didn’t do that by the deadline specified, and the case was then dismissed for delay. Mann did say he would appeal.

Throgmorton
Throgmorton
18 days ago
Reply to  Another

The take-home point is that even though Dr. Mann sued for defamation, he incongruously refused to provide evidence that the supposed defamation was actually false, something he could easily have done.

If I were publicly defamed as a liar, I would wish for my name to be cleared immediately, and the falsehood shown definitively to be untrue. Dr. Mann stonewalled for more than nine years, refusing to provide the evidence which supposedly should have cleared his good name, which suggests that he was using the legal process as a weapon, rather than trying to purge a slur on his character.

Throgmorton
Throgmorton
18 days ago
Reply to  Andy

It was worse than that. Dr. Mann took the libel lawsuit against Dr. Timothy Ball, a retiree. Dr. Ball made a truth defence, which is acceptable in Canadian common law, and requested that the plaintiff, Dr. Mann, provide the code and data on which he based his conclusions. Dr. Mann stalled for a decade until at Dr. Ball’s request to expedite the case due to his age and ill health, the judge threw out the suit.

Tl;dr Dr. Mann sued for libel, but refused to provide evidence that the supposed libel was, in fact, false. It appears he was hoping for Dr. Ball would run out of money and fold.

Adrian
28 days ago
Reply to  skeptik

Not really, they aren’t. But they are indeed garbage. For example you may download the code for GISS GCM ModelE from here: https://www.giss.nasa.gov/tools/modelE/

el muchacho
el muchacho
28 days ago
Reply to  skeptik

No. Quite the opposite. This has bitten the climatologists in the butt with the so-called “climategate”. Congressional enquiries showed that their integrity was intact and that their methods were sound and followed standard scientific practice. But they lacked transparency, and therefore it was recommended that they should from now on make public all their numerical code and all their data. This has become widespread practice in climatology.
In fact there is a guide of practice for climatologists:
https://library.wmo.int/doc_num.php?explnum_id=5541

Throgmorton
Throgmorton
18 days ago
Reply to  el muchacho

The so-called ‘hockey-team’ were not cleared by the series of inquiries following the release of the ‘climategate’ emails. In fact, the inquiries seemed designed to avoid the serious issues raised by the email dump.

https://www.rossmckitrick.com/uploads/4/8/0/8/4808045/climategate.10yearsafter.pdf

Simon Conway-Smith
Simon Conway-Smith
28 days ago
Reply to  Chris Martin

It raises the questions (a) what other academic models that have driven public policy have such bad quality?, and (b) do the climate models suffer in the same way, also making them untrustworthy?

AlanReynolds
28 days ago

Similar skeptical attention should be paid to the credibility automatically granted to economic model projections – even for decades ahead. Economic estimates are routinely treated as facts by the biggest U.S. newspaper and TV networks, particularly if the estimates are (1) from the Federal Reserve or Congressional Budget Office, and (2) useful as a lobbying tool to some politically-influential interest group.

whatever
whatever
27 days ago
Reply to  Chris Martin

Academics are paid peanuts in the UK. It’s not the US with their 6 figure salaries. You need to teach 8+ hours, do your adminitrivia, and perhaps you’ll squeeze a couple of hours in for research at the end (or beginning) of a very long day. Nothing like Google, with its 500K salaries, and its code reviews. Sure non-determinism sucks but if the orders of magnitude of results fit expectations from other models, it’s good enough to compete with other papers in the field. Want to change that? Fund intelligent people in academia the way you fund lawyers and bankers. Oh, and managers in private industry will change results if it suits them, so “privatise it” is bollocks.

Throgmorton
Throgmorton
18 days ago
Reply to  whatever

The problem does not lie with non determinism in the model, but with wild divergence of output.

Jeremy Crawford
Jeremy Crawford
29 days ago

Just wonderful and sadly utterly devastating. As an IT bod myself and early days skeptic this was such a pleasure to ŕead. Well done

Mike Haseler
Mike Haseler
29 days ago

Thanks for doing the analysis. Totally agree that leaving this kind of job to amateur academics is completely non sensical. I like your suggestion of using the insurance industry and if I were PM I would take that up immediately.

Matthew Dixon
Matthew Dixon
27 days ago
Reply to  Mike Haseler

Scientists provide the science, insurers provide insurance. I would never go to an academic for insurance. There is an obvious conflict of interest with relying on an insurance company. It has a fiduciary responsibility to share holders and policy making should be entirely separate from the commercial interests of providing health insurance. The purpose of academia, besides providing education, is to pursue R&D in a non-commercial environment where all IP and research products (i.e. papers and codes) are disclosed to the public. Unfortunately, the insurance industry does not work to the same open standard. The industry is plagued by grotesque profiteering and opaque modeling practices – there are few universal standards for modeling. Try getting an insurance company to fully disclose details of its mortality models and provide beautifully curated source code for everyone to reproduce the decisions made by insurance companies when reviewing claims. You will not find one insurance company’s code in the public domain that is representative of production. My experience has been that the insurance industry is on the whole exactly the opposite of what you are proposing – no transparency and is clearly designed to profit on the misfortunes of others. Granted academia has its flaws and has fallen victim to the jaws of capitalism, but it operates first and foremost in the interests of widening the public body of knowledge. You earn a voice by publishing scientific papers in peer reviewed journals and in some domains the results have to be scientifically reproducable and are quickly discredited if they aren’t. You also can’t separate academics from industry practitioners as many move back and forth between industry and you’ll find that in the insurance industry too. Ironically many of the models and math used in the insurance industry is developed by “amateurish academics”.

Robert Borland
Robert Borland
27 days ago
Reply to  Matthew Dixon

I am not a big fan of the insurance business, but to be objective:
-Actuarial models in the insurance industry are used to determine insurance pricing, not to settle claims. Claims are based on evidence.
-The insurance business is not designed to profit on the misfortunes of others; a perfect insurance business model outcome would be that there were NO misfortunes. One must also remember that the overwhelming desired outcome of purchasers of insurance is that it not be required to make claims.
-Academic science has not fallen victim to capitalism, it has fallen victim to bureaucracy and conformity; if you do not conform to espouse expected and required outcomes you are labeled as a pariah, demonised and excluded. Evidence contradicting official policy is suppressed, falsified, or rationalised away.
But see Thomas Kuhn’s ‘The Structure of Scientific Revolutions’ which touched on the herd mentality of structured organizations and eventual paradigm shifts. In the example of these pandemic modelling disasters, the paradigm shift would be to exclude modelling as an influence on government policy, and the manias that can result.
-And finally, in science there is usually no accountability, liability, or consequences except temporary. In this most recent marriage of political power and ‘modelling’ catastrophe, the solution has been to just come up with yet another model and to rationalise whatever policy implemented as having been necessary; politicians will rarely if ever admit error of a policy course no matter what the cost, whether lives or money.

Andy Riley
Andy Riley
29 days ago

Look at SetupModel.ccp from line 2060 – pages of nested conditionals and loops with nary a comment. Nightmare!

James
James
28 days ago
Reply to  Andy Riley

The best is there’s all this commented out code. Was it commented out by accident? Was there a reason for it being there to begin with? Who knows, it’s a mystery.

Alicat2441
Alicat2441
29 days ago

Haven’t time to read the article and stopped at the portion where the data can’t be replicated. That right there is a huuuuuuge red flag and makes the “models” useless. I’ll come back tonight to finish reading. I have to ask: Is this the same with the University of Washington IMHE models?. Why do I have a sneaking suspicion that it is.

Laurence_R
Laurence_R
29 days ago
Reply to  Alicat2441

The IMHE ‘model’ is much worse – it’s just a simple exercise in curve fitting, with little or no actual modelling happening at all. I have collected screenshots of its predictions (for the US, UK, Italy, Spain, Sweden) every few days over the last few weeks, so I could track them against reality, and it is completely useless. But, according to what I’ve read, the US government trusts it!

Until a few days ago, its curves didn’t even look plausible – for countries on a downward trend (e.g. Italy and Spain), they showed the numbers falling off a cliff and going down to almost zero within days, and for countries still on an upward trend (e.g. the UK and Sweden) they were very pessimistic. However, the figures for the US were strangely optimistic – maybe that’s why the White House liked them.
They seem to have changed their model in the last few days – the curves look more plausible now. However, plausible looking curves mean nothing – any one of us could take the existing data (up to today) and ‘extrapolate’ a curve into the future. So plausibility means nothing – it’s just making stuff up based on pseudo-science. In the UK, we’re not supposed to dissent, because that implies that we don’t want to ‘save lives’ or ‘protect the NHS’, so the pessimistic model wins. In the US, it’s different, depending on people’s politics, so I’m not going to try to analyse that.

So why do governments leap at these pseudo-models with their useless (but plausible-looking) predictions? It’s because they hate not knowing what’s going to happen, so they are willing to believe anyone with academic credentials who claims to have a crystal ball. And, if there are competing crystal balls from different academics, the government will simply pick the one that matches its philosophy best, and claim that it is ‘following the science’.

Patrick McCormack
Patrick McCormack
28 days ago
Reply to  Laurence_R

Ditto. The IMHE predictions are completely silly.

Simon Conway-Smith
Simon Conway-Smith
28 days ago
Reply to  Laurence_R

They leap at them for fear of the MSM accusing them of not doing anything.

I had hoped Donald Trump would be a stronger leader than that, and insisted on any model being independently and repeatedly verified before making any decision.

The other factor that seems entirely missing from the models is the ability of existing medicines, even off-label ones, to treat the virus, and there have been many trials of Hydroxy Chloroquine with Zinc sulphate (& some also with Azithromycin) that have demonstrated great success. It constantly dismays me that this is ignored, and here in the UK, patients are just given paracetamol; as if they have a headache!!

AlanReynolds
28 days ago
Reply to  Laurence_R

I offer a critical review of past and present IHME death projections here: https://www.cato.org/blog/six-models-project-drop-covid-19-deaths-states-open

Desmond
Desmond
28 days ago
Reply to  Laurence_R

Could these popularity contest winners perhaps just be idiots? Occam’s razor applies.

silent one
silent one
27 days ago
Reply to  Laurence_R

“It’s because they hate not knowing what’s going to happen, so they are willing to believe anyone with academic credentials who claims to have a crystal ball.”
Problem with this one is that Neil Fergusson and Imperial College have been consistently wrong.

G H
G H
26 days ago
Reply to  Alicat2441

I’m a guy working in the biz for 40+ years. Just a grunt, but paid pretty well for being a grunt. The “can’t be replicated” is insane.

The only time “can’t be replicated” is an issue when real time is involved. If you can’t say “Ready, set , go” with the same set of data and assumptions that are plugged in, you have some serious issues going on.

“But, we have to multi-thread…..on multiple CPU cores or we won’t get results fast enough”. Ok, you got bogus results.

Robin66
Robin66
29 days ago

This is scary stuff. I’ve been a professional developer and researcher in the finance sector for 12 years. My background is Physics PhD. I have seen this sort of single file code structure a lot and it is a minefield for bugs. This can be mitigated to some extent by regression tests but it’s only as good as the number of test scenarios that have been written. Randomness cannot just be dismissed like this. It is difficult to nail down non-determinism but it can be done and requires the developer to adopt some standard practices to lock down the computation path. It sounds like the team have lost control of their codebase and have their heads in the sand. I wouldn’t invest money in a fund that was so shoddily run. The fact that the future of the country depends on such code is a scandal

Basileus
Basileus
29 days ago
Reply to  Robin66

‘Software volatility’ is the expression Robin and it is always bad.

dr_t
dr_t
29 days ago

I have not looked at Neil Ferguson’s model and I’m not interested in doing so. Ferguson has not influenced my thinking in any way and I have reached my own conclusions, on my own. I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number, but suffice to say that I decided to fly to a different country, stock up on food right, and lock myself up so I have no contact with anybody, right at the beginning of February, when nobody else was stocking up yet, nobody else was locking themselves up, and people thought it was all a bit strange. When I flew to my isolation location, I wore a mask, and everyone thought it was a bit strange. Make your own conclusions.

I’ve read this review.

Firstly, I’ll stress this again, I’m not going to defend Ferguson’s model. I have not seen it. I don’t know what it’s like. I don’t know if it’s any good.

I don’t share Ferguson’s politics, even less so those of his girlfriend.

His estimate of the number that would likely die if we took no public health measures IMO is not an over-estimate. There are EU countries which have conducted tests of large random, unbiased, samples of their population to estimate what percentage of their population has had the virus. The number – in case of those countries – comes out at 2%-3%. If the same is true of the UK, then 30,000 deaths would translate to 1 million deaths if the virus infected everybody. Of course, we don’t know if the same is true of the UK.

But now I am going to criticize this criticism of Ferguson’s model, because it deserves criticism.

I’ve been writing software for 41 years. Including modeling and simulation software. I wrote my first stochastic Monte Carlo simulator 37 years ago. I have written millions of lines of code in dozens of different programming languages. I have designed many mathematical models, including stochastic ones.

Ferguson’s code is 30 years old. This review criticizes it as though it was written today, but many of these criticisms are simply not valid when applied to code that’s 30 years old. It was normal to write code that way 30 years ago. Monolithic code was much more common, especially for programs that were not meant to produce reusable components. Both disk space, RAM, and CPU speeds were not amenable to code being structured to the same extent it is today. Yes, structured programming was known, yes, software libraries were used, but programs like simulation software generally consisted of at most a handful of different source files.

30 years ago, there was no multi-threading, so it was reasonable to write programs on the assumption that they were going to run on a single-threaded CPU. With few exceptions, like people working on Transputers, nobody had access to a multi-threaded computer. I can’t say what is making his code not thread safe, but not being thread safe does not necessarily imply bad coding style, or bad code. There are many functions even in the standard C library which are not thread safe, and some that come in two flavours – thread safe and not thread safe. The thread safe version normally has more overhead and it is less efficient. Today, this may make no difference, but 30 years ago, that mattered. A lot. Writing code which was not thread safe, if you were optimizing for speed, may have made perfect sense.

While not documenting your programs was not great practice even back then, it was also very common, especially for programs which were initially designed for a very specific application, and were not meant to be reused in other projects or libraries. There is nothing particularly unusual about this.

It’s perfectly normal not to want to disclose 30 year old code because, as has been proven by this very review, people will look at it and criticize it as if it was modern code.

So Ferguson evidently rewrote his program to be more consistent with modern coding standards before releasing it. And probably introduced a couple of bugs in the process. Given the fact that the original code was undocumented, old, and that he was under time pressure to produce it in a hurry, it would have been strange if this didn’t introduce some bugs. This does not, per se, invalidate the model. Your review does not give any reason to think these bugs existed in the original code or that they were material.

The review criticizes the code because the model used is stochastic. Which means random, the review goes on explain. Random – surely this must be bad! But stochastic models and Monte Carlo simulation are absolutely standard techniques. They are used by financial institutions, they were used 30 years ago for multi-dimensional numerical integration, they are used everywhere. The very nature of the system being modeled is fundamentally and intrinsically stochastic. Are you saying you have a model which can predict, with certainty, how many dead people there will be 2 weeks from now? No, of course you don’t. This depends on so many variables, most of which are random, and so they have to be modeled as being random. From the way you describe the model (SimCity-like), it sounds like it models individual actors, so it ipso facto it has to be stochastic. How else do you model the actions of many independent individual human actors?

I don’t know the author or anything about her background. But it doesn’t sound to me like she was writing software or making mathematical models 30 years ago, or she wouldn’t be making many of the statements she is making.

Reviewing Ferguson’s model in depth is certainly something that someone ought to do. But a serious review would understand what the (stochastic) model does, explain what it does, and assess the model on its merits. I have no idea whether the model would survive such a review well or be torn to shreds by it. But this review just scratches the surface, and criticizes Ferguson’s software in very superficial ways, largely completely unwarranted. It does not even present the substance of the model.

MFP
MFP
29 days ago
Reply to  dr_t

I read the author’s discussion of the single-thread/multi-thread issue not so much as a criticism but as a rebuttal to possible counter-arguments. I agree it probably should have been left out (or relegated to a footnote), but the rest of the author’s arguments stand independently of the mult-thread issues.

I disagree with your framing of the author’s other criticisms as amounting to criticism of stochastic models. It does not appear the author has an issue with stochastic models, but rather with models where it is impossible to determine whether the variation in outputs is a product of intended pseudo-randomness or whether the variation is a product of unintended variability in the underlying process.

el muchacho
el muchacho
27 days ago
Reply to  MFP

According to Github, the reproductibility bugs mentionned have been corrected by either the Microsoft team or John Carmack, and the software is now fully repoducible. They sure checked what result was given by the software before and after the corrrection and they must have found out it was the same.
The question is, have the bugs led to incorrect simulations ? I can’t say but realistically it’s very unlikely. As a scientist, Neil Ferguson and his team are trained to see errors like that and the fact that they commented at these bugs is evidence enough that they knew they were buggy.

Is it poor software practice ? Absolutely.

Should scientists systematically open source their code and data ? I think so, and I deplore the fact that it’s still not standard practice (except in climatology).

Are the simulations flawed and is it bad science ? You certainly cannot conclude anything even close to that from such a shallow code review.

Paul Penrose
Paul Penrose
29 days ago
Reply to  dr_t

dr_t,
I am also a Software Engineer with over 35 years of experience, so I understand what you are saying as far as 30 year old code, however if the software is not fit for purpose because it is riddled with bugs, then it should not be used for making policy decisions. And frankly I don’t care how old the code is, if it is poorly written and documented, then it should be thrown out and rewritten, otherwise it is useless.

As a side note, I currently work on a code base that is pure C and close to 30 years old. It is properly composed of manageable sized units and reasonably organized. It also has up to date function specifications and decent regression tests. When this was written, these were probably cutting-edge ideas, but clearly wasn’t unknown. Since then we’ve upgraded to using current tech compilers, source code repositories, and critical peer review of all changes.

So there really is no excuse for using software models that are so deficient. The problem is these academics are ignorant of professional standards in software development and frankly don’t care. I’ve worked with a few over the course of my career and that has been my experience every time.

skeptik
skeptik
29 days ago
Reply to  Paul Penrose

I agree 100%, I wrote c/c++ code for years and this single file atrocity reminds me of student code

Neil
Neil
29 days ago
Reply to  Paul Penrose

The fact it wasn’t refactored in 30 years is a sin plain and simple.

G H
G H
26 days ago
Reply to  Neil

That’s human nature. I work as a S.E. in financial services. No real degree. Been doing it for 40 years, pays well, can probably work into my 70s if I want. Just got a little project to make a Access Data Base (MDB file) via a small program for a vendor that our clients love and trust. What the ??????? MicroSoft never canceled it but hasn’t promoted it in at least 15 years. I also get projects based on COBOL specs.

That tells me that people are kicking the can down the road because “It still runs. It’ll be fine”. And, they hope they are retired when it’s not fine.

rickk
rickk
28 days ago
Reply to  Paul Penrose

More over, this was likely the ‘code’ used for his swine flu predictions – which performed magnificently 😉

dodgy geezer
dodgy geezer
28 days ago
Reply to  Paul Penrose

I was coding on a large multi-language and multi-machine project 40 years ago. This was before Jsckson Structured Programming, but we were still required to document, to modularise, and to perform regression testing as well as test for new functionality. These were not new ideas when this model was originally created.

The point of key importance is that code must be useful to the user. This is normally ensured by managers providing feedback from the business and specifying user requirements in better detail as the product develops. And this stage was, of course , missing here.

Instead we had the politicians deferring to the ‘scientists’, who were trying out a predictive model untested against real life. That seems to have worked out about as well as if you had sacked the sales team of a company and let the IT manager run sales simulations on his own according to a theory which had been developed by his mates…

physicist137
physicist137
28 days ago
Reply to  dodgy geezer

> untested against real life.
And _untestable_? There is no mention in the review of how many parameter values need to be fixed to produce a run. More than 6-10 and I cannot imagine searching for parameters for a best fit [to past data] to result in stable values over time.

steve brown
steve brown
28 days ago
Reply to  Paul Penrose

All I know is that my son is same as Ferguson. Physics PhD BUT is now a commercial machine learning Data Scientist. However, he has spent five years out of academia learning the additional software skills required, passing all AWS certs etc. Ferguson didn’t.

Another
Another
27 days ago
Reply to  Paul Penrose

Yes, I was coding 30 years ago and we wrote modular, commented code using SCCS for version control.

dr_t
dr_t
27 days ago
Reply to  Another

And I know a juggler who can juggle 7 balls while rubbing his belly. He is a juggler, you may be a software developer and Ferguson is an epidemiological modeler. How good are your epidemiological modelling and ball juggling skills?

Fred Streeter
Fred Streeter
25 days ago
Reply to  dr_t

Working as an Analyst/Programmer together with a Metallurgist and a Production Engineer, I designed and programmed a Production Scheduling system, derived from their expertise.

This was some 35 years ago. Documentation of the system was provided in the terminology of the experts, with links to the documentation of the code – and vice versa.

So, no, I would not claim to have been able to juggle with their 7 balls, but, equally, they could not juggle with mine.

Robbo
Robbo
29 days ago
Reply to  dr_t

How wrong you will be proved to be. Testing is already indicating that huge numbers of the global population have already caught it. The virus has been in Europe since December at the latest, and as more information comes to light, that date will likely be moved significantly backwards. If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.
This virus will be proven to be less deadly than a bad strain of influenza, with it without a vaccinated population. Total deaths have only peaked post lockdown. That is not a coincidence.

Jacqueline
Jacqueline
29 days ago
Reply to  Robbo

@Robbo Why is it not a coincidence? I am not sure what to think about this virus: you say it will proven to be like a bad strain of influenza, but I work in a hospital and our clinical staff are saying they have never seen anything like it in terms of number of deaths.

Neil
Neil
29 days ago
Reply to  Jacqueline

There empty hospitals full of tik tok stars?

dodgy geezer
dodgy geezer
28 days ago
Reply to  Jacqueline

I would not be surprised at a large number of initial deaths with a new disease when the medical staff have no protocol for dealing with it. In fact, I understand that their treatment was sub-optimal and could have made things worse.

When we have a treatment for it we will see how dangerous it is compared to flu. Which can certainly kill you if not treated properly…

dr_t
dr_t
27 days ago
Reply to  dodgy geezer

https://vk.ovg.ox.ac.uk/vk/influenza-flu

“In the UK it is estimated that an average of 600 people a year die from complications of flu. In some years it is estimated that this can rise to over 10,000 deaths (see for example this UK study from 2013, which estimated over 13,000 deaths resulting from flu in 2008-09).”

This thing has already killed 30,000 in NHS hospitals and probably another 15,000 who died at home and in care homes – 45,000 in total. The numbers are only this low because of the draconian lockdown measures.

This is in the space of the first 2 months, and we are nowhere near the saturation point yet. Those countries in the EU which have conducted randomized antibody testing trials have determined that 2%-3% of their populations have been infected to-date.

The Spanish flu killed 220,000 in the UK over a period of 3 years between 1918 and 1920.

We may not know exactly how dangerous this thing is, but we already know that it is nothing like the flu and a heck of a lot more deadly.

silent one
silent one
27 days ago
Reply to  dr_t

What are the deaths of those that have died FROM covid 19 and how are those written on the death certificates and how is it that those that die of a disease other than covid 19 are also included as covid 19 deaths when they were only infected by covid 19. As we know there are asymptomatic carriers so there MUST be deaths were they had the covid but that it was not a factor in those deaths but were included on the death certificate. The numbers of deaths that have been attributable to covid 19 have been over-inflated. Never mind that the test is for a general coronavirus and not specific to covid 19.

Richard
Richard
26 days ago
Reply to  dr_t

Dr T, do you have references to the randomized antibody studies that show 2-3% spread? Some of the studies I’ve seen for the EU indicate higher (e.g., the Gangelt study).

Russ Nelson
28 days ago
Reply to  Jacqueline

How many of these clinical staff were working during the 1957 pandemic? Probably …. none. It was worse on an absolute and per-capita basis than what we’re seeing now.

dr_t
dr_t
27 days ago
Reply to  Russ Nelson

The 1957 flu pandemic killed 70,000 in the USA in the space of more than a year. SARS-2 has killed nearly 80,000 in the USA in the first 2 months. I cannot find reliable numbers for the number of deaths in the 1957 pandemic in the UK, but all the partial numbers I can find are a lot lower than the current number of SARS-2 deaths in the UK in the first 2 months (30,000 in the NHS + 15,000 at home and in care homes = 45,000). As all epidemics, the growth in the absence of mitigation measures is exponential until saturation is reached (we are very far from that point), so most of the deaths occur at the peak, and what you see even a week before the peak is a drop in the ocean. I think you need to check the facts before making such claims.

Galt1138
Galt1138
24 days ago
Reply to  dr_t

The US didn’t have anywhere near 80K deaths in the first two months. Where are you getting these numbers? And what date are you placing the first US COVID-19 death?
The CDC website shows a total of just under 49K as of May 11:
https://www.cdc.gov/nchs/nvss/vsrr/covid19/index.htm

Russ Nelson
28 days ago
Reply to  Jacqueline

How many of these clinical staff were working during the 1957 pandemic? Probably …. none. It was worse on an absolute and per-capita basis than what we’re seeing now.

Bumble
Bumble
29 days ago
Reply to  Robbo

Brilliant comment. This model assumes first infections at least two months too late. The unsuppressed peak was supposed to be mid May (the ‘terrifying’ graph) so what we have seen in April is likely the real peak and lockdown has had no impact on the virus. Lockdown will have killed far more people. Elderly see no point in living in lockdown. Anecdotal reports that people in care homes have just stopped eating.

Patrick McCormack
Patrick McCormack
28 days ago
Reply to  Bumble

Nope. Base rate. Look outside your own wee land.
Spain is a better example.
Just model Spain with a simple statistical model and you see the lockdown impact.
It’s easy, you can do it in an afternoon.

Frank
Frank
28 days ago
Reply to  Robbo

> If the R0 is to be believed, the natural peak would have been hit, with or without lockdown, in March or April. That is what we have seen.

That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked.

> This virus will be proven to be less deadly than a bad strain of influenza

Flu kills around 30,000/year in the US, mostly over a five-month period. Covid-19 has killed 70,000 in about six weeks, despite the lockdown.

SteveB
SteveB
28 days ago
Reply to  Frank

@Frank,

“That’s what we’ve seen WITH lockdown. We haven’t tried a no-lockdown scenario, so we don’t know in practice when that would have peaked”.

Incorrect.

Peak deaths in NHS hospitals in England were 874 on 08/04. A week earlier, on 01/04, there were 607 deaths. Crude Rt = 874/607 = 1.4. On average, a patient dying on 08/04 would have been infected c. 17 days earlier on 22/03. So, by 22/03 (before the full lockdown), Rt was (only) approx 1.4.

Ok, so that doesn’t tell us too much, but if we repeat the calculation and go back a further week to 15/03, Rt was approx 2.3. Another week back to 08/03 and it was approximately 4.0.

Propagating forward a week from 22/03, Rt then fell to 0.8 on 29/03

So you can see that Rt fell from 4.0 to 1.4 over the two weeks preceding the full lockdown and then from 1.4 to 0.8 over the following week, pretty much following the same trend regardless.

So, using the data we can see that we could have predicted the peak before the lockdown occurred, simply using the trend of Rt.

In my hypothesis, this was a consequence of limited social distancing (but not full lockdown) and the virus beginning to burn itself out naturally, with very large numbers of asymptomatic infections and a degree of prior immunity.

djaustin
djaustin
28 days ago
Reply to  SteveB

Peak excess all-cause mortality was last week – yes the last week in April. Don’t just look at reported COVID19 hospital deaths, And don’t just focus on one model.

SteveB
SteveB
28 days ago
Reply to  djaustin

How do you know that? ONS stats have only just been published for w/e 24th April and they were down a bit on the week before?

AlanReynolds
28 days ago
Reply to  SteveB

Epidemic curve are flat or down in so many countries with such different mitigation policies that it’s hard to say this policy or that made big difference, aside from two – ban all international travel by ship or airplane and stop mass transit commuting. No U.S. state could or did so either, but island states like New Zealand could and did both. In the U.S., state policies differ from doing everything (except ban travel and transit) to doing almost nothing (9 low-density Republican states, like Utah and the Dakotas). But again, Rt is at or below in almost all U.S. states, meaning the curve is flat or down. Policymakers hope to take credit for something that happened regardless of their harsh or gentle “mitigation” efforts, but it looks like something else –such as more sunshine and humidity or the virus just weakening for unknown reasons (as SARS-1 did in the U.S. by May). https://rt.live/

Isabel Page
Isabel Page
28 days ago
Reply to  SteveB

I started distancing myself before the end of January when I was abroad on holiday in Tenerife with no known cases. But someone coughing next to me? I reacted. Also kept away from vulnerable people on my return home. Surely others behaved likewise long before lockdown?

dr_t
dr_t
27 days ago
Reply to  Isabel Page

I also did the same at the end of January / early February.

JR
JR
28 days ago
Reply to  Frank

Frank, the peak Flu season is December through February, which is about the same amount of time that we’ve officially been recording deaths in the U.S. from the SARS-CoV-2 pathogen (February through April). Likewise, regarding a lockdown vs. no lockdown scenario comparison, that is also offset by the vaccine vs. no vaccine aspect of these two pathogens.

Please keep in mind that we’ve had numerous Flu seasons where between 60,000 to more than 100,000 Americans have passed away due to it, all despite a solid vaccination program.

Tom Welsh
Tom Welsh
28 days ago
Reply to  Frank

“Flu season deaths top 80,000 last year, CDC says”
By Susan Scutti, CNN
Updated 1645 GMT (0045 HKT) September 27, 2018

https://edition.cnn.com/2018/09/26/health/flu-deaths-2017–2018-cdc-bn/index.html

Epictetus
Epictetus
28 days ago
Reply to  Frank

Yes but the manner in which they count COVID-19 deaths is flawed. Even with co-morbidity they ascribe to COVID, and in cases where they do not test but there were COVID-like symptoms, they ascribe it to COVID according to CDC.

Bazza McKenzie
Bazza McKenzie
27 days ago
Reply to  Epictetus

Most governments are busily fudging the numbers up, to ex-post “justify” the extreme and massively damaging actions they imposed on communities and to gain financial benefit (e.g. states and hospitals which get larger payouts for Wuhan virus treatment than for treatment for other diseases).

As with “global warming”, the politicians, bureaucrats and academics are circling the wagons together to protect their interlinked interests.

Peter B
Peter B
28 days ago
Reply to  Frank

You are confusing deaths ASSOCIATED with Covid with EXCESS deaths resulting from flu. If all those who died of pneumonia, cancer, heart disease were routinely tested for flu we’d find that hundreds of thousands die WITH flu every year, though not as a direct result of it.

Russ Nelson
28 days ago
Reply to  Frank

Right, but you’re comparing apples to oranges. Compare Covid-19 to other pandemics, like 1917, 1957, or 1968.

Chebyshev
Chebyshev
28 days ago
Reply to  Frank

May be it is not “despite” but “because of”?

If you start the lockdown as late as March, then you ensure that infection and death rates are going to be higher because of high dosage and fragile immune system that comes from lockdown.

There are plenty of countries without lockdown to compare against. So it is not an unverifiable hypothesis.

David Blackall
David Blackall
28 days ago
Reply to  Robbo

“The virus has been in Europe since December at the latest” https://www.sciencedirect.com/science/article/pii/S1567134820301829?via%3Dihub

Patrick McCormack
Patrick McCormack
28 days ago
Reply to  Robbo

Oh yes. The model is all rather irrelevant now as we catch up on burying the dead.
In point of fact a ten line logistic model does as good a job.
Still, academic coding is usually a disaster. I went back to grad school in my late thirties after twenty years of software development. I should have brought a bigger stick.

Tom Welsh
Tom Welsh
28 days ago

“I should have brought a bigger stick”.

A PART, maybe? “Professor Attitude Realignment Tool”

Jon
Jon
25 days ago
Reply to  Robbo

SARS-CoV-1 (SARS) and SARS-CoV-2 (covid-19) both bind to the same receptor/enzyme (ACE2), causing increased angiotensin II (as it is no longer converted to angiotension 1,7 due to reduction ACE2 receptors/enzyme) and a cascade of pathological effects from that, causing: pneumonia, ARDS, hypoxia, local immune response, cytokine storm, inflammation, blood clots. SARS has a mortality rate of 10%, why would covid-19 be on a par with flu and not higher given it has the same/similar pathology as SARS ?

Mark
Mark
29 days ago
Reply to  dr_t

I have not seen the model or intend to do either. On thing that rang alarm bells with me was the statement that R0 was an input into its calculation making it a feedback system. These types of dynamical systems are known to exhibit truely chaotic behaviour. Even when not operating in those chaotic regions, the numerical methods must be chosen carefully so that they themselves do not introduce artificial method-induced pseudo non-deterministic behaviour (small differences in the initial conditions or bugs such as use of uninitialised variables)

dodgy geezer
dodgy geezer
28 days ago
Reply to  Mark

The modellers argument would be that life is chaotic, and introducing a virus to two separate but identical towns could indeed result in very different outcomes.

Which makes me wonder about the validity of modelling chaotic systems at all…

Frank
Frank
28 days ago
Reply to  Mark

I think the practice could make sense. The input R0 might describe how communicable the disease is without countermeasures, while the output R0 is the resulting communicability with the countermeasures being modelled. Nowhere in the article does it actually say the output is used as the following run’s input, and while I agree that’d be illogical and give huge swings in outputs (e.g., perhaps converging on infinity!) there’s no sign that’s being done. Is one of the top five critiques we can make of this code, that if used in a manner it’s not being used, it’s output would go crazy?

Jayne
Jayne
29 days ago
Reply to  dr_t

The value of your comprehensive reply was completely invalidated when you declined to provide your own calculations!

Bob
Bob
29 days ago
Reply to  dr_t

Yeah you’ve written millions of lines of code in dozens of languages, but didn’t read the review carefully. There’s a difference between randomness you introduce, which you can reproduce with the correct seed, and bugs which give you random results. You can’t just say, ‘oh it’s stochastic’, no, it’s bug ridden. They don’t understand the behaviour of their own model.

Saying it’s crappy because it’s 30 years old is nonsense. You can’t then use your crappy, bug ridden code to influence policies which have shut the economy down.

Tom Welsh
Tom Welsh
28 days ago
Reply to  Bob

Unix is 50 years old. And IBM mainframe operating systems even older. And CICS…

Software on which the world runs every second of the year.

David collier
David collier
28 days ago
Reply to  Tom Welsh

Oh for heaven’s sake. Have you read the Linux kernel? It only even begins to work because people live and breathe it. It wouldn’t pass structured programming 101. Linux specifically discourages comments.

Mike Whittaker
Mike Whittaker
16 days ago
Reply to  Tom Welsh

And how many systems are running Unix (as opposed to Linux) nowadays ?

Nic Lewis
29 days ago
Reply to  dr_t

The review is a code review, not a review of the mathematical model, so I don’t see that one would expect it to present the substance of the model in any detail.

” There are EU countries which have conducted tests of large random, unbiased, samples of their population to estimate what percentage of their population has had the virus. The number – in case of those countries – comes out at 2%-3%. If the same is true of the UK, then 30,000 deaths would translate to 1 million deaths if the virus infected everybody. ”

Antibody tests indicate those people who have been sufficiently susceptible to the virus for their innate immune systems and existing T-cells to be unable to defeat the SARS-COV-2 virus, resulting in their slower-responding adaptive immune systems generating antibodies against the virus. But there are potentially a much larger number of people whose innate immune systems and/or existing T-cells are able to defeat this virus, and have done so in many cases, without generation of a detectable quantity of SARS-COV-2 specific antibodies. That seems the most likely explanation for why the epidemic is waning in Sweden, indicating a Reproduction number below 1, contrary to even the 2.5% probability lower bound of 1.5 for the Reproduction number there estimated by the Imperial College team using their model (https://www.imperial.ac.uk/mrc-global-infectious-disease-analysis/covid-19/report-13-europe-npi-impact/).

duncanpt
duncanpt
29 days ago
Reply to  dr_t

It seems to me that most of your comments are excuses for practices that were poor at the time, let alone now. Most of them simply reinforce the view that the code should have been ditched and rewritten top to bottom years ago as being no longer fit for purpose, if it ever was. Opportunities or signals to do so: move from single to multi-thread machines; publication of new/revised libraries with different flavours; discovery of absence of comments (!); discovery that same input does not yield same output (when it’s intended to); etc

Incidentally, “… [no] reason to think these bugs existed in the original code or that they were material.” which is precisely why we need to see the *actual* code that produced the key reports leading to the trashing of our economy and the lockdown with its consequential deaths.

Personally, I don’t think programmers necessarily criticise old code so long as it does what it claims to do. They may not like or understand the style but they can accept that it works. But here’s the thing: if it doesn’t do what it claims, then the gloves are off and they will come gunning not only for the errors but the evident development mistakes that led to and compounded them.

Dene Bebbington
Dene Bebbington
29 days ago
Reply to  dr_t

Ferguson said his code was written 13 years ago, not 30. Even so, 30 years ago undocumented code was still bad practice even if that’s how some programmers worked. Unless Ferguson can provide evidence that his original code underwent stringent testing then there’s little reason to trust it. But if it was tested properly the question still remains whether the model it implements is a reliable reflection of what would happen in reality.

Andy
Andy
28 days ago

question what his predictions for: BSE, Swine Flu, Avian Flu in the past were compared to reality.

hint: his predictions were worse than asking Mystic Meg.

forsyth
forsyth
29 days ago
Reply to  dr_t

the code was written 13 years ago, not 30.

Neil
Neil
29 days ago
Reply to  dr_t

It was a different time is no basis for a defence and your comments are a defence. They either thought their code worked or they didn’t. This shows that they didn’t. That’s all that matters. As for your fear. That’s yours to deal with. Sounds like you’ve got issues to me.

LorenzoValla
LorenzoValla
29 days ago
Reply to  dr_t

Sorry, but this is an absurd criticism. We have all seen old legacy code that needs refactoring and modernization. Anything that is mission critical for a business, in medicine, in aviation, etc., will often have far more testing and scrutiny applied to it than the actual act of writing the code because either huge amounts of money are at stake, or even more importantly, lives are at stake. For this kind of modeling to be taken seriously, a serious effort should have been made to EARN credibility.

There is simply no excuse for Ferguson, his team, and Imperial College for peddling such garbage. I COMPLETELY agree with the author here that “all academic epidemiology be defunded.” There are far superior organizations that can do this work. And even better, those organizations will generate predictions that will be questioned by others because they are not hiding behind the faux credibility of academia.

SteveB
SteveB
29 days ago
Reply to  dr_t

dr_t,

Linux is nearly 30 years old. What’s your point again?

Tom Welsh
Tom Welsh
28 days ago
Reply to  SteveB

And Linux – although legally unencumbered – is essentially a Unix-like operating system. And Unix dates back to 1970.

TerryB
TerryB
28 days ago
Reply to  Tom Welsh

I started work as a trainee programmer for a commercial company in 1971. The first thing I learned was ‘comment everywhere, document everything’.

Mike Whittaker
Mike Whittaker
16 days ago
Reply to  SteveB

And I doubt whether there is much/any original code there now.

Kirk Parker
Kirk Parker
29 days ago
Reply to  dr_t

Given the lead-in remarks here, I wonder if this commenter is just trolling us.

Given the fantastical view of software development 30 years ago, I wonder if he really knows that much about software development? Comment free code? 15,000 line single-source files? GMAB! Kernighan and Plauger were complaining about standard Pascal’s lack of separate compilation 40 years ago when they rewrote “Software Tools” as “Software Tools In Pascal, stating that while it might be better for teaching, that lack made it worse than even Fortran for large scale programming projects.

another_d
another_d
29 days ago
Reply to  dr_t

I have a PhD in biochemistry and currently do academic research in systems biology. I have about 20 years coding experience. This kind of approach to statistical analysis is very familiar. I concur with dr_t.

The stochasticity is a feature not a bug; it is used to empirically estimate uncertainty (i.e. error bars). The model *should* be run many times and the mean/average and variance of the outputs are exactly the correct approach. Highlighting the difference between two individual runs of a stochastic model is only outdone in incorrectness by highlighting a single run.

You’re effectively criticizing the failure to correctly implement a run guarantee that wasn’t important in the first place. Based on your description it sounds like the same instance of an RNG is shared between multiple threads. Your RNG then becomes conditioned on the load of the cores, because any alteration in the order in which the RNG is called from individual threads changes the values used. If it’s accurate that the problem persists in a single threaded environment then it could be the result of a single call to any well-intentioned RNG that used a default seed like date/time. The consequence is only that parameter values are conditional on one random sequence rather than another random sequence. It’s irrelevant in practice.

Whether, as commenter MFP puts it, “the variation in output is the product of ‘intented’ pseudo-randomness or the product of unintended variability in the underlying process” is irrelevant. Variability *is* randomness. So intended and unintended randomness is an meaningless distinction. Non-randomness masquerading as randomness is the only important consideration, and such a mistake results in *less* variation in the results, not more.

Dave
Dave
28 days ago
Reply to  another_d

The other thing to notice is that the difference between the two runs seems to be (almost) entirely a question of “onset”. That is, the curves are shifted in time.

You’d expect a model to be far more influenced by randomness “at the start” (where individual random choices can have a big effect), and so you shouldn’t be reading very much into the onset behaviour anyhow (c.f. nearly all the charts published show “deaths since 20 deaths” or similar, because the behaviour since the *first* death has a lot of random variation). If this is what’s actually happening (and it certainly looks like it to me) the people making the critique are being fairly disingenuous not to point it out.

To be clear: I don’t think the non-reproducibility (in a single thread environment) is good, and it’s a definite PITA in an academic environment, but I’m doubting it makes any substantial difference to the results. “80,000 deaths difference” looks to be massively overstating things, when more accurate would be “the peak death rate comes a week later” (with the final number of deaths the same).

And even if 80,000 was accurate, it’s only a 20% difference. There are lots of input variables we’d be ecstatic to know about to 20% accuracy (R0, IFR, etc.), so that level of uncertainty should be expected and allowed for anyhow.

There may be other more serious flaws in the model, and I wouldn’t be surprised if some fundamental assumptions are wrong that make a much bigger difference – we are in uncharted territory. But this particular one doesn’t look to be serious.

LorenzoValla
LorenzoValla
28 days ago
Reply to  another_d

While we can debate the reviewer’s understanding of stochasticity used in this model, there doesn’t appear to be much debate about the quality of program/model itself. Put another way, it does not matter if the correct ideas were used in the attempt to create a model if the execution was so poor that the results cannot be trusted.

As an academic, I would expect you to be appalled that the program wasn’t peer reviewed. I can only hope that your omission here does not represent a tacit understanding that such practice is customary. But I suspect such hope is misplaced.

All of the modern standards (modularization, documentation, code review, modularization, unit and regression testing, etc.) are standards because they are necessary to create a trustworthy and reliable program. This is standard practice in the private sector because when their programs don’t work, the business fails. Another difference here is that when that business fails, the program either dies with it or is reconstituted in a corrected form by another business. In an academic setting, it’s far more likely that the failure will be blamed on insufficient funding, or that more research is required, or some other excuse that escapes blame being correctly applied.

another_d
another_d
28 days ago
Reply to  LorenzoValla

I’m not going to defend coding practices as such in the academy. Just realize that modularization, documentation, code review, etc. become much more burdensome when the objective of the code is a moving target. This is how it is in a basic research environment where the how is, by definition, not known a priori. How do you plan the programming when the solution is unspecified until the very end. The solution itself is what the research scientist is after, the implementation is just a means to that end. The code is going to carry the legacy of every bad idea and dead end that was pursued during the project.

This will always be a point of friction because once the solution is found it always looks straightforward and obvious in retrospect. A professional coder can always come in after all that toil and failure and turn their nose up at all the individual suboptimal choices scattered throughout. This happens constantly; a researcher develops a novel approach that solves 99% of the unknowns and then a hotshot software engineer comes in and complains that there’s still 1% left and if s/he had written the program (now conveniently armed with all the theory that was the real product of the research) it would run ten times as fast and account for 99.1% of the uncertainty. Come on. It’s a well-known caricature in research environments.

Go ahead, review and rewrite the Ferguson group’s code. Will the program run better? Definitely, probably a lot better. Will it be easier to understand? Yes. Will the outputs be exactly the same? No. Will they differ to such an extent that the downstream political decisions fundamentally change? *Very, very unlikely.*

LorenzoValla
LorenzoValla
28 days ago
Reply to  another_d

Look, you want your opinions to have merit, then carry the burden. That’s what he rest of us have to do. Moreover, it’s very, very likely that much of the code could be modularized for reuse and that the tweaking can be done systematically in a subset of modules.

What you’re describing is akin to an actual scientist puttering around in a lab and then telling the world they have found the solution while at the same telling the world it’s too complicated to explain or document along the way, so just trust the results. Just another reason why this process fails the basic principles of the scientific method.

Leif R
Leif R
28 days ago
Reply to  another_d

Well, current agile development practices do this continuous “problem discovery” all the time, but with sustained code quality at every commit (or at the very least at every pull request).

whatever
whatever
27 days ago
Reply to  LorenzoValla

Well you clearly are fresh off the boat. Academic source code is uniformly shit. It is very rarely provided, and never “peer reviewed”. “Peer reviewal” isn’t paid, it’s an extra “voluntary activity” done in one’s free time. You seriously think scientists have so much money that they’ll spend weeks peer viewing each others’ 15K line files looking for bugs?

Another
Another
27 days ago
Reply to  whatever

That’s why the open source approach is valuable.

Steven Wittens
Steven Wittens
27 days ago
Reply to  another_d

It is simply a) wrong and b) stupid to pretend that every call to an RNG is an instance of a statistically independent and uncorrelated variable.

It is wrong because it is untrue, and it is stupid because it makes it a nightmare to maintain reproducibility of results in an evolving project.

If you want to see a serious engineering treatment of RNGs and noise in integration problems, look to computer graphics, where the difference between white and blue noise is crucial for instance, and the difference between theory and practice can be huge due to quantization and sampling effects.

Peter W
Peter W
28 days ago
Reply to  dr_t

I was programming point of sale and some financial software about 40 years ago so I agree with your point that it was very different – a few K of RAM and a few years later a massive 10 megabyte hard drive!
However Stochastic still equals random and we can’t do what we’ve done on random information.
Good luck with hiding from a Caronovirus! It was right across the UK weeks before lockdown and will, in my view, be asymptomatic in between 30 to 60% of population. My guess is as good as any guesswork produced by predictive, stochastic models!

SteveB
SteveB
28 days ago
Reply to  dr_t

“I made my own calculation at the end of January, estimating the likely mortality rate of this virus. I’m not going to tell you the number”.

So, in other words, you are just like Ferguson: You made a prediction, which might have been reasonable at the time, but you won’t show your workings (and you won’t even tell us the prediction) but now you’re going to stick with it no matter what. That’s terrible science.

The latest meta analysis of Sero studies:

https://docs.google.com/spreadsheets/d/1zC3kW1sMu0sjnT_vP1sh4zL0tF6fIHbA6fcG5RQdqSc/

shows an overall IFR in the region of 0.2%, higher in major population centres. For people under 65 with no underlying health conditions it’s more like 0.02%. Research from the well-respected Drosten in Germany suggests perhaps 1/3 of people have natural immunity anyway:

https://www.medrxiv.org/content/10.1101/2020.04.17.20061440v1

Did you factor this in?

If your estimate is different to this, it’s looking increasingly likely that your estimate was wrong. Have you back-casted your estimate, perhaps using Sweden or Belarus as references?

adp
adp
28 days ago
Reply to  dr_t

Well said, Dr_t!!! Exactly my sentiments – from someone who started FORTRAN modelling 50 years ago and has continued through today.

I would describe this as simplistic and superficial critique – not really adding anything material to the discussion.

For those who don’t agree with a stochastic modelling approach, tell me from where you have “typical lock down behavioural patterns” for a truly probabilistic model. Nonsense!!!

Go back to the drawing board and come up with some useful and materially significant comments.

David George
David George
28 days ago
Reply to  dr_t

30 years ago I was developing the Mach operating system (the thing that runs Apple computers today). Written in ‘C” I can assure you that it was multi-threaded, modularized, structured and documented. Multi-cpu computers were already commonplace if not on the desktop. Dining philosophers dates from 1965 and every computer scientist should have come across that at university for the last 50 years. Multithreading has been available to coders since at least the days of Java (1995) if not before (it doesn’t require a cpu with more than 1 core just language and/or OS support).

Andy
Andy
28 days ago
Reply to  David George

I went to university in 1988, and one of the 1st year modules was concurrent programming. We used a language called Concurrent Euclid (a pascal clone with threading) possibly because threads weren’t well supported or were awkward to use and understand in other languages. Multi-threading programming in mainstream systems has been around for a long while.

David George
David George
28 days ago
Reply to  Andy

Indeed and I remember Modula 2, another Pascal derivative, supported threads. Concurrent programming is pretty old hat really.

dr_t
dr_t
27 days ago
Reply to  David George

Yes, and I too wrote multi-threaded software in the 1980s, including a thread scheduler I wrote in 8086 assembler for the IBM PC, and used Mach on my NeXT and Logitech Modula-2 on my 80286 PC clone (though I’m pretty sure that version only implemented co-routines, not real concurrency).

But I think you may be missing the wood for the trees.

Firstly, who in 1990 had a CPU capable of executing multiple threads in hardware simultaneously? Not on PCs. Not even on workstations like NeXT, Sun, Apollo, etc. I lost track of what the minis could do by that stage, but hardly anyone was still using them. More likely, you literally had to have access to a mainframe – an even smaller set of users.

Outside computer science nerds and academia, multi-threaded programming was not in general use.

There is no benefit to an end user like an epidemiological modeler using commodity hardware in using multi-threaded code if it’s going to run on hardware capable of only executing a single thread at one time. His objective is not to show off his computer science knowledge and skills but to get the results of his simulations. Therefore, it makes absolute sense to write software which works in a single-threaded environment only. I’d even go further and say that someone writing multi-threaded software that’s less efficient than the corresponding single-threaded software, just to show they can, is demonstrating a lack of ability to put his software engineering skills to proper use in practice.

At any rate, it seems that the software is 13, or perhaps 15, years old, not 30, and by 2005-2007, multi-core CPUs were just starting to become available on commodity hardware. But they were not universal and the number of cores was still relatively low, and multi-threaded programming would still not have been in common use among end users like epidemiological modelers, and was indeed far from being in universal use even among computer science nerds.

Given the complexities involved in writing correct multi-threaded code in a shared memory model, the limited benefit this would yield, and probably the amount and nature of the legacy code he had to work with, it doesn’t surprise me that Ferguson’s code remained written for a single threaded execution environment only.

And finally, let me say this again, Ferguson is not a software engineer. He is an academic, probably with very limited, if any, coding support, and his expertise lies somewhere else entirely. If someone attacked you for your lack of expertise in the field of epidemiology, I bet you wouldn’t think much of that.

I’m still waiting for a substantive analysis of his modelling work.

As for his estimates, to quote the Daily Mail (https://www.dailymail.co.uk/news/article-8294439/ROSS-CLARK-Neil-Fergusons-lockdown-predictions-dodgy.html): “Other researchers made the same prediction.”

He was not alone in forecasting 250,000-500,000 deaths if we did not take any steps to mitigate the pandemic. My own estimates were, and still are, higher (no, I’m not going to share them with you, they are for my private use only and I’m not on any government payroll, so there). You can debate whether the measures the government adopted were the correct ones (I certainly think they were not and that they have been incompetent) but I think it’s pretty clear that if they had continued to follow the herd cull policy, or if they adopt it again, the results would be disastrous.

I think some peripheral and superficial criticism of Ferguson’s programming ability is being (mis)used to try to change government policy.

Ferguson belongs to the political left. I do not. He also broke the lockdown rules, as a result of which he had to resign. He is gone.

This, and his possible lack of programming skills, does not per se mean he is a bad epidemiological modeller.

It also does not mean his predictions are wrong.

It certainly does not mean we should adopt the herd ‘immunity’ policy (I strongly prefer ‘herd cull’, because I believe it is more accurate). We should not.

Cristiano
Cristiano
28 days ago
Reply to  dr_t

Your stupid “””model””” clearly failed to take into account asymptomatic cases (between 60 and 80%). Maybe you ought to look at Iceland since they’ve done testing on 100% of their population, albeit still using low-specificity tests. Say, how come during the same time period in the US, 10% of the population contracted influenza but only 0.3% contracted COVID-19? I thought COVID-19’s R0 was many times higher than the influenza viruses..? Pro tip: infections are WIDELY underestimated, meaning CFR is widely overestimated.

G H
G H
26 days ago
Reply to  Cristiano

Iceland. Where their total population is about the size of Oakland CA. USA. A country isolated most of the time and more so in the winter. A country that is basically 1 racial group. A country without a thriving economy of world travel, and imports and exports on a grand scale.

Sure, why not. If you are going to slice and dice based on massive disparities in population and area, I’ll use US states Oregon and Arkansas. Same # of deaths per million as Iceland. I would imagine both those states get more economic “Action” than Iceland.

Neil F
Neil F
28 days ago
Reply to  dr_t

Completely agree.

It should also be noted that this ‘bug’ has been fixed – https://github.com/mrc-ide/covid-sim/pull/121

Ilma630
Ilma630
28 days ago
Reply to  dr_t

The very fact that the model code was USED now correctly lays it open to review and criticism, the same as if it were written yesterday, particularly as it has a direct affect on the wellbeing of millions NOW. If it’s not fit for purpose, it doesn’t matter how old or new it is.

CTD
CTD
28 days ago
Reply to  dr_t

Ferguson wrote this on his Twitter account a few months back: “I wrote the code (thousands of lines of undocumented C) 13+ years ago to model flu pandemics.”

So it more like 13 years old – not 30 years old.

Mark
Mark
28 days ago
Reply to  dr_t

“30 years ago, there was no multi-threading, so it was reasonable to write programs on the assumption that they were going to run on a single-threaded CPU. ”

Well yes. I am involved in a big upgrade to academic software to multithreading for the same reason. But we are extensively testing and validating this before even considering using it. Sounds like Ferguson’s group did this, found differences that indicated the single threaded code had wrong behaviour and then ignored it. So the problem is not lack of multi-threading, its lack of good testing and responsible behaviour (not using code you know is dangerously wrong)?

Tom
Tom
28 days ago
Reply to  dr_t

Very interesting. I know nothing about the coding aspects, but have long harboured suspicions about Professor Ferguson and his work. The discrepancies between his projections and what is actually observed (and he has modelled many epidemics) is beyond surreal! He was the shadowy figure, incidentally, advising the Govt. on foot and mouth in 2001, research which was described as ‘seriously flawed’, and which decimated the farming industry, via a quite disproportionate and unnecessary cull of animals.

I agree with the author that theoretical biologists should not be giving advice to the Govt. on these incredibly important issues at all! Let alone treated as ‘experts’ whose advice must be followed unquestioningly. I don’t know what the Govt. was thinking of. All this needs to come out in a review later, and, in my view, Ferguson needs to shoulder a large part of the blame if his advice is found to have done criminal damage to our country and our economy. This whole business has been handled very badly, not just by the UK but everyone, with the honourable exception of Sweden.

Guillermo
Guillermo
28 days ago
Reply to  dr_t

Thanks for your words of wisdom (I truly think they are). Nevertheless, for me (if true) the main point of the critique is: same input -> different output, under ceteris paribus conditions. Best regards and luck in your lockdown.

Russ Nelson
28 days ago
Reply to  dr_t

None of what you say excuses the use to which this farrago of nonsense has been put.
I’m not sure that the code we can see deserves much detailed analysis, since it is NOT what Ferguson ran. It has been munged by theoretically expert programmers and yet it STILL has horrific problems.

I don’t know how you code, but I’ll stand by my software from 40 years ago, because I’m not an idiot and never was. Now … where did I put that Tektronix 4014 tape?

Eric B Rasmusen
Eric B Rasmusen
28 days ago
Reply to  dr_t

In my field, economics, 61-year-olds like me face the problem that the tools are different from what they were 30 years ago, but we old guys can’t use that as an excuse. To get published, you have to use up-to-date statistical techniques. It’s hard to teach an old dog new tricks, so most of us stop publishing.
Your point that 30 years ago, programs didn’t have to cope with multiple cores sounds legit— but the post above seems to be saying that’s not the main problem, and it wouldn’t even work if run slowly on one core.
The biggest problem, though, is not making the code public. I’m amazed at how in so many fields it’s considered okay to keep your data and code secret. That’s totally unscholarly, and makes the results uncheckable.

ClarkeP
ClarkeP
27 days ago

I think their code is available and what third parties such as the University of Edinburgh and Sue Denim (!) have found when scrutinizing it is that it’s pretty poor. Following the science sounds sensible but when they are employing such poor models it’s not sound after all.

Guillermo
Guillermo
28 days ago
Reply to  dr_t

Thanks for your words of wisdom (I truly think they are). Nevertheless, for me (if true) the main point of the critique is: same input -> different output, under ceteris paribus conditions. Best regards and luck in your lockdown.

Russ Nelson
28 days ago
Reply to  dr_t

None of what you say excuses the use to which this farrago of nonsense has been put.
I’m not sure that the code we can see deserves much detailed analysis, since it is NOT what Ferguson ran. It has been munged by theoretically expert programmers and yet it STILL has horrific problems.

I don’t know how you code, but I’ll stand by my software from 40 years ago, because I’m not an idiot and never was. Now … where did I put that Tektronix 4014 tape?

OKRickety
OKRickety
28 days ago
Reply to  dr_t

“I’ve been writing software for 41 years.”

” I have written millions of lines of code in dozens of different programming languages. ”

Pffft! 41 years is 21.5 million minutes. There is no way you have written that much code, much less in dozens of languages.

You may have some valid points but I’m not going to take the chance.

dr_t
dr_t
27 days ago
Reply to  OKRickety

First of all, you are calling me a liar which I do not appreciate.

Secondly, your analysis is nonsense and any competent software developer will know that.

I’ve never counted how many lines of source code I can write per minute, as it is a pointless metric, though it will be quite a few. But I do know that writing 1000 lines of (tested and debugged) code in a day is a slow day with plenty of time to spare for other things.

In order not to speak purely in the abstract, there was an event (a coding contest) which took place in the 1980s where I was given a problem specification and had 24 hours (during which time I also had to do things like eat and sleep) to design a programming language and implement an interpreter for it. I still have the source code and it is 2547 lines long. Yes, it is spread over 18 modular source files, the longest two of which contain 451 lines (the parser) and 335 lines (the lexer), respectively. No, it is not multi-threaded. It is probably all re-entrant and so thread-safe.

There are 365 days in a year. Programming is not what I now do, but I do still write at least some software every day. Writing 365,000 lines of source code per year is therefore not a lot. I have never counted the total number of lines of source code I have written, as I don’t see the point, but I do know it is well into the millions of lines, and because it is so very comfortably so, I don’t need to count them, and it would be very hard to do so.

A few dozen programming languages is not a lot. Anyone who has to use computers regularly and professionally over an extended period of time will pick up that many and more as a matter of necessity.

I also don’t see why this matters to you so much. I merely point this out to set the context. Again you are focusing on the superficial and ignoring the substance, which are my arguments, which you obviously haven’t bothered to read. I do realize that requires an amount of mental ability, which not everyone possesses.

Mike Whittaker
Mike Whittaker
22 days ago
Reply to  dr_t

1000 LOC per day ? Tested and debugged ? Blimey !!

Steven Wittens
Steven Wittens
27 days ago
Reply to  dr_t

I have a lot of experience with simulations and stochastic models. But I’m an engineer, not an academic.

In the field, if you cannot explain every bit of randomness in your model, you do not understand it. This has nothing to do with “modern” code or not, because 30 years ago, the requirements of responsible engineering were exactly the same as they are today. If a company builds 20 bridges, and 1 of them falls down, we don’t call that a 95% success rate, we call that irresponsible and unacceptable failure.

The multi-threading is particularly important, because of the random seed. It doesn’t matter if you generate the same sequence of random numbers every time, if the order they are used in is non-deterministic. You’re effectively randomly swapping certain pairs of numbers in your sequence, every run.

This also makes it harder to improve and refactor the code. If you have only a single random generator, then each call to random() depends on all of the ones that came before. If you instead use an independent generator for each unique aspect of the model, now it doesn’t matter which order you process them in, the randomization does not cross component boundaries. This can even be applied at the individual decision level using a carefully constructed tree of seed derivations, which maps to the actual dependencies of the model.

This requires a holistic understanding of your code, and skill in architecting data flows. That’s not going to happen in a 15k pile of academic hell-code.

silent one
silent one
27 days ago
Reply to  dr_t

A good defence of Neil Ferguson being ‘shy’ of releasing 30yr old coding but with all due respect(I mean that) he should have indicated as much, now he should release the original code with the disclaimer that it was written as you describe above.

malf
malf
27 days ago
Reply to  dr_t

“Are you saying you have a model which can predict, with certainty, how many dead people there will be 2 weeks from now?”

No one is criticizing it on that basis, the issue is that, generalize the program as a function f(x), where x is some random seed value, every other random value in the program should be a function of x, so that f(x) = y over every run. There’s no reason that it shouldn’t run this way, if the author’s contention that, call every run f_t(x), f_0(x) = a, f_1(x) = b, f_2(x) = c, but that a, b and c converge on some number, that may well be the case, but there’s a very big difficulty in determining what it is that they converge upon.

Now, if you were writing something to simulate an empirical situation, where you were able to check your algorithm against the real world, and you found that the average of a,b,c did in fact converge on the real world observations, sure, that’s a sort of validity, but it’s still needlessly bad form, tho, of course, I am well aware that the best program is one that solves the problem and costs as little as possible.

If we apply this test to Ferguson’s model, we find that his program does not, in fact, in any way approximate the data we’re getting from the real world, so it’s not like the average runs are converging on what we’re seeing on the ground—it’s as though this is some sort of financial analysis program for picking stops, and it’s saying stock X is going to go up to Y, when it really only hits 1/10th Y—I wouldn’t use that program to pick stops, would you?

So it is not merely a criticism of the form of the program, nor is it that you cannot (in an absolute sense) have a useful program where f(x) =y is not true for every run of f, but the proof of the pudding is in the eating, and what we’re being fed by Ferguson and these academic modelers (makes one wonder what “Climate Change” code looks like!) is not top-shelf, not world-class, it’s the sort of program you’d get where there is vendor lock-in to the point of tenure. I mean, can you imagine the quality of software you’d get if a programmer had _tenure_?

Shahin
Shahin
24 days ago
Reply to  dr_t

I am a machine learning engineer (a PhD) and I confess I haven’t read the paper at all and I am not interested to do so either (and I don’t have the biology or virus knowledge at all). As much as anyone else, I love the restrictions to be eased or lifted, but I try to be unbiased in giving comments on these. Apparently I haven’t read the paper so I would take my own word with a grain of salt!
When it comes to modelling, it is normal to use stochastic models (which is a very good idea to do so for such a case in my mind, in comparison to a deterministic model), and getting different results is probably because they have forgotten to pass the seed parameter to one of the random functions, not an actual bug (an educated guess).

Again, when it comes to modelling and when you are working on data, you can spend your whole life writing unit tests, but it becomes so hard (and exponentially growing) so quickly that it is not possible to cover all different cases and is not worth the time to write tests. Google has given up writing tests on machine learning models (not sure if their model can be put under an umbrella term though).

From a normal developer’s perspective, a code without tests that might give different results is not a trustworthy code (as someone working at a consultancy firm, I am dealing with normal developers every day, so trust me on this), but from an ML expert lens, this is pretty normal. To be fair, I just don’t think it is fair to attack a modelling work only by the metrics described here.

I am not suggesting the model should be adopted without further investigation (and who am I to judge with such limited knowledge), but I also think it is unfair to demonstrate the model as trustworthy by the criteria explained here either.

These are just my thoughts (and personal).

TOD
TOD
20 days ago
Reply to  dr_t

This is an utterly bizarre take.

The REASON that coding standards have changed is precisely because of the problems that is inherent in monoliths. You don’t get to say, “we shouldn’t hold his outdated code up to modern standards, it’s not fair to criticise it as if it was written today”. You instead have to say, “this moron is using 30 year old code base and totally outdated, obsolete, and rightfully abandoned coding practises.”

scuzzaman
scuzzaman
29 days ago

“On a personal level I’d actually go further and suggest that all academic epidemiology be defunded. This sort of work is best done by the insurance sector. Insurers employ modellers and data scientists, but also employ managers whose job is to decide whether a model is accurate enough for real world usage and professional software engineers to ensure model software is properly tested, understandable and so on. Academic efforts don’t have these people and the results speak for themselves.”

Perhaps even more significantly, they pay a price when they get it wrong, a check on overreaching idiocy that appears completely lacking in these “advisory” academic roles in government.

See also https://www.youtube.com/watch?v=Dn_XEDPIeU8&t=593s Nassim Nicholas Taleb on having Skin In the Game.

Edward Reeves
Edward Reeves
29 days ago

On Monday I got so angry that I created a change.org petition on this very subject.

https://www.change.org/p/never-again-the-uk-s-response-to-covid-19

KurtGeek
KurtGeek
29 days ago

It sounds like something an undergrad would knock together, but this team is supposed to be the cream of their profession.

If this is the best the best can do then to ‘suggest that all academic epidemiology be defunded’ sounds like a good plan to me. But, sadly, this is shutting the stable door after the horse has bolted.

AndrewF
AndrewF
29 days ago
Reply to  KurtGeek

and exceedingly well funded (by Gates and others). No excuses at all for old or poor code.

non whatsoever.