COVID-19: Updated data implies that UK modelling hugely overestimates the expected death rates from infection
By Nic Lewis
There has been much media coverage about the danger to life posed by the COVID-19 coronavirus pandemic. While it is clearly a serious threat, one should consider whether the best evidence supports the current degree of panic and hence government policy. Much of the concern in the UK resulted from a non-peer reviewed study published by the COVID-19 Response Team from Imperial College (Ferguson et al 2020). In this article, I examine whether data from the Diamond Princess cruise ship – arguably the most useful data set available – support the fatality rate assumptions underlying the Imperial study. I find that it does not do so. The likely fatality rates for age groups from 60 upwards, which account for the vast bulk of projected deaths, appear to be much lower than those in the Ferguson et al. study.
Metrics for COVID-19’s fatality rate and their estimation
The fatality rate from infection (IFR), by age group, is a key parameter in determining how serious a threat the COVID-19 pandemic represents. Unfortunately, the IFR is difficult to determine. It is more practical to estimate the fatality rate for cases where the COVID-19 virus can be shown, by a standard test, to be present, whether or not there are any symptoms. This is referred to as the true case fatality rate (tCFR). The tCFR will overestimate the IFR, since a proportion of people who actually have been infected may show no viral presence when tested, either because they have already fought off and cleared an infection without any noticeable symptoms, or perhaps because they have pre-existing immunity. Nevertheless, where testing has been applied to a sample of people without regard to whether they show symptoms, the tCFR may provide a reasonable, albeit somewhat biased high, estimate of the IFR.
However, determining tCFR is not simple either, since in most cases infected people with no or mild symptoms will not be tested for COVID-19. Attempts have nevertheless been made to estimate tCFR by adjusting estimates of the CFR based on symptomatic cases only (sCFR), by adjusting for the non-random nature of testing, and also for the outcome of positive test result cases not being known for some time.
The Imperial studies
The Ferguson et al. study used estimates of the IFR from another paper from the same team, Verity et al. (2020), which had been published a few days earlier on 13 March. Very helpfully, Verity et al., unlike Ferguson et al., published the computer code and data that they used.
The Verity et al. CFR estimates were derived primarily from Chinese data, which reflected non-random testing. The authors obtained age-stratified IFR estimates (in reality, tCFR estimates) by adjusting their CFR estimates using infection prevalence data for expatriates evacuated from Wuhan, all of whom were tested for COVID-19 infection. This approach involves very large uncertainties.
An alternative approach to estimating the tCFR, as a proxy for the IFR, is to use data from a large sample of people, all of whom were tested for the presence of the virus without regard to whether they showed any symptoms, with all who tested positive subsequently being isolated and the case outcome recorded. I use that approach. While the sample of expatriates evacuated from Wuhan is too small for this purpose, occupants of the Diamond Princess cruise ship do provide a suitable such sample. Moreover, the Diamond Princess sample has the advantage that it consists mainly of people from high income countries, and those requiring hospitalisation were treated in such countries.
The Diamond Princess sample may well represent the best available evidence regarding tCFR for older age groups, who are most at risk. Verity et al (2020) did analyse data from the Diamond Princess, but did not use sCFR or tCFR estimates from them for their main CFR and IFR estimates.
The Diamond Princess death toll
When Verity et al. was prepared, the final death toll was not known. The data available only ran to 5 March 2020, at which point 7 passengers had died. The authors therefore used a fitted probability distribution for the delay from testing positive to dying to estimate that those deaths would represent 56% of the eventual death toll. They accordingly therefore estimated the tCFR using a scaled figure of 12.5 deaths.
Here, I adopt the same death rate model and use the same data set, but brought up to date. By 21 March the number of deaths had barely changed, increasing from 7 to 8. Of those 8 deaths, 3 are reported to have been in their 70s and 4 in their 80s. I allocate the remaining, unknown age, person pro rata between those two age groups. As at 21 March the Verity et al. model estimates that 96% of the eventual deaths should have occurred, so we can scale up to 100%, giving an estimated ultimate death toll of 8.34, allocated as to 3.58 to the 70-79 age group and 4.77 to the 80+ age group.
Accordingly, the Verity et al central estimate for the Diamond Princess death toll, of 12.5 eventual deaths, is 50% too high. This necessarily means that the estimates of tCFR and sCFR they derived from it are too high by the same proportion.
Numbers testing positive
The Diamond Princess dataset was published by the Japan National Institute of Infectious Diseases (NIID). I use the second version published on 21 February, which gives detailed data for 619 confirmed cases, updating it for subsequent test results. Verity used the original 19 February version of NIID, which gave data for 531 confirmed cases, although they did update it for subsequent test results.
The entire set of passengers and crew, totalling 3711 individuals, was tested for COVID-19. Some 706 (19.0%) ultimately had positive test results, of whom (based on the NIID data for 619 of them) 51% were asymptomatic. The infection rate varied between 10.0% for ages under 30 years to 24.5% for ages 60+ years. The age-distribution was only known for cases included in the NIID data. Verity et al. assumed that the age distribution for the overall total of 706 confirmed cases was the same as for the 531 NIID reported cases that they used. I do the same, but using the later NIID data, with 619 reported cases. On that basis, 201.9, 266.9 and 61.6 people in respectively the 60–69, 70–79 and 80+ key age groups had positive test results.
Recall that tCFR is the eventual death toll divided by the total numbers testing positive.
My overall tCFR central estimates from the Diamond Princess 70+ age groups, where all the deaths are taken to have occurred, are 2.54% overall (8.34/328.5), with a breakdown of 1.34% for ages 70-79 (3.58/266.9) and 8.04% (4.77/61.6) for ages 80+. For the 60–69 age group, there are sufficient test-positive occupants to make a crude median estimate of the tCFR, by calculating what it would need to be for there to be a 50% probability that no 60-69 year-old has died, as appears to have been the case. The thus-implied tCFR is 0.34%. There were too few Diamond Princess occupants in age groups below 60 with positive test results to provide any useful information about the COVID-19 tCFR for those groups.
Adjustments for false negatives and underlying death rates
It appears that in about 30% of symptomatic cases the standard RT-PCR test for COVID-19 infection gives a negative result when the patient is in fact infected. There is no evidence of any COVID-19 related deaths among Diamond Princess occupants who tested negative, which would be consistent with a lower viral load being associated with a lower probability both of a positive RT-PCR test result and of eventual death. The false-negative rate may be slightly lower for Diamond Princess occupants, a few of whom may have been retested or tested by a more reliable method where they had typical COVID-19 symptoms but an initially negative RT-PCR test result. However, it seems likely that the proportion of asymptomatic infected cases that are not detected by a RT-PCR test will be somewhat higher than the 30% estimated for symptomatic cases. We accordingly adjust all the tCFR ratios estimated from Diamond Princess case data down by 30% on account of false-negative test results.
The observed deaths of Diamond Princess occupants occurred over a 45 day period, during which a non-negligible percentage of old people would be expected to die from non-COVID-19 related causes. I have accordingly deducted from the adjusted tCFR ratios an allowance for non-COVID-19 deaths for 70+ age groups, based on UK age-stratified 2018 death rates, to arrive at estimates of deaths caused by COVID-19. There are arguments for the non-COVID death rates being either higher or lower than those for the UK population of the same age, but using those death statistics appears to be a reasonable first approximation.
Comparing the Ferguson et al. UK and Diamond Princess based fatality rate estimates
The results of the foregoing analysis are set out in Table 1. The key finding is that the estimated tCFRs for Diamond Princess 60+ age groups, which must if anything overestimate their IFRs, are far lower than the corresponding IFR estimates used by Ferguson et al. in the study adopted by the UK government. Those age groups account for the vast bulk of projected deaths. For people aged 60–69, the Ferguson et al IFR estimate is 19.4 times as high as the best tCFR estimate based on Diamond Princess data, for the 70–79 age group it is 8.3 times as high, and for the 80+ age group it is 2.1 times as high.
Table 1: True Case Fatality Rates estimated from the latest Diamond Princess data compared with Infection Fatality Rates per Ferguson et al. 2019, used by the UK government
Note: An all-causes tCFR of 0.34% (and hence 0.69 notional ultimate fatalities) is assumed for age-group 60-69 despite there being no actual fatalities in that age group (see text). Expected non-COVID-19 fatalities are based on UK 2018 death rates by age group applied to the DP positive test cases, scaled by the 45 day period over which COVID-19 deaths were recorded and divided by the same 0.96 factor used to scale up the 8 actual deaths. DP= Diamond Princess.
Based on the Diamond Princess data, the COVID-19 fatality rates by age-group assumed by Ferguson et al. appear to be far too pessimistic for all 60+ age groups, where the vast bulk of fatalities are projected to occur. It is quite possible that they are also too pessimistic for younger age groups as well, but unfortunately the Diamond Princess data are uninformative about death rates below age 60.
It is notable that for all the 60+ age groups the projected excess death rates, based on Diamond Princess case data, caused by COVID-19 is substantially lower than the underlying non-COVID-19 annual death rate. Even assuming, very pessimistically, that there is no overlap between the two, and that the same proportion of each age group becomes infected, projected COVID-19 related deaths from an epidemic in which the vast bulk of the population became infected with COVID-19 are only 9% of expected annual non-COVID deaths for the 60–69 age group. For the 70–79 age group, the proportion is 20%, and for the 80+ age group it is 26%. Relative to the expected non-COVID deaths over two years, the approximate period during which very onerous restrictions are projected to be in force in the UK, these COVID-19 excess death proportions would each be reduced by almost half. In practice, a high proportion of people killed by COVID-19 will have serious underlying health conditions, and would be much more likely than average to die from non-COVID-19 causes.
Nicholas Lewis 25 March 2020
Originally posted here
Update 27 March 2020
Since writing this article it appears that two more people from the Diamond Princess have died, bringing the death toll to 10. Although, as one or two commenters have pointed out, the Worldometer website has for the last day or two been showing 10 deaths, it cites no source for that, the World Health Authority Situation Reports show no change in the number of deaths, and my original searching of the Japanese language Ministry of Health, Labor and Welfare reports showed deaths remaining at 8. However, I have now, using Japanese language keywords, found their reports for 24 to 26 March, which show an increase from 8 to 10 deaths. I have therefore updated my analysis to reflect that increase, as per the revised Table 1 below. No information as to age at death was given, so I have allocated the 2 further deaths pro rata to the 70-79 and 80+ age groups. I have not reduced the scaling up for possible future deaths to reflect the later date, as it appears that the probability distribution that Verity et al. used may not allow sufficiently for deaths occurring more than a month after testing positive for COVID-19. I have also updated the total number of positive test results from 706 to 712, in line with the latest Japanese report.
Although the tCFRs that I estimate for older age groups are now slightly higher, they are still far below the Ferguson et al. estimates. The projected excess COVID-19 deaths over annual UK deaths rates given in the final paragraph become 10%, 27% and 34% for the 60-69, 70-79 and 80+ age groups respectively. Accordingly, my original conclusions are qualitatively unchanged.
Table 1 (revised): True Case Fatality Rates estimated from the latest Diamond Princess data compared with Infection Fatality Rates per Ferguson et al. 2019, used by the UK government
 Neil M Ferguson, Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand, Imperial College COVID-19 Response Team Report 9, 16 March 2020, https://spiral.imperial.ac.uk:8443/handle/10044/1/77482
 Ferguson et al. adjusted the Verity et al. IFR estimates “to account for a non-uniform attack rate”, without giving further information about the assumed attack rates. They appear to have increased the Verity et al. IFR estimates for all 60+ age groups by approximately 19%, while making little or no changes to those for younger age groups. It is unclear whether doing so was justified.
 Verity R, Okell LC, Dorigatti I, et al. Estimates of the severity of COVID-19 disease. medRxiv 13 March 2020; https://www.medrxiv.org/content/10.1101/2020.03.09.20033357v1.
 Their sample of evacuated expatriates is 689 people, of whom on 6 tested positive for COVID-19, none of whom died.
Russell et al., “Estimating the infection and case fatality ratio for COVID-19 using age-adjusted data from the outbreak on the Diamond Princess cruise ship”, medRxiv preprint dated 9 March 2020, did use exclusively Diamond Princess data. However, the early data that they used was incomplete and their IFR (actually tCFR) estimates appear to be based on assuming 26 eventual deaths, and hence are far too high.
 Verity et al. simply noted that that figures derived from the Diamond Princess data set were “consistent” with their main estimates, meaning that they fell within their very wide main estimate uncertainty ranges.
 Field Briefing: Diamond Princess COVID-19 Cases, 20 Feb Update. National Institute of Infectious Diseases, Japan https://www.niid.go.jp/niid/en/2019-ncov-e/9417-covid-dp-fe-02.html
 Using the same sources as given in Verity et al., for dates from 20 February on. Doing so yields a total of 704 positive test results, which I adjust to equal the cumulative total of 706 results stated in the final, 2 March 2020, update.
 A simple estimate based on the binomial distribution suggests that the 97.5% upper uncertainty bound is approximately double this figure.
 https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathregistrationssummarytablesenglandandwalesdeathsbysingleyearofagetables and https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fpopulationestimatesforukenglandandwalesscotlandandnorthernireland%2fmid20182019laboundaries/ukmidyearestimates20182018ladcodes.xls
 Moreover, I am unable to reproduce the Ferguson et al. estimate of 510,000 deaths in the ‘Do nothing’ case, based on their estimate of 81% of the population being infected and using their IFRs. Note that the Ferguson et al. IFR estimates assume that, as was the case for the infected Diamond Princess occupants, health systems have not been overwhelmed by COVID-19 cases.
 Based on the Ferguson et al. assumption that 81% of the population eventually becomes infected.