On this put up, fourth in a collection (earlier posts: Half 1, Half 2, Half 3), I’ll lastly discuss some substantive conclusions of the next paper:

Kissler, Tedijanto, Goldstein, Grad, and Lipsitch, Projecting the transmission dynamics of SARS-CoV-2 by the postpandemic interval, Science, vol. 368, pp. 860-868, 22 Could 2020 (launched on-line 14 April 2020). The paper can also be obtainable right here, with supplemental supplies right here.

In my earlier put up, I talked about how the authors estimate the replica numbers (*R*) over time for the 4 widespread chilly coronavirus, and the way these estimates might be improved. On this put up, I’ll discuss how Kissler et al. use these estimates for *R* to mannequin immunity and cross-immunity for these viruses, and the seasonal results on their transmission. These modelling outcomes inform the later components of the paper, by which they think about varied eventualities for future transmission of SARS-CoV-2 (the coronavirus answerable for COVID-19), whose traits could maybe resemble these of those different coronaviruses.

The conclusions that Kissler et al. draw from their mannequin don’t appear to me to be effectively supported. The issues begin with the artifacts and noise within the proxy knowledge and *R* estimates, which I mentioned in Half 2 and Half 3. These points with the *R* estimates induce Kissler et al. to mannequin smoothed *R* estimates, which ends up in autocorrelated errors that invalidate their assessments of uncertainty. The noise in *R* estimates additionally leads them to restrict their mannequin to the 33 weeks of “flu season”; consequently, their mannequin can’t presumably present a full evaluation of the diploma of seasonal variation in *R*, which is one matter of significant significance. The conclusions Kissler et al. draw from their mannequin relating to immunity and cross-immunity for the betacoronavirues are additionally flawed, as a result of they ignore the consequences of aggregation over the entire US, and since their mannequin is unrealistic and inconsistent in its therapy of immunity throughout a season and at the beginning of a season. A aspect impact of this unrealistic immunity mannequin is that the partial data on seasonality that their mannequin produces is biased.

After justifying these criticisms of Kissler et al.’s outcomes, I’ll discover what may be realized utilizing higher incidence proxies and *R* estimates, and higher fashions of seasonality and immunity.

The code I take advantage of (written in R) is obtainable right here, with GPLv2 licence.

## The mannequin Kissler et al. match, and its interpretation

I’ll begin by presenting what Kissler et al. did of their paper — how they modelled *R* for the widespread chilly coronaviruses, how they interpreted the outcomes of becoming their mannequin, and the way they suppose it would relate to SARS-CoV-2 (the virus inflicting COVID-19). The paper focuses on the 2 betacoronaviruses that trigger widespread colds — OC43 and HKU1 — since SARS-CoV-2 can also be a betacoronavirus.

Kissler et al. mannequin log(*R*) for every virus, for every of the 33 weeks of every yr’s “flu season” — from week 40 of a yr (early October) to week 20 of the following yr (mid Could). Right here, *R* is the smoothed estimate of *R _{u}*, as I mentioned in Half 3 of this collection (I’ll seek advice from this as simply

*R*, for brevity, and since I’ll think about different

*R*estimates later). They use a easy linear regression mannequin for log(

*R*), match by least squares, with the next covariates:

- Indicators for every season and virus, whose coefficients will give the values of
*R*for every virus at the beginning of every season. - A set of 10 spline segments, used to mannequin the change in R attributable to seasonal results (assumed the identical for all years, and for each viruses).
- The cumulative incidence of the virus for which log(R) is being modelled, throughout the present season (utilizing the chosen incidence proxy). This cumulative incidence is seen as a measure of the “depletion of susceptibles” attributable to immunity.
- The cumulative incidence of the opposite virus, throughout the present season. That is seen as permitting the mannequin to seize cross-immunity attributable to an infection by the opposite widespread chilly betacoronavirus.

Right here Fig. 1 from their paper, displaying the consequences of those elements of the mannequin, utilizing the regression coefficients fitted to the info (click on for bigger model):

These plots present the multiplicative results on *R*, derived from the additive results on log(*R*) within the linear regression mannequin. The gold line reveals the seasonal impact. The crimson and blue traces are the “immunity” results, for HKU1 and OC43 respectively. (Therefore, for HKU1, crimson is the impact of immunity to HKU1, and the blue line is the impact of cross-immunity from an infection with OC43, whereas for OC43, that is reversed, with crimson displaying the cross-immunity impact from HKU1.) The black dot on the left is the preliminary *R* worth, relative the worth for HKU1 in 2014-2015. The plots embrace uncertainty bands for these results.

I’ve replicated these outcomes (aside from the uncertainty bands). Under are my plots analogous to these above (click on for bigger model):

The 2 columns on the left above have plots of the multiplicative results of every mannequin part, comparable to Kissler et al.’s plots. The 2 proper columns present plots of the additive results on log(*R*). The plots above additionally embrace absolutely the degree of the preliminary worth of R (inexperienced dot), the whole impact from all elements within the mannequin (crimson line), and the *R* estimates that the mannequin was fitted to (pink dots, a few of which can be off the plot). I plot the impact of cumulative seasonal incidence of the identical virus as a black line, and the “cross-immunity” impact from cumulative seasonal incidence of the opposite virus as a gray line.

Kissler et al. truncated their plots at week 30 of flu season (as I mentioned in Half 1). My plots above go to the tip of the season (week 33).

For nearer examination, right here is the portion of Kissler et al.’s Fig. 1 for HKU1 within the 2017-2018 season:

### Mannequin fitted by Kissler et al.

Right here is the corresponding plot from my replication:

### Replication of Kissler et al.’s mannequin

One can see an actual correspondence, as much as week 30. One also can see that truncating the plot at week 30 leaves out a big rise in Kissler et al.’s estimate of the seasonal impact on *R* within the final three weeks.

Listed below are the regression coefficients from the mannequin slot in my replication:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.45088 0.10610 0.22478 0.16287 seasonal_spline2 0.24535 0.08779 0.15137 0.23198 seasonal_spline3 0.48011 0.08539 0.16118 0.14166 seasonal_spline4 0.07054 0.08333 0.14408 0.16536 seasonal_spline5 0.35121 0.10858 0.16415 0.15480 seasonal_spline6 0.04285 0.14018 0.17968 0.20912 seasonal_spline7 0.14935 0.15413 0.19551 0.19489 seasonal_spline8 0.09484 0.16603 0.19778 0.25869 seasonal_spline9 0.25369 0.16668 0.20295 0.20424 seasonal_spline10 0.29357 0.15608 0.20286 0.21917 OC43_same -0.00187 0.00051 0.00044 0.00061 OC43_other -0.00076 0.00044 0.00039 0.00056 HKU1_same -0.00251 0.00044 0.00044 0.00060 HKU1_other -0.00133 0.00051 0.00048 0.00063 OC43_season_2014 0.06825 0.05986 0.12802 0.14079 OC43_season_2015 -0.01195 0.06320 0.13376 0.13974 OC43_season_2016 0.04066 0.06053 0.12648 0.14115 OC43_season_2017 0.06951 0.06569 0.12982 0.14785 OC43_season_2018 0.06298 0.06090 0.12586 0.13743 HKU1_season_2014 0.12779 0.05986 0.17615 0.16235 HKU1_season_2015 0.01084 0.06320 0.14456 0.15355 HKU1_season_2016 0.15110 0.06053 0.15803 0.15721 HKU1_season_2017 0.21279 0.06569 0.13662 0.14642 HKU1_season_2018 0.19568 0.06090 0.15752 0.15977 Residual customary deviation for OC43: 0.093 Residual ACF: 1 0.66 0.18 -0.05 Zero 0.06 -0.07 -0.26 -0.36 -0.3 -0.17 -0.08 Residual customary deviation for HKU1: 0.2061 Residual ACF: 1 0.55 -0.08 -0.29 -0.11 0 -0.07 -0.18 -0.13 0.02 0.14 0.08

Right here, “similar” and “different” seek advice from the immune results of the identical or the opposite virus. The coefficients above match these in Desk S1 within the supplementary data for the Kissler et al. paper, besides they use a distinct (however equal) parameterization by which values for OC43 and seasons apart from 2014 are relative to HKU1/2014.

The usual deviation of the residuals of this mannequin for HKU1 is over twice that for OC43. This can be a drawback, because the mannequin suits knowledge for the 2 viruses collectively, assuming a standard worth for the residual customary deviation. The error bands in Kissler et al.’s plot are discovered utilizing a process that’s supposed to regulate for differing residual variances — both the residuals having increased variance for HKU1 than OC43, or to their variance various with time. These adjusted customary errors are proven within the `HC3.se` column. This adjustment has a considerable impact on the usual errors for the spline elements, however not a lot impact on the usual errors for the immunity and cross-immunity coefficients. (I’ll speak concerning the `NW.se` column later.)

The 95% confidence bands for the immunity results in Kissler et al.’s Fig. 1 correspond roughly to plus-or-minus twice the HC3 customary errors of the coefficients as proven above. For instance, the coefficient of -0.00133 for cross-immunity of HKU1 to OC43 has an ordinary error of 0.00048, so the 95% confidence interval for the coefficient is roughly -0.00229 to -0.00037. The cumulative incidence of OC43 on the finish of the 2017-2018 season was about 160. Exponentiating to get the multiplicative cross-immunity impact provides a spread of exp(-0.00229*160) to exp(-0.00037*160), which is about 0.69 to 0.94, matching the end-of-season vary in Kissler et al.’s plot.

Kissler et al. interpret the outcomes of this mannequin as follows:

As anticipated, depletion of susceptibles for every betacoronavirus pressure was negatively correlated with transmissibility of that pressure. Depletion of susceptibles for every pressure was additionally negatively correlated with the R

_{e}of the opposite pressure, offering proof of cross-immunity. Per incidence proxy unit, the impact of the cross-immunizing pressure was at all times lower than the impact of the pressure itself (desk S1), however the total affect of cross-immunity on the R_{e}might nonetheless be substantial if the cross-immunizing pressure had a big outbreak (e.g., HCoV-OC43 in 2014–2015 and 2016–2017). The ratio of cross-immunization to self-immunization results was bigger for HCoV- HKU1 than for HCoV-OC43, suggesting that HCoV-OC43 confers stronger cross-immunity. Seasonal forcing seems to drive the rise in transmissibility at the beginning of the season (late October by early December), whereas depletion of susceptibles performs a relatively bigger function within the decline in transmissibility towards the tip of the season. The strain-season coefficients have been pretty constant throughout seasons for every pressure and lacked a transparent correlation with incidence in prior seasons, in keeping with experimental outcomes displaying substantial waning of immunity inside 1 yr (15).

Later, discussing the outcomes of utilizing the identical knowledge to suit a SEIRS mannequin (which I’ll talk about in a future put up), they remark that

In response to the best-fit mannequin parameters, the R

_{0}for HCoV-OC43 and HCoV-HKU1 varies between 1.7 in the summertime and a pair of.2 within the winter and peaks within the second week of January, in keeping with the seasonal spline estimated from the info. Additionally in settlement with the findings of the regression mannequin, the period of immunity for each strains within the best-fit SEIRS mannequin is ~45 weeks, and every pressure induces cross-immunity towards the opposite, though the cross-immunity that HCoV-OC43 an infection induces towards HCoV-HKU1 is stronger than the reverse.

The summer season/winter *R _{0}* ratio of 1.7/2.2=0.77 on this SEIRS mannequin does certainly carefully match the low/excessive ratio of seasonal results within the regression mannequin, which may be seen above to be roughly 1.1/1.45=0.76. And a period of 45 weeks for immunity is in keeping with the regression mannequin’s assumption that immunity doesn’t decline through the 33 week flu season, however is generally passed by the beginning of the following season (19 weeks later), as evidenced by the vary of coefficients for

*R*at the beginning of a season (which might be affected by the diploma of lingering immunity from the earlier season) being solely (-0.01195,0.06951) for OC43 and (0.01084,0.21279) for HKU1 (although the width of the latter vary appears not fully negligible).

I summarize the conclusions Kissler et al. draw from this regression mannequin as follows:

- Each betacoronaviruses induce immunity towards themselves, however this immunity is generally gone lower than one yr later.
- Each viruses additionally confer substantial cross-immunity towards the opposite (bigger for OC43 towards HKU1 than the reverse), although this isn’t as sturdy as self-immunity.
- There’s a seasonal impact on transmission, whose magnitude reduces
*R*by an element of about 0.76 in off-season in comparison with the height.

The mannequin additionally produces an estimated curve for the seasonal impact (inside flu season), however aside from pointing to its rise at the beginning of flu season as initiating seasonal outbreaks, Kissler et al. maybe don’t connect nice significance to the detailed form of this estimated curve, because the error bands they present round it are fairly giant. Nevertheless, they do use this mannequin as proof relating to the general magnitude of the seasonal impact, which, to the extent it may be indicative of seasonality for COVID-19, is of nice sensible significance.

## Are these conclusions from the mannequin legitimate?

Largely no, I feel. I don’t imply that the conclusions are essentially all *false* (although some could also be), simply that, for a number of causes, modelling the info that they use in the way in which that they do doesn’t present good proof for his or her conclusions.

A few of the issues originate within the proxy they use for incidence of coronavirus an infection. As I mentioned in Half 2, their proxy, which I name proxyW, has artifacts attributable to holidays, some factors affected by outliers, a difficulty with zero measurements, and noise that’s greatest decreased by smoothing the incidence proxy relatively than later. I proposed a number of proxies that appear higher.

These issues with the incidence proxy affect estimation of weekly *R* values for every virus. I argued in Half 3 that modelling *R _{t}* is healthier than modelling

*R*, as a result of

_{u}*R*is dependent upon the longer term, obscuring the influences on transmissibility of the virus.

_{u}Moreover, the dealing with of immunity within the regression mannequin Kissler et al. match can be applicable for modelling *R _{t}* however not

*R*— the cumulative seasonal incidence of every virus

_{u}*as much as the present time*is used as a covariate within the mannequin, however

*R*will probably be affected by the cumulative incidence (and consequent immunity) at later occasions. As a result of the generative interval distribution,

_{u}*g*, that’s used extends solely 19 days (about two weeks) into the longer term, the impact of this modelling inaccuracy is not going to be enormous, however it would are likely to bias the estimation of immunity results considerably (most likely in direction of overestimating the immunity impact).

Since *R _{u}* averages values of

*R*at future occasions, as a result of system

_{t}*R*(see Half 3), it’s much less noisy than

_{u}= ∑_{t}g(t-u)R_{t}*R*, which can have been a motivation to make use of it. To additional cut back noise, Kissler et al. used a three-week shifting common model of log(

_{t}*R*).

_{u}If we attempt modelling *R _{t}* with Kissler et al.’s regression mannequin, here’s what we get for HKU1 in 2017-2018:

### Mannequin for *R*_{t} utilizing proxyW

_{t}

The artifacts and noise within the *R _{t}* estimates produce clearly ridiculous outcomes, with sturdy destructive cross-immunity, and a seasonal impact that has no resemblance to what we’d count on.

The outcomes of making use of Kissler et al.’s mannequin to the unsmoothed *R _{u }*will not be so dangerous. Under is a comparability of Kissler et al.’s mannequin for smoothed

*R*values versus the identical mannequin for unsmoothed

_{u}*R*values, utilizing the 2017-2018 season for example. Right here and beneath, I’ll use plots for log(

_{u}*R*) relatively than

*R*itself, since it’s log(

*R*) that the regression mannequin works with.

### Mannequin for smoothed *R*_{u} utilizing proxyW

_{u}

### Mannequin for unsmoothed *R*_{u} utilizing proxyW

_{u}

The variations are noticeable, however not drastic. The rise within the seasonal impact at the beginning of the season is steeper with unsmoothed *R _{u}* values, so the entire seasonal curve is shifted up (partly compensated for by the inexperienced dot shifting down). The rise on the finish of the season is even bigger than earlier than. The immunity and cross-immunity results are barely bigger.

The outcomes utilizing the unsmoothed *R _{u}* values are a greater information to what this mannequin is saying. These outcomes are considerably sudden — relatively than the sharp rise within the seasonal impact seen on the finish of flu season, one would possibly as an alternative count on a drop, and the very steep rise at first of the flu season can also be odd, since nothing of obvious observe occurs then (colleges open earlier). These oddities ought to immediate one to critically look at each the info and the assumptions behind the mannequin. Considerably obscuring the issues through the use of smoothed

*R*values shouldn’t be useful.

_{u}Additional perception comes from wanting intimately on the estimates discovered utilizing unsmoothed *R _{u}* values:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.57484 0.18245 0.31483 0.21023 seasonal_spline2 0.31601 0.15096 0.25117 0.29209 seasonal_spline3 0.58870 0.14684 0.20252 0.17467 seasonal_spline4 0.13309 0.14331 0.18339 0.20336 seasonal_spline5 0.48668 0.18671 0.18977 0.19179 seasonal_spline6 0.14439 0.24106 0.22037 0.23593 seasonal_spline7 0.27262 0.26504 0.23446 0.24860 seasonal_spline8 0.24716 0.28551 0.24667 0.27696 seasonal_spline9 0.31697 0.28663 0.25691 0.25880 seasonal_spline10 0.48711 0.26841 0.26182 0.27002 OC43_same -0.00204 0.00088 0.00063 0.00060 OC43_other -0.00090 0.00076 0.00056 0.00056 HKU1_same -0.00274 0.00076 0.00068 0.00066 HKU1_other -0.00145 0.00088 0.00074 0.00063 OC43_season_2014 -0.00951 0.10294 0.14988 0.16930 OC43_season_2015 -0.10040 0.10868 0.15427 0.16894 OC43_season_2016 -0.04292 0.10410 0.14776 0.16883 OC43_season_2017 -0.00409 0.11297 0.15312 0.17150 OC43_season_2018 -0.01522 0.10473 0.14967 0.16304 HKU1_season_2014 0.04350 0.10294 0.21745 0.21514 HKU1_season_2015 -0.07347 0.10868 0.16365 0.18131 HKU1_season_2016 0.05114 0.10410 0.20899 0.18428 HKU1_season_2017 0.14614 0.11297 0.16333 0.17836 HKU1_season_2018 0.11115 0.10473 0.20682 0.19546 Residual customary deviation for OC43: 0.1415 Residual ACF: 1 0.28 -0.23 -0.04 0.05 0.02 0.04 -0.15 -0.22 -0.12 -0.08 -0.03 Residual customary deviation for HKU1: 0.3622 Residual ACF: 1 0.17 -0.37 -0.17 0.03 0.08 -0.01 -0.13 -0.09 0.01 0.11 0.08

Discover that the estimate for cross immunity from OC43 to HKU1 (given by `HKU1_other` above) has elevated barely (in comparison with utilizing smoothed *R _{u}* values), however its magnitude is now lower than twice its customary error, and therefore shouldn’t be vital on the typical 0.05 degree — that’s, by the standard criterion, we can’t be assured that there truly is any cross-immunity. The estimate for cross immunity within the different path, although once more bigger, can also be much less vital (it was marginal earlier than), as a result of its customary error is now bigger.

When correctly computed, the usual errors of estimates within the mannequin for smoothed *R _{u}* values that Kissler et al. match are additionally giant sufficient that zero cross immunity (in both path) is believable. The residuals for that mannequin have vital autocorrelation at lag one (0.66 for OC43, 0.55 for HKU1), which tends to will increase the usual errors. The

`NW.se`column has customary errors that account for autocorrelation (and differing variances), as discovered by the Newey-West technique (carried out within the “sandwich” R bundle). With these bigger customary errors, the estimate for cross-immunity from HKU1 to OC43 shouldn’t be vital. The estimate for cross-immunity from OC43 to HKU1 is greater than twice the Newey-West customary error, however not by a lot. (Strategies for estimating customary errors when residuals are autocorrelated are considerably unreliable, so one shouldn’t have excessive confidence in marginal outcomes.)

The autocorrelations of the residuals are at the least partly the results of becoming smoothed values for *R _{u}* — the smoothed values are averages over three weeks, with two of those weeks being widespread to 2 adjoining values, inducing autocorrelation. The estimated lag-one autocorrelations for residuals within the mannequin of unsmoothed

*R*values are a lot much less (0.28 for OC43, 0.17 for HKU1).

_{u}Because the estimated cross-immunity results are at most marginally vital, the conclusions Kissler et al. draw about cross-immunity don’t appear justified.

The estimated curve for the seasonal impact on transmission appears relatively unusual. Maybe the abrupt rises at first and finish of the flu season are artifacts of the dealing with of end-points by the spline process used (I’ve not investigated this sufficient to make certain). Or maybe a wierd outcome shouldn’t be seen as stunning, because the error bands for the seasonal impact curve in Kissler et al.’s Fig. 1 are fairly giant, and may be even bigger if the impact of autocorrelation within the residuals have been accounted for. However on that view, the uncertainty within the inferred seasonal impact is so giant that no helpful conclusions may be drawn.

Kissler et al.’s conclusion that every virus induces self-immunity may appear justified, because the estimates for `OC43_same` and `HKU1_same` are a lot greater than twice their customary errors, when utilizing both the smoothed or unsmoothed values for *R _{u}*, and with any of the strategies used to compute customary errors. That these viruses induce some immunity is effectively supported from different proof in any case. Nevertheless, warning is warranted when utilizing this mannequin to guage the power of immunity, or its period. Utilizing phrases resembling “depletion of susceptibles” and “self-immunity” in reference to covariates and regression coefficients in a mannequin doesn’t imply that these options of the mannequin

*truly correspond to these ideas in actuality*.

The info being modelled is aggregated over the complete United States. On the NREVSS regional web page, one can see the regional breakdown for optimistic widespread chilly coronavirus checks within the final two years. For OC43, the peaks within the 2018-2019 season within the 4 areas happen on January 12 (Northeast), January 26 (Midwest), January 5 (South), and December 22 (West) — a selection of 5 weeks. The “depletion of susceptibles” within the West could have little affect on transmission within the Northeast. There should be geographical variation on a finer scale as effectively. It’s subsequently not secure to interpret the regression coefficient on “depletion of susceptibles” as a sign of power of immunity.

The mannequin utilized by Kissler et al. assumes that “immunity” persists undiminished throughout flu season, however that it’s largely passed by the beginning of the following season. This appears implausible — why would immunity that had lasted for 20 or extra weeks through the flu season be virtually fully gone after one other 19 weeks? Extra plausibly, the “immunity” (i.e., the regression coefficient on “depletion of susceptibles”) truly begins to say no through the flu season, not as a result of the immune response of contaminated people declines, however as a result of the locus of the outbreak strikes from initially contaminated geographical areas to different areas, whose susceptibles haven’t but been depleted. If a mannequin doesn’t embrace such a decline in “immunity”, the impact the decline would have had could also be be produced as an alternative by one other part of the mannequin. This will likely clarify why the seasonal impact goes up after week 22 of the season (early March) — as an alternative choice to the “immune” impact turning into much less. If that is occurring, then the magnitude of the seasonal impact in Kissler et al.’s mannequin underestimates the true seasonal impact.

## Utilizing higher R estimates from higher proxies

A method of attempting to get extra helpful outcomes than Kissler et al. do is to make use of higher *R* estimates, primarily based on higher proxies for incidence of the betacoronaviruses, as I mentioned in Half 2 and Half 3 of this collection.

The proxyAXss-filter *R* estimates differ from the proxyW estimates by:

- Utilizing the ratio of ILI visits to non-ILI visits as a proxy for Influenza-Like Sickness relatively than the (weighted) ratio of ILI visits to complete visits.
- Eradicating artifacts attributable to holidays by interpolating from neighboring non-holiday weeks.
- Changing some outliers in p.c optimistic checks for coronaviruses by affordable values.
- Changing zero values for p.c optimistic checks by affordable non-zero values.
- Smoothing the share of optimistic checks for coronaviruses attributable to every virus.
- Making use of a smoothing filter to the proxies after interpolating to each day values.

Modelling the unsmoothed *R _{u}* estimates from proxyAXss-filter provides the next outcomes for 2017-2018:

### Mannequin for unsmoothed *R*_{u} utilizing proxyAXss-filter

_{u}

The sharp rises within the seasonal impact at first of the season is lower than for the mannequin of unsmoothed *R _{u}* values with proxyW (proven above). There’s some autocorrelation at lag one within the residuals (0.53 for each viruses), maybe as a result of smoothing accomplished for the proxies (although

*R*itself additionally entails smoothing). After accounting for this autocorrelation, the 95% confidence interval for the cross-immunity impact of OC43 on HKU1 overlaps with zero, and as may be seen (gray line) above, the estimate is nearer to zero than for the earlier fashions. The estimate for cross-immunity from HKU1 to OC43 is about the identical as earlier than, and is at greatest marginally vital.

_{u}Since proxyAXss-filter has much less noise, considerably affordable outcomes are obtained when modelling *R _{t}*:

### Mannequin for *R*_{t} utilizing proxyAXss-filter

_{t}

I described a number of different approaches to decreasing noise within the proxies in Half 2. Here’s a plot from a mannequin utilizing the proxyDn estimates for *R _{t}*:

### Mannequin for *R*_{t} utilizing proxyDn

_{t}

The estimates for this mannequin are as follows:

Estimate Std. Error HC3.se NW.se seasonal_spline1 0.08769 0.06043 0.06199 0.03482 seasonal_spline2 0.14846 0.04995 0.04751 0.04079 seasonal_spline3 0.31573 0.04869 0.05294 0.05572 seasonal_spline4 0.08249 0.04499 0.05382 0.06426 seasonal_spline5 0.25449 0.05238 0.06374 0.08768 seasonal_spline6 -0.12442 0.06334 0.07115 0.11831 seasonal_spline7 -0.09949 0.06855 0.07913 0.13046 seasonal_spline8 -0.20727 0.07549 0.08118 0.13549 seasonal_spline9 -0.03293 0.07526 0.07716 0.12224 seasonal_spline10 -0.01487 0.06734 0.07468 0.13326 OC43_same -0.00123 0.00019 0.00018 0.00037 OC43_other -0.00047 0.00020 0.00026 0.00055 HKU1_same -0.00252 0.00020 0.00022 0.00040 HKU1_other 0.00002 0.00019 0.00018 0.00035 OC43_season_2014 0.14325 0.03397 0.04386 0.05663 OC43_season_2015 0.11544 0.03504 0.03894 0.04554 OC43_season_2016 0.11594 0.03423 0.03661 0.02958 OC43_season_2017 0.15530 0.03713 0.04109 0.05545 OC43_season_2018 0.18876 0.03579 0.03744 0.03144 HKU1_season_2014 0.03563 0.03397 0.04567 0.07521 HKU1_season_2015 0.11766 0.03504 0.04319 0.05390 HKU1_season_2016 0.02589 0.03423 0.03686 0.03899 HKU1_season_2017 0.29256 0.03713 0.03947 0.04218 HKU1_season_2018 0.10250 0.03579 0.03600 0.03676 Residual customary deviation for OC43: 0.09 Residual ACF: 1 0.63 0.31 0.22 0.12 0.03 -0.03 -0.12 -0.19 -0.22 -0.29 -0.34 Residual customary deviation for HKU1: 0.0922 Residual ACF: 1 0.63 0.31 0.21 0.09 -0.03 -0.09 -0.15 -0.17 -0.14 -0.14 -0.15

There’s vital autocorrelation, as a result of smoothing of proxies, so the Newey-West customary errors ought to be used.

The coefficient for cross-immunity from OC43 to HKU1 is just about zero. The coefficient for cross-immunity from HKU1 to OC43 solely about one customary error away from zero. This confirms {that a} mannequin of the type utilized by Kissler et al. supplies no proof of cross-immunity between the 2 betacoronaviruses, when it’s correctly match to cleaned-up knowledge. There’s actually no assist for the declare by Kissler et al. that cross-immunity from OC43 towards HKU1 is stronger than the reverse.

That this mannequin fails to indicate cross-immunity shouldn’t be actually stunning. As a result of the info is aggregated over all areas of the US, it’s fairly doubtless that a lot of these contaminated with one of many viruses are in areas the place the opposite virus shouldn’t be presently spreading. In the long run, each viruses would possibly induce immunity in individuals in each area, and cross-immunity between may be evident in that context, however this mannequin is delicate solely to cross-immunity on a time scale of some months.

For HKU1, the estimated coefficients of the symptoms of the 5 seasons differ fairly a bit. Furthermore, the most important coefficient, 0.29256 for the 2017-2018 season, follows the season with the smallest incidence for HKU1, as seen right here:

The second highest coefficient, 0.11766 for the 2015-2016 season, follows the season with the second-smallest incidence. Opposite to the conclusions of Kissler et al., it appears fairly believable that these seasonal coefficients replicate the extent of constant immunity from the earlier season.

## Constructing a greater mannequin

To be taught extra from this knowledge, we have to use a greater mannequin than Kissler et al.’s. I’ll confine myself right here to fashions that largely use the identical methods that Kissler et al. use — that’s, linear regression fashions for log(*R*), match by least squares.

Getting a greater thought of the seasonal impact on transmission is essential. To do that, the mannequin should cowl the entire yr, together with summer season months when the seasonal discount in transmission is probably at its strongest. The bias in estimation of the seasonal impact that may outcome from utilizing an unrealistic “immunity” mannequin additionally must be eradicated. After all, a great mannequin for immunity and cross-immunity can also be of nice curiosity in itself.

It’s simple to increase the “flu season” to the entire yr — simply set the season begin to week 28 (early July) and the season finish to week 27 of the following yr. If that’s all that’s accomplished, nonetheless, the outcomes are ridiculous, because the mannequin then assumes that immunity abruptly jumps to zero between week 27 and week 28.

I’ve tried a number of variations on extra practical fashions of immunity. A few of the extra complicated fashions that I attempted didn’t work as a result of their a number of components can’t simply be recognized from the info obtainable, at the least when working within the least-squares linear regression framework.

For instance, I attempted to appropriately mannequin that immunity ought to be represented by a time period within the expression for log(*R*) of the shape log(1-*d*/*T*), the place *d* is a proxy for the variety of immune individuals, and *T* is the whole inhabitants, scaled by the unknown scaling issue for the proxy *d*. This corresponds the a multiplicative issue of (1-*d*/*T*) for *R*, which represents the decline within the inclined inhabitants. If *d*/*T* is near zero, log(1-*d*/*T*) will probably be roughly –*d*/*T*, and the mannequin may be match by together with *d* as a covariate within the linear regression, with the regression coefficient discovered comparable to -1/*T*. That is what’s implicitly accomplished in Kissler et al.’s mannequin.

However when contemplating long-term immunity, assuming that *d*/*T* is near zero appears unrealistic. I subsequently tried together with log(1-*d*/*T*) as a covariate, anticipating its regression coefficient to be shut to at least one, after which selected *T* by a search process to attenuate the usual deviation of the mannequin residuals. Sadly, there are too many ways in which parameters on this mannequin can trade-off towards one another (e.g., *T* and the regression coefficient for log(1-*d*/*T*)), producing outcomes by which these parameters don’t tackle their supposed roles (even when the info is modelled effectively by way of residual customary deviation). I subsequently deserted this method, and am as an alternative hoping that even when *d*/*T* shouldn’t be small, its variation could also be sufficiently small that log(1-*d*/*T*) is approximatley linear within the area over which *d*/*T* varies.

The immunity mannequin I attempted that appears to work greatest assumes that there’s each an “immunity” of brief period (truly attributable to the info being aggregated over areas the place an infection spreads at totally different occasions) and long-term immunity (which is meant to symbolize widespread immunity within the inhabitants). Each types of immunity are modelled utilizing exponentially-weighted sums of the incidence proxies for every virus, however with totally different charges of decay in “immunity”. For brief-term immunity, the incidence *ok* weeks in the past is given weight 0.85^{ok}, which corresponds to decay of immunity by an element of two after Four weeks. For long-term immunity, the load for the incidence *ok* weeks in the past is 0.985^{ok}, which corresponds to decay of immunity by an element of two after 46 weeks. These two exponentially weighted sums for a virus are included as covariates within the regression mannequin for that virus, and the long-term sum is included as a covariate for the opposite virus, permitting cross-immunity to be modelled.

The best way seasonality is modelled additionally wants to alter, to keep away from a discontinuity on the finish of 1 full-year season and the start of the following. A periodic spline can be a chance, however the `bs` perform utilized in Kissler et al.’s R mannequin doesn’t have that choice. I used a Fourier illustration of a periodic seasonality perform, together with covariates of the shape sin(2π*fy*) and cos(2π*fy*), the place *y* is the time in years because the begin of the info collection, and *f* is 1, 2, 3, 4, 5, or 6. This seasonality mannequin is widespread to each viruses.

Lastly, I included a spline to mannequin long run developments (e.g., change in inhabitants density), which can also be widespread to each viruses, and a total offset for every virus (similar for all years). The long-term development spline was restricted to five levels of freedom in order that it will be unable take over the modelling of seasonality or immunity results.

Listed below are the outcomes of becoming this mannequin utilizing proxy knowledge proxyEn-daily, which is the proxy with probably the most smoothing / least noise (see Half 3), for 2017-2018 (begin of July to finish of June):

### Full-year mannequin for *R*_{t} utilizing proxyEn-daily

_{t}

The dotted vertical traces present the beginning and finish of “flu season”, to which the earlier fashions have been restricted. The inexperienced line is the long-term development; the darkish inexperienced dot on the left is the general offset for that virus. The strong black line is the short-term immunity impact; the dashed black line is the long-term immunity impact. The dashed gray line is the long-term cross-immunity impact. As earlier than, the orange line is the seasonal impact, the crimson line is the sum of all results, and the pink dots are the estimates for *R _{t}* that the crimson line is fitted to.

The estimated seasonal impact on log(*R*) has a low-to-high vary of 0.44, which corresponds to R being 1.55 occasions increased in mid-November than in mid-June. The form of the seasonal impact curve through the “flu season” is kind of much like the form discovered utilizing the mannequin restricted to flu season, with proxyDn estimates for *R _{t}* (proven above), although the vary is barely higher, and the rise on the finish of flu season is smaller.

For each viruses, the plot reveals each brief and long run self-immunity results. The plot reveals some cross-immunity from HKU1 to OC43, however little indication of cross-immunity from OC43 to HKU1.

Listed below are the plots for all years, in each additive and multiplicative kind (click on for bigger model):

Listed below are the estimates for the regression coefficients, and their customary errors:

Estimate Std. Error HC3.se NW.se trend_spline1 -0.07411 0.04928 0.05926 0.29550 trend_spline2 -0.06561 0.02890 0.03710 0.18423 trend_spline3 -0.09946 0.04761 0.06531 0.33427 trend_spline4 0.27051 0.03097 0.04304 0.23808 trend_spline5 0.02323 0.05161 0.08320 0.29286 sin(1 * 2 * pi * yrs) 0.11735 0.01395 0.01832 0.07656 cos(1 * 2 * pi * yrs) -0.16124 0.00625 0.00746 0.02524 sin(2 * 2 * pi * yrs) -0.02905 0.00580 0.00617 0.02275 cos(2 * 2 * pi * yrs) 0.02184 0.00476 0.00541 0.01813 sin(3 * 2 * pi * yrs) -0.00949 0.00422 0.00391 0.00681 cos(3 * 2 * pi * yrs) -0.01828 0.00430 0.00471 0.00860 sin(4 * 2 * pi * yrs) 0.01808 0.00414 0.00394 0.00785 cos(4 * 2 * pi * yrs) -0.02917 0.00419 0.00429 0.00722 sin(5 * 2 * pi * yrs) -0.01345 0.00412 0.00425 0.00560 cos(5 * 2 * pi * yrs) -0.00494 0.00417 0.00406 0.00527 sin(6 * 2 * pi * yrs) 0.01211 0.00412 0.00436 0.00509 cos(6 * 2 * pi * yrs) 0.00014 0.00416 0.00405 0.00290 OC43_same -0.00136 0.00020 0.00020 0.00071 OC43_samelt -0.00099 0.00019 0.00028 0.00097 OC43_otherlt -0.00098 0.00013 0.00020 0.00098 HKU1_same -0.00060 0.00024 0.00030 0.00104 HKU1_samelt -0.00208 0.00015 0.00019 0.00091 HKU1_otherlt -0.00021 0.00017 0.00022 0.00094 OC43_overall 0.52199 0.07282 0.10294 0.39844 HKU1_overall 0.37802 0.07037 0.09145 0.39340 Sum of OC43_same and OC43_samelt : -0.00235 NW std err 0.00095 Sum of HKU1_same and HKU1_samelt : -0.00268 NW std err 0.00082 Seasonality for week50 - week20 : -0.44075 NW std err 0.10621 Residual customary deviation for OC43: 0.0625 Residual ACF: 1 0.93 0.76 0.54 0.32 0.13 -0.01 -0.1 -0.15 -0.18 -0.21 -0.23 Residual customary deviation for HKU1: 0.0665 Residual ACF: 1 0.89 0.66 0.39 0.12 -0.08 -0.21 -0.26 -0.27 -0.24 -0.21 -0.17

There’s substantial residual autocorrelation, so the Newey-West customary errors ought to be used.

Wanting on the immunity coefficients individually, solely `HKU1_samelt` has magnitude greater than twice its customary error (`OC43_same` is shut). That is considerably deceptive, nonetheless, because the brief and long run immunity phrases can trade-off towards one another — a decrease degree of short-term immunity may be compensated to some extent by the next degree of long-term immunity, and vice versa. I computed the usual error for the sum of the brief and long run coefficients, discovering that for each viruses the magnitude of this sum is bigger than twice its customary error, offering clear proof that these viruses induce immunity towards themselves.

The cross-immunity coefficients are a lot lower than twice their customary error, so this mannequin supplies little or no proof of cross immunity for these viruses.

As a robustness test, right here is similar mannequin fitted to the proxyAXss-filter knowledge, which is much less smoothed, with the minimal “fixups” wanted to get affordable outcomes:

The curve for the seasonal impact is similar to the match utilizing proxyEn-daily. The long-term immunity and cross-immunity results look significantly totally different. Regardless of this, the crimson curve, giving the fitted values for log(*R*), could be very related. The adjustments within the immunity curves are largely compensated for by adjustments within the offset for every virus (darkish inexperienced dots) and within the long-term development line (inexperienced) — the variation attributable to immunity and cross-immunity is comparable.

As one other robustness test, listed here are the outcomes of making use of the identical mannequin to the 2 alphacoronaviruses (click on for bigger model):

The seasonal impact for these viruses is much like that for the betacoronaviruses. This isn’t a totally impartial outcome, nonetheless, since some knowledge is shared between the 2 virus teams within the development of the proxies.

For additional perception into the which means of those estimates and customary errors, we are able to have a look at the incidence of those viruses, based on proxyEn:

For OC43, substantial peaks of incidence happen yearly, however giant peaks for HKU1 happen in solely two of the 5 years. Due to this, the long-term exponentially-weighted incidence sum is extra variable for HKU1 than for OC43, which signifies that extra data is obtainable concerning the immunity results for HKU1 than for OC43.

It’s additionally essential to remember the fact that there are solely 5 years of information. Simply from wanting on the incidence plots above, one would possibly suppose that the very best peak for HKU1, in 2017-2018, is a consequence of the just about non-existent peak in 2016-2017, which might lead to fewer immune individuals the next yr. Nevertheless, the massive 2017-2018 peak may additionally be triggered partially by a typically growing upward development, which could even be discerned within the incidence for OC43. The potential trade-offs between these results enhance the uncertainty in inferring immunity and cross-immunity.

Past the formal indication of uncertainty captured by the usual errors, there’s unquantified uncertainty stemming from the strategies used to assemble the proxies. These uncertainties might be captured formally solely by a distinct method by which the method by which the proxy knowledge is produced is modelled by way of latent processes, together with the transmission dynamics of the viruses. There’s additionally uncertainly stemming from my casual mannequin choice process, which may be “overfitting” the info, or incorporating my preconceptions.

Taking these uncertainties into consideration, my conclusions from this modelling train are as follows:

- Each betacoronaviruses induce immunity towards themselves. This immunity most likely persists to a big extent for at the least one yr. There’s doubtless some obvious short-term immunity that is known as a reflection of the truth that the info being modelled is aggregated over the complete United States.
- There could also be some cross-immunity between the 2 betacoronaviruses, however this isn’t clearly established from this knowledge.
- There’s good proof that the season impacts transmissibility of those viruses. The 95% confidence interval for the ratio of the minimal to the utmost seasonal impact is (0.52,0.80). (That is discovered as exp(-0.44075±2*0.10621) utilizing the estimate and customary error above.) One ought to observe that that is a median seasonal impact over the complete United States; it’s presumably higher in some areas and fewer in others.

## Future posts

With this put up, I’ve accomplished my dialogue of the primary a part of the paper. Considerably unusually, the outcomes from this half, whose validity I query above, will not be *immediately* utilized in the remainder of the paper — they as an alternative simply present a sanity test for the outcomes of becoming a distinct mannequin to the identical proxy knowledge on incidence, and as one enter to the potential eventualities for the longer term which can be thought-about. The second mannequin that Kissler et al. match is of the SEIRS (Inclined-Uncovered-Contaminated-Recovered-Inclined) sort. After getting used to mannequin the widespread chilly betacoronaviruses, this SEIRS mannequin is prolonged to incorporate SARS-CoV-2, with varied hypothesized traits. Simulations with these prolonged fashions are the idea for Kissler et al.’s coverage dialogue.

I’ll critique this SEIRS mannequin for the widespread chilly betacoronaviruses in a future put up. However my subsequent put up will describe how the full-year regression fashions for *R _{t}* which can be described above can simply be transformed to dynamical fashions, that can be utilized to simulate from a distribution of hypothetical various histories for the 2015 to 2019 interval. The diploma to which these simulations resemble actuality is one other indication of mannequin high quality.