2 SIRC Epidemiological model¶

Hakkel Tamás - 2018.06.16.

The SIRC (Susceptible-Infected-Recovered-Carrier) epidemiological model has four state variables:

$S$ represents the number of susceptibles,
$I$ denotes the number of infected,
$R$ are the number of recovered, and
$C$ are the number of carrier individuals.

The model expresses that

susceptible individuals might become infected by either infected or carrier individuals,
infected individuals are infectious for a while, then they become either recovered or carriers, and
carriers have no sympthoms, though they can infect others.

$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C)S - \mu S \\ \frac{dI}{dt} = (\beta I + \epsilon \beta C)S - \gamma I - \mu I \\ \frac{dC}{dt} = \gamma q I − \Gamma C − \mu C \\ \frac{dR}{dt} = \gamma (1 − q)I + \Gamma C − \mu R \end{cases} $$

Simulation¶

Initial values¶

$N=1000 $
$S=900 $
$I=10 $
$R=C=0 $

Parameters¶

$\mu = 0.015 $
$\nu = 14 $
$\beta = 0.0002 $
$\epsilon = 1.2 $
$\gamma = 0.05 $
$q = 0.8 $
$\Gamma = 0.03 $

Time scale¶

$t\in\{0 ... 200\}$

Simulation without noise¶

In [7]:

plot(y,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Simple additive Gaussian white noise¶

Noise added just after simulation

In [10]:

plot(y_noisy_simple,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Additive autoregressive Gaussian white noise¶

$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C \epsilon_3)S - \mu S + \epsilon_1 \\ \frac{dI}{dt} = (\beta I \epsilon_2 + \epsilon \beta C \epsilon_3)S - \gamma I \epsilon_5 - \mu I + \epsilon_2 \\ \frac{dC}{dt} = \gamma q I \epsilon_5 − \Gamma C \epsilon_7 − \mu C + \epsilon_3 \\ \frac{dR}{dt} = \gamma (1 − q)I \epsilon_5 + \Gamma C \epsilon_7 − \mu R + \epsilon_4 \end{cases} $$

$$\varepsilon_k \sim N(0,3)$$

In [9]:

plot(y_noisy_additive,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Multiplicative autoregressive Gaussian white noise¶

$$ \begin{cases} \frac{dS}{dt} = \nu \varepsilon_1 - (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \mu S \varepsilon_4 \\ \frac{dI}{dt} = (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \gamma I \varepsilon_5 - \mu I \varepsilon_6 \\ \frac{dC}{dt} = \gamma q I \varepsilon_5 − \Gamma C \varepsilon_7 − \mu C \varepsilon_8 \\ \frac{dR}{dt} = \gamma (1 − q)I \varepsilon_5 + \Gamma C \varepsilon_7 − \mu R \varepsilon_9 \end{cases} $$

$$\varepsilon_k \sim N(1,0.5)$$

In [8]:

plot(y_noisy_multiplicative,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Selection of noise type¶

There are multiple options to add noise to the simulation, but I have chosen to use multiplicative white noise because it is the most realistic.

Case Study¶

First, I estimate $\nu$, $\gamma$ and $\mu$ parameters with Least Square's method
Then, I estimate $\beta$ and $\epsilon$ with instumental variable's method
Finally, I examine $\Gamma$ and $q$ parameters through Bayesian estimation

1. Least Square's method¶

Ordinary Least Squares method (OLS)¶

Linearize equations¶

First we want to estimate the following parameters: $\nu, \mu, \gamma$

Let's rewrite our equations to discrete time: $$ \begin{cases} S(k) = S(k-1) + \nu - (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \mu S(k-1) \\ I(k) = I(k-1) + (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) \\ C(k) = C(k-1) + \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1) \\ R(k) = R(k-1) \gamma (1 − q)I(k-1) + \Gamma C(k-1) − \mu R(k-1) \end{cases} $$

Linearization version #1¶

We can add up the first two equations, and get the following equation: $$S(k) + I(k) = S(k-1) + I(k-1) + \nu - \gamma I(k-1) - \mu (S(k-1) + I(k-1))$$

Let us introduce some auxiliary variables: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1)$$ $$SI_{sum}(k-1) = S(k-1) + I(k-1)$$

That way, we have got a linear equation with the desired parameters: $$SI_{diff}(k) = \nu - \gamma I(k-1) - \mu (SI_{sum}(k-1))$$

In [12]:

table((('&nu;',nu,theta[0,0]),('$\gamma$',gamma,theta[1,0]),('$\mu$',mu,theta[2,0])))

	Original	Estimated
ν	14.00000	14.79500
$\gamma$	0.05000	0.05300
$\mu$	0.01500	-0.06760

Though $\nu$ and $\gamma$ values are correct, estimation of $\mu$ was unsuccessful.

Linearization version #2¶

To overcome that problem, I try another linearization option: I add up all four equations, to get only $\nu$ and $\mu$ as parameters. $$\begin{aligned} S(k) + I(k) + R(k) + C(k) = {} & S(k-1) + I(k-1) + R(k-1) + C(k-1) \\ & + \nu - \mu (S(k-1) + I(k-1) + R(k-1) + C(k-1))\end{aligned}$$

Auxiliary variables are the following in that case: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1) + R(k) - R(k-1) + C(k) - C(k-1)$$

$$SI_{sum}(k-1) = S(k-1) + I(k-1) + R(k-1) + C(k-1)$$

And the linearized equation: $$SI_{diff}(k) = \nu - \mu (SI_{sum}(k-1))$$

In [13]:

table((('&nu;',nu,theta[0,0]),('$\mu$',mu,theta[1,0])))

	Original	Estimated
ν	14.00000	32.82862
$\mu$	0.01500	-0.03368

Unfortunately that modification did not yielded the desired output, moreover it even made results worse. That might be the effect of the autoregressive noise added separately to groups, so when I added these groups together, I might have amplified that noise.

Recursive Least Squares¶

Advantages:

online estimation
capable of handling changes in parameters
computationally less intensive

In order to make conversation faster without losing much of accuracy, I decided to increase forgetting parameter($\lambda$) asymptotically to 1.

In [35]:

table((('&nu;',nu,theta[0,-1]),('$\gamma$',gamma,theta[1,-1]),('$\mu$',mu,theta[2,-1])))

	Original	Estimated
ν	14.00000	15.14639
$\gamma$	0.05000	0.05422
$\mu$	0.01500	-0.06918

In [36]:

double_ax_plot(Lambda,theta)

Although, the estimation of the parameter $\mu$ was unsuccessful, estimations for $\nu$ and $\gamma$ values are fairly accurate, so I will use them in later estimations.

2. Instrumental Variables (IV)¶

Now let us examine the number of people getting infected, and estimate $\beta$, $\epsilon$ and $\mu$ parameters.

I picked the equation describing $I$ to estimate the parameters above.

Let's write it to discrete time form, and also add error term: $$I(k) - I(k-1) = (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) + \varepsilon(k)$$

Introducing auxiliary variables $$\beta_2 = \epsilon \beta$$

$$\beta_3 = 1 - \gamma - \mu$$

we get the following form: $$I(k) = \beta_3 I(k-1) + \beta I(k-1)S(k-1) + \beta_2 C(k-1) S(k-1) + \varepsilon(k)$$

ARMAX process $\rightarrow$ $\varepsilon$ is not independent of $I$, $S$ and $C$.

In [21]:

print('Covariance between I and the error:\n', np.cov(I_k_1, e))
print('\nCovariance between I*S and the error:\n', np.cov(IS, e))
print('\nCovariance between C*S and the error:\n', np.cov(CS, e))

Covariance between I and the error:
 [[9856.146  -65.996]
 [ -65.996   24.187]]

Covariance between I*S and the error:
 [[1.501e+09 1.233e+03]
 [1.233e+03 2.419e+01]]

Covariance between C*S and the error:
 [[ 7.697e+07 -2.700e+03]
 [-2.700e+03  2.419e+01]]

1st step: LS estimate of theta¶

In [22]:

table((('&beta;$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_ls[0,0]),
       ('&beta;',beta,theta_ls[1,0]),
       ('&beta;$_2 = \epsilon$ &beta;',epsilon*beta,theta_ls[2,0])),width=400)

	Original	Estimated
β$_3 = 1-\gamma-\mu$	0.93500	0.92037
β	0.00020	0.00021
β$_2 = \epsilon$ β	0.00024	0.00036

2. step: construction of the first instruments¶

In [23]:

print('Covariance between IV and the error:\n', np.cov(z, e))
values = np.concatenate((reshape(I_k),z,reshape(e)),axis=0)
plot(values,total=False,
     labels=(('r','$I(k)$'),('b','First instrumental variable'),('k','Noise/Error')))

Covariance between IV and the error:
 [[7954.64   -55.364]
 [ -55.364   24.187]]

The covariance matrix and the plot shows that first instrumental variable is mostly independent from noise.

The regressor for the new instrumental variables is: $$\xi^{(1)}_N = \begin{bmatrix}z^{(1)}(0) && z^{(1)}(1) && z^{(1)}(2) && ... && z^{(1)}(n-1) \\ u_1(0) && u_1(1) && u_1(2) && ... && u_1(n-1) \\ u_2(0) && u_2(1) && u_2(2) && ... && u_2(n-1) \end{bmatrix}$$

Then we calculate the first IV estimation for the parameters: $$\hat{\theta}_N^{(2)} = \hat{\theta}_N^{IV} = (\xi^{(1)}_N * X^T)^{-1} * (\xi^{(1)}_N * Y^T)$$

In [27]:

table((('&beta;$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_iv_2[0,0]),
       ('&beta;',beta,theta_iv_2[1,0]),
       ('&beta;$_2 = \epsilon$ &beta;',epsilon*beta,theta_iv_2[2,0])),width=400)

	Original	Estimated
β$_3 = 1-\gamma-\mu$	0.93500	0.91949
β	0.00020	0.00021
β$_2 = \epsilon$ β	0.00024	0.00036

The result of IV method is almost the same as the LS method. However, these result are reasonably accurate, so I can use them in the further tasks.

3. Bayesian estimation¶

Now I would like to estimate $\Gamma$ and $q$ parameters. To do that I will use the third equation from the equation system: $$\frac{dC}{dt} = \gamma q I − \Gamma C − \mu C$$

And in dicrete time form: $$C(k) - C(k-1) = \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1)$$

I already have a reasonable estimate for $\mu$, so I exclude it from investigation: $$C(k) - C(k-1) + \mu C(k-1) = \gamma q I(k-1) − \Gamma C(k-1)$$

Intoducing auxiliary variables: $$\theta: \begin{bmatrix}\gamma q \\ \Gamma\end{bmatrix}$$

$$y(k) = C(k) - C(k-1) + \mu C(k-1)$$

$$u_1(k) = \gamma I(k)$$

$$u_2(k) = C(k)$$

Arbitrary values for $\theta_{initial}$: $\begin{bmatrix} 0.3 \\ 0.3 \end{bmatrix}$

$\sigma_e = 0.3$

$\Sigma = \begin{bmatrix}\sigma_e && 0 \\ 0 && \sigma_e\end{bmatrix}$

In [38]:

max_Gamma, max_q = np.unravel_index(np.argmax(posterior_density, axis=None), posterior_density.shape)
table((('$q$',q,theta_1[max_q]/gamma_est),('$\gamma$',gamma,theta_2[max_Gamma])))

	Original	Estimated
$q$	0.80000	0.05263
$\gamma$	0.05000	0.00000

In [39]:

plot_3D(theta_1/gamma_est,theta_2,posterior_density)

Obviously, something went wrong here, but I cannot find the reason of it, so I run a simple LS estimation to figure out whether the code or the mathematical model is wrong.

In [32]:

Y = np.matrix(dC_muC)
X = np.matrix(np.vstack((gammaI,C)))
theta = inv(X*X.T) * (X*Y.T)
print(theta)

[[-0.022]
 [ 0.064]]

It also gave false result, so I certainly made some mistake at the linearization.

Literature:¶

Simulation code is based on that example: https://scipython.com/book/chapter-8-scipy/additional-examples/the-sir-epidemic-model/
SIR model: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.589.6645&rep=rep1&type=pdf
SIRC model: https://www.researchgate.net/publication/258379394_The_fractional_SIRC_model_and_influenza_A
SIR model parameter estimation: https://link.springer.com/article/10.1186/s13662-017-1078-5
Linearization: http://alun.math.ncsu.edu/wp-content/uploads/sites/2/2017/01/linearization.pdf
Instrumental Variables: https://economics.mit.edu/files/8273
Instrumental Variables: https://www.youtube.com/user/SpartacanUsuals?pbjreload=10