2 SIRC Epidemiological model

Hakkel Tamás - 2018.06.16.

The SIRC (Susceptible-Infected-Recovered-Carrier) epidemiological model has four state variables:

  • $S$ represents the number of susceptibles,
  • $I$ denotes the number of infected,
  • $R$ are the number of recovered, and
  • $C$ are the number of carrier individuals.

The model expresses that

  • susceptible individuals might become infected by either infected or carrier individuals,
  • infected individuals are infectious for a while, then they become either recovered or carriers, and
  • carriers have no sympthoms, though they can infect others.

SIRC.png

$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C)S - \mu S \\ \frac{dI}{dt} = (\beta I + \epsilon \beta C)S - \gamma I - \mu I \\ \frac{dC}{dt} = \gamma q I − \Gamma C − \mu C \\ \frac{dR}{dt} = \gamma (1 − q)I + \Gamma C − \mu R \end{cases} $$

Simulation

Initial values

$N=1000 $
$S=900 $
$I=10 $
$R=C=0 $

Parameters

$\mu = 0.015 $
$\nu = 14 $
$\beta = 0.0002 $
$\epsilon = 1.2 $
$\gamma = 0.05 $
$q = 0.8 $
$\Gamma = 0.03 $

Time scale

$t\in\{0 ... 200\}$

Simulation without noise

In [7]:
plot(y,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Simple additive Gaussian white noise

Noise added just after simulation

In [10]:
plot(y_noisy_simple,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Additive autoregressive Gaussian white noise

$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C \epsilon_3)S - \mu S + \epsilon_1 \\ \frac{dI}{dt} = (\beta I \epsilon_2 + \epsilon \beta C \epsilon_3)S - \gamma I \epsilon_5 - \mu I + \epsilon_2 \\ \frac{dC}{dt} = \gamma q I \epsilon_5 − \Gamma C \epsilon_7 − \mu C + \epsilon_3 \\ \frac{dR}{dt} = \gamma (1 − q)I \epsilon_5 + \Gamma C \epsilon_7 − \mu R + \epsilon_4 \end{cases} $$

$$\varepsilon_k \sim N(0,3)$$

In [9]:
plot(y_noisy_additive,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Multiplicative autoregressive Gaussian white noise

$$ \begin{cases} \frac{dS}{dt} = \nu \varepsilon_1 - (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \mu S \varepsilon_4 \\ \frac{dI}{dt} = (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \gamma I \varepsilon_5 - \mu I \varepsilon_6 \\ \frac{dC}{dt} = \gamma q I \varepsilon_5 − \Gamma C \varepsilon_7 − \mu C \varepsilon_8 \\ \frac{dR}{dt} = \gamma (1 − q)I \varepsilon_5 + \Gamma C \varepsilon_7 − \mu R \varepsilon_9 \end{cases} $$

$$\varepsilon_k \sim N(1,0.5)$$

In [8]:
plot(y_noisy_multiplicative,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))

Selection of noise type

There are multiple options to add noise to the simulation, but I have chosen to use multiplicative white noise because it is the most realistic.

Case Study

  1. First, I estimate $\nu$, $\gamma$ and $\mu$ parameters with Least Square's method
  2. Then, I estimate $\beta$ and $\epsilon$ with instumental variable's method
  3. Finally, I examine $\Gamma$ and $q$ parameters through Bayesian estimation

1. Least Square's method

Ordinary Least Squares method (OLS)

Linearize equations

First we want to estimate the following parameters: $\nu, \mu, \gamma$

Let's rewrite our equations to discrete time: $$ \begin{cases} S(k) = S(k-1) + \nu - (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \mu S(k-1) \\ I(k) = I(k-1) + (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) \\ C(k) = C(k-1) + \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1) \\ R(k) = R(k-1) \gamma (1 − q)I(k-1) + \Gamma C(k-1) − \mu R(k-1) \end{cases} $$

Linearization version #1

We can add up the first two equations, and get the following equation: $$S(k) + I(k) = S(k-1) + I(k-1) + \nu - \gamma I(k-1) - \mu (S(k-1) + I(k-1))$$

Let us introduce some auxiliary variables: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1)$$ $$SI_{sum}(k-1) = S(k-1) + I(k-1)$$

That way, we have got a linear equation with the desired parameters: $$SI_{diff}(k) = \nu - \gamma I(k-1) - \mu (SI_{sum}(k-1))$$

In [12]:
table((('ν',nu,theta[0,0]),('$\gamma$',gamma,theta[1,0]),('$\mu$',mu,theta[2,0])))
Original Estimated
ν 14.00000 14.79500
$\gamma$ 0.05000 0.05300
$\mu$ 0.01500 -0.06760

Though $\nu$ and $\gamma$ values are correct, estimation of $\mu$ was unsuccessful.

Linearization version #2

To overcome that problem, I try another linearization option: I add up all four equations, to get only $\nu$ and $\mu$ as parameters. $$\begin{aligned} S(k) + I(k) + R(k) + C(k) = {} & S(k-1) + I(k-1) + R(k-1) + C(k-1) \\ & + \nu - \mu (S(k-1) + I(k-1) + R(k-1) + C(k-1))\end{aligned}$$

Auxiliary variables are the following in that case: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1) + R(k) - R(k-1) + C(k) - C(k-1)$$

$$SI_{sum}(k-1) = S(k-1) + I(k-1) + R(k-1) + C(k-1)$$

And the linearized equation: $$SI_{diff}(k) = \nu - \mu (SI_{sum}(k-1))$$

In [13]:
table((('ν',nu,theta[0,0]),('$\mu$',mu,theta[1,0])))
Original Estimated
ν 14.00000 32.82862
$\mu$ 0.01500 -0.03368

Unfortunately that modification did not yielded the desired output, moreover it even made results worse. That might be the effect of the autoregressive noise added separately to groups, so when I added these groups together, I might have amplified that noise.

Recursive Least Squares

Advantages:

  • online estimation
  • capable of handling changes in parameters
  • computationally less intensive

In order to make conversation faster without losing much of accuracy, I decided to increase forgetting parameter($\lambda$) asymptotically to 1.

In [35]:
table((('ν',nu,theta[0,-1]),('$\gamma$',gamma,theta[1,-1]),('$\mu$',mu,theta[2,-1])))
Original Estimated
ν 14.00000 15.14639
$\gamma$ 0.05000 0.05422
$\mu$ 0.01500 -0.06918
In [36]:
double_ax_plot(Lambda,theta)

Although, the estimation of the parameter $\mu$ was unsuccessful, estimations for $\nu$ and $\gamma$ values are fairly accurate, so I will use them in later estimations.

2. Instrumental Variables (IV)

Now let us examine the number of people getting infected, and estimate $\beta$, $\epsilon$ and $\mu$ parameters.

I picked the equation describing $I$ to estimate the parameters above.

Let's write it to discrete time form, and also add error term: $$I(k) - I(k-1) = (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) + \varepsilon(k)$$

Introducing auxiliary variables $$\beta_2 = \epsilon \beta$$

$$\beta_3 = 1 - \gamma - \mu$$

we get the following form: $$I(k) = \beta_3 I(k-1) + \beta I(k-1)S(k-1) + \beta_2 C(k-1) S(k-1) + \varepsilon(k)$$

ARMAX process $\rightarrow$ $\varepsilon$ is not independent of $I$, $S$ and $C$.

In [21]:
print('Covariance between I and the error:\n', np.cov(I_k_1, e))
print('\nCovariance between I*S and the error:\n', np.cov(IS, e))
print('\nCovariance between C*S and the error:\n', np.cov(CS, e))
Covariance between I and the error:
 [[9856.146  -65.996]
 [ -65.996   24.187]]

Covariance between I*S and the error:
 [[1.501e+09 1.233e+03]
 [1.233e+03 2.419e+01]]

Covariance between C*S and the error:
 [[ 7.697e+07 -2.700e+03]
 [-2.700e+03  2.419e+01]]

1st step: LS estimate of theta

In [22]:
table((('β$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_ls[0,0]),
       ('β',beta,theta_ls[1,0]),
       ('β$_2 = \epsilon$ β',epsilon*beta,theta_ls[2,0])),width=400)
Original Estimated
β$_3 = 1-\gamma-\mu$ 0.93500 0.92037
β 0.00020 0.00021
β$_2 = \epsilon$ β 0.00024 0.00036

2. step: construction of the first instruments

In [23]:
print('Covariance between IV and the error:\n', np.cov(z, e))
values = np.concatenate((reshape(I_k),z,reshape(e)),axis=0)
plot(values,total=False,
     labels=(('r','$I(k)$'),('b','First instrumental variable'),('k','Noise/Error')))
Covariance between IV and the error:
 [[7954.64   -55.364]
 [ -55.364   24.187]]

The covariance matrix and the plot shows that first instrumental variable is mostly independent from noise.

The regressor for the new instrumental variables is: $$\xi^{(1)}_N = \begin{bmatrix}z^{(1)}(0) && z^{(1)}(1) && z^{(1)}(2) && ... && z^{(1)}(n-1) \\ u_1(0) && u_1(1) && u_1(2) && ... && u_1(n-1) \\ u_2(0) && u_2(1) && u_2(2) && ... && u_2(n-1) \end{bmatrix}$$

Then we calculate the first IV estimation for the parameters: $$\hat{\theta}_N^{(2)} = \hat{\theta}_N^{IV} = (\xi^{(1)}_N * X^T)^{-1} * (\xi^{(1)}_N * Y^T)$$

In [27]:
table((('β$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_iv_2[0,0]),
       ('β',beta,theta_iv_2[1,0]),
       ('β$_2 = \epsilon$ β',epsilon*beta,theta_iv_2[2,0])),width=400)
Original Estimated
β$_3 = 1-\gamma-\mu$ 0.93500 0.91949
β 0.00020 0.00021
β$_2 = \epsilon$ β 0.00024 0.00036

The result of IV method is almost the same as the LS method. However, these result are reasonably accurate, so I can use them in the further tasks.

3. Bayesian estimation

Now I would like to estimate $\Gamma$ and $q$ parameters. To do that I will use the third equation from the equation system: $$\frac{dC}{dt} = \gamma q I − \Gamma C − \mu C$$

And in dicrete time form: $$C(k) - C(k-1) = \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1)$$

I already have a reasonable estimate for $\mu$, so I exclude it from investigation: $$C(k) - C(k-1) + \mu C(k-1) = \gamma q I(k-1) − \Gamma C(k-1)$$

Intoducing auxiliary variables: $$\theta: \begin{bmatrix}\gamma q \\ \Gamma\end{bmatrix}$$

$$y(k) = C(k) - C(k-1) + \mu C(k-1)$$

$$u_1(k) = \gamma I(k)$$

$$u_2(k) = C(k)$$

Arbitrary values for $\theta_{initial}$: $\begin{bmatrix} 0.3 \\ 0.3 \end{bmatrix}$

$\sigma_e = 0.3$

$\Sigma = \begin{bmatrix}\sigma_e && 0 \\ 0 && \sigma_e\end{bmatrix}$

In [38]:
max_Gamma, max_q = np.unravel_index(np.argmax(posterior_density, axis=None), posterior_density.shape)
table((('$q$',q,theta_1[max_q]/gamma_est),('$\gamma$',gamma,theta_2[max_Gamma])))
Original Estimated
$q$ 0.80000 0.05263
$\gamma$ 0.05000 0.00000
In [39]:
plot_3D(theta_1/gamma_est,theta_2,posterior_density)

Obviously, something went wrong here, but I cannot find the reason of it, so I run a simple LS estimation to figure out whether the code or the mathematical model is wrong.

In [32]:
Y = np.matrix(dC_muC)
X = np.matrix(np.vstack((gammaI,C)))
theta = inv(X*X.T) * (X*Y.T)
print(theta)
[[-0.022]
 [ 0.064]]

It also gave false result, so I certainly made some mistake at the linearization.