Hakkel Tamás - 2018.06.16.
The SIRC (Susceptible-Infected-Recovered-Carrier) epidemiological model has four state variables:
The model expresses that

$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C)S - \mu S \\ \frac{dI}{dt} = (\beta I + \epsilon \beta C)S - \gamma I - \mu I \\ \frac{dC}{dt} = \gamma q I − \Gamma C − \mu C \\ \frac{dR}{dt} = \gamma (1 − q)I + \Gamma C − \mu R \end{cases} $$
plot(y,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))
Noise added just after simulation
plot(y_noisy_simple,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))
$$ \begin{cases} \frac{dS}{dt} = \nu - (\beta I + \epsilon \beta C \epsilon_3)S - \mu S + \epsilon_1 \\ \frac{dI}{dt} = (\beta I \epsilon_2 + \epsilon \beta C \epsilon_3)S - \gamma I \epsilon_5 - \mu I + \epsilon_2 \\ \frac{dC}{dt} = \gamma q I \epsilon_5 − \Gamma C \epsilon_7 − \mu C + \epsilon_3 \\ \frac{dR}{dt} = \gamma (1 − q)I \epsilon_5 + \Gamma C \epsilon_7 − \mu R + \epsilon_4 \end{cases} $$
$$\varepsilon_k \sim N(0,3)$$
plot(y_noisy_additive,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))
$$ \begin{cases} \frac{dS}{dt} = \nu \varepsilon_1 - (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \mu S \varepsilon_4 \\ \frac{dI}{dt} = (\beta I \varepsilon_2 + \epsilon \beta C \varepsilon_3)S - \gamma I \varepsilon_5 - \mu I \varepsilon_6 \\ \frac{dC}{dt} = \gamma q I \varepsilon_5 − \Gamma C \varepsilon_7 − \mu C \varepsilon_8 \\ \frac{dR}{dt} = \gamma (1 − q)I \varepsilon_5 + \Gamma C \varepsilon_7 − \mu R \varepsilon_9 \end{cases} $$
$$\varepsilon_k \sim N(1,0.5)$$
plot(y_noisy_multiplicative,(('b','Susceptible'),('r','Infected'),('g','Recovered with immunity'),('k','Carriers')))
There are multiple options to add noise to the simulation, but I have chosen to use multiplicative white noise because it is the most realistic.
First we want to estimate the following parameters: $\nu, \mu, \gamma$
Let's rewrite our equations to discrete time: $$ \begin{cases} S(k) = S(k-1) + \nu - (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \mu S(k-1) \\ I(k) = I(k-1) + (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) \\ C(k) = C(k-1) + \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1) \\ R(k) = R(k-1) \gamma (1 − q)I(k-1) + \Gamma C(k-1) − \mu R(k-1) \end{cases} $$
We can add up the first two equations, and get the following equation: $$S(k) + I(k) = S(k-1) + I(k-1) + \nu - \gamma I(k-1) - \mu (S(k-1) + I(k-1))$$
Let us introduce some auxiliary variables: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1)$$ $$SI_{sum}(k-1) = S(k-1) + I(k-1)$$
That way, we have got a linear equation with the desired parameters: $$SI_{diff}(k) = \nu - \gamma I(k-1) - \mu (SI_{sum}(k-1))$$
table((('ν',nu,theta[0,0]),('$\gamma$',gamma,theta[1,0]),('$\mu$',mu,theta[2,0])))
| Original | Estimated | |
|---|---|---|
| ν | 14.00000 | 14.79500 |
| $\gamma$ | 0.05000 | 0.05300 |
| $\mu$ | 0.01500 | -0.06760 |
Though $\nu$ and $\gamma$ values are correct, estimation of $\mu$ was unsuccessful.
To overcome that problem, I try another linearization option: I add up all four equations, to get only $\nu$ and $\mu$ as parameters. $$\begin{aligned} S(k) + I(k) + R(k) + C(k) = {} & S(k-1) + I(k-1) + R(k-1) + C(k-1) \\ & + \nu - \mu (S(k-1) + I(k-1) + R(k-1) + C(k-1))\end{aligned}$$
Auxiliary variables are the following in that case: $$SI_{diff}(k) = S(k) - S(k-1) + I(k) - I(k-1) + R(k) - R(k-1) + C(k) - C(k-1)$$
$$SI_{sum}(k-1) = S(k-1) + I(k-1) + R(k-1) + C(k-1)$$
And the linearized equation: $$SI_{diff}(k) = \nu - \mu (SI_{sum}(k-1))$$
table((('ν',nu,theta[0,0]),('$\mu$',mu,theta[1,0])))
| Original | Estimated | |
|---|---|---|
| ν | 14.00000 | 32.82862 |
| $\mu$ | 0.01500 | -0.03368 |
Unfortunately that modification did not yielded the desired output, moreover it even made results worse. That might be the effect of the autoregressive noise added separately to groups, so when I added these groups together, I might have amplified that noise.
Advantages:
In order to make conversation faster without losing much of accuracy, I decided to increase forgetting parameter($\lambda$) asymptotically to 1.
table((('ν',nu,theta[0,-1]),('$\gamma$',gamma,theta[1,-1]),('$\mu$',mu,theta[2,-1])))
| Original | Estimated | |
|---|---|---|
| ν | 14.00000 | 15.14639 |
| $\gamma$ | 0.05000 | 0.05422 |
| $\mu$ | 0.01500 | -0.06918 |
double_ax_plot(Lambda,theta)
Although, the estimation of the parameter $\mu$ was unsuccessful, estimations for $\nu$ and $\gamma$ values are fairly accurate, so I will use them in later estimations.
Now let us examine the number of people getting infected, and estimate $\beta$, $\epsilon$ and $\mu$ parameters.
I picked the equation describing $I$ to estimate the parameters above.
Let's write it to discrete time form, and also add error term: $$I(k) - I(k-1) = (\beta I(k-1) + \epsilon \beta C(k-1))S(k-1) - \gamma I(k-1) - \mu I(k-1) + \varepsilon(k)$$
Introducing auxiliary variables $$\beta_2 = \epsilon \beta$$
$$\beta_3 = 1 - \gamma - \mu$$
we get the following form: $$I(k) = \beta_3 I(k-1) + \beta I(k-1)S(k-1) + \beta_2 C(k-1) S(k-1) + \varepsilon(k)$$
ARMAX process $\rightarrow$ $\varepsilon$ is not independent of $I$, $S$ and $C$.
print('Covariance between I and the error:\n', np.cov(I_k_1, e))
print('\nCovariance between I*S and the error:\n', np.cov(IS, e))
print('\nCovariance between C*S and the error:\n', np.cov(CS, e))
Covariance between I and the error: [[9856.146 -65.996] [ -65.996 24.187]] Covariance between I*S and the error: [[1.501e+09 1.233e+03] [1.233e+03 2.419e+01]] Covariance between C*S and the error: [[ 7.697e+07 -2.700e+03] [-2.700e+03 2.419e+01]]
table((('β$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_ls[0,0]),
('β',beta,theta_ls[1,0]),
('β$_2 = \epsilon$ β',epsilon*beta,theta_ls[2,0])),width=400)
| Original | Estimated | |
|---|---|---|
| β$_3 = 1-\gamma-\mu$ | 0.93500 | 0.92037 |
| β | 0.00020 | 0.00021 |
| β$_2 = \epsilon$ β | 0.00024 | 0.00036 |
print('Covariance between IV and the error:\n', np.cov(z, e))
values = np.concatenate((reshape(I_k),z,reshape(e)),axis=0)
plot(values,total=False,
labels=(('r','$I(k)$'),('b','First instrumental variable'),('k','Noise/Error')))
Covariance between IV and the error: [[7954.64 -55.364] [ -55.364 24.187]]
The covariance matrix and the plot shows that first instrumental variable is mostly independent from noise.
The regressor for the new instrumental variables is: $$\xi^{(1)}_N = \begin{bmatrix}z^{(1)}(0) && z^{(1)}(1) && z^{(1)}(2) && ... && z^{(1)}(n-1) \\ u_1(0) && u_1(1) && u_1(2) && ... && u_1(n-1) \\ u_2(0) && u_2(1) && u_2(2) && ... && u_2(n-1) \end{bmatrix}$$
Then we calculate the first IV estimation for the parameters: $$\hat{\theta}_N^{(2)} = \hat{\theta}_N^{IV} = (\xi^{(1)}_N * X^T)^{-1} * (\xi^{(1)}_N * Y^T)$$
table((('β$_3 = 1-\gamma-\mu$',1-gamma-mu,theta_iv_2[0,0]),
('β',beta,theta_iv_2[1,0]),
('β$_2 = \epsilon$ β',epsilon*beta,theta_iv_2[2,0])),width=400)
| Original | Estimated | |
|---|---|---|
| β$_3 = 1-\gamma-\mu$ | 0.93500 | 0.91949 |
| β | 0.00020 | 0.00021 |
| β$_2 = \epsilon$ β | 0.00024 | 0.00036 |
The result of IV method is almost the same as the LS method. However, these result are reasonably accurate, so I can use them in the further tasks.
Now I would like to estimate $\Gamma$ and $q$ parameters. To do that I will use the third equation from the equation system: $$\frac{dC}{dt} = \gamma q I − \Gamma C − \mu C$$
And in dicrete time form: $$C(k) - C(k-1) = \gamma q I(k-1) − \Gamma C(k-1) − \mu C(k-1)$$
I already have a reasonable estimate for $\mu$, so I exclude it from investigation: $$C(k) - C(k-1) + \mu C(k-1) = \gamma q I(k-1) − \Gamma C(k-1)$$
Intoducing auxiliary variables: $$\theta: \begin{bmatrix}\gamma q \\ \Gamma\end{bmatrix}$$
$$y(k) = C(k) - C(k-1) + \mu C(k-1)$$
$$u_1(k) = \gamma I(k)$$
$$u_2(k) = C(k)$$
Arbitrary values for $\theta_{initial}$: $\begin{bmatrix} 0.3 \\ 0.3 \end{bmatrix}$
$\sigma_e = 0.3$
$\Sigma = \begin{bmatrix}\sigma_e && 0 \\ 0 && \sigma_e\end{bmatrix}$
max_Gamma, max_q = np.unravel_index(np.argmax(posterior_density, axis=None), posterior_density.shape)
table((('$q$',q,theta_1[max_q]/gamma_est),('$\gamma$',gamma,theta_2[max_Gamma])))
| Original | Estimated | |
|---|---|---|
| $q$ | 0.80000 | 0.05263 |
| $\gamma$ | 0.05000 | 0.00000 |
plot_3D(theta_1/gamma_est,theta_2,posterior_density)
Obviously, something went wrong here, but I cannot find the reason of it, so I run a simple LS estimation to figure out whether the code or the mathematical model is wrong.
Y = np.matrix(dC_muC)
X = np.matrix(np.vstack((gammaI,C)))
theta = inv(X*X.T) * (X*Y.T)
print(theta)
[[-0.022] [ 0.064]]
It also gave false result, so I certainly made some mistake at the linearization.