We demonstate how to use the
predict
method to obtain k-step ahead predictions /
forecasts of a model object, and the observations.
A prediction in this context is considered to be a prior estimate of the state mean and covariance i.e. an estimate obtained without incorporating information from future observations. We denote these
$$ \hat{x}_{i|j} = \mathrm{E}(x_{t_{i}} | x_{t_{j}}, y_{t_{j}}) \\ \hat{P}_{i|j} = \mathrm{V}(x_{t_{i}} | x_{t_{j}}, y_{t_{j}}) $$ where i ≥ j.
The predict
method computes the predictions depending on
its argument method
in the following way:
The Extended Kalman FIlter (method=ekf
): The
posterior values x̂i|i,
P̂i|i
are obtained from the filter. In each iteration the moment ODEs $$
\begin{align}
\frac{d\hat{x}_{i|i}}{dt} &= f(\hat{x}_{i|i},\cdots) \\
\frac{d\hat{P}_{i|i}}{dt} &=
\left[\frac{df}{dx}(\hat{x}_{i|i},\cdots)\right] \hat{P}_{i|i} +
\hat{P}_{i|i} \left[\frac{df}{dx}(\hat{x}_{i|i},\cdots)\right]^{T} +
G(\hat{x}_{i|i},\cdots)G^{T}(\hat{x}_{i|i},\cdots)
\end{align}
$$ are solved from the initial time ti until tj where the
number of steps is determined by the argument k.ahead
to
predict
, i.e. j=i+k.ahead
. The solution at
each time point ti + 1, ti + 2, …, ti + k.ahead
is the desired predictions. We note that these predictions assume that
the underlying probability density of the SDE remains approximately
Gaussian. If the prediction horizon is large then this assumption
probably doesn’t hold, and one should instead use the
simulate
method. The stochastic predictions are generated
by simulating the SDE forward in time for a user-defined number of
simulations.
The Laplace Approximation (method=laplace
): The
predict
method is not yet implemented.
We use the Ornstein-Uhlenbeck process again.
dXt = θ(μ − Xt) dt + σX dBt
Ytk = Xtk + etk, etk ∼ N(0, σY2) We first create data by simulating the process
# Simulate data using Euler Maruyama
set.seed(10)
theta=10; mu=1; sigma_x=1; sigma_y=1e-1
#
dt.sim = 1e-3
t.sim = seq(0,1,by=dt.sim)
dw = rnorm(length(t.sim)-1,sd=sqrt(dt.sim))
x = 3
for(i in 1:(length(t.sim)-1)) {
x[i+1] = x[i] + theta*(mu-x[i])*dt.sim + sigma_x*dw[i]
}
# Extract observations and add noise
dt.obs = 1e-2
t.obs = seq(0,1,by=dt.obs)
y = x[t.sim %in% t.obs] + sigma_y * rnorm(length(t.obs))
# Create data
.data = data.frame(
t = t.obs,
y = y
)
We now construct the ctsmTMB
model object
# Create model object
obj = ctsmTMB$new()
# Set name of model (and the created .cpp file)
obj$setModelname("ornstein_uhlenbeck")
# Add system equations
obj$addSystem(
dx ~ theta * (mu-x) * dt + sigma_x*dw
)
# Add observation equations
obj$addObs(
y ~ x
)
# Set observation equation variances
obj$setVariance(
y ~ sigma_y^2
)
# Specify algebraic relations
obj$setAlgebraics(
theta ~ exp(logtheta),
sigma_x ~ exp(logsigma_x),
sigma_y ~ exp(logsigma_y)
)
# Specify parameter initial values and lower/upper bounds in estimation
obj$setParameter(
logtheta = log(c(initial = 5, lower = 0, upper = 20)),
mu = c( initial = 0, lower = -10, upper = 10),
logsigma_x = log(c(initial = 1e-1, lower = 1e-5, upper = 5)),
logsigma_y = log(c(initial = 1e-1, lower = 1e-5, upper = 5))
)
# Set initial state mean and covariance
obj$setInitialState(list(x[1], 1e-1*diag(1)))
We can use the predict function without having called
estimate
first, in which case the initial value of the
parameters provided during add_parameters
are used.
The output of predict
is a list of two data.frames, one
for states and one for observations. The two data.frames share the same
five initial columns that contain indices i
and
j
, associated time-points t.i
and
t.j
, and a k.ahead
They both contain indices, time-points and associated k-step ahead , state/observation values and associated standard deviations, and observations provided in the data, and predicted observation values based from the predicted state values.
## i. j. t.i t.j k.ahead x var.x
## 1 0 0 0.00 0.00 0 3.095456 0.009090909
## 2 0 1 0.00 0.01 1 2.944489 0.008320958
## 3 1 1 0.01 0.01 0 2.827817 0.004541770
## 4 1 2 0.01 0.02 1 2.689903 0.004204726
## 5 2 2 0.02 0.02 0 2.673529 0.002960090
## 6 2 3 0.02 0.03 1 2.543140 0.002773563
## i. j. t.i t.j k.ahead y y.data
## 1 0 0 0.00 0.00 0 3.095456 3.105001
## 2 0 1 0.00 0.01 1 2.944489 2.687603
## 2.1 1 1 0.01 0.01 0 2.827817 2.687603
## 3 1 2 0.01 0.02 1 2.689903 2.634588
## 3.1 2 2 0.02 0.02 0 2.673529 2.634588
## 4 2 3 0.02 0.03 1 2.543140 2.267096
The indices i.
and j.
in the table refer to
expectation and variance of the state vector $$
\mathrm{E}(x_{t_{i}} | x_{t_{j}}, y_{t_{j}}) \\
\mathrm{V}(x_{t_{i}} | x_{t_{j}}, y_{t_{j}})
$$ where the moments are given at indices i (time ti) conditioned
on the information available up until indices j (time tj). In this
sense the state/covariance predictions with i = j are posterior
estimates, and all others are prior estimates.
By default only the standard deviations of the states are returned. The entire covariance matrix elements can be returned via
The number of time steps ahead that predictions are desired can be
changed with the k.ahead
argument. The default behaviour is
to save all prediction steps from 1 to k.ahead
but if only
a selected few are desired, then the return.k.ahead
argument can be used to indicate which should be kept in the output.
pred1 = obj$predict(.data, k.ahead=2)
pred2 = obj$predict(.data, k.ahead=10, return.k.ahead=c(2,5,8))
## i. j. t.i t.j k.ahead x var.x
## 1 0 0 0.00 0.00 0 3.095456 0.009090909
## 2 0 1 0.00 0.01 1 2.944489 0.008320958
## 3 0 2 0.00 0.02 2 2.800884 0.007624277
## 4 1 1 0.01 0.01 0 2.827817 0.004541770
## 5 1 2 0.01 0.02 1 2.689903 0.004204726
## 6 1 3 0.01 0.03 2 2.558715 0.003899757
## i. j. t.i t.j k.ahead x var.x
## 3 0 2 0.00 0.02 2 2.800884 0.007624277
## 6 0 5 0.00 0.05 5 2.410743 0.005907387
## 9 0 8 0.00 0.08 8 2.074946 0.004635482
## 14 1 3 0.01 0.03 2 2.558715 0.003899757
## 17 1 6 0.01 0.06 5 2.202306 0.003148193
## 20 1 9 0.01 0.09 8 1.895543 0.002591421
The default of predict
is to use the initial parameters
supplied during setParameter
, unless the
estimate
function has been succesfully run, then
predict
will use the parameter values at the found
minimizer. You can provide other parameters using the pars
argument of the method i.e.
The default behaviour of predict
is to use the initial
state and covariance suppled when calling the
set_initial_state
method. This can be important to change
if one wishes to correctly predict on the first few observations in a
provided prediction data series. You can supply the x0
and
p0
arguments with updated state and covariance
estimates
You can choose which solver options to use via the arguments
ode.solver
and ode.timestep
.
The ode.solver
decides which ode solver algorithm to
used when solving the mean and variance ODEs of the SDE.
The ode.timestep
determines the time step-size using
when solving the ODEs, with default value being the minimum observed
time difference in the provided data time vector
(min(diff(.data$t))
). The time step-size can’t be smaller
than this. If the chosen time step-size Δstep
does not “almost” divide the observation time differences in the data
Δobs
i.e. if $$
\frac{\Delta_{obs}}{\Delta_{step}} - \bigg\lfloor
\frac{\Delta_{obs}}{\Delta_{step}} \bigg\rfloor \geq 0.02
$$ then the time step-size is reduced such that it does divide
evenly by setting
$$ \frac{\Delta_{obs}}{\Delta_{step}} = \bigg\lceil \frac{\Delta_{obs}}{\Delta_{step}} \bigg\rceil $$ In other words: if there it takes 3.561 time-steps to get from ti to ti + 1 then we take 4 time-steps instead, and calculate a reduced time-step such that this is true. If there were only 3.01 time-steps then we retain the original time-step, ignoring the small temporal discrepancy.
We provide these arguments via
We could use our model predictions against observation to compute a
model performance score e.g. RMSE (root-mean square error). We first
estimate the model parameters, such that they are automatically used
when calling predict
. We then predict up to 10-steps ahead,
and return only these values.
fit = obj$estimate(.data)
pred = obj$predict(.data, k.ahead=10)
pred.states = pred$states
pred.obs = pred$observations
# ggplot2 theme
library(ggplot2)
mytheme =
theme_minimal() +
theme(
text = element_text("Avenir Next Condensed",size=15),
legend.text = element_text(size=15),
axis.text = element_text(size=15),
strip.text = element_text(face="bold",size=15),
legend.box = "vertical",
legend.position = "top",
plot.title = element_text(hjust=0.5)
)
Let’s plot the 10-step predictions against the observations.
data = .data
ggplot() +
geom_line(aes(x=pred10$t.j,y=pred10$x,color="10-Step Predictions")) +
geom_ribbon(aes(x=pred10$t.j,ymin=pred10$x-2*sqrt(pred10$var.x),ymax=pred10$x+2*sqrt(pred10$var.x)),fill="grey",alpha=0.5) +
geom_point(aes(x=data$t,data$y,color="Observations")) +
labs(color="",x="Time",y="") +
# coord_cartesian(xlim=c(0,0.1)) +
mytheme
We notice that because these are 10-step predictions the state
uncertainty (95%) is much larger than for the filtered posterior state
estimates from
fit
i.e.
t = fit$states$mean$posterior$t
xpost = fit$states$mean$posterior$x
xpost_sd = fit$states$sd$posterior$x
ggplot() +
geom_line(aes(x=t,y=xpost,color="0-Step Predictions (Posterior State Estimates)"),lwd=1) +
geom_ribbon(aes(x=t,ymin=xpost-2*xpost_sd,ymax=xpost+2*xpost_sd),fill="grey",alpha=0.5) +
geom_point(aes(x=data$t,y=data$y,color="Observations")) +
labs(x = "Time", y = "", color="") +
mytheme
We can calculate the RMSE prediction score for each prediction step as
follows:
rmse = c()
k.ahead = 1:10
for(i in k.ahead){
xy = data.frame(
x = pred.states[pred.states$k.ahead==i,"x"],
y = pred.obs[pred.obs$k.ahead==i,"y.data"]
)
rmse[i] = sqrt(mean((xy[["x"]] - xy[["y"]])^2))
}
ggplot() +
geom_line(aes(k.ahead, rmse), color="steelblue") +
geom_point(aes(k.ahead, rmse), color="red") +
labs(
title = "Root-Mean Square Errors for Different Prediction Horizons",
x = "Prediction Steps",
y = "Root-Mean-Square Errors"
) +
mytheme