Yearly seasonalities need several years of data to learn patterns. The basic approaches are dummy variables, Fourier features (periodic) and splines (not periodic).
Below are some more complex feature designs.
See this gist for the code.
radial basis (bump)
We want to generate a Gaussian-like curve around a specific date.
The width of the bump function is controlled by
This can be repeated, e.g. for day-of-week effect.
See Ronan and Vincent Warmerdam.
asymmetric bump
Sometimes the effect before and after the seasonal variable is not the same, e.g. after Christmas there might be a strong effect before and a drop after.
Now, there can be different widths and different amplitudes before and after the event. Note, the amplitude can also be negative here.
From Juan’s blog.
eigen features
For longer seasonalities where we do not know the underlying pattern, but we want that the effect starts at zero and ends at zero but behaves like Fourier series otherwise. e.g. two week school holiday.
Suppose the seasonality lasts for
and
and multiplying by the matrix whose columns are the eigenvectors
For our feature, let
# fill out the matrices
AMA = A.dot(M).dot(A.T) / (span / 4)
w, v = np.linalg.eig(AMA)
basis = np.concatenate(
[
# the first time point should be zero
np.zeros((1, order * 2)),
# the final element of v_i is zero since the last column
# and row of AMA are zero
np.sqrt(w[: order * 2].real * v[:, : order * 2]),
]
)
Credit to Ulrich Mueller.
monotonic
Monotone functions can be modelled using I-splines, which are the integral of non-negative M-splines, with strictly positive or strictly negative coefficients. GAM packages like mgcv
or pyGAM
use a likelihood penalty for monotonic constraints.
python implementation.
extending to multivariate
The seasonalities can have the same shape for each stratum but different magnitude.
where we have
variation between years
The coefficients of these basis functions might vary slightly between years. Use a random walk prior on