ETC3550 A1
Ari Gestetner - 33966753
Question 1
Code
retail |> autoplot(Turnover, color=plot_colour) +
labs(title = "Time Series Plot of Retail Data")
Looking at the plot we can notice a somewhat exponential positive
trend.
Additionally, we can note some multiplicitive seasonal patterns.
We can also see some indications of cyclical patterns where there
appears to be a drop in the late 1980’s. Additionally, we can see what
appears to be a drop in the late 2000’s (likely around the 2008 GFC) and
another fall in the early 2020’s (likely due to Covid).
Question 2
Code
retail |> gg_season(Turnover, labels = "left")
By plotting the data with the gg_season
function we can note some
features of the data.
Specifically, we are able to notice that as the years increment, so does
the data increase (indicating a positive trend). Furthermore, we can
notice how there is an increase in sales by December and January,
however, there is then a drop in sales during the month of Febuary.
Additionally, we can note that there are some years with a significant
drop in sales, however, the plot is too clutered to investigate specific
years. Therefore, we will split the data into
Code
retail |>
filter(Month < yearmonth("Jan 2000")) |>
gg_season(Turnover) +
labs(title = "Seasonal plots from before January 2000")
Code
retail |>
filter(Month >= yearmonth("Jan 2000")) |>
gg_season(Turnover) +
labs(title = "Seasonal plots from after January 2000")
In the first plot we can see a significant drop in the late 1980’s
(Green line) around the end of the year. Furthermore, in the second plot
we can see another significant dip in 2020 (Purple line) which is also
around the time of the covid outbreak.
Nevertheless, there doesn’t appear to be any significant drops in the
late 2000’s despite what it looked like in the autoplot
plot.
Ultimatly, these drops and recoveries are part of the cyclical behaviour
of this data.
Question 3
Code
retail |>
gg_subseries(Turnover, color=plot_colour) +
labs(title = "Subseries plot for retail data")
Taking a look at the gg_subseries
plot we can note some features which
were hidden in the other 2 plots.
First, we can notice how there is a dip in the late 2000’s over every
month which is possibly the reason it wasn’t so obvious in the
gg_season
plot.
Furthermore, we are also able to notice the how the drop due to covid is
extreme druing the months of March to June. However, after June the drop
seems minimal compared to the earlier months.
Nevertheless, we can also see some common features with the other 2
plots, mainly the seasonal pattern with the increase in December and
January and the slight dip in Febuary. As well, we can see the strong
trend in the data which exists in every month.
Question 4
To find a good
Code
lambda <- 0
retail |> autoplot(box_cox(Turnover, lambda), color = plot_colour) +
labs(title = glue::glue("Retail Turnover with Box Cox with lambda = {round(lambda, 2)}"), y = glue::glue("box_cox(Turnover, lambda = {round(lambda, 2)})"))
Code
lambda <- 1
retail |> autoplot(box_cox(Turnover, lambda), color = plot_colour) +
labs(title = glue::glue("Retail Turnover with Box Cox with lambda = {round(lambda, 2)}"), y = glue::glue("box_cox(Turnover, lambda = {round(lambda, 2)})"))
Firstly, we are able to see that the variation clearly looks
multiplicative with the identity transformation. Furthermore, We can see
here that the
Now we will use the Guerrero feature and see if it beats the log
function.
Code
# Attempt guerro first
lambda <- retail |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
retail |> autoplot(box_cox(Turnover, lambda), color = plot_colour) +
labs(title = glue::glue("Retail Turnover with Box Cox with lambda = {round(lambda, 2)} using Guerrero"), y = glue::glue("box_cox(Turnover, lambda = {round(lambda, 2)})"))
We can see here that the Guerrero feature doesn’t improve on
Question 5
Code
lambda = 0
retail |>
model(STL(box_cox(Turnover, lambda) ~ season(window = 9) +
trend(window = 7), robust = TRUE)) |>
components() |>
autoplot(color=plot_colour) + labs(title = "STL Decomposition: US retail employment")
While some seasonality remains in the remainder, we can see some regular
features of the data, such as positive trend and some cyclical
patterns.
There is one particular feature which stands out in the STL
Decomposition, that is how the seasonality changes as the years
progress. As in in up untill around 2010, we can see that the
December/January increase in sales was much more significant than since
2010. This can be seen by how the peaks seem to be “fatter” in the later
years, whereas, they are a lot narrower in the earlier years. This may
be due to internet shopping which helps consumers make easy purchases at
all times of the year.