Introduction
Background
Time series
Methodology
Short Answer
If you are in a hurry, I predicted that unemployment on Greece is expected to further reduce. It will range between 10% - 13% in February, 2023.
Prerequisites
Import Libraries
For this analysis we will need standard libraries for importing and processing my data, such as readr (Wickham, Hester, & Bryan, 2024) and dplyr (Wickham, François, Henry, Müller, & Vaughan, 2023). The kableExtra (Zhu, 2024) package was used to print the results in table format, while the flextable (Gohel & Skintzos, 2024) package was used to print the results of the Dickey-Fuller and KPSS tests.
Then, due to the nature of the data (time series) it was deemed necessary to use relevant libraries such as lubridate(Spinu, Grolemund, & Wickham, 2024), tseries(Trapletti & Hornik, 2024) & forecast(R. Hyndman et al., 2025) packages.
Finally, the ggplot2 (R-ggplot2?) package was used to create some visualizations, as well as an auxiliary package, ggtext (R-ggtext?), for further formatting those.
Show the code
acf_plot_function = function(data, min = -1, max = 1){
ts_data <- data
N <- length(ts_data)
# Compute confidence limit
conf_limit <- qnorm(0.975) / sqrt(N)
# Use forecast::Acf (biased estimator, recommended for ARIMA)
acf_obj <- Acf(ts_data, plot = FALSE, lag.max = 20)
# Extract lags and values
# Note: Acf() returns acf_obj$acf as a matrix; acf_obj$lag as a matrix
lags_acf <- as.numeric(acf_obj$lag[-1]) # skip lag 0
acf_values <- as.numeric(acf_obj$acf[-1]) # skip lag 0
# Build the highcharter ACF plot
acf_chart <- highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "Autocorrelation Plot") %>%
hc_xAxis(categories = lags_acf, title = list(text = "Lag")) %>%
hc_yAxis(min = min, max = max,
title = list(text = "ACF"),
plotLines = list(
list(value = 0, color = "#000000", width = 3),
list(value = conf_limit, color = "#FF0000", dashStyle = "ShortDash", width = 1),
list(value = -conf_limit, color = "#FF0000", dashStyle = "ShortDash", width = 1)
)
) %>%
hc_legend(enabled = FALSE) %>%
hc_add_series(name = "ACF", data = round(acf_values, digits = 4))
# Show chart
acf_chart
}
pacf_plot_function <- function(data, min = -1, max = 1) {
ts_data <- data
N <- length(ts_data)
# Compute confidence limit (same formula as for ACF)
conf_limit <- qnorm(0.975) / sqrt(N)
# Compute PACF using forecast::Pacf (biased estimator, recommended for ARIMA)
pacf_obj <- forecast::Pacf(ts_data, plot = FALSE, lag.max = 20)
# Extract lags and PACF values
lags_pacf <- as.numeric(pacf_obj$lag) # already excludes lag 0
pacf_values <- as.numeric(pacf_obj$acf)
# Build the highcharter PACF plot
pacf_chart <- highcharter::highchart() %>%
highcharter::hc_chart(type = "column") %>%
highcharter::hc_title(text = "Partial Autocorrelation Plot") %>%
highcharter::hc_xAxis(categories = lags_pacf, title = list(text = "Lag")) %>%
highcharter::hc_yAxis(
min = min, max = max,
title = list(text = "PACF"),
plotLines = list(
list(value = 0, color = "#000000", width = 3),
list(value = conf_limit, color = "#FF0000", dashStyle = "ShortDash", width = 1),
list(value = -conf_limit, color = "#FF0000", dashStyle = "ShortDash", width = 1)
)
) %>%
highcharter::hc_legend(enabled = FALSE) %>%
highcharter::hc_add_series(name = "PACF", data = round(pacf_values, digits = 4))
# Show chart
pacf_chart
}
Import dataset
After loading the libraries I am able to use the commands of the readr package to import my data. My data is in .csv format, so I’ll use the read_csv()
command (Wickham et al., 2024) to import them.
Additionally, I choose not to include EA-19 values (as I investigate Greece’s unemployment).
Preview Dataset
Show the code
head(unemployment, 10) %>%
kbl(.,
align = 'c',
booktabs = T,
centering = T,
valign = T) %>%
kable_styling()
LOCATION | TIME | Value |
---|---|---|
GRC | 1998-04 | 10.9 |
GRC | 1998-05 | 11.0 |
GRC | 1998-06 | 10.9 |
GRC | 1998-07 | 11.0 |
GRC | 1998-08 | 11.2 |
GRC | 1998-09 | 11.1 |
GRC | 1998-10 | 11.1 |
GRC | 1998-11 | 11.4 |
GRC | 1998-12 | 11.4 |
GRC | 1999-01 | 11.4 |
Dataset Structure
Our dataset is consisted by 3 variables (columns). More specifically, concerning my variables, are as follows :
Variable | Property | Description |
---|---|---|
LOCATION |
qualitative (nominal) |
Specific country’s statistics |
TIME |
qualitative (ordinal) |
Month of the reported data |
Value |
quantitative (continuous) |
Unemployment at the given Time and Country |
Thus, my sample has 3 variables, of which 2 are qualitative and 1 is quantitative property.
Time Series Preprocessing
The TIME
variable needs to be a Date variable which is not fulfilled on our case.
Show the code
sapply(unemployment, class) %>%
kbl() %>% kable_styling(full_width = F, position = "center")
x | |
---|---|
LOCATION | character |
TIME | character |
Value | numeric |
So, above I see that I have dates in a format “YYYY-MM” (Year - Month) and they are considered as characters. With the help of lubridate
package I will convert my time series on a Date format.
Show the code
unemployment$TIME <- lubridate::ym(unemployment$TIME)
And let’s check again :
Show the code
sapply(unemployment, class) %>% kbl() %>% kable_styling(full_width = F, position = "center")
x | |
---|---|
LOCATION | character |
TIME | Date |
Value | numeric |
And now I got the Date format. I am able to continue my analysis.
Missing Values
Good news! On this dataset there are 0 missing values, in total.
In case the dataset had missing values, I would first look at which variables those were.In a second phase, it might be necessary to replace the missing values.
Descriptive Statistics
The unemployment data for Greece refer to the period from April 1998 το August 2022. Regarding Europe we have data from January 2000 to August 2022.
The last 20 years have seen particularly large changes mainly in Greece, due to the domestic economic crisis. There is a noticeable change between September 2010 (10.1%) and September 2013 (28.1 %), in terms of unemployment in Greece. For the record, below you can find a table with the months that presented the highest unemployment in Greece. As you will notice the five highest values were observed in 2013.
Show the code
TIME | Value | |
---|---|---|
289 | 2013-11-01 | 27.9 |
290 | 2013-04-01 | 28.0 |
291 | 2013-05-01 | 28.0 |
292 | 2013-07-01 | 28.1 |
293 | 2013-09-01 | 28.1 |
Accordingly, the highest unemployment rates observed at the European level are the following:
Show the code
TIME | Value | |
---|---|---|
268 | 2013-06-01 | 11.6 |
269 | 2013-01-01 | 11.7 |
270 | 2013-02-01 | 11.7 |
271 | 2013-03-01 | 11.7 |
272 | 2013-04-01 | 11.7 |
The lowest unemployment ever recorded in Greece was in the period of 2008:
Show the code
TIME | Value |
---|---|
2008-05-01 | 7.4 |
2008-06-01 | 7.6 |
2008-07-01 | 7.6 |
2008-10-01 | 7.6 |
2008-01-01 | 7.7 |
On the other hand, at the moment in Europe we are at the lowest levels of the last 20 years, with unemployment hovering around 6%.
Show the code
TIME | Value |
---|---|
2022-07-01 | 6.0 |
2022-08-01 | 6.0 |
2022-04-01 | 6.1 |
2022-05-01 | 6.1 |
2022-06-01 | 6.1 |
Finally, it is evident that in the case of Greece the changes are more intense than in Europe.
All of the above can be summarized by the following graph:
Show the code
`summarise()` has grouped output by 'LOCATION'. You can override using the
`.groups` argument.
Show the code
highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = "Unemployment Rates") %>%
hc_xAxis(title = list(text = "Year")) %>%
hc_yAxis(title = list(text = "Value")) %>%
hc_tooltip(shared = TRUE, valueDecimals = 2, valueSuffix = " %") %>%
hc_add_series(
data = g,
type = "line",
hcaes(x = Year, y = mean_unemployment, group = LOCATION)
) %>%
hc_plotOptions(
series = list(
dataLabels = list(
enabled = TRUE,
formatter = JS("function() {
if (this.point.index === this.series.data.length - 1) {
return this.y.toFixed(2) + ' %';
}
return null;
}")
)
)
)
Examine trend and seasonality
In time series analysis, it’s important to distinguish where the variability in the final form of the series originates from. Time series consist of three main components: trend, seasonality, and randomness. The trend refers to whether the series shows a specific direction (upward/downward). Seasonality refers to recurring patterns at regular intervals.
y_t = S_t + T_t + E_t Όπου:
- y_t δηλώνουν τα δεδομένα που έχουμε διαθέσιμα,
- S_t
- T_t
- E_t
Similarly, the multiplicative model is defined:
y_t = S_t \cdot T_t \cdot E_t
where the components that make up the time series are multiplied instead of added.
Examine stationarity
Definition of stationarity
An important concept in studying time series is stationarity. A time series is called stationary (“Applied Time Series Analysis,” n.d.) if:
- E(X_t):\text{constant}
- Var(X_t): \text{constant}
- Cov(X_t, X_s): \text{constant}
Examine Stationarity Graphically
It is apparent that there is a big variation on the values of unemployment. This time series is not stationary and the differencing is justified.
Show the code
highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = "Unemployment Rates") %>%
hc_xAxis(title = list(text = "Year")) %>%
hc_yAxis(title = list(text = "Value")) %>%
hc_tooltip(shared = TRUE, valueDecimals = 2, valueSuffix = " %") %>%
hc_add_series(
data = g,
type = "line",
hcaes(x = Year, y = mean_unemployment, group = LOCATION)
) %>%
hc_plotOptions(
series = list(
dataLabels = list(
enabled = TRUE,
formatter = JS("function() {
if (this.point.index === this.series.data.length - 1) {
return this.y.toFixed(2) + ' %';
}
return null;
}")
)
)
)
Here we can see a big improvement in comparison with original data. I have some concerns about points close to 150 (mildly upwards trend) and 250 (outlier).
Show the code
grc_unemployment = unemployment %>% dplyr::filter(LOCATION == "GRC")
grc_unemployment_diff1 <- diff(grc_unemployment$Value, differences = 1)
highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = "Unemployment Rates") %>%
hc_xAxis(title = list(text = "Year")) %>%
hc_yAxis(title = list(text = "Value")) %>%
hc_tooltip(shared = TRUE, valueDecimals = 2, valueSuffix = " %") %>%
hc_add_series(
data = grc_unemployment_diff1
)
Given the concerns of above, I made also a second difference plot. It seems to solve the problem on points close to 150.
Show the code
grc_unemployment_diff2<- diff(grc_unemployment$Value, differences = 2)
highchart() %>%
hc_chart(type = "line") %>%
hc_title(text = "Unemployment Rates") %>%
hc_xAxis(title = list(text = "Year")) %>%
hc_yAxis(title = list(text = "Value")) %>%
hc_tooltip(shared = TRUE, valueDecimals = 2, valueSuffix = " %") %>%
hc_add_series(
data = grc_unemployment_diff2
)
Examine Stationarity with Statistical tests
The graphical interpretation of stationarity can be beneficial for a quick assessment on topic of stationarity. However it can be considered a subjective metric, which leads on a non consistent decision (someone may consider the second figure as stationary and some others not.
Thankfully, there are some statistical tests which can help us on our decisions. Some commonly used are :
- Augmented Dickey-Fuller (ADF) test
- Phillips- Perron (PP) test
- Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
- Zivot-Andrews (ZA) test
But an other problem arises. We are not in a fancy IDE with menus and checkboxes. We use R and its packages. There are many packages that have already write functions to study stationarity with the aforementioned tests. Some of them are:
- tseries
- urca
- CADFtest
We should carefully evaluate our choices, as these packages advertise similar functionality but differ significantly in the scope of their functions and, more importantly, in the level of flexibility they offer for handling time series characteristics. In short, the tseries
package is the one I’m most familiar with, but it is also the most limited in terms of configurable parameters. Although many tutorials use this package, I found that I couldn’t set the number of lags for tests or specify characteristics like trends when applicable. On the other hand, the urca
package addresses these limitations by allowing users to specify both the trend component and the number of lags in the respective tests. One minor drawback of urca, however, is that it only provides critical values, not p-values.
Summary
Show the code
summary_stationarity_results <- data.frame(
"Type" = c("levels", "Diff(GRC)", "Diff2(GRC)"),
"ADF test" = c("Non Stationary", "Stationary", "Stationary"),
"PP test" = c("Non Stationary", "Stationary", "Stationary"),
"KPSS test" =c("Non Stationary", "Non Stationary", "Stationary")
)
summary_stationarity_results %>% kbl() %>% kable_styling()
Type | ADF.test | PP.test | KPSS.test |
---|---|---|---|
levels | Non Stationary | Non Stationary | Non Stationary |
Diff(GRC) | Stationary | Stationary | Non Stationary |
Diff2(GRC) | Stationary | Stationary | Stationary |
ADF test
One of the most
H_0 : \text{The time series has a unit root} \\ H_1 : \text{The time series has not a unit root} Selecting the lags
Loading required package: MASS
Attaching package: 'MASS'
The following object is masked from 'package:dplyr':
select
Loading required package: strucchange
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
Loading required package: sandwich
Loading required package: urca
Loading required package: lmtest
Attaching package: 'lmtest'
The following object is masked _by_ '.GlobalEnv':
unemployment
The following object is masked from 'package:highcharter':
unemployment
Show the code
G = VARselect(grc_unemployment$Value, lag.max = 20, type = "const")
Show the code
get_adf_results = function(variable){
# Ensure variable is numeric (e.g., a time series or numeric vector)
if (!is.numeric(variable)) {
stop("Input variable must be numeric.")
}
# Perform ADF tests
adf_orig = adf.test(variable)
adf_diff1 = adf.test(diff(variable, differences = 1))
adf_diff2 = adf.test(diff(variable, differences = 2))
# Extract results
adf_table = data.frame(
"Transformation" = c("Level", "1st Difference", "2nd Difference"),
"Dickey-Fuller" = c(
adf_orig$statistic[["Dickey-Fuller"]],
adf_diff1$statistic[["Dickey-Fuller"]],
adf_diff2$statistic[["Dickey-Fuller"]]
),
"Lag Order" = c(
adf_orig$parameter[["Lag order"]],
adf_diff1$parameter[["Lag order"]],
adf_diff2$parameter[["Lag order"]]
),
"p-value" = c(
adf_orig$p.value,
adf_diff1$p.value,
adf_diff2$p.value
)
) %>%
kbl(digits = 5) %>%
kable_styling(full_width = FALSE)
return(adf_table)
}
Show the code
get_adf_results(grc_unemployment$Value)
Transformation | Dickey.Fuller | Lag.Order | p.value |
---|---|---|---|
Level | -0.93123 | 6 | 0.94846 |
1st Difference | -3.50738 | 6 | 0.04240 |
2nd Difference | -9.48682 | 6 | 0.01000 |
Συνεπώς, είναι προφανές από τα αποτελέσματα του στατιστικού ελέγχου Dickey Fuller ότι η χρονοσειρά μου δεν είναι στάσιμη. Θα πρέπει να εφαρμόσω τον έλεγχο Dickey-Fuller στις δοαφορές.
PP test
H_0 : \text{The time series has a unit root} \\ H_1 : \text{The time series has not a unit root}
Show the code
get_pp_results = function(variable){
# Ensure variable is numeric (e.g., a time series or numeric vector)
if (!is.numeric(variable)) {
stop("Input variable must be numeric.")
}
# Perform ADF tests
pp_orig = pp.test(variable)
pp_diff1 = pp.test(diff(variable, differences = 1))
pp_diff2 = pp.test(diff(variable, differences = 2))
# Extract results
pp_table = data.frame(
"Transformation" = c("Level", "1st Difference", "2nd Difference"),
"Phillips-Perron" = c(
pp_orig$statistic[["Dickey-Fuller Z(alpha)"]],
pp_diff1$statistic[["Dickey-Fuller Z(alpha)"]],
pp_diff2$statistic[["Dickey-Fuller Z(alpha)"]]
),
"Lag Order" = c(
pp_orig[["parameter"]][["Truncation lag parameter"]],
pp_diff1[["parameter"]][["Truncation lag parameter"]],
pp_diff2[["parameter"]][["Truncation lag parameter"]]
),
"p-value" = c(
pp_orig$p.value,
pp_diff1$p.value,
pp_diff2$p.value
)
) %>%
kbl(digits = 5) %>%
kable_styling(full_width = FALSE)
return(pp_table)
}
Show the code
get_pp_results(grc_unemployment$Value)
Transformation | Phillips.Perron | Lag.Order | p.value |
---|---|---|---|
Level | 0.19401 | 5 | 0.99 |
1st Difference | -279.81456 | 5 | 0.01 |
2nd Difference | -337.13053 | 5 | 0.01 |
KPSS test
H_0 : \text{The time series has not a unit root} \\ H_1 : \text{Alternatively}
Show the code
get_kpss_results = function(variable){
# Ensure variable is numeric (e.g., a time series or numeric vector)
if (!is.numeric(variable)) {
stop("Input variable must be numeric.")
}
# Perform ADF tests
kpss_orig = kpss.test(variable)
kpss_diff1 = kpss.test(diff(variable, differences = 1))
kpss_diff2 = kpss.test(diff(variable, differences = 2))
# Extract results
pp_table = data.frame(
"Transformation" = c("Level", "1st Difference", "2nd Difference"),
"KPSS" = c(
kpss_orig[["statistic"]][["KPSS Level"]],
kpss_diff1[["statistic"]][["KPSS Level"]],
kpss_diff2[["statistic"]][["KPSS Level"]]
),
"Lag Order" = c(
kpss_orig[["parameter"]][["Truncation lag parameter"]],
kpss_diff1[["parameter"]][["Truncation lag parameter"]],
kpss_diff2[["parameter"]][["Truncation lag parameter"]]
),
"p-value" = c(
kpss_orig$p.value,
kpss_diff1$p.value,
kpss_diff2$p.value
)
) %>%
kbl(digits = 5) %>%
kable_styling(full_width = FALSE)
return(pp_table)
}
Show the code
get_kpss_results(grc_unemployment$Value)
Transformation | KPSS | Lag.Order | p.value |
---|---|---|---|
Level | 2.53719 | 5 | 0.01000 |
1st Difference | 0.59510 | 5 | 0.02308 |
2nd Difference | 0.01426 | 5 | 0.10000 |
Identify Model
Show the code
acf_plot_function(unemployment$Value, min=-0.2, max=1)
Show the code
pacf_plot_function(unemployment$Value, min=-0.2, max=1)
Show the code
acf_plot_function(grc_unemployment_diff1, min=-0.1, max = 0.6)
Show the code
pacf_plot_function(grc_unemployment_diff1, min=-0.2, max=0.5)
Show the code
acf_plot_function(grc_unemployment_diff2, min=-0.5, max = 0.5)
Show the code
pacf_plot_function(grc_unemployment_diff2, min=-0.5, max = 0.5)
Build Time Series Model
Show the code
auto_model <- auto.arima(grc_unemployment$Value, trace = T,seasonal = TRUE)
Fitting models using approximations to speed things up...
ARIMA(2,2,2) : 302.1399
ARIMA(0,2,0) : 473.9345
ARIMA(1,2,0) : 405.2316
ARIMA(0,2,1) : 297.5442
ARIMA(1,2,1) : 301.0253
ARIMA(0,2,2) : 299.5392
ARIMA(1,2,2) : 302.154
Now re-fitting the best model(s) without approximations...
ARIMA(0,2,1) : 296.71
Best model: ARIMA(0,2,1)
Show the code
auto_model
Series: grc_unemployment$Value
ARIMA(0,2,1)
Coefficients:
ma1
-0.9075
s.e. 0.0229
sigma^2 = 0.1597: log likelihood = -146.33
AIC=296.67 AICc=296.71 BIC=304.02
Call:
arima(x = grc_unemployment$Value, order = c(9, 2, 1))
Coefficients:
ar1 ar2 ar3 ar4 ar5 ar6 ar7 ar8
-0.7879 -0.7083 -0.6715 -0.6944 -0.5922 -0.3838 -0.3160 -0.4457
s.e. 0.1713 0.1539 0.1493 0.1434 0.1379 0.1172 0.0883 0.0718
ar9 ma1
-0.2704 -0.1456
s.e. 0.0690 0.1767
sigma^2 estimated as 0.1388: log likelihood = -126.92, aic = 275.83
Compare Models
After building some ARIMA models, I should decide which is the best one to make my estimations. One metric to evaluate those models is AIC (Akaike Information Criterion). The lower the value of AIC, the better my model.
Show the code
accuracy_table <- data.frame(
"Name of Model" = c("Auto Model", "ARIMA Candidate #1", "ARIMA Candidate #2", "ARIMA Candidate #3"),
Model = c("ARIMA(0,2,1)", "ARIMA(0,1,2)", "ARIMA(1,1,2)", "ARIMA(9,2,1)"),
AIC =c(auto_model$aic, arimaModel_1$aic, arimaModel_2$aic, arimaModel_3$aic)
)
accuracy_table %>% kbl() %>% kable_styling()
Name.of.Model | Model | AIC |
---|---|---|
Auto Model | ARIMA(0,2,1) | 296.6684 |
ARIMA Candidate #1 | ARIMA(0,1,2) | 319.6669 |
ARIMA Candidate #2 | ARIMA(1,1,2) | 296.3705 |
ARIMA Candidate #3 | ARIMA(9,2,1) | 275.8314 |
So, the best model is the Auto Model (ARIMA(9,2,1)), which has the lowest AIC value.
Checking best models
Forecast Future Unemployment
Previously, I identify which is the best model. Now, I will use this model in order to predict unemployment for the next 6 months. It should be recalled that the last available value was from August of 2022 (12.2%). Therefore, I will make a prediction for unemployment in Greece until February 2023.
ARIMA (0,2,1) forecasts
Show the code
s = forecast(auto_model,20)
month = seq(max(grc_unemployment$TIME) %m+% months(1), # start at next month
by = "1 month",
length.out = 20)
highchart() %>%
hc_title(text = "Prediction of Unemployment with ARIMA(0,2,1)") %>%
hc_xAxis(type = "datetime") %>%
hc_yAxis(title = list(text="Unemployment (%)")) %>%
hc_add_series(name="Historic Data",
data = list_parse2(data.frame(x = datetime_to_timestamp(grc_unemployment$TIME), y = grc_unemployment$Value)),
type = "line"
) %>%
hc_add_series(name = "Forecast",
data = list_parse2(data.frame(
x = datetime_to_timestamp(month),
y = round(s$mean, digits=2)
)),
type = "line",
color = "#d62728") %>%
hc_add_series(name = "Confidence Interval",
data = list_parse2(data.frame(
x = datetime_to_timestamp(month),
low = round(s$lower[, "80%"], digits=2),
high = round(s$upper[, "80%"], digits=2)
)),
type = "arearange",
color = hex_to_rgba("#d62728", 0.2),
linkedTo = ":previous")
Warning: `unite_()` was deprecated in tidyr 1.2.0.
ℹ Please use `unite()` instead.
ℹ The deprecated feature was likely used in the highcharter package.
Please report the issue at <https://github.com/jbkunst/highcharter/issues>.
ARIMA (9,2,1) forecasts
Show the code
s1 = forecast(arimaModel_3,20)
month = seq(max(grc_unemployment$TIME) %m+% months(1), # start at next month
by = "1 month",
length.out = 20)
highchart() %>%
hc_title(text = "Prediction of Unemployment with ARIMA(0,2,1)") %>%
hc_xAxis(type = "datetime") %>%
hc_yAxis(title = list(text="Unemployment (%)")) %>%
hc_add_series(name="Historic Data",
data = list_parse2(data.frame(x = datetime_to_timestamp(grc_unemployment$TIME), y = grc_unemployment$Value)),
type = "line"
) %>%
hc_add_series(name = "Forecast",
data = list_parse2(data.frame(
x = datetime_to_timestamp(month),
y = round(s1$mean, digits=2)
)),
type = "line",
color = "#d62728") %>%
hc_add_series(name = "Confidence Interval",
data = list_parse2(data.frame(
x = datetime_to_timestamp(month),
low = round(s1$lower[, "80%"], digits=2),
high = round(s1$upper[, "80%"], digits=2)
)),
type = "arearange",
color = hex_to_rgba("#d62728", 0.2),
linkedTo = ":previous")
Results
Given the diagram as well as the forecast table (of the best performing model, candidate #3), I conclude that a reduction in unemployment in Greece is expected in the next period of time (in the next six months). More specifically, Greece’s unemployment in February 2023 will range between 10% (9.4%) and 13% (14%) with an 80% (95%) probability (based on ARIMA(9,2,1) model).
Acknowledgements
Image by Rosy / Bad Homburg / Germany from Pixabay
References
Citation
@online{2022,
author = {, stesiam},
title = {Forecasting {Unemployment} in {Greece}},
date = {2022-10-22},
url = {https://stesiam.com/posts/forecasting-greek-unemployment/},
langid = {en}
}