Univariate TS Models

Note

For TAs:
1. The differencing and ADF test have already been done in the EDA section. Here, I directly check the ACF and PACF plots of the stationary series.
2. I combined the ARIMA and SARIMA sections. By checking the ACF and PACF plots, I can determine whether a seasonal component exists, select appropriate orders of terms, and choose the optimal model through model diagnostics.

Here, we will apply the ARIMA or SARIMA model to our 10 time series datasets in order to make forecasts. Below is a brief introduction to both models, including the parameters p, d, q, P, D, and Q.

ARIMA Model:
ARIMA (AutoRegressive Integrated Moving Average) is used for non-seasonal time series data that exhibit patterns over time. The parameters are:
- p: The order of the autoregressive (AR) term.
- d: The degree of differencing to make the series stationary.
- q: The order of the moving average (MA) term.

SARIMA Model:
SARIMA (Seasonal ARIMA) is an extension of the ARIMA model that deals with seasonality in time series data. The additional parameters for seasonal components are:
- P: The order of the seasonal autoregressive (AR) term.
- D: The degree of seasonal differencing.
- Q: The order of the seasonal moving average (MA) term.
- s: The number of periods in each season.

Data and Model Selection:
In the EDA phase, we performed differencing and seasonal differencing to transform the original series into stationary series. The stationarity of the differenced series was validated using the Augmented Dickey-Fuller (ADF) test. Therefore, we can now decide whether to apply the ARIMA or SARIMA model and select the appropriate values for d or D.

U.S. Dollar Index: First-order differencing, ARIMA model, d=1.
Trade Balance: Both seasonal and ordinary differencing, SARIMA model, d=1, D=1, s=4.
GDP: First-order differencing, ARIMA model, d=1.
Unemployment Rate: First-order differencing, ARIMA model, d=1.
CPI: Both seasonal and ordinary differencing, SARIMA model, d=1, D=1, s=12.
S&P 500: First-order differencing, ARIMA model, d=1.
Gold Price: First-order differencing, ARIMA model, d=1.
Global Commodity Price: First-order differencing, ARIMA model, d=1.
House Price: First-order differencing, ARIMA model, d=1.
International Visitors: Both seasonal and ordinary differencing, SARIMA model, d=1, D=1, s=12.

After performing the necessary differencing or seasonal differencing, all series have become stationary. We will now present the ACF and PACF plots of these differenced series to determine the optimal values for p (AR model), q (MA model), P (seasonal AR), and Q (seasonal MA).

ACF and PACF Plots

The following displays the ACF and PACF plots for the stationary series after differencing.

Code

library(ggplot2)
library(tidyverse)
library(plotly)
library(quantmod)
library(forecast)
library(astsa)

# Load data
invisible(getSymbols("DX-Y.NYB", src = "yahoo", from = "2005-01-01", to = "2024-12-31"))
dxy <- data.frame(Date = index(`DX-Y.NYB`), 
                       Open = `DX-Y.NYB`[, "DX-Y.NYB.Open"], 
                       High = `DX-Y.NYB`[, "DX-Y.NYB.High"], 
                       Low = `DX-Y.NYB`[, "DX-Y.NYB.Low"], 
                       Close = `DX-Y.NYB`[, "DX-Y.NYB.Close"])
colnames(dxy) <- c("Date", "Open", "High", "Low", "Close")
dxy <- na.omit(dxy)

bea <- read.csv("./data/bea.csv")
bea$time <- as.Date(bea$time)

gdp <- read.csv("./data/gdp.csv")
gdp$time <- as.Date(gdp$time)
gdp$total <- gdp$consumption + gdp$investment + gdp$net_export + gdp$government

data_unem <- read.csv("./data/unem.csv", header=TRUE)
data_unem$time <- as.Date(data_unem$time)

data_cpi <- read.csv("./data/cpi.csv", header=TRUE)
data_cpi$time <- as.Date(data_cpi$time)

invisible(getSymbols("^GSPC", src = "yahoo", from = "2005-01-01", to = "2024-12-31"))
data <- data.frame(Date = index(GSPC), 
                       Open = GSPC[, "GSPC.Open"], 
                       High = GSPC[, "GSPC.High"], 
                       Low = GSPC[, "GSPC.Low"], 
                       Close = GSPC[, "GSPC.Close"])
colnames(data) <- c("Date", "Open", "High", "Low", "Close")

xau <- read.csv("./data/xau.csv")
xau$Date <- as.Date(xau$Date)

gsci <- read.csv("./data/gsci.csv")
gsci$Date <- as.Date(gsci$Date)

house <- read.csv("./data/house.csv", header=TRUE)
house$time <- as.Date(house$time)

visitors <- read.csv("./data/visitors.csv", header=TRUE)
visitors$time <- as.Date(visitors$time)

# time series
dxy_ts <- ts(log(dxy$Close), start=c(2005,1), frequency=252)
balance_ts <- ts(bea$balance, start=c(2005,1), end=c(2024,3), frequency=4)
gdp_ts <- ts(gdp$total, start=c(2005,1), end=c(2024,3), frequency=4)
unem_ts <- ts(log(data_unem$unem), start=c(2005,1), end=c(2023,12), frequency=12)
cpi_ts <- ts(data_cpi$cpi, start=c(2005,1), end=c(2023,12), frequency=12)
sp5_ts <- ts(GSPC$GSPC.Close, start=c(2005,1), frequency=252)
xau_ts <- ts(log(xau$Price), start=c(2005,1), end=c(2024,52), frequency=52)
gsci_ts <- ts(log(gsci$Price), start=c(2014,252), frequency=252)
house_ts <- ts(house$index, start=c(2005,1), end=c(2024,4), frequency=4)
visitors_ts <- ts(log(visitors$count), start=c(2005,1), end=c(2024,12), frequency=12)

Code

library(gridExtra)

acf <- ggAcf(diff(dxy_ts), 50)+ggtitle("ACF Plot for USD Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(dxy_ts), 50)+ggtitle("PACF Plot for USD Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(diff(balance_ts, lag=4)))+ggtitle("ACF Plot for Trade Balance") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(diff(balance_ts, lag=4)))+ggtitle("PACF Plot for Trade Balance") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(gdp_ts))+ggtitle("ACF Plot for GDP") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(gdp_ts))+ggtitle("PACF Plot for GDP") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(unem_ts))+ggtitle("ACF Plot for Unemployment Rate") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(unem_ts))+ggtitle("PACF Plot for Unemployment Rate") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(diff(cpi_ts, lag=12)))+ggtitle("ACF Plot for CPI") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(diff(cpi_ts, lag=12)))+ggtitle("PACF Plot for CPI") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(sp5_ts), 50)+ggtitle("ACF Plot for S&P 500 Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(sp5_ts), 50)+ggtitle("PACF Plot for S&P 500 Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(xau_ts), 50)+ggtitle("ACF Plot for Gold Price") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(xau_ts), 50)+ggtitle("PACF Plot for Gold Price") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(gsci_ts), 50)+ggtitle("ACF Plot for S&P GSCI Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(gsci_ts), 50)+ggtitle("PACF Plot for S&P GSCI Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(house_ts))+ggtitle("ACF Plot for House Price Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(house_ts))+ggtitle("PACF Plot for House Price Index") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

Code

acf <- ggAcf(diff(diff(visitors_ts, lag=12)))+ggtitle("ACF Plot for Number of International Visitors") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
pacf <- ggPacf(diff(diff(visitors_ts, lag=12)))+ggtitle("PACF Plot for Number of International Visitors") + theme_bw()+
  geom_segment(lineend = "butt", color = "#5a3196") +
    geom_hline(yintercept = 0, color = "#5a3196") 
grid.arrange(acf, pacf, nrow=2)

The ACF plot helps determine the q parameter by identifying the number of significant lags in the moving average (MA) component. The PACF plot helps determine the p parameter by identifying the number of significant lags in the autoregressive (AR) component. For seasonal models, ACF and PACF can also be used to determine the seasonal parameters Q and P.

U.S. Dollar Index: ARIMA model, p=0, d=1, q=0.
Trade Balance: SARIMA model, p=1, d=1, q=1, P=1 or 2，D=1, Q=1, s=4.
GDP: ARIMA model, p=0, d=1, q=0.
Unemployment Rate: ARIMA model, p=0 or 1, d=1, q=0 or 1.
CPI: SARIMA model, p=1 or 2, d=1, q=1 or 2, P=1 or 2, D=1, Q=1 or 2, s=12.
S&P 500: ARIMA model, p=0 or 1 or 4, d=1, q=0:4.
Gold Price: ARIMA model, p=0:4, d=1, q=0:4.
Global Commodity Price: ARIMA model, p=0:4, d=1, q=0:4.
House Price: ARIMA model, p=1 or 3, d=1, q=0:4.
International Visitors: SARIMA model, p=1 or 2, d=1, q=1, P=1 or 2, D=1, Q=1, s=12.

Model Selection by Manual Search

Write a function

Code

set.seed(123)
library(kableExtra)

#write a funtion
ARIMA.c = function(p1, p2, q1, q2, data) {
  d = 1
  i = 1
  temp = data.frame()
  ls = matrix(rep(NA, 6 * 100), nrow = 100)
  
  for (p in p1:p2) {
    for (q in q1:q2) {
          if (p + d + q <= 9) {
            
            model <- tryCatch({
              Arima(data, order = c(p, d, q), include.drift = TRUE)
            }, error = function(e) {
              return(NULL)
            })
            
            if (!is.null(model)) {
              ls[i, ] = c(p, d, q, model$aic, model$bic, model$aicc)
              i = i + 1
            }
          }
        }
      }
      temp = as.data.frame(ls)
      names(temp) = c("p", "d", "q", "AIC", "BIC", "AICc")
      temp = na.omit(temp)
      return(temp)
}

SARIMA.c = function(p1, p2, q1, q2, P1, P2, Q1, Q2, s, data) {
  d = 1
  D = 1
  i = 1
  temp = data.frame()
  ls = matrix(rep(NA, 9 * 100), nrow = 100)
  
  for (p in p1:p2) {
    for (q in q1:q2) {
      for (P in P1:P2) {
        for (Q in Q1:Q2) {
          if (p + d + q + P + D + Q <= 9) {
            
            model <- tryCatch({
              Arima(data, order = c(p, d, q), seasonal = list(order = c(P,D,Q), period = s))
            }, error = function(e) {
              return(NULL)
            })
            
            if (!is.null(model)) {
              ls[i, ] = c(p, d, q, P, D, Q, model$aic, model$bic, model$aicc)
              i = i + 1
            }
          }
        }
      }
    }
  }
  
  temp = as.data.frame(ls)
  names(temp) = c("p", "d", "q", "P", "D", "Q", "AIC", "BIC", "AICc")
  temp = na.omit(temp)
  return(temp)
}

highlight_output = function(output, type="ARIMA") {
    highlight_row <- c(which.min(output$AIC), which.min(output$BIC), which.min(output$AICc))
    knitr::kable(output, align = 'c', caption = paste("Comparison of", type, "Models")) %>%
    kable_styling(full_width = FALSE, position = "center") %>%
    row_spec(highlight_row, bold = TRUE, background = "#FFFF99")  # Highlight row in yellow
}

Code

output=ARIMA.c(p1=0,p2=3,q1=0,q2=2,data=dxy_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	-39549.01	-39535.96	-39549.01
0	1	1	-39548.08	-39528.51	-39548.08
0	1	2	-39546.49	-39520.39	-39546.48
1	1	0	-39548.06	-39528.49	-39548.06
1	1	1	-39546.07	-39519.98	-39546.06
1	1	2	-39544.50	-39511.87	-39544.48
2	1	0	-39546.53	-39520.43	-39546.52
2	1	1	-39544.52	-39511.90	-39544.51
2	1	2	-39542.13	-39502.98	-39542.11
3	1	0	-39545.35	-39512.73	-39545.34
3	1	1	-39543.36	-39504.21	-39543.34
3	1	2	-39541.22	-39495.55	-39541.20

Code

output=SARIMA.c(p1=0, p2=1, q1=0, q2=1, P1=0, P2=3, Q1=0, Q2=1, s=4, data=balance_ts)
highlight_output(output, type="SARIMA")

Comparison of SARIMA Models
p	d	q	P	D	Q	AIC	BIC	AICc
0	1	0	0	1	0	1687.800	1690.104	1687.856
0	1	0	0	1	1	1644.011	1648.619	1644.180
0	1	0	1	1	0	1666.520	1671.128	1666.689
0	1	0	1	1	1	1645.374	1652.286	1645.716
0	1	0	2	1	0	1658.774	1665.686	1659.117
0	1	0	2	1	1	1647.328	1656.544	1647.908
0	1	0	3	1	0	1650.663	1659.880	1651.243
0	1	0	3	1	1	1649.327	1660.848	1650.210
0	1	1	0	1	0	1678.957	1683.565	1679.126
0	1	1	0	1	1	1637.229	1644.141	1637.572
0	1	1	1	1	0	1656.656	1663.568	1656.998
0	1	1	1	1	1	1638.897	1648.113	1639.477
0	1	1	2	1	0	1652.337	1661.553	1652.917
0	1	1	2	1	1	1640.855	1652.375	1641.737
0	1	1	3	1	0	1646.420	1657.940	1647.302
0	1	1	3	1	1	1642.721	1656.545	1643.975
1	1	0	0	1	0	1680.722	1685.330	1680.891
1	1	0	0	1	1	1639.440	1646.352	1639.783
1	1	0	1	1	0	1658.873	1665.785	1659.215
1	1	0	1	1	1	1641.207	1650.424	1641.787
1	1	0	2	1	0	1654.105	1663.321	1654.685
1	1	0	2	1	1	1643.165	1654.685	1644.047
1	1	0	3	1	0	1648.591	1660.111	1649.473
1	1	0	3	1	1	1645.109	1658.934	1646.363
1	1	1	0	1	0	1672.026	1678.939	1672.369
1	1	1	0	1	1	1638.773	1647.990	1639.353
1	1	1	1	1	0	1658.556	1667.772	1659.135
1	1	1	1	1	1	1640.490	1652.011	1641.373
1	1	1	2	1	0	1654.048	1665.569	1654.931
1	1	1	2	1	1	1642.480	1656.305	1643.734
1	1	1	3	1	0	1647.626	1661.450	1648.880
1	1	1	3	1	1	1644.346	1660.474	1646.043

Code

output=ARIMA.c(p1=0,p2=3,q1=0,q2=3,data=gdp_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	1134.101	1138.814	1134.261
0	1	1	1136.067	1143.137	1136.391
0	1	2	1136.765	1146.192	1137.313
0	1	3	1137.904	1149.687	1138.737
1	1	0	1136.058	1143.128	1136.382
1	1	1	1137.910	1147.337	1138.458
1	1	2	1136.451	1148.235	1137.284
1	1	3	1138.449	1152.589	1139.632
2	1	0	1136.759	1146.186	1137.307
2	1	1	1136.479	1148.262	1137.312
2	1	2	1138.450	1152.590	1139.633
2	1	3	1140.223	1156.720	1141.823
3	1	0	1137.340	1149.124	1138.174
3	1	1	1138.434	1152.575	1139.618
3	1	2	1140.093	1156.590	1141.693
3	1	3	1141.938	1160.791	1144.025

Code

output=ARIMA.c(p1=0,p2=3,q1=0,q2=3,data=unem_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	-446.2086	-439.3587	-446.1551
0	1	1	-448.7059	-438.4310	-448.5983
0	1	2	-448.6860	-434.9862	-448.5058
0	1	3	-446.7116	-429.5868	-446.4401
1	1	0	-447.8811	-437.6062	-447.7735
1	1	1	-449.1818	-435.4820	-449.0016
1	1	2	-447.1818	-430.0570	-446.9103
1	1	3	-445.2603	-424.7106	-444.8785
2	1	0	-448.1008	-434.4010	-447.9206
2	1	1	-447.1818	-430.0571	-446.9103
2	1	2	-448.5442	-427.9945	-448.1624
2	1	3	-446.6341	-422.6594	-446.1227
3	1	0	-446.1389	-429.0141	-445.8674
3	1	1	-445.2760	-424.7263	-444.8942
3	1	2	-446.6353	-422.6606	-446.1239
3	1	3	-444.5934	-417.1938	-443.9328

Code

output=SARIMA.c(p1=0, p2=2, q1=0, q2=2, P1=0, P2=2, Q1=0, Q2=2, s=12, data=cpi_ts)
highlight_output(output, type="SARIMA")

Comparison of SARIMA Models
p	d	q	P	D	Q	AIC	BIC	AICc
0	1	0	0	1	0	658.6535	662.0241	658.6723
0	1	0	0	1	1	564.1015	570.8427	564.1581
0	1	0	0	1	2	565.4134	575.5253	565.5272
0	1	0	1	1	0	617.2403	623.9816	617.2970
0	1	0	1	1	1	565.5610	575.6729	565.6747
0	1	0	1	1	2	566.7002	580.1828	566.8907
0	1	0	2	1	0	587.3352	597.4471	587.4489
0	1	0	2	1	1	565.7886	579.2712	565.9791
0	1	0	2	1	2	567.7341	584.5873	568.0212
0	1	1	0	1	0	602.0990	608.8402	602.1556
0	1	1	0	1	1	496.5890	506.7009	496.7028
0	1	1	0	1	2	498.5221	512.0047	498.7126
0	1	1	1	1	0	549.9327	560.0447	550.0465
0	1	1	1	1	1	498.5327	512.0153	498.7232
0	1	1	1	1	2	500.5890	517.4422	500.8761
0	1	1	2	1	0	523.8556	537.3381	524.0461
0	1	1	2	1	1	499.6569	516.5101	499.9440
0	1	1	2	1	2	501.2088	521.4326	501.6127
0	1	2	0	1	0	599.6907	609.8026	599.8044
0	1	2	0	1	1	494.2850	507.7675	494.4755
0	1	2	0	1	2	496.2307	513.0839	496.5178
0	1	2	1	1	0	547.4384	560.9210	547.6289
0	1	2	1	1	1	496.2386	513.0918	496.5257
0	1	2	1	1	2	498.2849	518.5087	498.6888
0	1	2	2	1	0	522.9439	539.7971	523.2310
0	1	2	2	1	1	497.4835	517.7073	497.8873
0	1	2	2	1	2	500.2385	523.8330	500.7796
1	1	0	0	1	0	603.0126	609.7538	603.0692
1	1	0	0	1	1	496.0501	506.1620	496.1639
1	1	0	0	1	2	497.9731	511.4557	498.1636
1	1	0	1	1	0	553.9797	564.0916	554.0934
1	1	0	1	1	1	497.9887	511.4713	498.1792
1	1	0	1	1	2	500.0198	516.8730	500.3069
1	1	0	2	1	0	527.9067	541.3892	528.0971
1	1	0	2	1	1	498.4834	515.3365	498.7704
1	1	0	2	1	2	499.8364	520.0603	500.2403
1	1	1	0	1	0	599.8505	609.9624	599.9642
1	1	1	0	1	1	493.2341	506.7166	493.4245
1	1	1	0	1	2	495.1292	511.9824	495.4163
1	1	1	1	1	0	547.4658	560.9484	547.6563
1	1	1	1	1	1	495.1458	511.9990	495.4329
1	1	1	1	1	2	497.2322	517.4561	497.6361
1	1	1	2	1	0	522.6196	539.4728	522.9066
1	1	1	2	1	1	496.2410	516.4649	496.6449
1	1	1	2	1	2	497.8764	521.4708	498.4174
1	1	2	0	1	0	601.6888	615.1714	601.8793
1	1	2	0	1	1	489.1427	505.9959	489.4298
1	1	2	0	1	2	490.3504	510.5742	490.7542
1	1	2	1	1	0	549.1670	566.0202	549.4541
1	1	2	1	1	1	490.5057	510.7295	490.9096
1	1	2	1	1	2	490.7902	514.3847	491.3313
1	1	2	2	1	0	523.1811	543.4049	523.5849
1	1	2	2	1	1	490.9600	514.5545	491.5011
1	1	2	2	1	2	492.1886	519.1537	492.8877
2	1	0	0	1	0	600.1804	610.2923	600.2941
2	1	0	0	1	1	495.2469	508.7294	495.4374
2	1	0	0	1	2	497.1964	514.0496	497.4835
2	1	0	1	1	0	549.5725	563.0551	549.7630
2	1	0	1	1	1	497.2053	514.0585	497.4924
2	1	0	1	1	2	499.2351	519.4589	499.6389
2	1	0	2	1	0	524.8195	541.6727	525.1066
2	1	0	2	1	1	498.0705	518.2943	498.4743
2	1	0	2	1	2	499.6220	523.2165	500.1630
2	1	1	0	1	0	601.7994	615.2820	601.9899
2	1	1	0	1	1	494.5859	511.4391	494.8730
2	1	1	0	1	2	496.3847	516.6086	496.7886
2	1	1	1	1	0	549.4650	566.3181	549.7520
2	1	1	1	1	1	496.4203	516.6442	496.8242
2	1	1	1	1	2	497.5523	521.1468	498.0934
2	1	1	2	1	0	521.3671	541.5910	521.7710
2	1	1	2	1	1	497.3314	520.9259	497.8725
2	1	1	2	1	2	499.8879	526.8530	500.5869
2	1	2	0	1	0	603.5423	620.3955	603.8294
2	1	2	0	1	1	490.8666	511.0904	491.2705
2	1	2	0	1	2	492.1662	515.7606	492.7072
2	1	2	1	1	0	545.9993	566.2231	546.4032
2	1	2	1	1	1	498.0507	521.6452	498.5918
2	1	2	1	1	2	499.5352	526.5003	500.2342
2	1	2	2	1	1	492.8487	519.8138	493.5477

Code

output=ARIMA.c(p1=0,p2=4,q1=0,q2=4,data=sp5_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	47962.58	47975.63	47962.59
0	1	1	47933.67	47953.24	47933.67
0	1	2	47931.10	47957.20	47931.11
0	1	3	47930.79	47963.41	47930.80
0	1	4	47927.88	47967.02	47927.89
1	1	0	47931.96	47951.54	47931.97
1	1	1	47930.98	47957.07	47930.98
1	1	2	47932.40	47965.02	47932.41
1	1	3	47929.10	47968.24	47929.12
1	1	4	47929.84	47975.51	47929.86
2	1	0	47930.68	47956.77	47930.69
2	1	1	47932.58	47965.20	47932.60
2	1	2	47932.99	47972.13	47933.01
2	1	3	47926.33	47971.99	47926.35
2	1	4	47835.84	47888.03	47835.87
3	1	0	47932.08	47964.70	47932.09
3	1	1	47930.16	47969.30	47930.18
3	1	2	47919.29	47964.96	47919.32
3	1	3	47838.73	47890.92	47838.76
3	1	4	47837.02	47895.73	47837.06
4	1	0	47926.75	47965.89	47926.77
4	1	1	47928.80	47974.46	47928.82
4	1	2	47869.65	47921.84	47869.68
4	1	3	47837.02	47895.73	47837.05
4	1	4	47821.16	47886.39	47821.20

Code

output=ARIMA.c(p1=0,p2=4,q1=0,q2=4,data=xau_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	-4855.388	-4845.496	-4855.376
0	1	1	-4853.402	-4838.564	-4853.378
0	1	2	-4852.414	-4832.630	-4852.375
0	1	3	-4851.325	-4826.595	-4851.267
0	1	4	-4849.898	-4820.222	-4849.817
1	1	0	-4853.401	-4838.563	-4853.378
1	1	1	-4851.401	-4831.617	-4851.363
1	1	2	-4852.944	-4828.214	-4852.886
1	1	3	-4851.075	-4821.399	-4850.994
1	1	4	-4849.221	-4814.599	-4849.113
2	1	0	-4852.379	-4832.595	-4852.340
2	1	1	-4852.991	-4828.261	-4852.933
2	1	2	-4850.882	-4821.206	-4850.800
2	1	3	-4848.999	-4814.377	-4848.891
2	1	4	-4847.111	-4807.543	-4846.971
3	1	0	-4851.100	-4826.370	-4851.042
3	1	1	-4851.130	-4821.454	-4851.049
3	1	2	-4849.017	-4814.395	-4848.908
3	1	3	-4846.877	-4807.309	-4846.738
3	1	4	-4846.797	-4802.283	-4846.622
4	1	0	-4849.568	-4819.892	-4849.487
4	1	1	-4849.346	-4814.724	-4849.238
4	1	2	-4847.200	-4807.631	-4847.060
4	1	3	-4845.021	-4800.507	-4844.846
4	1	4	-4851.528	-4802.068	-4851.314

Code

output=ARIMA.c(p1=0,p2=4,q1=0,q2=4,data=gsci_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	-14332.79	-14321.12	-14332.79
0	1	1	-14330.80	-14313.28	-14330.79
0	1	2	-14329.07	-14305.72	-14329.05
0	1	3	-14327.87	-14298.68	-14327.84
0	1	4	-14328.89	-14293.86	-14328.86
1	1	0	-14330.80	-14313.28	-14330.79
1	1	1	-14328.80	-14305.44	-14328.78
1	1	2	-14327.07	-14297.88	-14327.04
1	1	3	-14327.03	-14292.00	-14327.00
1	1	4	-14326.89	-14286.03	-14326.85
2	1	0	-14329.09	-14305.74	-14329.07
2	1	1	-14327.09	-14297.90	-14327.06
2	1	2	-14325.46	-14290.43	-14325.43
2	1	3	-14324.10	-14283.23	-14324.06
2	1	4	-14328.87	-14282.17	-14328.82
3	1	0	-14327.84	-14298.65	-14327.81
3	1	1	-14326.78	-14291.75	-14326.74
3	1	2	-14324.30	-14283.44	-14324.26
3	1	3	-14328.60	-14281.90	-14328.55
3	1	4	-14326.07	-14273.53	-14326.00
4	1	0	-14328.61	-14293.58	-14328.58
4	1	1	-14326.62	-14285.76	-14326.58
4	1	2	-14332.55	-14285.85	-14332.49
4	1	3	-14327.41	-14274.87	-14327.34
4	1	4	-14325.66	-14267.29	-14325.58

Code

output=ARIMA.c(p1=0,p2=4,q1=0,q2=4,data=house_ts)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	558.5658	563.3047	558.7237
0	1	1	508.3397	515.4480	508.6597
0	1	2	495.4612	504.9390	496.0017
0	1	3	495.5759	507.4231	496.3978
0	1	4	485.3186	499.5353	486.4853
1	1	0	509.8531	516.9614	510.1731
1	1	1	500.2495	509.7273	500.7900
1	1	2	494.9507	506.7980	495.7727
1	1	3	496.2475	510.4642	497.4142
1	1	4	479.3701	495.9562	480.9476
2	1	0	510.4845	519.9622	511.0250
2	1	1	502.0996	513.9468	502.9215
2	1	2	496.7344	510.9511	497.9011
2	1	3	490.7811	507.3672	492.3586
2	1	4	479.9493	498.9049	482.0064
3	1	0	475.3554	487.2026	476.1773
3	1	1	476.9477	491.1644	478.1144
3	1	2	474.5879	491.1740	476.1654
3	1	3	476.5774	495.5330	478.6346
3	1	4	476.9459	498.2709	479.5546
4	1	0	476.6184	490.8351	477.7851
4	1	1	477.7623	494.3485	479.3398
4	1	2	475.1334	494.0890	477.1906
4	1	3	476.8984	498.2234	479.5071
4	1	4	477.7119	501.4064	480.9472

Code

output=SARIMA.c(p1=0, p2=2, q1=0, q2=1, P1=0, P2=3, Q1=0, Q2=1, s=12, data=visitors_ts)
highlight_output(output, type="SARIMA")

Comparison of SARIMA Models
p	d	q	P	D	Q	AIC	BIC	AICc
0	1	0	0	1	0	68.6551507	72.080101	68.6729284
0	1	0	0	1	1	-72.6757187	-65.825819	-72.6221472
0	1	0	1	1	0	9.7573949	16.607295	9.8109663
0	1	0	1	1	1	-71.3588664	-61.084016	-71.2512430
0	1	0	2	1	0	-17.8900301	-7.615180	-17.7824068
0	1	0	2	1	1	-70.7505577	-57.050758	-70.5703775
0	1	0	3	1	0	-36.6883524	-22.988552	-36.5081722
0	1	0	3	1	1	-69.7225221	-52.597772	-69.4510289
0	1	1	0	1	0	54.9012344	61.751134	54.9548058
0	1	1	0	1	1	-83.5645818	-73.289732	-83.4569584
0	1	1	1	1	0	-2.8324692	7.442381	-2.7248459
0	1	1	1	1	1	-81.9905407	-68.290741	-81.8103605
0	1	1	2	1	0	-28.4213329	-14.721533	-28.2411527
0	1	1	2	1	1	-81.2083205	-64.083570	-80.9368273
0	1	1	3	1	0	-47.8183617	-30.693612	-47.5468685
0	1	1	3	1	1	-80.7926830	-60.242983	-80.4108649
1	1	0	0	1	0	58.1846515	65.034552	58.2382229
1	1	0	0	1	1	-81.6456802	-71.370830	-81.5380568
1	1	0	1	1	0	-0.3454084	9.929442	-0.2377851
1	1	0	1	1	1	-80.1803198	-66.480520	-80.0001396
1	1	0	2	1	0	-26.2734183	-12.573618	-26.0932381
1	1	0	2	1	1	-79.4269422	-62.302192	-79.1554489
1	1	0	3	1	0	-45.7336381	-28.608888	-45.4621449
1	1	0	3	1	1	-78.9179061	-58.368206	-78.5360879
1	1	1	0	1	0	55.9679375	66.242788	56.0755609
1	1	1	0	1	1	-81.9368137	-68.237014	-81.7566335
1	1	1	1	1	0	-1.1721635	12.527637	-0.9919833
1	1	1	1	1	1	-80.3245462	-63.199796	-80.0530530
1	1	1	2	1	0	-27.0241890	-9.899439	-26.7526958
1	1	1	2	1	1	-79.5854119	-59.035712	-79.2035938
1	1	1	3	1	0	-46.4533041	-25.903604	-46.0714859
2	1	0	0	1	0	55.3232522	65.598102	55.4308755
2	1	0	0	1	1	-82.4578205	-68.758020	-82.2776403
2	1	0	1	1	0	-2.2679220	11.431878	-2.0877418
2	1	0	1	1	1	-80.8297352	-63.704985	-80.5582420
2	1	0	2	1	0	-27.5930135	-10.468263	-27.3215203
2	1	0	2	1	1	-79.9888671	-59.439167	-79.6070489
2	1	0	3	1	0	-46.8300627	-26.280363	-46.4482446
2	1	0	3	1	1	-79.5950589	-55.620409	-79.0836434
2	1	1	0	1	0	57.0632267	70.763027	57.2434068
2	1	1	0	1	1	-81.3515915	-64.226841	-81.0800983
2	1	1	1	1	0	-0.8520385	16.272712	-0.5805453
2	1	1	1	1	1	-79.8469095	-59.297209	-79.4650913
2	1	1	2	1	0	-26.0638088	-5.514109	-25.6819907
2	1	1	2	1	1	-77.5805658	-53.605916	-77.0691502
2	1	1	3	1	0	-45.6030051	-21.628355	-45.0915895
2	1	1	3	1	1	-78.8231763	-51.423576	-78.1626258

Best model for each time series:

U.S. Dollar Index: ARIMA(0,1,0)
Trade Balance: SARIMA(0,1,1)x(0,1,1)[4]
GDP: ARIMA(0,1,0)
Unemployment Rate: ARIMA(0,1,0) or ARIMA(1,1,1)
CPI: SARIMA(1,1,2)x(0,1,1)[12]
S&P 500: ARIMA(4,1,4)
Gold Price: ARIMA(0,1,0)
Global Commodity Price: ARIMA(0,1,0)
House Price: ARIMA(3,1,0) or ARIMA(3,1,2)
International Visitors: SARIMA(0,1,1)x(0,1,1)[12]

Model Selection by Auto.arima()

Code

auto.arima(dxy_ts)

Series: dxy_ts 
ARIMA(1,1,0) 

Coefficients:
         ar1
      0.0146
s.e.  0.0141

sigma^2 = 2.27e-05:  log likelihood = 19776.68
AIC=-39549.37   AICc=-39549.37   BIC=-39536.32

Code

auto.arima(balance_ts)

Series: balance_ts 
ARIMA(1,0,1)(2,1,0)[4] 

Coefficients:
         ar1     ma1     sar1     sar2
      0.7910  0.4459  -0.6826  -0.2875
s.e.  0.0816  0.1216   0.1226   0.1259

sigma^2 = 240161663:  log likelihood = -829.66
AIC=1669.32   AICc=1670.19   BIC=1680.91

Code

auto.arima(gdp_ts)

Series: gdp_ts 
ARIMA(0,2,1) 

Coefficients:
          ma1
      -0.9210
s.e.   0.0442

sigma^2 = 115231:  log likelihood = -558.4
AIC=1120.8   AICc=1120.97   BIC=1125.49

Code

auto.arima(unem_ts)

Series: unem_ts 
ARIMA(0,1,1) 

Coefficients:
         ma1
      0.1553
s.e.  0.0722

sigma^2 = 0.007935:  log likelihood = 227.33
AIC=-450.66   AICc=-450.61   BIC=-443.81

Code

auto.arima(cpi_ts)

Series: cpi_ts 
ARIMA(2,2,3)(1,0,0)[12] 

Coefficients:
          ar1     ar2      ma1      ma2      ma3    sar1
      -0.0209  0.0458  -0.3448  -0.4849  -0.1413  0.2485
s.e.   0.7971  0.2766   0.7938   0.5391   0.2840  0.0702

sigma^2 = 0.6156:  log likelihood = -264.23
AIC=542.46   AICc=542.97   BIC=566.4

Code

auto.arima(sp5_ts)

Error in polyroot(c(1, testvec)) : root finding code failed

Series: sp5_ts 
ARIMA(5,2,0) 

Coefficients:
          ar1      ar2      ar3      ar4      ar5
      -0.8897  -0.6626  -0.4855  -0.3295  -0.1257
s.e.   0.0140   0.0183   0.0194   0.0183   0.0140

sigma^2 = 959.1:  log likelihood = -24403.24
AIC=48818.48   AICc=48818.49   BIC=48857.62

Code

auto.arima(xau_ts)

Series: xau_ts 
ARIMA(0,1,0) with drift 

Coefficients:
       drift
      0.0018
s.e.  0.0007

sigma^2 = 0.0005455:  log likelihood = 2429.69
AIC=-4855.39   AICc=-4855.38   BIC=-4845.5

Code

auto.arima(gsci_ts)

Series: gsci_ts 
ARIMA(0,1,0)(1,0,0)[252] with drift 

Coefficients:
        sar1  drift
      0.0073  1e-04
s.e.  0.0205  3e-04

sigma^2 = 0.000205:  log likelihood = 7168.46
AIC=-14330.92   AICc=-14330.91   BIC=-14313.41

Code

auto.arima(house_ts)

Series: house_ts 
ARIMA(2,2,2) 

Coefficients:
          ar1      ar2     ma1     ma2
      -0.1458  -0.8308  0.1323  0.2815
s.e.   0.0917   0.0863  0.1590  0.1457

sigma^2 = 21.7:  log likelihood = -229.36
AIC=468.73   AICc=469.56   BIC=480.51

Code

auto.arima(visitors_ts)

Series: visitors_ts 
ARIMA(2,0,2)(0,0,2)[12] with non-zero mean 

Coefficients:
         ar1     ar2     ma1     ma2    sma1    sma2     mean
      0.0223  0.7591  1.0862  0.1426  0.2468  0.1422  15.3529
s.e.  0.1352  0.1105  0.1549  0.0987  0.0662  0.0583   0.1815

sigma^2 = 0.04477:  log likelihood = 34.24
AIC=-52.48   AICc=-51.86   BIC=-24.63

Best model for each time series:

U.S. Dollar Index: ARIMA(1,1,0)
Trade Balance: SARIMA(1,0,1)(2,1,0)[4]
GDP: ARIMA(0,2,1)
Unemployment Rate: ARIMA(0,1,1)
CPI: SARIMA(2,2,3)(1,0,0)[12]
S&P 500: ARIMA(5,2,0)
Gold Price: ARIMA(0,1,0)
Global Commodity Price: SARIMA(0,1,0)(1,0,0)[252]
House Price: ARIMA(2,2,2)
International Visitors: SARIMA(2,0,2)(0,0,2)[12]

Model Diagnostics

ARIMA(0,1,0)

Code

model_output <- capture.output(sarima(dxy_ts, 0,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate    SE t.value p.value
constant    1e-04 1e-04  0.8257   0.409

sigma^2 estimated as 2.26922e-05 on 5034 degrees of freedom 
 
AIC = -7.854818  AICc = -7.854818  BIC = -7.852226

ARIMA(1,1,0)

Code

model_output <- capture.output(sarima(dxy_ts, 1,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ar1        0.0145 0.0141  1.0260  0.3049
constant   0.0001 0.0001  0.8118  0.4169

sigma^2 estimated as 2.268745e-05 on 5033 degrees of freedom 
 
AIC = -7.85463  AICc = -7.854629  BIC = -7.850743

For both ARIMA(0,1,0) and ARIMA(1,1,0), the model diagnostics results are very similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are below the 0.05 significance level for lags greater than 10, implying that autocorrelations remain in the residuals.

Since the AR(1) term in the ARIMA(1,1,0) model is not significant at the 10% level, and the ARIMA(0,1,0) model has lower AIC, AICc, and BIC values, we decide to use ARIMA(0,1,0) as the optimal model. Moreover, the intercept term of the ARIMA(0,1,0) is insignificant.

SARIMA(0,1,1)x(0,1,1)[4]

Code

model_output <- capture.output(sarima(balance_ts, 0,1,1,0,1,1,4))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
     Estimate     SE t.value p.value
ma1    0.3844 0.1179  3.2618  0.0017
sma1  -1.0000 0.1739 -5.7491  0.0000

sigma^2 estimated as 186230777 on 72 degrees of freedom 
 
AIC = 22.12471  AICc = 22.127  BIC = 22.21812

SARIMA(1,0,1)(2,1,0)[4]

Code

model_output <- capture.output(sarima(balance_ts, 1,0,1,2,1,0,4))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
          Estimate        SE t.value p.value
ar1         0.7876    0.0819  9.6172  0.0000
ma1         0.4463    0.1217  3.6686  0.0005
sar1       -0.6808    0.1225 -5.5597  0.0000
sar2       -0.2876    0.1255 -2.2917  0.0249
constant -738.6922 1479.6176 -0.4992  0.6192

sigma^2 estimated as 226639976 on 70 degrees of freedom 
 
AIC = 22.28099  AICc = 22.29258  BIC = 22.46639

For both SARIMA(0,1,1)x(0,1,1)[4] and SARIMA(1,0,1)(2,1,0)[4], the model diagnostics results are very similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows perfect independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are all above the 0.05 significance level, implying no autocorrelations are left in the residuals and concluding that the model is well-fitted.

Coefficient Significance: All model coefficients are significant.

Therefore, we decide to use SARIMA(0,1,1)x(0,1,1)[4] as the optimal model, since it has lower AIC, AICc, and BIC.

ARIMA(0,1,0)

Code

model_output <- capture.output(sarima(gdp_ts, 0,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate      SE t.value p.value
constant 212.9192 38.3573  5.5509       0

sigma^2 estimated as 114760 on 77 degrees of freedom 
 
AIC = 14.53976  AICc = 14.54043  BIC = 14.60019

ARIMA(0,2,1)

Code

model_output <- capture.output(sarima(gdp_ts, 0,2,1))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
    Estimate     SE  t.value p.value
ma1   -0.921 0.0442 -20.8594       0

sigma^2 estimated as 113730.5 on 76 degrees of freedom 
 
AIC = 14.5559  AICc = 14.55659  BIC = 14.61677

For both ARIMA(0,1,0) and ARIMA(0,2,1), the model diagnostics results are very similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, but there is a significant oscillation in 2020, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows perfect independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are all above the 0.05 significance level, implying no autocorrelations are left in the residuals and concluding that the model is well-fitted.

Since the MA(1) term in the ARIMA(0,2,1) model is significant at the 5% level, we decide to use ARIMA(0,2,1) as the optimal model.

ARIMA(0,1,0)

Code

model_output <- capture.output(sarima(unem_ts, 0,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate    SE t.value p.value
constant  -0.0015 0.006  -0.246  0.8059

sigma^2 estimated as 0.008057314 on 226 degrees of freedom 
 
AIC = -1.965677  AICc = -1.965598  BIC = -1.935501

ARIMA(1,1,1)

Code

model_output <- capture.output(sarima(unem_ts, 1,1,1))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ar1       -0.5485 0.2146 -2.5556  0.0113
ma1        0.6962 0.1832  3.7993  0.0002
constant  -0.0014 0.0064 -0.2196  0.8264

sigma^2 estimated as 0.007811542 on 224 degrees of freedom 
 
AIC = -1.978774  AICc = -1.9783  BIC = -1.918423

ARIMA(0,1,1)

Code

model_output <- capture.output(sarima(unem_ts, 0,1,1))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ma1        0.1550 0.0722  2.1459  0.0329
constant  -0.0014 0.0068 -0.2091  0.8345

sigma^2 estimated as 0.00789841 on 225 degrees of freedom 
 
AIC = -1.976678  AICc = -1.976442  BIC = -1.931414

The model diagnostics results for all three models are similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, but there is a significant oscillation in 2020, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows perfect independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

However, the Ljung-Box test results are different. For the ARIMA(0,1,0) model, the Ljung-Box Test p-values after lag 2 are all above the 0.05 significance level. For the other two models, all p-values are above the threshold. This implies that, for the other two models, no autocorrelations remain in the residuals, concluding that the models are well-fitted.

Coefficient Significance: All model coefficients are significant.

Since both the AR(1) and MA(1) terms in the ARIMA(1,1,1) model are significant, we choose this model as the optimal one.

SARIMA(1,1,2)x(0,1,1)[12]

Code

model_output <- capture.output(sarima(cpi_ts, 1,1,2,0,1,1,12))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
     Estimate     SE t.value p.value
ar1    0.9485 0.0611 15.5239  0.0000
ma1   -0.3778 0.0934 -4.0438  0.0001
ma2   -0.4161 0.0769 -5.4148  0.0000
sma1  -0.9915 1.1241 -0.8820  0.3788

sigma^2 estimated as 0.4663129 on 211 degrees of freedom 
 
AIC = 2.275082  AICc = 2.275968  BIC = 2.353469

SARIMA(2,2,3)(1,0,0)[12]

Code

model_output <- capture.output(sarima(cpi_ts, 2,2,3,1,0,0,12))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
     Estimate     SE t.value p.value
ar1   -0.0209 0.7971 -0.0262  0.9791
ar2    0.0458 0.2766  0.1656  0.8686
ma1   -0.3448 0.7938 -0.4343  0.6645
ma2   -0.4849 0.5391 -0.8995  0.3694
ma3   -0.1413 0.2840 -0.4976  0.6193
sar1   0.2485 0.0702  3.5403  0.0005

sigma^2 estimated as 0.5989571 on 220 degrees of freedom 
 
AIC = 2.400257  AICc = 2.401955  BIC = 2.506203

The model diagnostics results for both models are similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

However, the Ljung-Box test results and coefficient significance are different. For the SARIMA(1,1,2)x(0,1,1)[12] model, the Ljung-Box Test p-values are all above the 0.05 significance level, and all coefficients are significant. For the SARIMA(2,2,3)(1,0,0)[12] model, the Ljung-Box Test p-values are above the 0.05 significance level only after lag 20, and the majority of the coefficients are not significant.

Therefore, we choose the SARIMA(1,1,2)x(0,1,1)[12] model as the optimal one.

ARIMA(4,1,4)

Code

model_output <- capture.output(sarima(sp5_ts, 4,1,4))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE  t.value p.value
ar1       -0.2465 0.0268  -9.2149  0.0000
ar2        0.7864 0.0223  35.2752  0.0000
ar3       -0.3998 0.0198 -20.1590  0.0000
ar4       -0.8560 0.0262 -32.6906  0.0000
ma1        0.1942 0.0334   5.8183  0.0000
ma2       -0.7627 0.0273 -27.9744  0.0000
ma3        0.4428 0.0244  18.1189  0.0000
ma4        0.7686 0.0328  23.4248  0.0000
constant   0.9567 0.3778   2.5323  0.0114

sigma^2 estimated as 783.1994 on 5022 degrees of freedom 
 
AIC = 9.505299  AICc = 9.505306  BIC = 9.518265

ARIMA(5,2,0)

Code

model_output <- capture.output(sarima(sp5_ts, 5,2,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
    Estimate     SE  t.value p.value
ar1  -0.8897 0.0140 -63.5255       0
ar2  -0.6626 0.0183 -36.2866       0
ar3  -0.4855 0.0194 -25.0734       0
ar4  -0.3295 0.0183 -18.0316       0
ar5  -0.1257 0.0140  -8.9664       0

sigma^2 estimated as 958.1395 on 5025 degrees of freedom 
 
AIC = 9.705463  AICc = 9.705465  BIC = 9.713244

The model diagnostics results for both models are very similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, But the magnitude of the residuals after 2020 is noticeably larger, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box test results for both models are not satisfactory, as they are below the 0.05 significance level. This indicates that autocorrelations remain in the residuals, concluding that the models need improvement.

Coefficient Significance: All model coefficients are significant.

Since the ARIMA(4,1,4) model has lower AIC, AICc and BIC, we choose it as the optimal model.

ARIMA(0,1,0)

Code

model_output <- capture.output(sarima(xau_ts, 0,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate    SE t.value p.value
constant   0.0018 7e-04  2.4406  0.0148

sigma^2 estimated as 0.0005449249 on 1038 degrees of freedom 
 
AIC = -4.673136  AICc = -4.673132  BIC = -4.663615

This model is chosen by both manual search and auto.arima(). The diagnostic results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are all above the 0.05 significance level, implying no autocorrelations are left in the residuals and concluding that the model is well-fitted.

Therefore, the ARIMA(0,1,0) model is the optimal model.

ARIMA(0,1,0)

Code

model_output <- capture.output(sarima(gsci_ts, 0,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate    SE t.value p.value
constant    1e-04 3e-04  0.4152   0.678

sigma^2 estimated as 0.0002048122 on 2534 degrees of freedom 
 
AIC = -5.653962  AICc = -5.653961  BIC = -5.649356

SARIMA(0,1,0)(1,0,0)[252]

Code

model_output <- capture.output(sarima(gsci_ts, 0,1,0,1,0,0,252))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
sar1       0.0073 0.0205  0.3552  0.7225
constant   0.0001 0.0003  0.4089  0.6826

sigma^2 estimated as 0.000204801 on 2533 degrees of freedom 
 
AIC = -5.653223  AICc = -5.653221  BIC = -5.646314

The model diagnostics results for both models are similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, but the magnitude of the residuals around 2020 is noticeably larger, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

However, the Ljung-Box test results are different. For the ARIMA(0,1,0) model, most of the Ljung-Box Test p-values are all above the 0.05 significance level. For the SARIMA(0,1,0)(1,0,0)[252] model, only p-values after lag 100 are above the threshold. This implies that autocorrelations remain in the residuals for the latter model. Moreover, the seasonal AR(1) term of the model is not significant.

Therefore, we choose the ARIMA(0,1,0) model as the optimal one.

ARIMA(3,1,0)

Code

model_output <- capture.output(sarima(house_ts, 3,1,0))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ar1        0.8522 0.0873  9.7634  0.0000
ar2       -0.6078 0.1101 -5.5188  0.0000
ar3        0.6153 0.0870  7.0760  0.0000
constant   5.5136 3.2681  1.6871  0.0957

sigma^2 estimated as 20.62213 on 75 degrees of freedom 
 
AIC = 6.017157  AICc = 6.023999  BIC = 6.167122

ARIMA(3,1,2)

Code

model_output <- capture.output(sarima(house_ts, 3,1,2))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ar1        0.7153 0.1199  5.9658  0.0000
ar2       -0.6927 0.1109 -6.2484  0.0000
ar3        0.7350 0.0864  8.5071  0.0000
ma1        0.1982 0.1740  1.1390  0.2584
ma2        0.3248 0.1441  2.2531  0.0273
constant   5.3341 2.8926  1.8441  0.0692

sigma^2 estimated as 19.33216 on 73 degrees of freedom 
 
AIC = 6.007442  AICc = 6.02221  BIC = 6.217393

ARIMA(2,2,2)

Code

model_output <- capture.output(sarima(house_ts, 2,2,2))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
    Estimate     SE t.value p.value
ar1  -0.1458 0.0917 -1.5904  0.1160
ar2  -0.8308 0.0863 -9.6271  0.0000
ma1   0.1323 0.1590  0.8321  0.4080
ma2   0.2815 0.1457  1.9323  0.0572

sigma^2 estimated as 20.58715 on 74 degrees of freedom 
 
AIC = 6.009331  AICc = 6.016356  BIC = 6.160402

The model diagnostics results for all three models are similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, but there is a significant oscillation in 2020, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows perfect independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

However, the Ljung-Box test results are different. For the ARIMA(3,1,0) model, the Ljung-Box Test p-values after lag 6 are all above the 0.05 significance level. For the other two models, all p-values are above the threshold. This implies that, for the other two models, no autocorrelations remain in the residuals, concluding that the models are well-fitted.

Coefficient significance: For the ARIMA(3,1,0) model, all three coefficients are significant. For the ARIMA(3,1,2) model, only the coefficient of the MA1 term is insignificant, while all other four are significant. For the ARIMA(2,2,2) model, only the coefficient of the AR2 term is significant, with the other three coefficients insignificant.

Therefore, we choose the ARIMA(3,1,2) model as the optimal one.

SARIMA(0,1,1)x(0,1,1)[12]

Code

model_output <- capture.output(sarima(visitors_ts, 0,1,1,0,1,1,12))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
     Estimate     SE  t.value p.value
ma1    0.2515 0.0667   3.7693   2e-04
sma1  -1.0000 0.0919 -10.8779   0e+00

sigma^2 estimated as 0.03367981 on 225 degrees of freedom 
 
AIC = -0.3681259  AICc = -0.3678899  BIC = -0.3228623

SARIMA(2,0,2)(0,0,2)[12]

Code

model_output <- capture.output(sarima(visitors_ts, 2,0,2,0,0,2,12))

Code

start_line <- grep("Coefficients", model_output)  # Locate where coefficient details start
end_line <- length(model_output)  # Last line of output
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
      Estimate     SE t.value p.value
ar1     0.0223 0.1352  0.1647  0.8693
ar2     0.7591 0.1105  6.8687  0.0000
ma1     1.0862 0.1549  7.0138  0.0000
ma2     0.1426 0.0987  1.4442  0.1500
sma1    0.2468 0.0662  3.7273  0.0002
sma2    0.1422 0.0583  2.4375  0.0155
xmean  15.3529 0.1815 84.5983  0.0000

sigma^2 estimated as 0.0434599 on 233 degrees of freedom 
 
AIC = -0.2186667  AICc = -0.2166552  BIC = -0.1026454

The model diagnostics results for both models are similar. The results are as follows:

The Residual Plot shows nearly consistent fluctuation around zero, but there is a significant oscillation in 2020, indicating the need for a more advanced model to account for special events.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

However, the Ljung-Box test results are different. For the SARIMA(0,1,1)x(0,1,1)[12] model, the Ljung-Box Test p-values are all above the 0.05 significance level. For the SARIMA(2,0,2)(0,0,2)[12] model, all p-values are below the threshold.

Therefore, we choose the SARIMA(0,1,1)x(0,1,1)[12] model as the optimal model, where all coefficients are significant.

The equation of the best model for each time series:

U.S. Dollar Index:

ARIMA(0,1,0)
\[ (1-B) x_t= w_t \]

where \(x_t\) is the original time series and \(w_t\) is the Gaussian white noise process.

Trade Balance:

SARIMA(0,1,1)x(0,1,1)[4] \[ \left(1-B^{4}\right)(1-B) x_t=\Theta (B^{4}) \theta (B) w_t \]

where \[ \begin{align} \Theta(B^4) &= 1 + 0.3818B^{4} \\ \theta(B) &= 1 -B \end{align} \]

GDP:

ARIMA(0,2,1)
\[ (1-B)^2 x_t=\theta (B) w_t \]

where \[ \theta(B) = 1 -0.921B \]

Unemployment Rate:

ARIMA(1,1,1)
\[ \phi (B) (1-B) x_t=\theta (B) w_t \]

where \[ \begin{align} \phi (B) &= 1 + 0.5485 B \\ \theta(B) &= 1 + 0.6962 B \end{align} \]

CPI:

SARIMA(1,1,2)x(0,1,1)[12]
\[ \phi(B) \left(1-B^{12}\right)(1-B) x_t=\Theta (B^{12}) \theta (B) w_t \]

where \[ \begin{align} \phi (B) &= 1 -0.9485 B \\ \Theta(B^{12}) &= 1 - 0.9915 B^{12} \\ \theta(B) &= 1 -0.3778 B -0.4161 B^{2} \end{align} \]

S&P 500:

ARIMA(4,1,4) \[ \phi (B) (1-B) x_t=\theta (B) w_t + \mu \]

where \[ \begin{align} \mu &= 0.9567 \\ \phi (B) &= 1 + 0.2465 B - 0.7864 B^2 + 0.3998 B^3 + 0.8560 B^4\\ \theta(B) &= 1 + 0.1942 B - 0.7627 B^2 + 0.4428 B^3 + 0.7686 B^4 \end{align} \]

Gold Price:

ARIMA(0,1,0)
\[ (1-B) x_t= w_t + \mu \quad \quad \quad where \quad \mu=0.0018 \]

Global Commodity Price:

ARIMA(0,1,0)
\[ (1-B) x_t= w_t \]

House Price:

ARIMA(3,1,2) \[ \phi (B) (1-B) x_t=\theta (B) w_t + \mu \]

where \[ \begin{align} \mu &= 5.3341 \\ \phi (B) &= 1 - 0.7153 B + 0.6927 B^2 - 0.7350 B^3 \\ \theta(B) &= 1 + 0.1982 B + 0.3248 B^2 \end{align} \]

International Visitors:

SARIMA(0,1,1)x(0,1,1)[12]
\[ \left(1-B^{12}\right)(1-B) x_t=\Theta (B^{12}) \theta (B) w_t \]

where \[ \begin{align} \Theta(B^{12}) &= 1 - 1.0000 B^{12} \\ \theta(B) &= 1 + 0.2515 B \end{align} \]

Forecaseting

Code

fit <- Arima(dxy_ts, order = c(0,1,0), include.drift = TRUE)
forecast_result <- forecast(fit, h = 252)
autoplot(forecast_result) +
  labs(title = "ARIMA(0,1,0) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(balance_ts, order = c(0,1,1), seasonal = list(order = c(0,1,1), period = 4))
forecast_result <- forecast(fit, h = 8)
autoplot(forecast_result) +
  labs(title = "SARIMA(0,1,1)x(0,1,1)[4] Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(gdp_ts, order = c(0,2,1))
forecast_result <- forecast(fit, h = 12)
autoplot(forecast_result) +
  labs(title = "ARIMA(0,2,1) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(unem_ts, order = c(1,1,1))
forecast_result <- forecast(fit, h = 12)
autoplot(forecast_result) +
  labs(title = "ARIMA(1,1,1) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(cpi_ts, order = c(1,1,2), seasonal = list(order = c(0,1,1), period = 12))
forecast_result <- forecast(fit, h = 36)
autoplot(forecast_result) +
  labs(title = "SARIMA(1,1,2)X(0,1,1)[12] Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(sp5_ts, order = c(4,1,4), include.drift=TRUE)
forecast_result <- forecast(fit, h = 252)
autoplot(forecast_result) +
  labs(title = "ARIMA(4,1,4) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(xau_ts, order = c(0,1,0), include.drift=TRUE)
forecast_result <- forecast(fit, h = 52)
autoplot(forecast_result) +
  labs(title = "ARIMA(0,1,0) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(gsci_ts, order = c(0,1,0), include.drift=TRUE)
forecast_result <- forecast(fit, h = 252)
autoplot(forecast_result) +
  labs(title = "ARIMA(0,1,0) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(house_ts, order = c(3,1,2))
forecast_result <- forecast(fit, h = 12)
autoplot(forecast_result) +
  labs(title = "ARIMA(3,1,2) Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

Code

fit <- Arima(visitors_ts, order = c(0,1,1), seasonal = list(order = c(0,1,1), period = 12))
forecast_result <- forecast(fit, h = 36)
autoplot(forecast_result) +
  labs(title = "SARIMA(0,1,1)X(0,1,1)[12] Forecast",
       x = "Time",
       y = "Predicted Values") +
  theme_minimal()

The forecast plots for all series show that our optimal models perform well. The confidence intervals, shown in the shaded blue area, represent the range within which the time series are expected to move. The widening of these intervals as the forecast period extends reflects increasing uncertainty.

Comparison with Benchmark Methods

Write a function

Code

plot_forecasts <- function(forecast_result, ts, h, fit) {
  print(accuracy(forecast_result))  
  # Plot the forecasts using Mean, Naïve, Drift Methods, and ARIMA Fit
  autoplot(ts) +
    autolayer(meanf(ts, h = h), series = "Mean", PI = FALSE) +
    autolayer(naive(ts, h = h), series = "Naïve", PI = FALSE) +
    autolayer(snaive(ts, h = h), series = "SNaïve", PI = FALSE) +
    autolayer(rwf(ts, drift = TRUE, h = h), series = "Drift", PI = FALSE) +
    autolayer(forecast(fit, h = h), series = "Fit", PI = FALSE) +
    xlab("Date") + 
    ylab("Predicted Values") +
    guides(colour = guide_legend(title = "Forecast Methods")) +
    theme_minimal()
}

Code

fit <- Arima(dxy_ts, order = c(0,1,0), include.drift = TRUE)
forecast_result <- forecast(fit, h = 252)
plot_forecasts(forecast_result, dxy_ts, 252, fit)

                       ME        RMSE         MAE        MPE       MAPE
Training set 8.733295e-07 0.004763563 0.003531088 -5.079e-05 0.07875213
                   MASE       ACF1
Training set 0.05939571 0.01503592

Code

fit <- Arima(balance_ts, order = c(0,1,1), seasonal = list(order = c(0,1,1), period = 4))
forecast_result <- forecast(fit, h = 8)
plot_forecasts(forecast_result, balance_ts, 8, fit)

                    ME     RMSE      MAE       MPE     MAPE      MASE
Training set -759.8004 13207.78 9733.733 0.4284101 6.512908 0.4244237
                    ACF1
Training set -0.02815605

Code

fit <- Arima(gdp_ts, order = c(0,2,1))
forecast_result <- forecast(fit, h = 12)
plot_forecasts(forecast_result, gdp_ts, 12, fit)

                   ME     RMSE      MAE       MPE     MAPE      MASE       ACF1
Training set 32.67852 332.9493 147.9156 0.1147268 0.756791 0.1612916 -0.1421698

Code

fit <- Arima(unem_ts, order = c(1,1,1))
forecast_result <- forecast(fit, h = 12)
plot_forecasts(forecast_result, unem_ts, 12, fit)

                       ME       RMSE        MAE        MPE     MAPE      MASE
Training set -0.001293724 0.08819836 0.03253087 -0.1672727 1.841263 0.1628627
                     ACF1
Training set 0.0003963382

Code

fit <- Arima(cpi_ts, order = c(1,1,2), seasonal = list(order = c(0,1,1), period = 12))
forecast_result <- forecast(fit, h = 36)
plot_forecasts(forecast_result, cpi_ts, 36, fit)

                     ME      RMSE       MAE        MPE      MAPE       MASE
Training set 0.01658077 0.6646226 0.4823797 0.00423208 0.2025314 0.07696065
                   ACF1
Training set 0.02124335

Code

fit <- Arima(sp5_ts, order = c(4,1,4), include.drift=TRUE)
forecast_result <- forecast(fit, h = 252)
plot_forecasts(forecast_result, sp5_ts, 252, fit)

                      ME     RMSE      MAE         MPE      MAPE       MASE
Training set -0.02174175 27.98292 17.39177 -0.02943003 0.7786464 0.05055928
                    ACF1
Training set -0.01040333

Code

fit <- Arima(xau_ts, order = c(0,1,0), include.drift=TRUE)
forecast_result <- forecast(fit, h = 52)
plot_forecasts(forecast_result, xau_ts, 52, fit)

                       ME       RMSE        MAE          MPE      MAPE     MASE
Training set 5.803826e-06 0.02333315 0.01745149 0.0001973932 0.2473152 0.122605
                    ACF1
Training set 0.003597321

Code

fit <- Arima(gsci_ts, order = c(0,1,0), include.drift=TRUE)
forecast_result <- forecast(fit, h = 252)
plot_forecasts(forecast_result, gsci_ts, 252, fit)

                       ME       RMSE        MAE          MPE      MAPE
Training set 2.379987e-06 0.01430894 0.01014429 -0.000263987 0.1665067
                   MASE         ACF1
Training set 0.05461529 -0.001120247

Code

fit <- Arima(house_ts, order = c(3,1,2))
forecast_result <- forecast(fit, h = 12)
plot_forecasts(forecast_result, house_ts, 12, fit)

                   ME     RMSE      MAE       MPE      MAPE      MASE
Training set 0.520081 4.429123 2.728648 0.1197514 0.6335488 0.1106083
                    ACF1
Training set -0.01834026

Code

fit <- Arima(visitors_ts, order = c(0,1,1), seasonal = list(order = c(0,1,1), period = 12))
forecast_result <- forecast(fit, h = 36)
plot_forecasts(forecast_result, visitors_ts, 36, fit)

                        ME     RMSE        MAE         MPE      MAPE      MASE
Training set -0.0005064716 0.178516 0.06267839 -0.01008101 0.4338407 0.2188027
                    ACF1
Training set -0.00867727

Our models generally outperform the benchmark methods, although sometimes the Naïve method better predicts seasonality. By comparing the forecast results of our model with those benchmark methods, it is clear that our model offers more precise and reliable predictions. The accuracy metrics (ME, RMSE, MAE, MPE, MAPE, MASE, ACF1) further confirm the superiority of our model, making it the optimal choice for forecasting in this case.