Multivariate TS Models

Multivariate Analysis

Based on the data and previous univariate analysis, we will begin conducting multivariate analysis in this section. Our data science questions include:

What factors interact with the US Dollar Index?
What exogenous variables affect the US Dollar Index?
What factors are influenced by the US Dollar Index?

Before constructing Multivariate Time Series Models, we will first conduct a literature review (see the Introduction part) and analyze the mechanisms among variables from an economic perspective.

Interacting Variables

Trade Balance

Trade Deficit → Dollar Depreciation: Higher imports than exports lead to a trade deficit, requiring capital inflows but increasing dollar supply, causing depreciation.
Dollar Depreciation → Trade Deficit Reduction: A weaker dollar boosts US exports and reduces imports, helping balance the deficit.
The expansion of the trade deficit leads to the depreciation of the US dollar.

A trade deficit means that the US imports more than it exports. According to the Balance of Payments Identity (Current Account + Financial Account = 0), a trade deficit (i.e., a negative current account) means that the financial account must be positive. This is reflected in capital inflows. Typically, the US issues government bonds and attracts foreign investment to balance the trade deficit. However, the increased supply of dollars in the foreign exchange market will cause the dollar to depreciate, or foreign investors may become concerned about the US fiscal situation, leading to reduced demand for the dollar and further depreciation.
The depreciation of the dollar leads to a reduction in the trade deficit.

When the dollar depreciates, US goods and services become relatively cheaper in international markets, which may increase US exports. At the same time, imported goods become relatively expensive, leading to a reduction in US imports. Therefore, the depreciation of the dollar generally reduces the trade deficit.

Global Commodity Market

A stronger dollar leads to a decrease in global commodity prices.

Since the dollar is the global reserve and trading currency, commodities (such as oil, gold, copper, aluminum, etc.) are typically priced in dollars. When the dollar appreciates, the cost for holders of other currencies to buy dollar-priced commodities increases, which leads to a reduction in demand and lowers commodity prices. Additionally, investors may shift funds from physical assets like commodities into dollar-denominated assets, further driving down commodity prices.
Changes in commodity prices also influence the US Dollar Index.

A rise in commodity prices, especially oil and metal prices, often puts pressure on global inflation. This, in turn, affects the Dollar Index. Moreover, commodity price fluctuations often reflect changes in the global economy. For example, an increase in oil prices may indicate a slowdown in global economic growth or supply-demand imbalances. These factors could lead to higher demand for the dollar as a safe-haven currency, thus pushing up the Dollar Index.

Stock Market

The risk and return associated with stocks are relatively high, and the dollar can serve as a safe-haven asset. Therefore, there is a competitive relationship between the two. Generally speaking, the stock market and the Dollar Index have an inverse relationship. When global investors have a higher risk appetite or when the stock market is strong, capital tends to flow into the stock market. This reduces demand for the dollar and leads to a depreciation of the dollar. On the other hand, when the dollar appreciates, investors may prefer to move funds into low-risk assets like bonds. This leads to capital outflows from the stock market and puts downward pressure on it, especially on overvalued stocks. However, a stronger dollar may also reflect a robust economy, which could push stock prices higher.

Gold Market

The dollar and gold also tend to show inverse fluctuations due to their competitive relationship.
A stronger dollar leads to a decrease in gold prices.

First, gold prices are usually denominated in dollars. When the dollar appreciates, the cost of gold in other currencies increases, reducing demand and lowering gold prices. Secondly, a stronger dollar may be due to the Federal Reserve raising interest rates. In this case, investors are more likely to invest in higher-yielding assets (such as dollar-denominated bonds) and reduce holdings in non-interest-bearing assets like gold, leading to a decline in gold prices.
An increase in gold prices leads to the depreciation of the dollar.

When the dollar weakens or global economic uncertainty rises, the dollar may no longer be considered a safe-haven asset, and gold becomes the preferred choice. Investors may tend to buy gold as a hedge against inflation or currency devaluation, leading to decreased demand for the dollar and causing it to depreciate.

GDP

Economic growth drives dollar appreciation.

When a country’s GDP grows strongly, it typically reflects good economic performance and attracts foreign investors’ attention. This capital inflow (e.g., foreign direct investment, securities investment) increases demand for the country’s currency, which in our case is the US dollar, leading to dollar appreciation.

Unemployment Rate

A higher unemployment rate leads to dollar depreciation.
- Mechanism 1
  
  The unemployment rate mainly affects the economy, which in turn influences the Dollar Index. There is an inverse relationship between the unemployment rate and GDP growth. A higher unemployment rate typically indicates an economic recession or depression, where businesses reduce hiring, which is often accompanied by a decrease in consumption and investment. This reduces the ability to attract foreign investment into the US, leading to reduced demand for dollar-denominated assets and a depreciation of the dollar.
- Mechanism 2
  
  When the unemployment rate is high, the Federal Reserve may adopt loose monetary policies (e.g., lowering interest rates or quantitative easing) to stimulate the economy and reduce unemployment. Such loose monetary policies typically lead to dollar depreciation because lower interest rates reduce the relative attractiveness of the dollar, making it less likely to attract foreign capital inflows, thereby decreasing demand for the dollar.

CPI

An increase in the CPI drives dollar appreciation.

When the CPI rises, it indicates inflation. In this case, the Federal Reserve may adopt a tightening monetary policy, such as raising interest rates, to control inflation. Higher interest rates make the dollar more attractive because they can attract foreign capital inflows, increasing demand for the dollar and thus driving up its value.

Expectations are also a factor. If inflation expectations rise, markets may anticipate that the Federal Reserve will take more tightening measures, which could lead to dollar appreciation.

Exogenous Variables

Real Estate Market

A stronger dollar typically indicates a relatively strong US economy, and interest rates may rise. Higher mortgage rates increase borrowing costs, which can suppress demand for housing and lead to a decline in housing prices.
However, a stronger dollar can also boost confidence among homebuyers and global investors. This leads to increased capital inflows into US real estate, pushing housing prices up.

Therefore, the impact may be a combined effect.

Tourism

When the dollar appreciates, foreign tourists need to exchange more of their local currency for dollars, which increases their travel costs. Some tourists may opt for other travel destinations, leading to a decrease in the number of tourists visiting the US. Therefore, US tourism may decline with a stronger dollar.

Code

library(tidyverse)
library(ggplot2)
library(forecast)
library(astsa) 
library(xts)
library(zoo)
library(tseries)
library(fpp2)
library(fma)
library(lubridate)
library(tidyverse)
library(TSstudio)
library(quantmod)
library(tidyquant)
library(plotly)
library(readr)
library(dplyr)
library(kableExtra)
library(knitr)
library(patchwork)
library(vars)

# Load data
invisible(getSymbols("DX-Y.NYB", src = "yahoo", from = "2005-01-01", to = "2024-12-31"))
dxy <- data.frame(Date = index(`DX-Y.NYB`), 
                       Open = `DX-Y.NYB`[, "DX-Y.NYB.Open"], 
                       High = `DX-Y.NYB`[, "DX-Y.NYB.High"], 
                       Low = `DX-Y.NYB`[, "DX-Y.NYB.Low"], 
                       Close = `DX-Y.NYB`[, "DX-Y.NYB.Close"])
colnames(dxy) <- c("Date", "Open", "High", "Low", "Close")
dxy <- na.omit(dxy)

bea <- read.csv("./data/bea.csv")
bea$time <- as.Date(bea$time)

gdp <- read.csv("./data/gdp.csv")
gdp$time <- as.Date(gdp$time)
gdp$total <- gdp$consumption + gdp$investment + gdp$net_export + gdp$government

data_unem <- read.csv("./data/unem.csv", header=TRUE)
data_unem$time <- as.Date(data_unem$time)

data_cpi <- read.csv("./data/cpi.csv", header=TRUE)
data_cpi$time <- as.Date(data_cpi$time)

invisible(getSymbols("^GSPC", src = "yahoo", from = "2005-01-01", to = "2024-12-31"))
invisible(getSymbols("BTC-USD", src = "yahoo", from = "2015-01-01", to = "2024-12-31"))

xau <- read.csv("./data/xau.csv")
xau$Date <- as.Date(xau$Date)

gsci <- read.csv("./data/gsci.csv")[2:2518,]
gsci$Date <- as.Date(gsci$Date)

house <- read.csv("./data/house.csv", header=TRUE)
house$time <- as.Date(house$time)

visitors <- read.csv("./data/visitors.csv", header=TRUE)
visitors$time <- as.Date(visitors$time)

egg <- read.csv("./data/eggs_price.csv")
egg$Date <- as.Date(egg$Date)

ir <- read.csv("./data/interest_rate.csv")
ir$Date <- as.Date(ir$Date)

oil <- read.csv("./data/oil_price.csv")
oil$Date <- as.Date(oil$Date)

set.seed(123)
library(kableExtra)

# ARIMA funtion
ARIMA.c = function(p1, p2, q1, q2, data) {
  d = 1
  i = 1
  temp = data.frame()
  ls = matrix(rep(NA, 6 * 100), nrow = 100)
  
  for (p in p1:p2) {
    for (q in q1:q2) {
          if (p + d + q <= 9) {
            
            model <- tryCatch({
              Arima(data, order = c(p, d, q), include.drift = TRUE)
            }, error = function(e) {
              return(NULL)
            })
            
            if (!is.null(model)) {
              ls[i, ] = c(p, d, q, model$aic, model$bic, model$aicc)
              i = i + 1
            }
          }
        }
      }
      temp = as.data.frame(ls)
      names(temp) = c("p", "d", "q", "AIC", "BIC", "AICc")
      temp = na.omit(temp)
      return(temp)
}

# SARIMA function
SARIMA.c = function(p1, p2, q1, q2, P1, P2, Q1, Q2, s, data) {
  d = 1
  D = 1
  i = 1
  temp = data.frame()
  ls = matrix(rep(NA, 9 * 100), nrow = 100)
  
  for (p in p1:p2) {
    for (q in q1:q2) {
      for (P in P1:P2) {
        for (Q in Q1:Q2) {
          if (p + d + q + P + D + Q <= 9) {
            
            model <- tryCatch({
              Arima(data, order = c(p, d, q), seasonal = list(order = c(P,D,Q), period = s))
            }, error = function(e) {
              return(NULL)
            })
            
            if (!is.null(model)) {
              ls[i, ] = c(p, d, q, P, D, Q, model$aic, model$bic, model$aicc)
              i = i + 1
            }
          }
        }
      }
    }
  }
  
  temp = as.data.frame(ls)
  names(temp) = c("p", "d", "q", "P", "D", "Q", "AIC", "BIC", "AICc")
  temp = na.omit(temp)
  return(temp)
}

highlight_output = function(output, type="ARIMA") {
    highlight_row <- c(which.min(output$AIC), which.min(output$BIC), which.min(output$AICc))
    knitr::kable(output, align = 'c', caption = paste("Comparison of", type, "Models")) %>%
    kable_styling(full_width = FALSE, position = "center") %>%
    row_spec(highlight_row, bold = TRUE, background = "#FFFF99")  # Highlight row in yellow
}

# Define a function to fit SARIMA and handle errors
fit_sarima <- function(xtrain, p, d, q, P, D, Q, s) {
  fit <- tryCatch({
    Arima(xtrain, order = c(p, d, q), seasonal = list(order = c(P, D, Q), period = s),
          include.drift = FALSE, lambda = 0, method = "ML")
  }, error = function(e) {
    return(NULL)  # Return NULL if an error occurs
  })
  return(fit)  # Return the fitted model (or NULL)
}

VAR: USD ~ Trade Deficit + Global Commodity Price + CPI

Here, we will analyze the interactions between the USD, trade deficit, global commodity prices, and CPI.

Code

dxy_m <- dxy %>%
  mutate(month = floor_date(Date, "month")) %>% 
  group_by(month) %>%
  summarise(Close = mean(Close, na.rm = TRUE))
dxy_q <- dxy %>%
  mutate(quarter = floor_date(Date, "quarter")) %>% 
  group_by(quarter) %>%
  summarise(Close = mean(Close, na.rm = TRUE))
gsci_q <- gsci %>%
  mutate(quarter = floor_date(Date, "quarter")) %>%
  group_by(quarter) %>%
  summarise(gsci = mean(Price, na.rm = TRUE))
cpi_q <- data_cpi %>%
  mutate(quarter = floor_date(time, "quarter")) %>%
  group_by(quarter) %>%
  summarise(cpi = mean(cpi, na.rm = TRUE))

df1 <- data.frame(Date = gsci_q$quarter, 
                  USD = log(dxy_q$Close)[41:80], 
                  Deficit = log(abs(bea$balance))[41:80],
                  CommodityPrice = log(gsci_q$gsci),
                  CPI = log(cpi_q$cpi)[41:80]
                  )

plot_usd <- plot_ly(df1, x = ~Date, y = ~USD, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_deficit <- plot_ly(df1, x = ~Date, y = ~Deficit, type = 'scatter', mode = 'lines', name = 'Trade Deficit') 
plot_gsci <- plot_ly(df1, x = ~Date, y = ~CommodityPrice, type = 'scatter', mode = 'lines', name = 'Commodity Price') 
plot_cpi <- plot_ly(df1, x = ~Date, y = ~CPI, type = 'scatter', mode = 'lines', name = 'CPI') 

subplot(plot_usd, plot_deficit, plot_gsci, plot_cpi, nrows = 4, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Related Variables", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'Trade Deficit'),
    yaxis3 = list(title = 'Commodity Price'),
    yaxis4 = list(title = 'CPI'))

Code

ts_df1 <- ts(df1[,-1], start = c(2015,1), frequency = 4)
VARselect(ts_df1, lag.max=6, type="both")

$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     6      6      1      2 

$criteria
                   1             2             3             4             5
AIC(n) -2.787866e+01 -2.851355e+01 -2.862475e+01 -2.831302e+01 -2.893424e+01
HQ(n)  -2.751123e+01 -2.790116e+01 -2.776740e+01 -2.721072e+01 -2.758698e+01
SC(n)  -2.680123e+01 -2.671783e+01 -2.611074e+01 -2.508073e+01 -2.498366e+01
FPE(n)  7.923879e-13  4.444468e-13  4.556769e-13  8.165450e-13  7.276663e-13
                   6
AIC(n) -2.997350e+01
HQ(n)  -2.838128e+01
SC(n)  -2.530463e+01
FPE(n)  6.699754e-13

Based on the results, the lag length could be 1, 2, or 6.

Code

summary(fit <- VAR(ts_df1, p=1, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, Deficit, CommodityPrice, CPI 
Deterministic variables: both 
Sample size: 39 
Log Likelihood: 342.132 
Roots of the characteristic polynomial:
0.8721 0.8113 0.8113 0.3509
Call:
VAR(y = ts_df1, p = 1, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)    
USD.l1             0.667614   0.124027   5.383 5.96e-06 ***
Deficit.l1         0.004326   0.026047   0.166   0.8691    
CommodityPrice.l1  0.069535   0.029550   2.353   0.0247 *  
CPI.l1             0.248191   0.232994   1.065   0.2945    
const             -0.289282   0.914592  -0.316   0.7538    
trend             -0.002214   0.001532  -1.445   0.1578    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.02053 on 33 degrees of freedom
Multiple R-Squared: 0.8444, Adjusted R-squared: 0.8208 
F-statistic: 35.81 on 5 and 33 DF,  p-value: 2.07e-12 


Estimation results for equation Deficit: 
======================================== 
Deficit = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)  
USD.l1            -1.498555   0.747045  -2.006   0.0531 .
Deficit.l1         0.421178   0.156890   2.685   0.0113 *
CommodityPrice.l1  0.030080   0.177986   0.169   0.8668  
CPI.l1             1.097922   1.403382   0.782   0.4396  
const              7.397317   5.508813   1.343   0.1885  
trend              0.005873   0.009226   0.637   0.5288  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.1237 on 33 degrees of freedom
Multiple R-Squared: 0.8249, Adjusted R-squared: 0.7984 
F-statistic: 31.09 on 5 and 33 DF,  p-value: 1.403e-11 


Estimation results for equation CommodityPrice: 
=============================================== 
CommodityPrice = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)    
USD.l1            -1.150435   0.516991  -2.225    0.033 *  
Deficit.l1         0.171882   0.108576   1.583    0.123    
CommodityPrice.l1  0.703631   0.123175   5.712 2.25e-06 ***
CPI.l1             0.810695   0.971207   0.835    0.410    
const              0.570008   3.812359   0.150    0.882    
trend             -0.002896   0.006385  -0.454    0.653    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.08559 on 33 degrees of freedom
Multiple R-Squared: 0.8813, Adjusted R-squared: 0.8633 
F-statistic: 48.98 on 5 and 33 DF,  p-value: 2.532e-14 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + const + trend 

                    Estimate Std. Error t value Pr(>|t|)    
USD.l1            -0.0531202  0.0384176  -1.383   0.1760    
Deficit.l1        -0.0003969  0.0080682  -0.049   0.9611    
CommodityPrice.l1  0.0180458  0.0091531   1.972   0.0571 .  
CPI.l1             0.9359941  0.0721704  12.969 1.67e-14 ***
const              0.4906491  0.2832965   1.732   0.0926 .  
trend              0.0005609  0.0004745   1.182   0.2456    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.00636 on 33 degrees of freedom
Multiple R-Squared: 0.9962, Adjusted R-squared: 0.9957 
F-statistic:  1741 on 5 and 33 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                      USD    Deficit CommodityPrice        CPI
USD             4.217e-04 -0.0001257     -0.0003646 -1.961e-05
Deficit        -1.257e-04  0.0152973      0.0011320  2.837e-04
CommodityPrice -3.646e-04  0.0011320      0.0073263  4.279e-04
CPI            -1.961e-05  0.0002837      0.0004279  4.046e-05

Correlation matrix of residuals:
                   USD Deficit CommodityPrice     CPI
USD             1.0000 -0.0495        -0.2075 -0.1501
Deficit        -0.0495  1.0000         0.1069  0.3606
CommodityPrice -0.2075  0.1069         1.0000  0.7860
CPI            -0.1501  0.3606         0.7860  1.0000

Code

summary(fit <- VAR(ts_df1, p=2, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, Deficit, CommodityPrice, CPI 
Deterministic variables: both 
Sample size: 38 
Log Likelihood: 359.881 
Roots of the characteristic polynomial:
0.8411 0.7931 0.7931 0.7212 0.7212 0.5164 0.5164 0.04401
Call:
VAR(y = ts_df1, p = 2, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)    
USD.l1             0.860943   0.176984   4.865 4.02e-05 ***
Deficit.l1        -0.034702   0.031732  -1.094   0.2835    
CommodityPrice.l1 -0.049823   0.069952  -0.712   0.4822    
CPI.l1             1.999967   0.992855   2.014   0.0537 .  
USD.l2            -0.374902   0.180288  -2.079   0.0468 *  
Deficit.l2         0.034468   0.033392   1.032   0.3108    
CommodityPrice.l2  0.034555   0.050941   0.678   0.5031    
CPI.l2            -1.375498   0.904893  -1.520   0.1397    
const             -0.959290   1.024171  -0.937   0.3569    
trend             -0.003810   0.001726  -2.208   0.0356 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.02 on 28 degrees of freedom
Multiple R-Squared: 0.8741, Adjusted R-squared: 0.8337 
F-statistic: 21.61 on 9 and 28 DF,  p-value: 2.388e-10 


Estimation results for equation Deficit: 
======================================== 
Deficit = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + const + trend 

                    Estimate Std. Error t value Pr(>|t|)    
USD.l1             -0.634374   0.852994  -0.744 0.463252    
Deficit.l1          0.353624   0.152935   2.312 0.028333 *  
CommodityPrice.l1  -0.698797   0.337141  -2.073 0.047518 *  
CPI.l1             17.998184   4.785184   3.761 0.000794 ***
USD.l2             -0.639547   0.868918  -0.736 0.467836    
Deficit.l2         -0.189120   0.160935  -1.175 0.249838    
CommodityPrice.l2   0.376337   0.245519   1.533 0.136542    
CPI.l2            -16.702649   4.361243  -3.830 0.000662 ***
const              10.258999   4.936119   2.078 0.046958 *  
trend               0.010970   0.008318   1.319 0.197905    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.09637 on 28 degrees of freedom
Multiple R-Squared: 0.9071, Adjusted R-squared: 0.8773 
F-statistic: 30.39 on 9 and 28 DF,  p-value: 3.822e-12 


Estimation results for equation CommodityPrice: 
=============================================== 
CommodityPrice = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)  
USD.l1            -1.030868   0.765116  -1.347   0.1887  
Deficit.l1         0.109221   0.137179   0.796   0.4326  
CommodityPrice.l1  0.642615   0.302408   2.125   0.0425 *
CPI.l1             1.223680   4.292199   0.285   0.7777  
USD.l2            -0.074839   0.779399  -0.096   0.9242  
Deficit.l2         0.192069   0.144355   1.331   0.1941  
CommodityPrice.l2 -0.080255   0.220224  -0.364   0.7183  
CPI.l2            -0.244303   3.911934  -0.062   0.9506  
const             -1.235445   4.427584  -0.279   0.7823  
trend             -0.004332   0.007461  -0.581   0.5662  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.08644 on 28 degrees of freedom
Multiple R-Squared: 0.8971, Adjusted R-squared: 0.864 
F-statistic: 27.13 on 9 and 28 DF,  p-value: 1.546e-11 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + const + trend 

                    Estimate Std. Error t value Pr(>|t|)    
USD.l1            -0.0345719  0.0532750  -0.649   0.5217    
Deficit.l1        -0.0123101  0.0095518  -1.289   0.2080    
CommodityPrice.l1 -0.0072755  0.0210566  -0.346   0.7323    
CPI.l1             1.3978230  0.2988658   4.677  6.7e-05 ***
USD.l2            -0.0155966  0.0542696  -0.287   0.7759    
Deficit.l2         0.0241018  0.0100515   2.398   0.0234 *  
CommodityPrice.l2 -0.0009416  0.0153342  -0.061   0.9515    
CPI.l2            -0.4129302  0.2723880  -1.516   0.1407    
const              0.2221791  0.3082927   0.721   0.4771    
trend              0.0002623  0.0005195   0.505   0.6175    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.006019 on 28 degrees of freedom
Multiple R-Squared: 0.997,  Adjusted R-squared: 0.9961 
F-statistic:  1040 on 9 and 28 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                      USD    Deficit CommodityPrice        CPI
USD             3.998e-04 -0.0005604     -0.0004343 -3.436e-05
Deficit        -5.604e-04  0.0092876      0.0020977  2.978e-04
CommodityPrice -4.343e-04  0.0020977      0.0074725  4.054e-04
CPI            -3.436e-05  0.0002978      0.0004054  3.623e-05

Correlation matrix of residuals:
                   USD Deficit CommodityPrice     CPI
USD             1.0000 -0.2908        -0.2512 -0.2855
Deficit        -0.2908  1.0000         0.2518  0.5134
CommodityPrice -0.2512  0.2518         1.0000  0.7791
CPI            -0.2855  0.5134         0.7791  1.0000

Code

summary(fit <- VAR(ts_df1, p=6, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, Deficit, CommodityPrice, CPI 
Deterministic variables: both 
Sample size: 34 
Log Likelihood: 420.574 
Roots of the characteristic polynomial:
0.9774 0.9774 0.9612 0.9612 0.9551 0.9537 0.9537 0.9456 0.9456 0.914 0.8946 0.8946 0.8791 0.8791 0.874 0.874 0.8692 0.8692 0.8455 0.8455 0.8179 0.8179 0.4034 0.04809
Call:
VAR(y = ts_df1, p = 6, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + USD.l3 + Deficit.l3 + CommodityPrice.l3 + CPI.l3 + USD.l4 + Deficit.l4 + CommodityPrice.l4 + CPI.l4 + USD.l5 + Deficit.l5 + CommodityPrice.l5 + CPI.l5 + USD.l6 + Deficit.l6 + CommodityPrice.l6 + CPI.l6 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)  
USD.l1             0.188522   0.278061   0.678   0.5169  
Deficit.l1        -0.114187   0.063291  -1.804   0.1089  
CommodityPrice.l1 -0.088524   0.097296  -0.910   0.3895  
CPI.l1             3.503156   1.427231   2.455   0.0397 *
USD.l2             0.063782   0.272373   0.234   0.8207  
Deficit.l2        -0.023496   0.063565  -0.370   0.7212  
CommodityPrice.l2  0.131369   0.096624   1.360   0.2110  
CPI.l2            -1.958538   2.251024  -0.870   0.4096  
USD.l3            -0.447982   0.274821  -1.630   0.1417  
Deficit.l3        -0.138874   0.067914  -2.045   0.0751 .
CommodityPrice.l3 -0.110289   0.095874  -1.150   0.2832  
CPI.l3             0.835259   1.921125   0.435   0.6752  
USD.l4            -0.251535   0.275764  -0.912   0.3884  
Deficit.l4         0.002074   0.066215   0.031   0.9758  
CommodityPrice.l4 -0.171482   0.095911  -1.788   0.1116  
CPI.l4             0.707427   1.943709   0.364   0.7253  
USD.l5            -0.320172   0.227445  -1.408   0.1969  
Deficit.l5        -0.007782   0.056297  -0.138   0.8935  
CommodityPrice.l5  0.084407   0.081619   1.034   0.3313  
CPI.l5            -1.363133   1.779101  -0.766   0.4656  
USD.l6            -0.459738   0.277596  -1.656   0.1363  
Deficit.l6         0.083523   0.053758   1.554   0.1589  
CommodityPrice.l6  0.059224   0.061525   0.963   0.3639  
CPI.l6             0.465138   1.282731   0.363   0.7263  
const              1.118637   1.802790   0.621   0.5522  
trend             -0.008372   0.002630  -3.184   0.0129 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01492 on 8 degrees of freedom
Multiple R-Squared: 0.9795, Adjusted R-squared: 0.9156 
F-statistic: 15.33 on 25 and 8 DF,  p-value: 0.0002289 


Estimation results for equation Deficit: 
======================================== 
Deficit = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + USD.l3 + Deficit.l3 + CommodityPrice.l3 + CPI.l3 + USD.l4 + Deficit.l4 + CommodityPrice.l4 + CPI.l4 + USD.l5 + Deficit.l5 + CommodityPrice.l5 + CPI.l5 + USD.l6 + Deficit.l6 + CommodityPrice.l6 + CPI.l6 + const + trend 

                    Estimate Std. Error t value Pr(>|t|)  
USD.l1             -0.669424   1.194355  -0.560   0.5905  
Deficit.l1          0.469621   0.271852   1.727   0.1223  
CommodityPrice.l1  -0.750854   0.417916  -1.797   0.1101  
CPI.l1             10.635835   6.130392   1.735   0.1210  
USD.l2             -1.644347   1.169923  -1.406   0.1975  
Deficit.l2         -0.507760   0.273033  -1.860   0.1000 .
CommodityPrice.l2  -0.263142   0.415029  -0.634   0.5438  
CPI.l2             -3.749152   9.668833  -0.388   0.7083  
USD.l3             -0.794137   1.180438  -0.673   0.5201  
Deficit.l3          0.380241   0.291713   1.303   0.2287  
CommodityPrice.l3  -0.314116   0.411807  -0.763   0.4675  
CPI.l3              9.897744   8.251813   1.199   0.2647  
USD.l4              0.635921   1.184490   0.537   0.6060  
Deficit.l4         -0.118016   0.284415  -0.415   0.6891  
CommodityPrice.l4   0.557226   0.411966   1.353   0.2132  
CPI.l4             -7.712312   8.348820  -0.924   0.3826  
USD.l5             -1.962701   0.976947  -2.009   0.0794 .
Deficit.l5         -0.203957   0.241814  -0.843   0.4235  
CommodityPrice.l5  -0.786869   0.350578  -2.244   0.0550 .
CPI.l5              8.431582   7.641778   1.103   0.3020  
USD.l6              0.903502   1.192359   0.758   0.4703  
Deficit.l6         -0.219157   0.230905  -0.949   0.3703  
CommodityPrice.l6   0.233288   0.264270   0.883   0.4031  
CPI.l6            -11.240777   5.509717  -2.040   0.0757 .
const               3.608153   7.743532   0.466   0.6537  
trend              -0.006603   0.011295  -0.585   0.5749  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.06409 on 8 degrees of freedom
Multiple R-Squared: 0.986,  Adjusted R-squared: 0.9422 
F-statistic: 22.51 on 25 and 8 DF,  p-value: 5.357e-05 


Estimation results for equation CommodityPrice: 
=============================================== 
CommodityPrice = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + USD.l3 + Deficit.l3 + CommodityPrice.l3 + CPI.l3 + USD.l4 + Deficit.l4 + CommodityPrice.l4 + CPI.l4 + USD.l5 + Deficit.l5 + CommodityPrice.l5 + CPI.l5 + USD.l6 + Deficit.l6 + CommodityPrice.l6 + CPI.l6 + const + trend 

                   Estimate Std. Error t value Pr(>|t|)
USD.l1             -0.02834    1.99586  -0.014    0.989
Deficit.l1          0.51395    0.45429   1.131    0.291
CommodityPrice.l1   0.22994    0.69837   0.329    0.750
CPI.l1              0.49998   10.24435   0.049    0.962
USD.l2             -1.49073    1.95503  -0.763    0.468
Deficit.l2          0.34165    0.45626   0.749    0.475
CommodityPrice.l2  -0.11959    0.69355  -0.172    0.867
CPI.l2             -0.06070   16.15735  -0.004    0.997
USD.l3              0.30408    1.97260   0.154    0.881
Deficit.l3          0.29492    0.48748   0.605    0.562
CommodityPrice.l3   0.28301    0.68816   0.411    0.692
CPI.l3              1.89453   13.78941   0.137    0.894
USD.l4              1.40984    1.97937   0.712    0.497
Deficit.l4          0.12161    0.47528   0.256    0.805
CommodityPrice.l4  -0.01759    0.68843  -0.026    0.980
CPI.l4              2.31895   13.95151   0.166    0.872
USD.l5             -0.42458    1.63255  -0.260    0.801
Deficit.l5         -0.09938    0.40409  -0.246    0.812
CommodityPrice.l5   0.26122    0.58584   0.446    0.667
CPI.l5             -6.75397   12.76999  -0.529    0.611
USD.l6              1.41084    1.99252   0.708    0.499
Deficit.l6         -0.15247    0.38586  -0.395    0.703
CommodityPrice.l6  -0.12111    0.44162  -0.274    0.791
CPI.l6              2.57117    9.20715   0.279    0.787
const             -16.81991   12.94003  -1.300    0.230
trend              -0.02132    0.01887  -1.130    0.291


Residual standard error: 0.1071 on 8 degrees of freedom
Multiple R-Squared: 0.9443, Adjusted R-squared: 0.7703 
F-statistic: 5.426 on 25 and 8 DF,  p-value: 0.009051 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + Deficit.l1 + CommodityPrice.l1 + CPI.l1 + USD.l2 + Deficit.l2 + CommodityPrice.l2 + CPI.l2 + USD.l3 + Deficit.l3 + CommodityPrice.l3 + CPI.l3 + USD.l4 + Deficit.l4 + CommodityPrice.l4 + CPI.l4 + USD.l5 + Deficit.l5 + CommodityPrice.l5 + CPI.l5 + USD.l6 + Deficit.l6 + CommodityPrice.l6 + CPI.l6 + const + trend 

                    Estimate Std. Error t value Pr(>|t|)
USD.l1             0.0447305  0.1207314   0.370    0.721
Deficit.l1         0.0058585  0.0274802   0.213    0.837
CommodityPrice.l1 -0.0045955  0.0422451  -0.109    0.916
CPI.l1             0.7886579  0.6196909   1.273    0.239
USD.l2            -0.1248322  0.1182617  -1.056    0.322
Deficit.l2         0.0182774  0.0275995   0.662    0.526
CommodityPrice.l2 -0.0156250  0.0419533  -0.372    0.719
CPI.l2            -0.1946339  0.9773744  -0.199    0.847
USD.l3            -0.0129803  0.1193246  -0.109    0.916
Deficit.l3         0.0165136  0.0294879   0.560    0.591
CommodityPrice.l3 -0.0166860  0.0416275  -0.401    0.699
CPI.l3             0.8414897  0.8341348   1.009    0.343
USD.l4             0.0505925  0.1197342   0.423    0.684
Deficit.l4         0.0201891  0.0287501   0.702    0.502
CommodityPrice.l4 -0.0125725  0.0416436  -0.302    0.770
CPI.l4             0.0383578  0.8439408   0.045    0.965
USD.l5            -0.0194012  0.0987547  -0.196    0.849
Deficit.l5         0.0053613  0.0244438   0.219    0.832
CommodityPrice.l5  0.0282938  0.0354382   0.798    0.448
CPI.l5            -0.4281220  0.7724694  -0.554    0.595
USD.l6             0.0370442  0.1205297   0.307    0.766
Deficit.l6        -0.0127183  0.0233411  -0.545    0.601
CommodityPrice.l6 -0.0277421  0.0267137  -1.038    0.329
CPI.l6             0.0367840  0.5569500   0.066    0.949
const             -0.6637172  0.7827552  -0.848    0.421
trend             -0.0007543  0.0011417  -0.661    0.527


Residual standard error: 0.006479 on 8 degrees of freedom
Multiple R-Squared: 0.9988, Adjusted R-squared: 0.9951 
F-statistic: 267.1 on 25 and 8 DF,  p-value: 3.206e-09 



Covariance matrix of residuals:
                      USD   Deficit CommodityPrice       CPI
USD             2.227e-04 5.909e-05     -4.855e-05 9.372e-06
Deficit         5.909e-05 4.108e-03      2.520e-03 1.162e-04
CommodityPrice -4.855e-05 2.520e-03      1.147e-02 6.200e-04
CPI             9.372e-06 1.162e-04      6.200e-04 4.198e-05

Correlation matrix of residuals:
                    USD Deficit CommodityPrice     CPI
USD             1.00000 0.06178       -0.03038 0.09694
Deficit         0.06178 1.00000        0.36701 0.27992
CommodityPrice -0.03038 0.36701        1.00000 0.89346
CPI             0.09694 0.27992        0.89346 1.00000

The model with p=1 or p=2 looks good. However, only a few variables are significant in the model with p=6, suggesting that VAR(6) may not be the best choice.

Code

fun.var <- function(ts, year, p, s){
  fit <- VAR(ts, p=p, type='both')
  fcast <- predict(fit, n.ahead = s)
  
  f1<-fcast$fcst$USD
  f2<-fcast$fcst$Deficit
  f3<-fcast$fcst$CommodityPrice
  f4<-fcast$fcst$CPI
  ff<-data.frame(f1[,1],f2[,1],f3[,1],f4[,1])
  ff<-ts(ff,start=c(year,1),frequency = s)
  return(ff)
}

data <- ts_df1
n=nrow(data)
n_var = ncol(data)
h <- 4  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 12
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 2

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 12 months
  test_start_year <- st + (i-1) + 1/h #starting year for predication, i.e. xtest

  ######## VAR(1) ############
  ff1 <- fun.var(xtrain, test_start_year, p=1, s=h)
  ######## VAR(2) ############
  ff2 <- fun.var(xtrain, test_start_year, p=2, s=h)
  
  ##### collecting errors ######
  a = h*i-h+1
  b= h*i
  rmse1[c(a:b),]  <-abs(ff1-xtest)
  rmse2[c(a:b),]  <-abs(ff2-xtest)
}

rmse_combined <- as.data.frame(rbind(rmse1, rmse2))
colnames(rmse_combined) = c("USD","Deficit","CommodityPrice","CPI")
rmse_combined$Model <- c(rep("VAR(1)", n-k),rep("VAR(2)", n-k))
rmse_combined$Date <- time(data)[(k+1):n]

# Create the USD RMSE plot with a legend
ggplot(data = rmse_combined, aes(x = Date, y = USD, color = Model)) + 
  geom_line() +
  labs(
    title = "CV Error for USD",
    x = "Date",
    y = "Error",
    color = "Model"
  ) +
  theme_minimal()

VAR(1) is slightly better.

\[ \begin{aligned} \left[\begin{array}{l} \hat{USD_t} \\ \hat{Deficit_t} \\ \hat{CommodityPrice_t} \\ \hat{CPI_t} \end{array}\right] &= \left[\begin{array}{ccc} 0.667614 & -0.004326 & -0.069535 & 0.248191 \\ -1.498555 & 0.421178 & 0.030080 & -1.097922 \\ -1.150435 & 0.171882 & 0.703631 & -0.810695 \\ 0.0531202 & -0.0003969 & -0.0180458 & 0.9359941 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-1} \\ \text{Deficit}_{t-1} \\ \text{CommodityPrice}_{t-1} \\ \text{CPI}_{t-1} \end{array}\right] \end{aligned} \]

Code

# Fit a VAR(1) model including both a constant and trend
fit <- VAR(data, p = 1, type = "both")
forecast(fit, h=h) %>%
  autoplot() + 
  scale_x_continuous(breaks = 2015:2027) +
  xlab("Year") +
  theme_minimal()

I believe the forecast is very good. The VAR(1) model has effectively captured the pattern of four varibales.

VAR: USD ~ Interest Rate + CPI + Unemployment Rate + GDP

Let’s now analyze how macroeconomic variables, such as interest rate, CPI, unemployment rate, and GDP, interact with the US Dollar Index.

Code

ir_q <- ir %>%
  mutate(quarter = floor_date(Date, "quarter")) %>% 
  group_by(quarter) %>%
  summarise(IR = mean(IR, na.rm = TRUE))
unem_q <- data_unem %>%
  mutate(quarter = floor_date(time, "quarter")) %>%
  group_by(quarter) %>%
  summarise(unem = mean(unem, na.rm = TRUE))

df2 <- data.frame(Date = dxy_q$quarter, 
                  USD = log(dxy_q$Close), 
                  InterestRate = log(ir_q$IR),
                  CPI = log(cpi_q$cpi),
                  UnemploymentRate = log(unem_q$unem),
                  GDP = log(gdp$total)
                  )

plot_usd <- plot_ly(df2, x = ~Date, y = ~USD, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_ir <- plot_ly(df2, x = ~Date, y = ~InterestRate, type = 'scatter', mode = 'lines', name = 'Interest Rate') 
plot_cpi <- plot_ly(df2, x = ~Date, y = ~CPI, type = 'scatter', mode = 'lines', name = 'CPI') 
plot_unem <- plot_ly(df2, x = ~Date, y = ~UnemploymentRate, type = 'scatter', mode = 'lines', name = 'unemployment Rate') 
plot_gdp <- plot_ly(df2, x = ~Date, y = ~GDP, type = 'scatter', mode = 'lines', name = 'GDP') 

subplot(plot_usd, plot_ir, plot_cpi, plot_unem, plot_gdp, nrows = 5, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Related Variables", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'Interest Rate'),
    yaxis3 = list(title = 'CPI'),
    yaxis4 = list(title = 'Unemployment Rate'),
    yaxis5 = list(title = 'GDP'))

The trend of the USD and interest rate is very similar.

Code

ts_df2 <- ts(df2[,-1], start = c(2005,1), frequency = 4)
VARselect(ts_df2, lag.max=6, type="both")

$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     3      2      1      3 

$criteria
                   1             2             3             4             5
AIC(n) -3.344277e+01 -3.397270e+01 -3.420058e+01 -3.417668e+01 -3.396877e+01
HQ(n)  -3.300805e+01 -3.322747e+01 -3.314483e+01 -3.281042e+01 -3.229200e+01
SC(n)  -3.235301e+01 -3.210454e+01 -3.155402e+01 -3.075172e+01 -2.976541e+01
FPE(n)  3.000693e-15  1.786960e-15  1.462198e-15  1.575617e-15  2.109013e-15
                   6
AIC(n) -3.389629e+01
HQ(n)  -3.190900e+01
SC(n)  -2.891453e+01
FPE(n)  2.577987e-15

Based on the results, the lag length could be 1, 2, or 3.

Code

summary(fit <- VAR(ts_df2, p=1, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, InterestRate, CPI, UnemploymentRate, GDP 
Deterministic variables: both 
Sample size: 79 
Log Likelihood: 797.035 
Roots of the characteristic polynomial:
1.022 0.8726 0.8726 0.7888 0.7888
Call:
VAR(y = ts_df2, p = 1, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               0.8602541  0.0646797  13.300  < 2e-16 ***
InterestRate.l1     -0.0121483  0.0040616  -2.991 0.003805 ** 
CPI.l1               0.4560117  0.3075579   1.483 0.142522    
UnemploymentRate.l1 -0.0894047  0.0253856  -3.522 0.000748 ***
GDP.l1              -0.2034453  0.2634512  -0.772 0.442505    
const                0.2930808  1.3118642   0.223 0.823851    
trend               -0.0004485  0.0012972  -0.346 0.730553    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.02816 on 72 degrees of freedom
Multiple R-Squared: 0.9355, Adjusted R-squared: 0.9301 
F-statistic:   174 on 6 and 72 DF,  p-value: < 2.2e-16 


Estimation results for equation InterestRate: 
============================================= 
InterestRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1                2.94889    1.09160   2.701   0.0086 ** 
InterestRate.l1       0.75927    0.06855  11.077   <2e-16 ***
CPI.l1                1.47241    5.19066   0.284   0.7775    
UnemploymentRate.l1  -0.35278    0.42843  -0.823   0.4130    
GDP.l1                3.51256    4.44627   0.790   0.4321    
const               -53.16104   22.14034  -2.401   0.0189 *  
trend                -0.05229    0.02189  -2.388   0.0195 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.4753 on 72 degrees of freedom
Multiple R-Squared: 0.9224, Adjusted R-squared: 0.9159 
F-statistic: 142.7 on 6 and 72 DF,  p-value: < 2.2e-16 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1              -0.0097524  0.0167734  -0.581 0.562773    
InterestRate.l1     -0.0015307  0.0010533  -1.453 0.150488    
CPI.l1               0.6599743  0.0797593   8.275 4.71e-12 ***
UnemploymentRate.l1  0.0148023  0.0065833   2.248 0.027607 *  
GDP.l1               0.3207071  0.0683211   4.694 1.24e-05 ***
const               -1.2135174  0.3402070  -3.567 0.000647 ***
trend               -0.0011576  0.0003364  -3.441 0.000968 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.007303 on 72 degrees of freedom
Multiple R-Squared: 0.997,  Adjusted R-squared: 0.9967 
F-statistic:  3976 on 6 and 72 DF,  p-value: < 2.2e-16 


Estimation results for equation UnemploymentRate: 
================================================= 
UnemploymentRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1              -0.324474   0.351276  -0.924   0.3587    
InterestRate.l1      0.019355   0.022058   0.877   0.3832    
CPI.l1               2.673199   1.670348   1.600   0.1139    
UnemploymentRate.l1  0.704855   0.137869   5.112 2.52e-06 ***
GDP.l1              -2.635176   1.430804  -1.842   0.0696 .  
const               12.753590   7.124738   1.790   0.0777 .  
trend                0.010574   0.007045   1.501   0.1378    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.1529 on 72 degrees of freedom
Multiple R-Squared: 0.8143, Adjusted R-squared: 0.7989 
F-statistic: 52.64 on 6 and 72 DF,  p-value: < 2.2e-16 


Estimation results for equation GDP: 
==================================== 
GDP = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               0.0481664  0.0347894   1.385  0.17048    
InterestRate.l1      0.0003173  0.0021846   0.145  0.88493    
CPI.l1              -0.2530517  0.1654266  -1.530  0.13047    
UnemploymentRate.l1  0.0390789  0.0136542   2.862  0.00551 ** 
GDP.l1               1.2398195  0.1417029   8.749 6.12e-13 ***
const               -1.2061893  0.7056144  -1.709  0.09168 .  
trend               -0.0008130  0.0006977  -1.165  0.24777    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01515 on 72 degrees of freedom
Multiple R-Squared: 0.9961, Adjusted R-squared: 0.9958 
F-statistic:  3058 on 6 and 72 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                        USD InterestRate        CPI UnemploymentRate        GDP
USD               7.931e-04   -4.666e-05 -8.036e-05       -2.729e-05 -2.910e-05
InterestRate     -4.666e-05    2.259e-01  1.529e-03       -5.819e-02  5.515e-03
CPI              -8.036e-05    1.529e-03  5.334e-05       -2.209e-04  4.633e-05
UnemploymentRate -2.729e-05   -5.819e-02 -2.209e-04        2.339e-02 -2.023e-03
GDP              -2.910e-05    5.515e-03  4.633e-05       -2.023e-03  2.295e-04

Correlation matrix of residuals:
                       USD InterestRate     CPI UnemploymentRate      GDP
USD               1.000000    -0.003486 -0.3907        -0.006335 -0.06823
InterestRate     -0.003486     1.000000  0.4405        -0.800489  0.76601
CPI              -0.390712     0.440520  1.0000        -0.197793  0.41881
UnemploymentRate -0.006335    -0.800489 -0.1978         1.000000 -0.87316
GDP              -0.068225     0.766006  0.4188        -0.873164  1.00000

Code

summary(fit <- VAR(ts_df2, p=2, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, InterestRate, CPI, UnemploymentRate, GDP 
Deterministic variables: both 
Sample size: 78 
Log Likelihood: 832.809 
Roots of the characteristic polynomial:
1.018 0.9183 0.8712 0.8712 0.6637 0.6637 0.5249 0.5249 0.3718 0.116
Call:
VAR(y = ts_df2, p = 2, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1               1.243612   0.117284  10.603 6.82e-16 ***
InterestRate.l1     -0.001780   0.012031  -0.148    0.883    
CPI.l1               0.493171   0.588521   0.838    0.405    
UnemploymentRate.l1 -0.006097   0.051373  -0.119    0.906    
GDP.l1               0.481614   0.464835   1.036    0.304    
USD.l2              -0.499303   0.112295  -4.446 3.43e-05 ***
InterestRate.l2     -0.003618   0.011278  -0.321    0.749    
CPI.l2              -0.171779   0.483045  -0.356    0.723    
UnemploymentRate.l2 -0.069032   0.053140  -1.299    0.198    
GDP.l2              -0.734841   0.491734  -1.494    0.140    
const                1.940365   1.324816   1.465    0.148    
trend                0.001211   0.001320   0.917    0.362    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.02485 on 66 degrees of freedom
Multiple R-Squared: 0.9539, Adjusted R-squared: 0.9462 
F-statistic: 124.1 on 11 and 66 DF,  p-value: < 2.2e-16 


Estimation results for equation InterestRate: 
============================================= 
InterestRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1                0.77182    2.12085   0.364  0.71708    
InterestRate.l1       1.10863    0.21755   5.096 3.13e-06 ***
CPI.l1                1.15195   10.64221   0.108  0.91413    
UnemploymentRate.l1   2.09945    0.92897   2.260  0.02713 *  
GDP.l1               19.94260    8.40561   2.373  0.02059 *  
USD.l2                1.47756    2.03063   0.728  0.46941    
InterestRate.l2      -0.41436    0.20394  -2.032  0.04621 *  
CPI.l2               -1.25989    8.73490  -0.144  0.88575    
UnemploymentRate.l2  -2.71377    0.96094  -2.824  0.00626 ** 
GDP.l2              -15.93254    8.89202  -1.792  0.07775 .  
const               -46.09798   23.95662  -1.924  0.05864 .  
trend                -0.05011    0.02388  -2.099  0.03969 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.4494 on 66 degrees of freedom
Multiple R-Squared: 0.9356, Adjusted R-squared: 0.9249 
F-statistic: 87.15 on 11 and 66 DF,  p-value: < 2.2e-16 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1              -0.0547189  0.0317267  -1.725   0.0893 .  
InterestRate.l1      0.0009829  0.0032544   0.302   0.7636    
CPI.l1               0.8251027  0.1592011   5.183 2.25e-06 ***
UnemploymentRate.l1  0.0185356  0.0138969   1.334   0.1869    
GDP.l1               0.1500489  0.1257429   1.193   0.2370    
USD.l2               0.0454964  0.0303770   1.498   0.1390    
InterestRate.l2     -0.0033796  0.0030509  -1.108   0.2720    
CPI.l2              -0.2960236  0.1306689  -2.265   0.0268 *  
UnemploymentRate.l2  0.0023137  0.0143750   0.161   0.8726    
GDP.l2               0.2888473  0.1330193   2.171   0.0335 *  
const               -1.6513563  0.3583769  -4.608 1.91e-05 ***
trend               -0.0015728  0.0003572  -4.403 4.00e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.006723 on 66 degrees of freedom
Multiple R-Squared: 0.9976, Adjusted R-squared: 0.9972 
F-statistic:  2468 on 11 and 66 DF,  p-value: < 2.2e-16 


Estimation results for equation UnemploymentRate: 
================================================= 
UnemploymentRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1               0.036548   0.695835   0.053 0.958270    
InterestRate.l1     -0.008517   0.071376  -0.119 0.905384    
CPI.l1               2.649964   3.491625   0.759 0.450585    
UnemploymentRate.l1  0.006195   0.304789   0.020 0.983844    
GDP.l1              -9.563499   2.757814  -3.468 0.000929 ***
USD.l2              -0.253234   0.666233  -0.380 0.705092    
InterestRate.l2      0.052567   0.066912   0.786 0.434904    
CPI.l2              -1.158596   2.865851  -0.404 0.687317    
UnemploymentRate.l2  0.915813   0.315276   2.905 0.004995 ** 
GDP.l2               8.094701   2.917400   2.775 0.007181 ** 
const                7.156199   7.859980   0.910 0.365893    
trend                0.007159   0.007834   0.914 0.364124    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.1475 on 66 degrees of freedom
Multiple R-Squared: 0.8417, Adjusted R-squared: 0.8154 
F-statistic: 31.91 on 11 and 66 DF,  p-value: < 2.2e-16 


Estimation results for equation GDP: 
==================================== 
GDP = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               0.0097499  0.0674215   0.145 0.885459    
InterestRate.l1      0.0023914  0.0069158   0.346 0.730602    
CPI.l1              -0.0806670  0.3383141  -0.238 0.812280    
UnemploymentRate.l1  0.1062055  0.0295319   3.596 0.000617 ***
GDP.l1               1.6667546  0.2672129   6.238 3.58e-08 ***
USD.l2               0.0157525  0.0645533   0.244 0.807970    
InterestRate.l2     -0.0052575  0.0064833  -0.811 0.420319    
CPI.l2              -0.2792651  0.2776810  -1.006 0.318231    
UnemploymentRate.l2 -0.0795905  0.0305480  -2.605 0.011329 *  
GDP.l2              -0.3403793  0.2826757  -1.204 0.232839    
const               -1.3408370  0.7615772  -1.761 0.082938 .  
trend               -0.0011383  0.0007591  -1.500 0.138497    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01429 on 66 degrees of freedom
Multiple R-Squared: 0.9967, Adjusted R-squared: 0.9962 
F-statistic:  1818 on 11 and 66 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                        USD InterestRate        CPI UnemploymentRate        GDP
USD               6.177e-04   -0.0004103 -6.842e-05        0.0001653 -4.382e-05
InterestRate     -4.103e-04    0.2019778  1.376e-03       -0.0520121  4.928e-03
CPI              -6.842e-05    0.0013760  4.520e-05       -0.0002339  3.572e-05
UnemploymentRate  1.653e-04   -0.0520121 -2.339e-04        0.0217418 -1.914e-03
GDP              -4.382e-05    0.0049275  3.572e-05       -0.0019136  2.041e-04

Correlation matrix of residuals:
                      USD InterestRate     CPI UnemploymentRate     GDP
USD               1.00000     -0.03673 -0.4095          0.04512 -0.1234
InterestRate     -0.03673      1.00000  0.4554         -0.78488  0.7674
CPI              -0.40949      0.45542  1.0000         -0.23599  0.3719
UnemploymentRate  0.04512     -0.78488 -0.2360          1.00000 -0.9084
GDP              -0.12340      0.76742  0.3719         -0.90837  1.0000

Code

summary(fit <- VAR(ts_df2, p=3, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, InterestRate, CPI, UnemploymentRate, GDP 
Deterministic variables: both 
Sample size: 77 
Log Likelihood: 852.23 
Roots of the characteristic polynomial:
1.018 0.9244 0.917 0.917 0.7694 0.7694 0.6696 0.6696 0.5777 0.5777 0.5679 0.5679 0.5252 0.5252 0.479
Call:
VAR(y = ts_df2, p = 3, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + USD.l3 + InterestRate.l3 + CPI.l3 + UnemploymentRate.l3 + GDP.l3 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               1.3041007  0.1346595   9.684 7.09e-14 ***
InterestRate.l1     -0.0001496  0.0120338  -0.012  0.99012    
CPI.l1               0.5322798  0.5861887   0.908  0.36749    
UnemploymentRate.l1  0.0252092  0.0545618   0.462  0.64573    
GDP.l1               0.6647715  0.5314519   1.251  0.21584    
USD.l2              -0.6246021  0.1957419  -3.191  0.00226 ** 
InterestRate.l2     -0.0333759  0.0172772  -1.932  0.05811 .  
CPI.l2               0.6049569  0.7416823   0.816  0.41792    
UnemploymentRate.l2 -0.1299901  0.0752514  -1.727  0.08924 .  
GDP.l2              -0.5487509  0.6584922  -0.833  0.40796    
USD.l3               0.1774575  0.1265508   1.402  0.16599    
InterestRate.l3      0.0241139  0.0112423   2.145  0.03602 *  
CPI.l3              -0.1384671  0.5144835  -0.269  0.78875    
UnemploymentRate.l3 -0.0170603  0.0552331  -0.309  0.75848    
GDP.l3              -0.8993484  0.5469347  -1.644  0.10534    
const                2.9549197  1.5195664   1.945  0.05652 .  
trend                0.0020854  0.0014830   1.406  0.16484    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.02368 on 60 degrees of freedom
Multiple R-Squared: 0.9619, Adjusted R-squared: 0.9518 
F-statistic: 94.76 on 16 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation InterestRate: 
============================================= 
InterestRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + USD.l3 + InterestRate.l3 + CPI.l3 + UnemploymentRate.l3 + GDP.l3 + const + trend 

                     Estimate Std. Error t value Pr(>|t|)    
USD.l1                1.70238    2.56584   0.663  0.50957    
InterestRate.l1       1.00951    0.22930   4.403 4.47e-05 ***
CPI.l1                4.57534   11.16939   0.410  0.68353    
UnemploymentRate.l1   3.08821    1.03964   2.970  0.00427 ** 
GDP.l1               32.83225   10.12643   3.242  0.00194 ** 
USD.l2               -0.06425    3.72972  -0.017  0.98631    
InterestRate.l2      -0.25232    0.32920  -0.766  0.44640    
CPI.l2               -8.08837   14.13221  -0.572  0.56923    
UnemploymentRate.l2  -4.00970    1.43386  -2.796  0.00693 ** 
GDP.l2              -24.83896   12.54708  -1.980  0.05233 .  
USD.l3                1.29772    2.41133   0.538  0.59245    
InterestRate.l3      -0.06865    0.21421  -0.320  0.74974    
CPI.l3               11.59664    9.80310   1.183  0.24149    
UnemploymentRate.l3  -0.09004    1.05243  -0.086  0.93211    
GDP.l3              -11.05328   10.42144  -1.061  0.29311    
const               -24.99209   28.95422  -0.863  0.39149    
trend                -0.03107    0.02826  -1.099  0.27599    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.4511 on 60 degrees of freedom
Multiple R-Squared:  0.94,  Adjusted R-squared: 0.924 
F-statistic: 58.78 on 16 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation CPI: 
==================================== 
CPI = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + USD.l3 + InterestRate.l3 + CPI.l3 + UnemploymentRate.l3 + GDP.l3 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1              -0.0456288  0.0352101  -1.296 0.199973    
InterestRate.l1      0.0001286  0.0031466   0.041 0.967535    
CPI.l1               0.8533293  0.1532737   5.567 6.40e-07 ***
UnemploymentRate.l1  0.0226905  0.0142666   1.590 0.116985    
GDP.l1               0.2555096  0.1389614   1.839 0.070908 .  
USD.l2               0.0003292  0.0511816   0.006 0.994889    
InterestRate.l2      0.0038095  0.0045176   0.843 0.402435    
CPI.l2              -0.7793480  0.1939314  -4.019 0.000166 ***
UnemploymentRate.l2 -0.0022688  0.0196764  -0.115 0.908588    
GDP.l2               0.1146839  0.1721793   0.666 0.507918    
USD.l3               0.0158616  0.0330899   0.479 0.633434    
InterestRate.l3     -0.0062358  0.0029396  -2.121 0.038037 *  
CPI.l3               0.4018145  0.1345246   2.987 0.004077 ** 
UnemploymentRate.l3  0.0037730  0.0144421   0.261 0.794795    
GDP.l3               0.1312249  0.1430098   0.918 0.362505    
const               -1.8760382  0.3973288  -4.722 1.45e-05 ***
trend               -0.0017959  0.0003878  -4.631 2.00e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.006191 on 60 degrees of freedom
Multiple R-Squared: 0.9981, Adjusted R-squared: 0.9975 
F-statistic:  1933 on 16 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation UnemploymentRate: 
================================================= 
UnemploymentRate = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + USD.l3 + InterestRate.l3 + CPI.l3 + UnemploymentRate.l3 + GDP.l3 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               -0.158293   0.830263  -0.191 0.849441    
InterestRate.l1       0.042915   0.074196   0.578 0.565161    
CPI.l1                1.945295   3.614232   0.538 0.592408    
UnemploymentRate.l1  -0.272156   0.336409  -0.809 0.421709    
GDP.l1              -13.215106   3.276745  -4.033 0.000158 ***
USD.l2                0.258764   1.206875   0.214 0.830956    
InterestRate.l2      -0.050690   0.106525  -0.476 0.635910    
CPI.l2               -0.695936   4.572951  -0.152 0.879551    
UnemploymentRate.l2   1.019072   0.463974   2.196 0.031939 *  
GDP.l2                8.778332   4.060030   2.162 0.034606 *  
USD.l3               -0.540352   0.780268  -0.693 0.491282    
InterestRate.l3       0.065911   0.069316   0.951 0.345481    
CPI.l3               -1.920987   3.172123  -0.606 0.547075    
UnemploymentRate.l3   0.318939   0.340548   0.937 0.352746    
GDP.l3                4.901007   3.372206   1.453 0.151336    
const                 1.089946   9.369110   0.116 0.907776    
trend                 0.002167   0.009144   0.237 0.813485    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.146 on 60 degrees of freedom
Multiple R-Squared: 0.8588, Adjusted R-squared: 0.8212 
F-statistic: 22.81 on 16 and 60 DF,  p-value: < 2.2e-16 


Estimation results for equation GDP: 
==================================== 
GDP = USD.l1 + InterestRate.l1 + CPI.l1 + UnemploymentRate.l1 + GDP.l1 + USD.l2 + InterestRate.l2 + CPI.l2 + UnemploymentRate.l2 + GDP.l2 + USD.l3 + InterestRate.l3 + CPI.l3 + UnemploymentRate.l3 + GDP.l3 + const + trend 

                      Estimate Std. Error t value Pr(>|t|)    
USD.l1               0.0366198  0.0788740   0.464 0.644126    
InterestRate.l1     -0.0031791  0.0070486  -0.451 0.653591    
CPI.l1              -0.0831206  0.3433476  -0.242 0.809537    
UnemploymentRate.l1  0.1304955  0.0319584   4.083 0.000133 ***
GDP.l1               2.0371172  0.3112867   6.544 1.49e-08 ***
USD.l2              -0.0528169  0.1146516  -0.461 0.646698    
InterestRate.l2      0.0072805  0.0101198   0.719 0.474664    
CPI.l2              -0.1515052  0.4344247  -0.349 0.728499    
UnemploymentRate.l2 -0.1006042  0.0440769  -2.282 0.026021 *  
GDP.l2              -0.6154750  0.3856978  -1.596 0.115800    
USD.l3               0.0722026  0.0741245   0.974 0.333931    
InterestRate.l3     -0.0077263  0.0065850  -1.173 0.245301    
CPI.l3               0.0271473  0.3013478   0.090 0.928519    
UnemploymentRate.l3 -0.0085163  0.0323516  -0.263 0.793265    
GDP.l3              -0.2352782  0.3203554  -0.734 0.465548    
const               -0.9517904  0.8900539  -1.069 0.289190    
trend               -0.0007743  0.0008687  -0.891 0.376271    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01387 on 60 degrees of freedom
Multiple R-Squared: 0.9971, Adjusted R-squared: 0.9963 
F-statistic:  1288 on 16 and 60 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                        USD InterestRate        CPI UnemploymentRate        GDP
USD               5.606e-04   -0.0007734 -5.581e-05        0.0003454 -6.159e-05
InterestRate     -7.734e-04    0.2035333  1.360e-03       -0.0514387  4.824e-03
CPI              -5.581e-05    0.0013605  3.833e-05       -0.0002594  3.925e-05
UnemploymentRate  3.454e-04   -0.0514387 -2.594e-04        0.0213112 -1.832e-03
GDP              -6.159e-05    0.0048239  3.925e-05       -0.0018322  1.923e-04

Correlation matrix of residuals:
                      USD InterestRate     CPI UnemploymentRate     GDP
USD               1.00000     -0.07241 -0.3807          0.09992 -0.1876
InterestRate     -0.07241      1.00000  0.4871         -0.78103  0.7710
CPI              -0.38072      0.48710  1.0000         -0.28699  0.4571
UnemploymentRate  0.09992     -0.78103 -0.2870          1.00000 -0.9050
GDP              -0.18757      0.77100  0.4571         -0.90501  1.0000

The model with p=1 or p=2 looks good. However, only a few variables are significant in the model with p=3. Next, we will perform cross-validation to determine the best model.

Code

fun.var.2 <- function(ts, year, p, s){
  fit <- VAR(ts, p=p, type='both')
  fcast <- predict(fit, n.ahead = s)
  
  f1<-fcast$fcst$USD
  f2<-fcast$fcst$InterestRate
  f3<-fcast$fcst$CPI
  f4<-fcast$fcst$UnemploymentRate
  f5<-fcast$fcst$GDP
  ff<-data.frame(f1[,1],f2[,1],f3[,1],f4[,1],f5[,1])
  ff<-ts(ff,start=c(year,1),frequency = s)
  return(ff)
}
data <- ts_df2
n=nrow(data)
n_var = ncol(data)
h <- 4  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 12
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 2
rmse3 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 3

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 12 months
  test_start_year <- st + (i-1) + 1/h #starting year for predication, i.e. xtest

  ######## VAR(1) ############
  ff1 <- fun.var.2(xtrain, test_start_year, p=1, s=h)
  ######## VAR(2) ############
  ff2 <- fun.var.2(xtrain, test_start_year, p=2, s=h)
  ######## VAR(3) ############
  ff3 <- fun.var.2(xtrain, test_start_year, p=3, s=h)
  
  ##### collecting errors ######
  a = h*i-h+1
  b= h*i
  rmse1[c(a:b),]  <-abs(ff1-xtest)
  rmse2[c(a:b),]  <-abs(ff2-xtest)
  rmse3[c(a:b),]  <-abs(ff3-xtest)
}

rmse_combined <- as.data.frame(rbind(rmse1, rmse2, rmse3))
colnames(rmse_combined) = c("USD","InterestRate","CPI","UnemploymentRate","GDP")
rmse_combined$Model <- c(rep("VAR(1)", n-k),rep("VAR(2)", n-k),rep("VAR(3)", n-k))
rmse_combined$Date <- time(data)[(k+1):n]

# Create the USD RMSE plot with a legend
ggplot(data = rmse_combined, aes(x = Date, y = USD, color = Model)) + 
  geom_line() +
  labs(
    title = "CV Error for USD",
    x = "Date",
    y = "Error",
    color = "Model"
  ) +
  theme_minimal()

VAR(1) model is much better.

\[ \begin{aligned} \left[\begin{array}{l} \hat{USD_t} \\ \hat{InterestRate_t} \\ \hat{CPI_t} \\ \hat{UnemploymentRate_t} \\ \hat{GDP_t} \end{array}\right] &= \left[\begin{array}{ccc} 0.8602541 & 0.0121483 & 0.4560117 & -0.0894047 & 0.2034453 \\ 2.94889 & 0.75927 & 1.47241 & -0.35278 & 3.51256 \\ 0.0097524 & -0.0015307 & 0.6599743 & 0.0148023 & 0.3207071\\ -0.324474 & 0.019355 & 2.673199 & 0.704855 & -2.635176 \\ 0.0481664 & 0.0003173 & -0.2530517 & 0.0390789 & 1.2398195 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-1} \\ \text{InterestRate}_{t-1} \\ \text{CPI}_{t-1} \\ \text{UnemploymentRate}_{t-1} \\ \text{GDP}_{t-1} \end{array}\right] \end{aligned} \]

Code

# Fit a VAR(1) model including both a constant and trend
fit <- VAR(data, p = 1, type = "both")

forecast(fit, h=h) %>%
  autoplot() + 
  xlab("Year") +
  theme_minimal()

I believe the forecast is very good. The VAR(1) model has effectively captured the pattern of five varibales.

VAR: USD ~ Stock Price + Bitcoin Price

Now, let’s analyze how the US Dollar Index interacts with other financial assets, such as the S&P 500 Index and Bitcoin.

Code

# USD and S&P500 available on trading days
# Bitcoin prices (2015-2024) available daily

# Extract trading days
dxy_15 <- dxy[2523:dim(dxy)[1],]
sp5_15 <- GSPC[2518:dim(GSPC)[1],]
trading_days <- as.Date(intersect(as.Date(rownames(dxy_15)),as.Date(index(sp5_15))))
dxy_15 <- filter(dxy_15, Date %in% trading_days)
sp5_15 <- sp5_15[trading_days]
btc_15 <- `BTC-USD`[trading_days]

df3 <- data.frame(Date = dxy_15$Date, 
                  USD = log(dxy_15$Close), 
                  SP500 = log(sp5_15$GSPC.Close),
                  Bitcoin = log(btc_15[,"BTC-USD.Close"])
                  )
colnames(df3) <- c("Date", "USD", "SP500", "Bitcoin")

plot_usd <- plot_ly(df3, x = ~Date, y = ~USD, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_sp5 <- plot_ly(df3, x = ~Date, y = ~SP500, type = 'scatter', mode = 'lines', name = 'S&P500 Index') 
plot_btc <- plot_ly(df3, x = ~Date, y = ~Bitcoin, type = 'scatter', mode = 'lines', name = 'Bitcoin') 

subplot(plot_usd, plot_sp5, plot_btc, nrows = 3, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Related Variables", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'S&P500 Index'),
    yaxis3 = list(title = 'Bitcoin'))

Code

ts_df3 <- ts(df3[,-1], start = c(2015,1), frequency = 252)
VARselect(ts_df3, lag.max=6, type="both")

$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     5      2      2      5 

$criteria
                   1             2             3             4             5
AIC(n) -2.617662e+01 -2.620759e+01 -2.621035e+01 -2.620885e+01 -2.621068e+01
HQ(n)  -2.616396e+01 -2.618734e+01 -2.618251e+01 -2.617342e+01 -2.616766e+01
SC(n)  -2.614175e+01 -2.615181e+01 -2.613365e+01 -2.611123e+01 -2.609215e+01
FPE(n)  4.281929e-12  4.151340e-12  4.139908e-12  4.146114e-12  4.138518e-12
                   6
AIC(n) -2.620763e+01
HQ(n)  -2.615701e+01
SC(n)  -2.606817e+01
FPE(n)  4.151193e-12

Based on the results, the lag length could be 2 or 5.

Code

summary(fit <- VAR(ts_df3, p=2, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, SP500, Bitcoin 
Deterministic variables: both 
Sample size: 2511 
Log Likelihood: 22238.469 
Roots of the characteristic polynomial:
0.998 0.9882 0.9882 0.1622 0.04563 0.009754
Call:
VAR(y = ts_df3, p = 2, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1      1.004e+00  2.000e-02  50.213  < 2e-16 ***
SP500.l1   -4.421e-02  7.942e-03  -5.566 2.88e-08 ***
Bitcoin.l1  3.686e-04  2.039e-03   0.181   0.8566    
USD.l2     -1.404e-02  2.003e-02  -0.701   0.4834    
SP500.l2    4.576e-02  7.943e-03   5.762 9.36e-09 ***
Bitcoin.l2 -8.659e-04  2.038e-03  -0.425   0.6710    
const       3.618e-02  1.998e-02   1.811   0.0703 .  
trend       8.613e-07  7.377e-07   1.168   0.2431    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.004332 on 2503 degrees of freedom
Multiple R-Squared: 0.9924, Adjusted R-squared: 0.9923 
F-statistic: 4.641e+04 on 7 and 2503 DF,  p-value: < 2.2e-16 


Estimation results for equation SP500: 
====================================== 
SP500 = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1     -1.135e-01  5.142e-02  -2.207  0.02738 *  
SP500.l1    8.537e-01  2.042e-02  41.806  < 2e-16 ***
Bitcoin.l1  8.256e-05  5.242e-03   0.016  0.98744    
USD.l2      1.012e-01  5.150e-02   1.965  0.04948 *  
SP500.l2    1.336e-01  2.042e-02   6.541 7.39e-11 ***
Bitcoin.l2  4.339e-05  5.241e-03   0.008  0.99339    
const       1.507e-01  5.138e-02   2.933  0.00338 ** 
trend       5.742e-06  1.897e-06   3.027  0.00249 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01114 on 2503 degrees of freedom
Multiple R-Squared: 0.9988, Adjusted R-squared: 0.9988 
F-statistic: 2.938e+05 on 7 and 2503 DF,  p-value: < 2.2e-16 


Estimation results for equation Bitcoin: 
======================================== 
Bitcoin = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1      1.870e-01  2.011e-01   0.930  0.35253    
SP500.l1   -7.479e-02  7.987e-02  -0.936  0.34918    
Bitcoin.l1  9.902e-01  2.050e-02  48.294  < 2e-16 ***
USD.l2     -2.622e-01  2.014e-01  -1.302  0.19309    
SP500.l2    5.666e-02  7.988e-02   0.709  0.47816    
Bitcoin.l2  5.311e-03  2.050e-02   0.259  0.79557    
const       5.083e-01  2.009e-01   2.529  0.01149 *  
trend       1.990e-05  7.419e-06   2.683  0.00735 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.04357 on 2503 degrees of freedom
Multiple R-Squared: 0.9994, Adjusted R-squared: 0.9994 
F-statistic: 5.993e+05 on 7 and 2503 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
               USD      SP500    Bitcoin
USD      1.877e-05 -6.093e-06 -1.548e-05
SP500   -6.093e-06  1.241e-04  1.100e-04
Bitcoin -1.548e-05  1.100e-04  1.898e-03

Correlation matrix of residuals:
             USD   SP500  Bitcoin
USD      1.00000 -0.1263 -0.08204
SP500   -0.12627  1.0000  0.22677
Bitcoin -0.08204  0.2268  1.00000

Code

summary(fit <- VAR(ts_df3, p=5, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, SP500, Bitcoin 
Deterministic variables: both 
Sample size: 2508 
Log Likelihood: 22243.537 
Roots of the characteristic polynomial:
0.9979 0.9884 0.9884 0.5469 0.5469 0.4885 0.4885 0.4613 0.4613 0.4553 0.4553 0.4274 0.4118 0.4118 0.2657
Call:
VAR(y = ts_df3, p = 5, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + USD.l3 + SP500.l3 + Bitcoin.l3 + USD.l4 + SP500.l4 + Bitcoin.l4 + USD.l5 + SP500.l5 + Bitcoin.l5 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1      1.004e+00  2.019e-02  49.692  < 2e-16 ***
SP500.l1   -4.489e-02  8.068e-03  -5.564 2.92e-08 ***
Bitcoin.l1  5.992e-04  2.046e-03   0.293   0.7696    
USD.l2     -3.504e-02  2.858e-02  -1.226   0.2203    
SP500.l2    4.381e-02  1.064e-02   4.117 3.96e-05 ***
Bitcoin.l2  1.722e-03  2.879e-03   0.598   0.5497    
USD.l3     -2.433e-03  2.858e-02  -0.085   0.9322    
SP500.l3   -6.585e-03  1.073e-02  -0.613   0.5396    
Bitcoin.l3 -3.375e-03  2.879e-03  -1.172   0.2412    
USD.l4      5.154e-02  2.848e-02   1.810   0.0705 .  
SP500.l4    7.833e-03  1.063e-02   0.737   0.4612    
Bitcoin.l4 -4.209e-03  2.878e-03  -1.462   0.1438    
USD.l5     -2.700e-02  2.009e-02  -1.344   0.1792    
SP500.l5    1.531e-03  8.087e-03   0.189   0.8499    
Bitcoin.l5  4.790e-03  2.044e-03   2.344   0.0192 *  
const       3.277e-02  2.024e-02   1.619   0.1056    
trend       7.349e-07  7.459e-07   0.985   0.3246    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.004329 on 2491 degrees of freedom
Multiple R-Squared: 0.9924, Adjusted R-squared: 0.9923 
F-statistic: 2.03e+04 on 16 and 2491 DF,  p-value: < 2.2e-16 


Estimation results for equation SP500: 
====================================== 
SP500 = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + USD.l3 + SP500.l3 + Bitcoin.l3 + USD.l4 + SP500.l4 + Bitcoin.l4 + USD.l5 + SP500.l5 + Bitcoin.l5 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1     -9.620e-02  5.164e-02  -1.863 0.062625 .  
SP500.l1    8.629e-01  2.063e-02  41.822  < 2e-16 ***
Bitcoin.l1 -5.043e-05  5.231e-03  -0.010 0.992309    
USD.l2     -8.612e-04  7.310e-02  -0.012 0.990600    
SP500.l2    1.746e-01  2.721e-02   6.417 1.65e-10 ***
Bitcoin.l2  1.359e-02  7.362e-03   1.846 0.064947 .  
USD.l3      1.829e-01  7.310e-02   2.502 0.012396 *  
SP500.l3   -5.604e-02  2.745e-02  -2.042 0.041299 *  
Bitcoin.l3 -1.069e-02  7.362e-03  -1.452 0.146554    
USD.l4     -1.860e-01  7.283e-02  -2.553 0.010730 *  
SP500.l4   -6.211e-02  2.718e-02  -2.285 0.022372 *  
Bitcoin.l4  2.599e-03  7.361e-03   0.353 0.724016    
USD.l5      8.840e-02  5.139e-02   1.720 0.085498 .  
SP500.l5    6.827e-02  2.068e-02   3.301 0.000977 ***
Bitcoin.l5 -5.350e-03  5.226e-03  -1.024 0.306082    
const       1.453e-01  5.176e-02   2.808 0.005029 ** 
trend       5.658e-06  1.907e-06   2.966 0.003043 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01107 on 2491 degrees of freedom
Multiple R-Squared: 0.9988, Adjusted R-squared: 0.9988 
F-statistic: 1.299e+05 on 16 and 2491 DF,  p-value: < 2.2e-16 


Estimation results for equation Bitcoin: 
======================================== 
Bitcoin = USD.l1 + SP500.l1 + Bitcoin.l1 + USD.l2 + SP500.l2 + Bitcoin.l2 + USD.l3 + SP500.l3 + Bitcoin.l3 + USD.l4 + SP500.l4 + Bitcoin.l4 + USD.l5 + SP500.l5 + Bitcoin.l5 + const + trend 

             Estimate Std. Error t value Pr(>|t|)    
USD.l1      2.032e-01  2.032e-01   1.000  0.31745    
SP500.l1   -6.824e-02  8.119e-02  -0.840  0.40071    
Bitcoin.l1  9.906e-01  2.058e-02  48.124  < 2e-16 ***
USD.l2     -3.502e-01  2.876e-01  -1.218  0.22348    
SP500.l2    1.201e-01  1.071e-01   1.122  0.26210    
Bitcoin.l2  9.846e-03  2.897e-02   0.340  0.73396    
USD.l3      3.971e-01  2.876e-01   1.381  0.16754    
SP500.l3   -9.210e-02  1.080e-01  -0.853  0.39392    
Bitcoin.l3  4.471e-02  2.897e-02   1.544  0.12278    
USD.l4     -3.625e-01  2.866e-01  -1.265  0.20603    
SP500.l4   -6.490e-02  1.069e-01  -0.607  0.54395    
Bitcoin.l4 -2.697e-02  2.896e-02  -0.931  0.35186    
USD.l5      3.185e-02  2.022e-01   0.158  0.87484    
SP500.l5    8.691e-02  8.137e-02   1.068  0.28558    
Bitcoin.l5 -2.301e-02  2.056e-02  -1.119  0.26328    
const       5.353e-01  2.037e-01   2.628  0.00864 ** 
trend       2.106e-05  7.505e-06   2.806  0.00505 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.04355 on 2491 degrees of freedom
Multiple R-Squared: 0.9994, Adjusted R-squared: 0.9994 
F-statistic: 2.613e+05 on 16 and 2491 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
               USD      SP500    Bitcoin
USD      1.874e-05 -6.191e-06 -1.519e-05
SP500   -6.191e-06  1.225e-04  1.085e-04
Bitcoin -1.519e-05  1.085e-04  1.897e-03

Correlation matrix of residuals:
             USD   SP500  Bitcoin
USD      1.00000 -0.1292 -0.08059
SP500   -0.12921  1.0000  0.22495
Bitcoin -0.08059  0.2250  1.00000

The results looks similar. Next, we will perform cross-validation to determine the best model.

Code

fun.var.3 <- function(ts, year, p, s){
  fit <- VAR(ts, p=p, type='both')
  fcast <- predict(fit, n.ahead = s)
  
  f1<-fcast$fcst$USD
  f2<-fcast$fcst$SP500
  f3<-fcast$fcst$Bitcoin
  ff<-data.frame(f1[,1],f2[,1],f3[,1])
  ff<-ts(ff,start=c(year,1),frequency = s)
  return(ff)
}

data <- ts_df3
n=nrow(data)
n_var = ncol(data)
h <- 252  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 12
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 2

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 12 months
  test_start_year <- st + (i-1) + 1/h #starting year for predication, i.e. xtest

  ######## VAR(2) ############
  ff1 <- fun.var.3(xtrain, test_start_year, p=2, s=h)
  ######## VAR(5) ############
  ff2 <- fun.var.3(xtrain, test_start_year, p=5, s=h)
  
  ##### collecting errors ######
  a = h*i-h+1
  b= h*i
  rmse1[c(a:b),]  <-abs(ff1-xtest)
  rmse2[c(a:b),]  <-abs(ff2-xtest)
}

rmse_combined <- as.data.frame(rbind(rmse1, rmse2))
colnames(rmse_combined) = c("USD","S&P500","Bitcoin")
rmse_combined$Model <- c(rep("VAR(2)", n-k),rep("VAR(5)", n-k))
rmse_combined$Date <- time(data)[(k+1):n]

# Create the USD RMSE plot with a legend
ggplot(data = rmse_combined, aes(x = Date, y = USD, color = Model)) + 
  geom_line() +
  labs(
    title = "CV Error for USD",
    x = "Date",
    y = "Error",
    color = "Model"
  ) +
  theme_minimal()

The cross-validation results are very similar. Considering the principle of parsimony, we select VAR(2) as the best model.

\[ \begin{aligned} \left[\begin{array}{l} \hat{USD_t} \\ \hat{SP500_t} \\ \hat{Bitcoin_t} \end{array}\right] &= \left[\begin{array}{ccc} 1.004 & -0.04421 & 0.0003686 \\ -0.1135 & 0.8537 & 0.00008256 \\ 0.1870 & -0.07479 & 0.9902 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-1} \\ \text{SP500}_{t-1} \\ \text{Bitcoin}_{t-1} \\ \end{array}\right] \\ &+ \left[\begin{array}{ccc} -0.01404 & -0.04576e & -0.0008659 \\ 0.1012 & 0.1336 & 0.00004339 \\ -0.2622 & 0.05666 & 0.005311 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-2} \\ \text{SP500}_{t-2} \\ \text{Bitcoin}_{t-2} \\ \end{array}\right] \end{aligned} \]

Code

# Fit a VAR(2) model including both a constant and trend
fit <- VAR(data, p = 2, type = "both")

forecast(fit, h=h) %>%
  autoplot() + 
  scale_x_continuous(breaks = 2015:2026) +
  xlab("Year") +
  theme_minimal()

VAR: USD ~ Oil Price + Gold Price

Code

xau_m <- xau %>%
  mutate(month = floor_date(Date, "month")) %>% 
  group_by(month) %>%
  summarise(Price = mean(Price, na.rm = TRUE))

df4 <- data.frame(Date = dxy_m$month, 
                  USD = log(dxy_m$Close), 
                  OilPrice = log(oil$Oil),
                  GoldPrice = log(xau_m$Price)
                  )

plot_usd <- plot_ly(df4, x = ~Date, y = ~USD, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_oil <- plot_ly(df4, x = ~Date, y = ~OilPrice, type = 'scatter', mode = 'lines', name = 'Oil Price') 
plot_gold <- plot_ly(df4, x = ~Date, y = ~GoldPrice, type = 'scatter', mode = 'lines', name = 'Gold Price') 

subplot(plot_usd, plot_oil, plot_gold, nrows = 3, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Related Variables", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'Crude Oil Price'),
    yaxis3 = list(title = 'Gold Price'))

We can see that the USD and oil prices tend to move in opposite directions.

Code

ts_df4 <- ts(df4[,-1], start = c(2005,1), frequency = 12)
VARselect(ts_df4, lag.max=6, type="both")

$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     2      2      2      2 

$criteria
                   1             2             3             4             5
AIC(n) -1.937674e+01 -1.952636e+01 -1.948332e+01 -1.946905e+01 -1.944988e+01
HQ(n)  -1.928743e+01 -1.938347e+01 -1.928684e+01 -1.921899e+01 -1.914624e+01
SC(n)  -1.915525e+01 -1.917197e+01 -1.899603e+01 -1.884887e+01 -1.869680e+01
FPE(n)  3.844119e-09  3.310129e-09  3.456157e-09  3.506587e-09  3.575685e-09
                   6
AIC(n) -1.938645e+01
HQ(n)  -1.902922e+01
SC(n)  -1.850047e+01
FPE(n)  3.811676e-09

VAR(2) is selected as the best model. But we will further validate this choice through model diagnostics and cross-validation.

Code

summary(fit <- VAR(ts_df4, p=1, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, OilPrice, GoldPrice 
Deterministic variables: both 
Sample size: 239 
Log Likelihood: 1314.286 
Roots of the characteristic polynomial:
0.9684 0.9331 0.9331
Call:
VAR(y = ts_df4, p = 1, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + OilPrice.l1 + GoldPrice.l1 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1        9.410e-01  2.727e-02  34.502  < 2e-16 ***
OilPrice.l1   7.723e-03  4.265e-03   1.811  0.07145 .  
GoldPrice.l1 -1.745e-02  7.716e-03  -2.262  0.02464 *  
const         3.367e-01  1.620e-01   2.079  0.03874 *  
trend         1.734e-04  6.664e-05   2.602  0.00986 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01713 on 234 degrees of freedom
Multiple R-Squared: 0.9746, Adjusted R-squared: 0.9742 
F-statistic:  2246 on 4 and 234 DF,  p-value: < 2.2e-16 


Estimation results for equation OilPrice: 
========================================= 
OilPrice = USD.l1 + OilPrice.l1 + GoldPrice.l1 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1       -2.541e-01  1.666e-01  -1.525    0.129    
OilPrice.l1   9.008e-01  2.606e-02  34.567   <2e-16 ***
GoldPrice.l1  2.492e-02  4.715e-02   0.529    0.598    
const         1.374e+00  9.897e-01   1.388    0.166    
trend         8.798e-05  4.072e-04   0.216    0.829    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.1046 on 234 degrees of freedom
Multiple R-Squared: 0.8942, Adjusted R-squared: 0.8924 
F-statistic: 494.6 on 4 and 234 DF,  p-value: < 2.2e-16 


Estimation results for equation GoldPrice: 
========================================== 
GoldPrice = USD.l1 + OilPrice.l1 + GoldPrice.l1 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1        4.209e-02  6.091e-02   0.691    0.490    
OilPrice.l1  -9.052e-03  9.526e-03  -0.950    0.343    
GoldPrice.l1  9.914e-01  1.723e-02  57.530   <2e-16 ***
const        -7.635e-02  3.618e-01  -0.211    0.833    
trend        -4.755e-05  1.488e-04  -0.319    0.750    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.03825 on 234 degrees of freedom
Multiple R-Squared: 0.9913, Adjusted R-squared: 0.9911 
F-statistic:  6642 on 4 and 234 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                 USD   OilPrice  GoldPrice
USD        0.0002933 -0.0004969 -0.0002689
OilPrice  -0.0004969  0.0109508  0.0001376
GoldPrice -0.0002689  0.0001376  0.0014632

Correlation matrix of residuals:
              USD OilPrice GoldPrice
USD        1.0000 -0.27724  -0.41049
OilPrice  -0.2772  1.00000   0.03439
GoldPrice -0.4105  0.03439   1.00000

Code

summary(fit <- VAR(ts_df4, p=2, type="both"))


VAR Estimation Results:
========================= 
Endogenous variables: USD, OilPrice, GoldPrice 
Deterministic variables: both 
Sample size: 238 
Log Likelihood: 1334.167 
Roots of the characteristic polynomial:
0.9707 0.8674 0.8674 0.4025 0.2636 0.1065
Call:
VAR(y = ts_df4, p = 2, type = "both")


Estimation results for equation USD: 
==================================== 
USD = USD.l1 + OilPrice.l1 + GoldPrice.l1 + USD.l2 + OilPrice.l2 + GoldPrice.l2 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1        1.2027754  0.0714199  16.841  < 2e-16 ***
OilPrice.l1  -0.0113234  0.0106528  -1.063  0.28892    
GoldPrice.l1 -0.0131396  0.0306965  -0.428  0.66901    
USD.l2       -0.2850616  0.0717925  -3.971 9.59e-05 ***
OilPrice.l2   0.0171081  0.0105528   1.621  0.10635    
GoldPrice.l2 -0.0064260  0.0308520  -0.208  0.83519    
const         0.4600009  0.1584409   2.903  0.00405 ** 
trend         0.0002079  0.0000649   3.204  0.00155 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.01631 on 230 degrees of freedom
Multiple R-Squared: 0.9773, Adjusted R-squared: 0.9766 
F-statistic:  1417 on 7 and 230 DF,  p-value: < 2.2e-16 


Estimation results for equation OilPrice: 
========================================= 
OilPrice = USD.l1 + OilPrice.l1 + GoldPrice.l1 + USD.l2 + OilPrice.l2 + GoldPrice.l2 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1       -3.476e-01  4.359e-01  -0.797    0.426    
OilPrice.l1   1.201e+00  6.502e-02  18.468  < 2e-16 ***
GoldPrice.l1  1.588e-01  1.873e-01   0.848    0.398    
USD.l2        1.539e-01  4.382e-01   0.351    0.726    
OilPrice.l2  -3.120e-01  6.441e-02  -4.844 2.34e-06 ***
GoldPrice.l2 -1.185e-01  1.883e-01  -0.629    0.530    
const         1.060e+00  9.670e-01   1.096    0.274    
trend        -4.941e-05  3.961e-04  -0.125    0.901    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.09955 on 230 degrees of freedom
Multiple R-Squared: 0.9054, Adjusted R-squared: 0.9025 
F-statistic: 314.5 on 7 and 230 DF,  p-value: < 2.2e-16 


Estimation results for equation GoldPrice: 
========================================== 
GoldPrice = USD.l1 + OilPrice.l1 + GoldPrice.l1 + USD.l2 + OilPrice.l2 + GoldPrice.l2 + const + trend 

               Estimate Std. Error t value Pr(>|t|)    
USD.l1       -6.861e-02  1.678e-01  -0.409    0.683    
OilPrice.l1  -2.366e-02  2.503e-02  -0.945    0.346    
GoldPrice.l1  1.071e+00  7.212e-02  14.850   <2e-16 ***
USD.l2        1.107e-01  1.687e-01   0.656    0.512    
OilPrice.l2   1.827e-02  2.479e-02   0.737    0.462    
GoldPrice.l2 -8.309e-02  7.249e-02  -1.146    0.253    
const        -6.986e-02  3.723e-01  -0.188    0.851    
trend        -2.538e-05  1.525e-04  -0.166    0.868    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Residual standard error: 0.03832 on 230 degrees of freedom
Multiple R-Squared: 0.9911, Adjusted R-squared: 0.9909 
F-statistic:  3670 on 7 and 230 DF,  p-value: < 2.2e-16 



Covariance matrix of residuals:
                 USD   OilPrice  GoldPrice
USD        0.0002660 -0.0003718 -0.0002632
OilPrice  -0.0003718  0.0099101  0.0001536
GoldPrice -0.0002632  0.0001536  0.0014687

Correlation matrix of residuals:
              USD OilPrice GoldPrice
USD        1.0000 -0.22898  -0.42112
OilPrice  -0.2290  1.00000   0.04025
GoldPrice -0.4211  0.04025   1.00000

Code

fun.var.4 <- function(ts, year, p, s){
  fit <- VAR(ts, p=p, type='both')
  fcast <- predict(fit, n.ahead = s)
  
  f1<-fcast$fcst$USD
  f2<-fcast$fcst$OilPrice
  f3<-fcast$fcst$GoldPrice
  ff<-data.frame(f1[,1],f2[,1],f3[,1])
  ff<-ts(ff,start=c(year,1),frequency = s)
  return(ff)
}

data <- ts_df4
n=nrow(data)
n_var = ncol(data)
h <- 12  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 12
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = n-k, ncol = n_var)  # RMSE for Model 2

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 12 months
  test_start_year <- st + (i-1) + 1/h #starting year for predication, i.e. xtest

  ######## VAR(1) ############
  ff1 <- fun.var.4(xtrain, test_start_year, p=1, s=h)
  ######## VAR(2) ############
  ff2 <- fun.var.4(xtrain, test_start_year, p=2, s=h)
  
  ##### collecting errors ######
  a = h*i-h+1
  b= h*i
  rmse1[c(a:b),]  <-abs(ff1-xtest)
  rmse2[c(a:b),]  <-abs(ff2-xtest)
}

rmse_combined <- as.data.frame(rbind(rmse1, rmse2))
colnames(rmse_combined) = c("USD","Oil Price","Gold Price")
rmse_combined$Model <- c(rep("VAR(1)", n-k),rep("VAR(2)", n-k))
rmse_combined$Date <- time(data)[(k+1):n]

# Create the USD RMSE plot with a legend
ggplot(data = rmse_combined, aes(x = Date, y = USD, color = Model)) + 
  geom_line() +
  labs(
    title = "CV Error for USD",
    x = "Date",
    y = "Error",
    color = "Model"
  ) +
  theme_minimal()

The cross-validation result of VAR(1) is not significantly better than that of VAR(2). Therefore, we proceed with VAR(2) as suggested by VARselect(), since it has a lower AIC.

\[ \begin{aligned} \left[\begin{array}{l} \hat{USD_t} \\ \hat{OilPrice_t} \\ \hat{GoldPrice_t} \end{array}\right] &= \left[\begin{array}{ccc} 1.2027754 & -0.0113234 & -0.0131396 \\ -0.3476 & 1.201 & 0.1588 \\ -0.06861 & -0.02366 & 1.071 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-1} \\ \text{OilPrice}_{t-1} \\ \text{GoldPrice}_{t-1} \\ \end{array}\right] \\ &+ \left[\begin{array}{ccc} -0.2850616 & 0.0171081 & -0.0064260 \\ 0.1539 & -0.3120 & -0.1185 \\ 0.1107 & 0.01827 & -0.08309 \end{array}\right] \left[\begin{array}{l} \text{USD}_{t-2} \\ \text{OilPrice}_{t-2} \\ \text{GoldPrice}_{t-2} \\ \end{array}\right] \end{aligned} \]

Code

# Fit a VAR(2) model including both a constant and trend
fit <- VAR(data, p = 2, type = "both")

forecast(fit, h=h) %>%
  autoplot() + 
  xlab("Year") +
  theme_minimal()

SARIMAX: USD ~ Egg Price

In this analysis, we will explore how the domestic market, specifically egg prices, affects the USD.

Code

df5 <- data.frame(Date = dxy_m$month, dxy = log(dxy_m$Close), egg = log(egg$Price))

plot_usd <- plot_ly(df5, x = ~Date, y = ~dxy, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_egg <- plot_ly(df5, x = ~Date, y = ~egg, type = 'scatter', mode = 'lines', name = 'Egg Price') 

subplot(plot_usd, plot_egg, nrows = 2, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Egg Price", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'Egg Price'))

The trend of the USD and egg prices is very similar. Thus, we can fit an ARIMAX model using the USD as the dependent variable and egg prices as the exogenous variable.

Code

ts_df5 <- ts(df5, start = c(2005, 1), frequency = 12)
auto.arima(ts_df5[,"dxy"], xreg = ts_df5[,"egg"])

Series: ts_df5[, "dxy"] 
Regression with ARIMA(1,1,2)(2,0,1)[12] errors 

Coefficients:
          ar1     ma1     ma2    sar1     sar2    sma1    xreg
      -0.9684  1.2983  0.3595  -0.704  -0.1496  0.5654  0.0019
s.e.   0.0264  0.0678  0.0635   0.449   0.0765  0.4505  0.0125

sigma^2 = 0.0002711:  log likelihood = 645.57
AIC=-1275.13   AICc=-1274.5   BIC=-1247.32

Code

# Fit a linear model
lm_fit.5 <- lm(dxy ~ egg, data = df5)
res.5 <- ts(residuals(lm_fit.5), start = c(2005, 1), frequency = 12)

# ACF and PACF plots of the residuals
ggtsdisplay(diff(res.5), main = "Differenced residuals") # p=0:1, d=1, q=0:1, P=0:3, D=1, Q=0:1

Code

# Manual search
output=SARIMA.c(p1=0,p2=1,q1=0,q2=1,P1=0,P2=3,Q1=0,Q2=1,s=12,data=res.5)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	P	D	Q	AIC	BIC	AICc
0	1	0	0	1	0	-955.7432	-952.3182	-955.7254
0	1	0	0	1	1	-1097.7315	-1090.8816	-1097.6779
0	1	0	1	1	0	-1028.9996	-1022.1497	-1028.9460
0	1	0	1	1	1	-1097.9083	-1087.6334	-1097.8007
0	1	0	2	1	0	-1068.5904	-1058.3155	-1068.4828
0	1	0	2	1	1	-1095.9367	-1082.2369	-1095.7565
0	1	0	3	1	0	-1071.0916	-1057.3918	-1070.9114
0	1	0	3	1	1	-1095.7146	-1078.5899	-1095.4431
0	1	1	0	1	0	-972.4749	-965.6250	-972.4213
0	1	1	0	1	1	-1108.1003	-1097.8254	-1107.9927
0	1	1	1	1	0	-1045.4959	-1035.2210	-1045.3882
0	1	1	1	1	1	-1107.3199	-1093.6201	-1107.1398
0	1	1	2	1	0	-1076.4356	-1062.7358	-1076.2554
0	1	1	2	1	1	-1105.3775	-1088.2528	-1105.1060
0	1	1	3	1	0	-1078.2845	-1061.1598	-1078.0130
0	1	1	3	1	1	-1104.3276	-1083.7779	-1103.9458
1	1	0	0	1	0	-975.9562	-969.1063	-975.9027
1	1	0	0	1	1	-1110.0948	-1099.8199	-1109.9872
1	1	0	1	1	0	-1047.1080	-1036.8331	-1047.0003
1	1	0	1	1	1	-1109.1529	-1095.4531	-1108.9727
1	1	0	2	1	0	-1077.3774	-1063.6776	-1077.1972
1	1	0	2	1	1	-1107.1979	-1090.0731	-1106.9264
1	1	0	3	1	0	-1079.5166	-1062.3919	-1079.2451
1	1	0	3	1	1	-1105.6125	-1085.0628	-1105.2306
1	1	1	0	1	0	-974.5777	-964.3029	-974.4701
1	1	1	0	1	1	-1108.5119	-1094.8121	-1108.3317
1	1	1	1	1	0	-1045.1624	-1031.4626	-1044.9822
1	1	1	1	1	1	-1107.5445	-1090.4198	-1107.2730
1	1	1	2	1	0	-1075.5337	-1058.4089	-1075.2622
1	1	1	2	1	1	-1105.5719	-1085.0222	-1105.1901
1	1	1	3	1	0	-1077.8525	-1057.3028	-1077.4707
1	1	1	3	1	1	-1104.2687	-1080.2940	-1103.7573

Code

# Model Diagnostics
model_output <- capture.output(sarima(res.5, 1,1,0,0,1,1,12))

Code

start_line <- grep("Coefficients", model_output)
end_line <- length(model_output)
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
     Estimate     SE  t.value p.value
ar1    0.2471 0.0641   3.8530   2e-04
sma1  -1.0000 0.0612 -16.3362   0e+00

sigma^2 estimated as 0.0003659517 on 225 degrees of freedom 
 
AIC = -4.890285  AICc = -4.890049  BIC = -4.845022

The best model for the residuals of the linear model is ARIMA(1,1,0)(0,1,1)[12].

The Residual Plot of the ARIMA model shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows mostly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are mostly above the 0.05 significance level, implying that a few autocorrelations are left in the residuals.

All coefficients are significant at the 5% significant level.

Code

# Define parameters
data <- ts_df5
n <- nrow(data)  # Total observations
h <- 12  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 12
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = num_iter, ncol = h)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = num_iter, ncol = h)  # RMSE for Model 2

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 12 months

  # Fit auto.arima() model
  model.1 <- Arima(xtrain[,"dxy"], order = c(1,1,2), 
                  seasonal = list(order = c(2,0,1), period = 12),
                  xreg = xtrain[, "egg"],
                  optim.control = list(maxit = 10000))   
  f.1 <-   forecast(model.1, xreg = xtest[, "egg"], h = h)
  rmse1[i,] <-  (f.1$mean - xtest[,"dxy"])^2

  ###### Fit manual model
  model.2 <- Arima(xtrain[, "dxy"], order = c(1,1,0),
                  seasonal = list(order = c(0,1,1), period = 12),
                  xreg = xtrain[, "egg"],
                  optim.control = list(maxit = 10000))
  f.2 <- forecast(model.2, xreg = xtest[, "egg"], h = h)
  rmse2[i,] <-  (f.2$mean - xtest[,"dxy"])^2
}   

# Compute RMSE across all iterations
rmse1_avg <- sqrt(colMeans(rmse1, na.rm = TRUE))
rmse2_avg <- sqrt(colMeans(rmse2, na.rm = TRUE))

# Create a DataFrame for better visualization
error_table <- data.frame(
    Horizon = 1:h,
    RMSE_Model1 = rmse1_avg,
    RMSE_Model2 = rmse2_avg
)

# **Improved RMSE Plot using ggplot2**
ggplot(error_table, aes(x = Horizon)) +
  geom_line(aes(y = RMSE_Model1, color = "Regression with ARIMA(1,1,2)(2,0,1)[12] errors"), size = 1) +
  geom_line(aes(y = RMSE_Model2, color = "Regression with ARIMA(1,1,0)(0,1,1)[12] errors"), size = 1) +
  labs(title = "RMSE Comparison for 12-Step Forecasts",
       x = "Forecast Horizon (Months Ahead)",
       y = "Root Mean Squared Error (RMSE)") +
  scale_color_manual(name = "Models", values = c("red", "blue")) +
  theme_minimal()

ARIMA(1,1,2)(2,0,1)[12] looks better.

Fit and summarize the best model:

Code

best_model <- Arima(data[,"dxy"], order = c(1,1,2), 
                seasonal = list(order = c(2,0,1), period = 12),
                xreg = data[, "egg"],
                optim.control = list(maxit = 10000))   
summary(best_model)

Series: data[, "dxy"] 
Regression with ARIMA(1,1,2)(2,0,1)[12] errors 

Coefficients:
          ar1     ma1     ma2    sar1     sar2    sma1    xreg
      -0.9684  1.2983  0.3595  -0.704  -0.1496  0.5654  0.0019
s.e.   0.0264  0.0678  0.0635   0.449   0.0765  0.4505  0.0125

sigma^2 = 0.0002711:  log likelihood = 645.57
AIC=-1275.13   AICc=-1274.5   BIC=-1247.32

Training set error measures:
                       ME       RMSE        MAE        MPE    MAPE      MASE
Training set 0.0009160148 0.01618699 0.01272872 0.01965128 0.28394 0.2172217
                    ACF1
Training set -0.01524275

Model Equation

\[ \begin{gathered} y_t=\beta x_t+n_t \\ \Phi_P\left(B^s\right) \varphi(B) \nabla_s^D \nabla^d n_t=\delta+\Theta_Q\left(B^s\right) \theta(B) w_t \end{gathered} \]

With the parameters from the model: \[ \begin{gathered} y_t=\beta x_t+n_t \\ y_t=0.0019 \text { Egg Price }+n_t \end{gathered} \]

With \(\operatorname{ARIMA}(1,1,2)(2,0,1)[12]\) errors: \[ \begin{gathered} \Phi_P\left(B^{12}\right) \varphi(B) \nabla_{12}^{D=0} \nabla^{d=1} n_t=\Theta_Q\left(B^{12}\right) \theta(B) w_t \\ (1-\Phi_1 B^{12} - \Phi_2 B^{24})\left(1-\phi_1 B\right)(1-B) n_t=\left(1+\Theta B^{12}\right)\left(1+\theta_1 B+\theta_2 B^2\right) w_t \\ (1+0.704 B^{12} +0.1496 B^{24})\left(1+0.9684 B\right)(1-B) n_t=\left(1+0.5654 B^{12}\right)\left(1+1.2983 B+0.3595 B^2\right) w_t \end{gathered} \]

Code

arima_egg <- auto.arima(ts_df5[,"egg"])
f_egg <- forecast(arima_egg, h = h)
f.5 <-   forecast(best_model, xreg = f_egg$mean, h = h)
autoplot(f.5) +
  labs(title = "12-Month Forecast of Log-Transformed U.S. Dollar",
       x = "Time", y = "Log-transformed USD") +
  theme_minimal()

I believe the forecast is very good. By adding the exogenous variable (egg price) to the SARIMA model, the SARIMAX model has effectively captured the pattern of USD.

ARIMAX: USD ~ House Price + International Visitors

Here, we will examine how the real estate and tourism market affect the USD.

Code

visitors_q <- visitors %>%
  mutate(quarter = floor_date(time, "quarter")) %>% 
  group_by(quarter) %>%
  summarise(count = mean(count, na.rm = TRUE))
df6 <- data.frame(Date = dxy_q$quarter, 
                  dxy = log(dxy_q$Close), 
                  house = log(house$index),
                  visitors = log(visitors_q$count))
plot_usd <- plot_ly(df6, x = ~Date, y = ~dxy, type = 'scatter', mode = 'lines', name = 'U.S. Dollar')
plot_house <- plot_ly(df6, x = ~Date, y = ~house, type = 'scatter', mode = 'lines', name = 'House Price') 
plot_visitors <- plot_ly(df6, x = ~Date, y = ~visitors, type = 'scatter', mode = 'lines', name = 'Visitors') 

subplot(plot_usd, plot_house, plot_visitors, nrows = 3, shareX = TRUE) %>%
  layout(title = "Trend of U.S. Dollar and Related Variables ", showlegend = FALSE,
    xaxis = list(title = 'Date'),
    yaxis = list(title = 'U.S. Dollar'),
    yaxis2 = list(title = 'House Price'),
    yaxis3 = list(title = "Num of Intl Visitors"))

Both the USD and house prices show an upward trend. Thus, we can fit an ARIMAX model using the USD as the dependent variable, house prices and international visitors as the exogenous variables.

Code

ts_df6 <- ts(df6, start = c(2005, 1), frequency = 4)
auto.arima(ts_df6[,"dxy"], xreg = ts_df6[,c("house", "visitors")])

Series: ts_df6[, "dxy"] 
Regression with ARIMA(0,1,1) errors 

Coefficients:
         ma1   house  visitors
      0.4995  0.3585   -0.0118
s.e.  0.0878  0.2114    0.0071

sigma^2 = 0.0007382:  log likelihood = 174.14
AIC=-340.28   AICc=-339.74   BIC=-330.8

Code

# Fit a linear model
lm_fit.6 <- lm(dxy ~ house + visitors, data = df6)
res.6 <- ts(residuals(lm_fit.6), start = c(2005, 1), frequency = 4)

# ACF and PACF plots of the residuals
ggtsdisplay(diff(res.6), main = "Differenced Residuals") # p=0:2, d=1, q=0:3

Code

# Manual search
output=ARIMA.c(p1=0,p2=2,q1=0,q2=3,data=res.6)
highlight_output(output)

Comparison of ARIMA Models
p	d	q	AIC	BIC	AICc
0	1	0	-315.2008	-310.4620	-315.0430
0	1	1	-325.5822	-318.4739	-325.2622
0	1	2	-323.6752	-314.1974	-323.1346
0	1	3	-326.5282	-314.6809	-325.7062
1	1	0	-321.2992	-314.1908	-320.9792
1	1	1	-323.6205	-314.1428	-323.0800
1	1	2	-328.3456	-316.4983	-327.5237
1	1	3	-327.2125	-312.9958	-326.0458
2	1	0	-327.2298	-317.7520	-326.6893
2	1	1	-327.7637	-315.9165	-326.9418
2	1	2	-326.5323	-312.3157	-325.3657
2	1	3	-324.5698	-307.9837	-322.9924

Code

# Model Diagnostics
model_output <- capture.output(sarima(res.6, 0,1,1))

Code

start_line <- grep("Coefficients", model_output)
end_line <- length(model_output)
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ma1        0.4289 0.0995  4.3111  0.0000
constant  -0.0001 0.0047 -0.0299  0.9763

sigma^2 estimated as 0.0008781503 on 77 degrees of freedom 
 
AIC = -4.121294  AICc = -4.119295  BIC = -4.031315

Code

model_output <- capture.output(sarima(res.6, 1,1,2))

Code

start_line <- grep("Coefficients", model_output)
end_line <- length(model_output)
cat(model_output[start_line:end_line], sep = "\n")

Coefficients: 
         Estimate     SE t.value p.value
ar1        0.7855 0.0786  9.9877  0.0000
ma1       -0.5163 0.1041 -4.9618  0.0000
ma2       -0.4837 0.0977 -4.9491  0.0000
constant   0.0005 0.0008  0.5959  0.5531

sigma^2 estimated as 0.0007832046 on 75 degrees of freedom 
 
AIC = -4.156273  AICc = -4.149431  BIC = -4.006308

The best model for the residuals of the linear model is ARIMA(0,1,1), since it has lower AIC and AICc values.

For both models:

The Residual Plot of the ARIMA model shows nearly consistent fluctuation around zero, suggesting that the residuals are nearly stationary with a constant mean and finite variance over time.

The Autocorrelation Function (ACF) of the residuals shows perfectly independence.

The Q-Q Plot indicates that the residuals follow a near-normal distribution, with minor deviations at the tails, which is typical in time series data.

The Ljung-Box Test p-values are all above the 0.05 significance level, implying that no autocorrelations are left in the residuals.

All coefficients are significant at the 5% significant level.

Code

# Define parameters
data <- ts_df6
n <- nrow(data)  # Total observations
h <- 4  # h: Forecast horizon
# k: Initial training set
# Calculate k as 1/3rd of the data, rounded down to the nearest multiple of 4
k <- floor(n / 3 / h) * h
num_iter <- (n - k) / h  # Number of rolling iterations

# Initialize matrices for RMSE
rmse1 <- matrix(NA, nrow = num_iter, ncol = h)  # RMSE for Model 1
rmse2 <- matrix(NA, nrow = num_iter, ncol = h)  # RMSE for Model 2

# Define rolling start time
st <- tsp(data)[1] + (k - 1) / h 

# Walk-Forward Validation Loop
for (i in 1:num_iter) {
  xtrain <- window(data, end = st + i - 1)
  xtest <- window(data, start = st + (i - 1) + 1/h, end = st + i)  # Test set for the next 4 months

  # Fit auto.arima() model
  model.1 <- Arima(xtrain[,"dxy"], order = c(0,1,1), 
                  xreg = xtrain[, c("house", "visitors")],
                  optim.control = list(maxit = 10000))   
  f.1 <-   forecast(model.1, xreg = xtest[, c("house", "visitors")], h = h)
  rmse1[i,] <-  (f.1$mean - xtest[,"dxy"])^2

  ###### Fit manual model
  model.2 <- Arima(xtrain[, "dxy"], order = c(1,1,2),
                  xreg = xtrain[, c("house", "visitors")],
                  optim.control = list(maxit = 10000))
  f.2 <- forecast(model.2, xreg = xtest[, c("house", "visitors")], h = h)
  rmse2[i,] <-  (f.2$mean - xtest[,"dxy"])^2
}   

# Compute RMSE across all iterations
rmse1_avg <- sqrt(colMeans(rmse1, na.rm = TRUE))
rmse2_avg <- sqrt(colMeans(rmse2, na.rm = TRUE))

# Create a DataFrame for better visualization
error_table <- data.frame(
    Horizon = 1:h,
    RMSE_Model1 = rmse1_avg,
    RMSE_Model2 = rmse2_avg
)

# **Improved RMSE Plot using ggplot2**
ggplot(error_table, aes(x = Horizon)) +
  geom_line(aes(y = RMSE_Model1, color = "Regression with ARIMA(0,1,1) errors"), size = 1) +
  geom_line(aes(y = RMSE_Model2, color = "Regression with ARIMA(1,1,2) errors"), size = 1) +
  labs(title = "RMSE Comparison for 4-Step Forecasts",
       x = "Forecast Horizon (Months Ahead)",
       y = "Root Mean Squared Error (RMSE)") +
  scale_color_manual(name = "Models", values = c("red", "blue")) +
  theme_minimal()

ARIMA(0,1,1) looks better.

Fit and summarize the best model:

Code

best_model <- Arima(data[,"dxy"], order = c(0,1,1), 
                xreg = data[, c("house", "visitors")],
                optim.control = list(maxit = 10000))   
summary(best_model)

Series: data[, "dxy"] 
Regression with ARIMA(0,1,1) errors 

Coefficients:
         ma1   house  visitors
      0.4995  0.3585   -0.0118
s.e.  0.0878  0.2114    0.0071

sigma^2 = 0.0007382:  log likelihood = 174.14
AIC=-340.28   AICc=-339.74   BIC=-330.8

Training set error measures:
                       ME       RMSE        MAE          MPE      MAPE
Training set 2.440354e-05 0.02648176 0.02082905 0.0001847146 0.4645378
                  MASE       ACF1
Training set 0.3713074 0.03777751

Model Equation

\[ \begin{gathered} y_t=\beta x_t+n_t \\ \varphi(B) \nabla^d n_t=\delta+ \theta(B) w_t \end{gathered} \]

With the parameters from the model: \[ \begin{gathered} y_t=\beta x_t+n_t \\ y_t=0.3585 \text { House Price } -0.0118 \text{International Visitors} +n_t \end{gathered} \]

With \(\operatorname{ARIMA}(0,1,1)\) errors: \[ \begin{gathered} \varphi(B) \nabla^{d=1} n_t= \theta(B) w_t \\ (1-B) n_t=\left(1+\theta_1 B\right) w_t \\ (1-B) n_t=\left(1+0.4995 B\right) w_t \end{gathered} \]

Code

arima_house <- auto.arima(ts_df6[,"house"])
f_house <- forecast(arima_house, h = h)
arima_visitors <- auto.arima(ts_df6[,"visitors"])
f_visitors <- forecast(arima_visitors, h = h)
xreg <- cbind(f_house$mean, f_visitors$mean)
colnames(xreg) <- c("house", "visitors")
f.6 <-   forecast(best_model, xreg = xreg, h = h)
autoplot(f.6) +
  labs(title = "4-Quarter Forecast of Log-Transformed U.S. Dollar",
       x = "Time", y = "Log-transformed USD") +
  theme_minimal()

I believe the forecast is very good. By adding the exogenous variables (house price and the number of international visitors) to the ARIMA model, the ARIMAX model has effectively captured the pattern of USD.

Conclusion

🔹 First, to represent international trade, I built a VAR(1) model including the trade deficit, commodity prices, and the CPI.

The parameter matrix shows that a widening trade deficit leads to dollar depreciation, which aligns with economic intuition — a negative trade balance puts downward pressure on the currency. Commodity prices and the USD are also negatively correlated, likely reflecting that rising global commodity prices increase import costs. CPI, however, is positively related to the dollar, possibly due to expectations of tighter monetary policy in response to inflation.

🔹 Second, for macroeconomic fundamentals, I modeled the USD alongside interest rates, CPI, unemployment rate, and GDP.

The model suggests the dollar strengthens when interest rates, inflation, and GDP increase — which is consistent with capital inflows attracted by higher returns and a stronger economy. Unemployment, by contrast, is negatively related to the USD, reflecting that economic slack weakens confidence in the currency.

🔹 Third, I explored financial market dynamics using a VAR(2) model with S&P 500 and Bitcoin prices.

At lag one, the dollar and Bitcoin appear positively correlated — this may indicate that both are being used as alternative safe-haven assets. The S&P 500, however, is negatively related to the dollar, supporting the idea that when investors are optimistic about risky assets, the dollar tends to weaken.

🔹 Fourth, I examined the global commodity market with oil and gold prices in a VAR(2) framework.

As expected, gold is negatively related to the USD, consistent with its role as a traditional hedge when the dollar weakens. Oil also shows a negative relationship — rising oil prices may signal global inflation or geopolitical risks that weigh on the dollar. This finding also aligns with the earlier observation that the USD is negatively related to the global commodity price index.

🔹 Fifth, I introduced a SARIMAX model using egg prices as a proxy for domestic food inflation and supply-side shocks. This model predicted a slight upward trend in the USD.

🔹 Finally, to capture the real economy, I applied an ARIMAX(0,1,1) model using U.S. house prices and the number of international visitors.

House prices were positively related to the dollar, reflecting stronger domestic demand and investment sentiment. In contrast, international tourism had a negative effect — perhaps because a stronger dollar reduces the attractiveness of the U.S. as a travel destination. This model also produced an upward trend in the USD forecast.

Overall, the models suggest that the USD is influenced by a complex interplay of trade balances, macroeconomic fundamentals, financial market dynamics, commodity prices, and domestic economic indicators. The forecasts indicate a generally upward trend in the USD, driven by strong fundamentals and positive sentiment in the U.S. economy.