Michael W. McCracken and Joseph T. McGillicuddy, "An Empirical 
Investigation of Direct and Iterated Multistep Conditional Forecasts",
Journal of Applied Econometrics, Vol. 34, No. 2, 2019, pp. 181-204.

OVERVIEW

This readme file details the data and MATLAB program files used to
produce the empirical results in the paper's Table 1, Tables 3-9, and 
Figure 1. All files are zipped in the file mm-files.zip, which should
be unzipped in the directory where the programs are to be run.

The main programs should be run in a directory one level higher than
the others. (The data file is in a subfolder within this directory,
which is what the provided programs require.) Instructions on how to
run the main MATLAB scripts to produce the desired results and a
description of said results are included at the end of this file.


CONTACT INFORMATION

michael.w.mccracken [AT] stls.frb.org
(314)444-8594.


FOLDER MAP

Subfolders:
    Data - contains data files used 
    FcnsFcast - contains MATLAB functions used for the forecasting
        exercises
    FcnsLeSage - contains MATLAB functions from the LeSage Toolbox used
        to analyze output
    FcnsOrg - contains MATLAB functions used to organize the output
    FinalOutput - where final output files are written
    RawOutput - where .mat files from the forecasting exercises are written
    Matlab-scripts - contains the main MATLAB scripts.

Main MATLAB scripts:
    bivar_exercises.m - runs the bivariate forecasting exercises
    trivar_exercises.m - runs the trivariate forecasting exercises, 
        including the bootstraps
    largevar_exercises.m - runs the larger system forecasting exercises
    org_bivar.m - analyzes/organizes the bivariate results
    org_trivar.m - analyzes/organizes the trivariate results
    org_trivar_bs.m - analyzes/organizes the bootstrapped trivariate results
    org_largevar.m - analyzes/organizes the larger system results
    figure1.m - creates Figure 1 from the paper after performing the 
        relevant exercises


DATA FILES

Below is a description of the data files used. All data files are stored in
the subfolder 'Data'.

2017-06.csv 
This is the main data file. It is the June 2017 vintage of FRED-MD. For 
more details, see 
https://research.stlouisfed.org/econ/mccracken/fred-databases/.

groups.xlsx (and group.csv)
This file identifies the group to which each series in 2017-06.csv belongs.
Each of the 128 series in 2017-06.csv are identified as belonging to one of
five groups used in Marcellino, Stock, and Watson (2006):
    1) income, output, sales, and capacity utilization
    2) employment and unemployment
    3) construction, inventories, and orders
    4) interest rates and asset prices
    5) nominal prices, wages, and money.


MAIN MATLAB SCRIPTS

All main MATLAB scripts are saved in the Matlab-scripts directory, but
they must be copied to a directory one level higher and run from
there. Instructions on how to run them in order to obtain the desired
results are given in a later section of this read-me file.

bivar_exercises.m 
This script runs the bivariate pseudo-out-of-sample forecasting exercises 
and saves the results as a .mat file in the subfolder 'RawOutput'. It 
relies on functions in the folder 'FcnsFcast'.

trivar_exercises.m
This script runs the trivariate pseudo-out-of-sample forecasting exercises 
and saves the results as a .mat file in the subfolder 'RawOutput'. It is 
also responsible for performing the forecasting exercises on bootstrapped 
samples. It relies on functions in the folder 'FcnsFcast'.

largevar_exercises.m
This script runs the pseudo-out-of-sample forecasting exercises for the
larger systems and saves the results as a .mat file in the subfolder 
'RawOutput'. It relies on functions in the folder 'FcnsFcast'.

org_bivar.m
This script analyzes the forecasts from the bivariate exercises and
organizes the results into Tables 1, 3, and 6 from the paper. It writes 
these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script 
relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'.

org_trivar.m
This script analyzes the forecasts from the trivariate exercises and 
organizes the results into Tables 4, 5, and 7 from the paper. It writes 
these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script 
relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'.

org_trivar_bs.m
This script analyzes the forecasts from the bootstrapped trivariate 
exercises and organizes the results into Table 9 from the paper. It writes 
these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script 
relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'.

org_largevar.m
This script analyzes the forecasts from the larger system exercises and 
organizes the results into Table 8 from the paper. This table is saved to 
Tables.xlsx in the subfolder 'FinalOutput'. The script relies on functions 
in the folders 'FcnsOrg' and 'FcnsLeSage'.

figure1.m
This script creates Figure 1 from the paper, which is saved as Figure1.eps
in the folder 'FinalOutput'. It relies on functions in the 
folders 'FcnsFcast' and 'FcnsOrg'.


MATLAB FUNCTIONS

These functions are called by the main scripts. They are saved in one of 
the three folders with names beginning with 'Fcns'.

Functions Called to Run Forecasting Exercises
These functions are saved in 'FcnsFcast'.

accumulate_h.m
This function accumulates the first or second differences of a set of
series across the given forecast horizon based on the order of integration.

bivar_fcast.m
This function performs the bivariate pseudo-out-of-sample forecasting 
exercises for all desired bivariate systems from a given group pairing. 

cond_ARDL_DMS.m
This function estimates a horizon-specific autoregressive distributed lag
(ARDL) model via OLS and uses it to produce a conditional point forecast.

cond_VAR_IMS.m
This function estimates a VAR via OLS and uses an iterative multistep
(IMS) approach to construct conditional point forecasts h periods ahead.

eom.m
This function takes a set of dates and returns the date at the end of the
month of each date.

find_bad.m
This function identifies problematic series from a given dataset. A
series is problematic if at least one of the following pertains to it:
   a) it is missing any values between first and last observations
   b) it starts after the maximum start date allowed
   c) it ends before the minimum end date allowed.
It writes the identifiers of these series to bad_series.txt in the 
'FinalOutput' folder.

largevar_fcast.m
This script performs pseudo-out-of-sample forecasting exercises for a
given system. 

RMBbootstrap.m
This function performs a residual-based moving block bootstrap as outlined 
in Section 5 of Bruggemann, Jentsch, and Trenkler (2014).

transform.m
This function performs stationary-inducing transformations based on the
given transformation number. An aggregation of first or second differences
is also returned based on the order of integration and given horizon.

trivar_fcast.m
This function performs the pseudo-out-of-sample forecasting exercises for a 
given set of trivariate systems. It will bootstrap the sample first if 
desired. 

undoDiff.m
This function undoes differencing of a set of series based on order of
integration of each series. 


FUNCTIONS CALLED TO ANALYZE RESULTS AND ORGANIZE OUTPUT

These functions are saved in 'FcnsOrg'.

adjust_sample_split.m
This function adjusts the sample split by removing undesired out-of-sample 
periods.

efficiency_test.m
This function calculates and returns the t-statistics associated with bias  
and efficiency tests.

eom.m
This function takes a set of dates and returns the date at the end of the
month of each date.

get_stats.m
This function computes various statistics from the MSEs of two models.
These statistics are as follows.
   1) Ratios of MSEs from model 2 over MSEs from model 1 
   2) Ratios of MSEs from model 1 over MSEs from model 1 with first lag
      structure 
   3) Ratios of MSEs from model 2 over MSEs from model 1 with first lag
      structure 
   4) t-statistics from testing null of MSE1-MSE2 equals 0 where MSE1 is
      the MSE from model 1 and MSE2 is the MSE from model 2 using a
      Newey-West HAC with a lag length set to ceil(1.5*h) 
It also creates dummies identifying which t-statistics are significant in
favor of each model.

getCritVal.m
This function obtains the critical values for a two-sided test at a given
significance level based on the distribution of a given set of test 
statistics obtained via bootstrapping.

getT5_bivar.m
This function organizes relative MSEs from the bivariate exercises into
tables of the form of Table 5 from Marcellino, Stock, and Watson (2006).

getT5_largevar.m
This function organizes relative MSEs from the larger system exercises
into tables of the form of Table 5 from Marcellino, Stock, and Watson
(2006), but just with the mean and median relative MSEs, not the other
percentiles.

getT5_trivar.m
This function organizes relative MSEs from the trivariate exercises into
tables of the form of Table 5 from Marcellino, Stock, and Watson (2006).

getT5_tstat_bivar.m
This function organizes t-statistics from the bivariate exercises into
tables of the form of Table 5 from Marcellino, Stock, and Watson (2006),
but with additional rows including the number of models where the null is 
rejeceted in favor of each method and the total number of models 
considered.

getT5_tstat_trivar.m
This function organizes t-statistics from the trivariate exercises into 
tables of the form of Table 5 from Marcellino, Stock, and Watson (2006), 
but with additional rows including the number of models where the null was 
rejeceted in favor of each method and the total number of models 
considered.

getT6_trivar.m
This function organizes relative MSEs from the trivariate exercises into
tables of the form of Table 6 from Marcellino, Stock, and Watson (2006).

MSEtest.m
This function returns the t-statistics associated with the MSE-test.


FUNCTIONS FROM THE LESAGE TOOLBOX

These functions are from the Econometrics Toolbox by James P. LeSage
(see https://www.spatial-econometrics.com/). They are called to analyze the
data, and they are saved in 'FcnsLeSage'.

nwest.m
This function computes a Newey-West adjusted heteroscedastic-serial
consistent Least-squares Regression.


HOW TO RUN

Below are instructions on how to run the main programs to generate the 
desired results.

----------

How to Generate Tables 1, 3, and 6

1) Run bivar_exercises.m with the variable great_mod set to 0. 
2) Run bivar_exercises.m with the variable great_mod set to 1.
3) Run org_bivar.m.

----------

How to Generate Tables 4, 5, and 7.

1) Run trivar_exercises.m with the variable run_bootstrap set to 0 and the 
   variable great_mod set to 0.
2) Run trivar_exercises.m with the variable run_bootstrap set to 0 and the 
   variable great_mod set to 1.
3) Run org_trivar.m.

----------

How to Generate Table 8

1) Run largevar_exercises.m with the variable great_mod set to 0.
2) Run largevar_exercises.m with the variable great_mod set to 1.
3) Run org_largevar.m.

----------

How to Generate Table 9

Note that the steps below outline the process for breaking the bootstrapped
samples into six blocks and generating the results for each block one at a 
time. The number six was chosen to match the computing resources available
to us. Changing the number of blocks (determined by variable ncomp) should 
not alter the end results. 

Also note that we ran these exercises in parallel using MATLAB's parfor- 
loop. Each iteration in a parfor-loop has a unique, independent set of 
random numbers, and subsequent runs of the parfor-loop generate different
numbers. Hence, the bootstrapped results we obtained will not be directly
reproducible.

1) Run trivar_exercises.m with the following variable specifications:
      a) run_bootstrap=1
      b) great_mod=0
      c) msw_sample=1
      d) ncomp=6
      e) comp=1.

2) Repeat Step 1 five more times, keeping the same specification except for
   comp which should be incremented by one each time. (I.e. run Step 1 for 
   comp=2,...,6.)

3) Run trivar_exercises.m with the following variable specifications:
      a) run_bootstrap=1
      b) great_mod=1
      c) msw_sample=0
      d) ncomp=6
      e) comp=1.

4) Repeat Step 3 five more times, keeping the same specification except for
   comp which should be incremented by one each time. (I.e. run Step 3 for 
   comp=2,...,6.)

5) Run org_trivar_bs.m.

----------

How to Generate Figure 1

1) Run figure1.m.


FINAL OUTPUT

Final output files are saved in folder 'FinalOutput'

bad_series.txt
A text file containing the series from FRED-MD removed from the bivariate 
exercises along with the reasons for removal.

Figure1.eps
Figure 1 in .eps format. Produced by figure1.m.

Tables.xlsx
All tables are written to this Excel file.