Michael W. McCracken and Joseph T. McGillicuddy, "An Empirical Investigation of Direct and Iterated Multistep Conditional Forecasts", Journal of Applied Econometrics, Vol. 34, No. 2, 2019, pp. 181-204. OVERVIEW This readme file details the data and MATLAB program files used to produce the empirical results in the paper's Table 1, Tables 3-9, and Figure 1. All files are zipped in the file mm-files.zip, which should be unzipped in the directory where the programs are to be run. The main programs should be run in a directory one level higher than the others. (The data file is in a subfolder within this directory, which is what the provided programs require.) Instructions on how to run the main MATLAB scripts to produce the desired results and a description of said results are included at the end of this file. CONTACT INFORMATION michael.w.mccracken [AT] stls.frb.org (314)444-8594. FOLDER MAP Subfolders: Data - contains data files used FcnsFcast - contains MATLAB functions used for the forecasting exercises FcnsLeSage - contains MATLAB functions from the LeSage Toolbox used to analyze output FcnsOrg - contains MATLAB functions used to organize the output FinalOutput - where final output files are written RawOutput - where .mat files from the forecasting exercises are written Matlab-scripts - contains the main MATLAB scripts. Main MATLAB scripts: bivar_exercises.m - runs the bivariate forecasting exercises trivar_exercises.m - runs the trivariate forecasting exercises, including the bootstraps largevar_exercises.m - runs the larger system forecasting exercises org_bivar.m - analyzes/organizes the bivariate results org_trivar.m - analyzes/organizes the trivariate results org_trivar_bs.m - analyzes/organizes the bootstrapped trivariate results org_largevar.m - analyzes/organizes the larger system results figure1.m - creates Figure 1 from the paper after performing the relevant exercises DATA FILES Below is a description of the data files used. All data files are stored in the subfolder 'Data'. 2017-06.csv This is the main data file. It is the June 2017 vintage of FRED-MD. For more details, see https://research.stlouisfed.org/econ/mccracken/fred-databases/. groups.xlsx (and group.csv) This file identifies the group to which each series in 2017-06.csv belongs. Each of the 128 series in 2017-06.csv are identified as belonging to one of five groups used in Marcellino, Stock, and Watson (2006): 1) income, output, sales, and capacity utilization 2) employment and unemployment 3) construction, inventories, and orders 4) interest rates and asset prices 5) nominal prices, wages, and money. MAIN MATLAB SCRIPTS All main MATLAB scripts are saved in the Matlab-scripts directory, but they must be copied to a directory one level higher and run from there. Instructions on how to run them in order to obtain the desired results are given in a later section of this read-me file. bivar_exercises.m This script runs the bivariate pseudo-out-of-sample forecasting exercises and saves the results as a .mat file in the subfolder 'RawOutput'. It relies on functions in the folder 'FcnsFcast'. trivar_exercises.m This script runs the trivariate pseudo-out-of-sample forecasting exercises and saves the results as a .mat file in the subfolder 'RawOutput'. It is also responsible for performing the forecasting exercises on bootstrapped samples. It relies on functions in the folder 'FcnsFcast'. largevar_exercises.m This script runs the pseudo-out-of-sample forecasting exercises for the larger systems and saves the results as a .mat file in the subfolder 'RawOutput'. It relies on functions in the folder 'FcnsFcast'. org_bivar.m This script analyzes the forecasts from the bivariate exercises and organizes the results into Tables 1, 3, and 6 from the paper. It writes these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'. org_trivar.m This script analyzes the forecasts from the trivariate exercises and organizes the results into Tables 4, 5, and 7 from the paper. It writes these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'. org_trivar_bs.m This script analyzes the forecasts from the bootstrapped trivariate exercises and organizes the results into Table 9 from the paper. It writes these tables to Tables.xlsx in the subfolder 'FinalOutput'. The script relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'. org_largevar.m This script analyzes the forecasts from the larger system exercises and organizes the results into Table 8 from the paper. This table is saved to Tables.xlsx in the subfolder 'FinalOutput'. The script relies on functions in the folders 'FcnsOrg' and 'FcnsLeSage'. figure1.m This script creates Figure 1 from the paper, which is saved as Figure1.eps in the folder 'FinalOutput'. It relies on functions in the folders 'FcnsFcast' and 'FcnsOrg'. MATLAB FUNCTIONS These functions are called by the main scripts. They are saved in one of the three folders with names beginning with 'Fcns'. Functions Called to Run Forecasting Exercises These functions are saved in 'FcnsFcast'. accumulate_h.m This function accumulates the first or second differences of a set of series across the given forecast horizon based on the order of integration. bivar_fcast.m This function performs the bivariate pseudo-out-of-sample forecasting exercises for all desired bivariate systems from a given group pairing. cond_ARDL_DMS.m This function estimates a horizon-specific autoregressive distributed lag (ARDL) model via OLS and uses it to produce a conditional point forecast. cond_VAR_IMS.m This function estimates a VAR via OLS and uses an iterative multistep (IMS) approach to construct conditional point forecasts h periods ahead. eom.m This function takes a set of dates and returns the date at the end of the month of each date. find_bad.m This function identifies problematic series from a given dataset. A series is problematic if at least one of the following pertains to it: a) it is missing any values between first and last observations b) it starts after the maximum start date allowed c) it ends before the minimum end date allowed. It writes the identifiers of these series to bad_series.txt in the 'FinalOutput' folder. largevar_fcast.m This script performs pseudo-out-of-sample forecasting exercises for a given system. RMBbootstrap.m This function performs a residual-based moving block bootstrap as outlined in Section 5 of Bruggemann, Jentsch, and Trenkler (2014). transform.m This function performs stationary-inducing transformations based on the given transformation number. An aggregation of first or second differences is also returned based on the order of integration and given horizon. trivar_fcast.m This function performs the pseudo-out-of-sample forecasting exercises for a given set of trivariate systems. It will bootstrap the sample first if desired. undoDiff.m This function undoes differencing of a set of series based on order of integration of each series. FUNCTIONS CALLED TO ANALYZE RESULTS AND ORGANIZE OUTPUT These functions are saved in 'FcnsOrg'. adjust_sample_split.m This function adjusts the sample split by removing undesired out-of-sample periods. efficiency_test.m This function calculates and returns the t-statistics associated with bias and efficiency tests. eom.m This function takes a set of dates and returns the date at the end of the month of each date. get_stats.m This function computes various statistics from the MSEs of two models. These statistics are as follows. 1) Ratios of MSEs from model 2 over MSEs from model 1 2) Ratios of MSEs from model 1 over MSEs from model 1 with first lag structure 3) Ratios of MSEs from model 2 over MSEs from model 1 with first lag structure 4) t-statistics from testing null of MSE1-MSE2 equals 0 where MSE1 is the MSE from model 1 and MSE2 is the MSE from model 2 using a Newey-West HAC with a lag length set to ceil(1.5*h) It also creates dummies identifying which t-statistics are significant in favor of each model. getCritVal.m This function obtains the critical values for a two-sided test at a given significance level based on the distribution of a given set of test statistics obtained via bootstrapping. getT5_bivar.m This function organizes relative MSEs from the bivariate exercises into tables of the form of Table 5 from Marcellino, Stock, and Watson (2006). getT5_largevar.m This function organizes relative MSEs from the larger system exercises into tables of the form of Table 5 from Marcellino, Stock, and Watson (2006), but just with the mean and median relative MSEs, not the other percentiles. getT5_trivar.m This function organizes relative MSEs from the trivariate exercises into tables of the form of Table 5 from Marcellino, Stock, and Watson (2006). getT5_tstat_bivar.m This function organizes t-statistics from the bivariate exercises into tables of the form of Table 5 from Marcellino, Stock, and Watson (2006), but with additional rows including the number of models where the null is rejeceted in favor of each method and the total number of models considered. getT5_tstat_trivar.m This function organizes t-statistics from the trivariate exercises into tables of the form of Table 5 from Marcellino, Stock, and Watson (2006), but with additional rows including the number of models where the null was rejeceted in favor of each method and the total number of models considered. getT6_trivar.m This function organizes relative MSEs from the trivariate exercises into tables of the form of Table 6 from Marcellino, Stock, and Watson (2006). MSEtest.m This function returns the t-statistics associated with the MSE-test. FUNCTIONS FROM THE LESAGE TOOLBOX These functions are from the Econometrics Toolbox by James P. LeSage (see https://www.spatial-econometrics.com/). They are called to analyze the data, and they are saved in 'FcnsLeSage'. nwest.m This function computes a Newey-West adjusted heteroscedastic-serial consistent Least-squares Regression. HOW TO RUN Below are instructions on how to run the main programs to generate the desired results. ---------- How to Generate Tables 1, 3, and 6 1) Run bivar_exercises.m with the variable great_mod set to 0. 2) Run bivar_exercises.m with the variable great_mod set to 1. 3) Run org_bivar.m. ---------- How to Generate Tables 4, 5, and 7. 1) Run trivar_exercises.m with the variable run_bootstrap set to 0 and the variable great_mod set to 0. 2) Run trivar_exercises.m with the variable run_bootstrap set to 0 and the variable great_mod set to 1. 3) Run org_trivar.m. ---------- How to Generate Table 8 1) Run largevar_exercises.m with the variable great_mod set to 0. 2) Run largevar_exercises.m with the variable great_mod set to 1. 3) Run org_largevar.m. ---------- How to Generate Table 9 Note that the steps below outline the process for breaking the bootstrapped samples into six blocks and generating the results for each block one at a time. The number six was chosen to match the computing resources available to us. Changing the number of blocks (determined by variable ncomp) should not alter the end results. Also note that we ran these exercises in parallel using MATLAB's parfor- loop. Each iteration in a parfor-loop has a unique, independent set of random numbers, and subsequent runs of the parfor-loop generate different numbers. Hence, the bootstrapped results we obtained will not be directly reproducible. 1) Run trivar_exercises.m with the following variable specifications: a) run_bootstrap=1 b) great_mod=0 c) msw_sample=1 d) ncomp=6 e) comp=1. 2) Repeat Step 1 five more times, keeping the same specification except for comp which should be incremented by one each time. (I.e. run Step 1 for comp=2,...,6.) 3) Run trivar_exercises.m with the following variable specifications: a) run_bootstrap=1 b) great_mod=1 c) msw_sample=0 d) ncomp=6 e) comp=1. 4) Repeat Step 3 five more times, keeping the same specification except for comp which should be incremented by one each time. (I.e. run Step 3 for comp=2,...,6.) 5) Run org_trivar_bs.m. ---------- How to Generate Figure 1 1) Run figure1.m. FINAL OUTPUT Final output files are saved in folder 'FinalOutput' bad_series.txt A text file containing the series from FRED-MD removed from the bivariate exercises along with the reasons for removal. Figure1.eps Figure 1 in .eps format. Produced by figure1.m. Tables.xlsx All tables are written to this Excel file.