# Readme for Andrew C. Chang & Trace J. Levinson "Raiders of the Lost High-Frequency Forecasts: New Data and Evidence on the Efficiency of the Fed's Forecasting", Journal of Applied Econometrics, forthcoming.
# Last update: August 22, 2022
# Pre-analysis plan link: 
# http://dx.doi.org/10.17605/OSF.IO/DE3PE


## Directory and File List

data_dictionary.txt - explanation of high-frquency forecast data (or see the codebook at the end of this readme.md file). 

### /ado/plus

Contains packages for Stata.  The master_public.do file will reset ado paths to this folder.

Contains the packages and associated help files for: eret2, estadd, estout, estpost, eststo, esttab, ivreg2, ranktest

### /Code/public

asymmetricloss.do - reestimates PAP specifications using Quad-Quad loss

figures_forecastplots.do - graphs figures in paper

master_public.do - runs everything

regs_final_public.do - summary stats, regression estimation, outputs .tex files of tables to /results/latex/regs_final/public

regs_final_public_allobs.do - regressions keeping all revisions, including those < 1bps

regs_final_public_clusteredstderr.do - robustness checks using clustered by FOMC and Newey-West std errors

regs_final_public_fixed_event - Nordhaus fixed event type autocorrelation tests

SPFandHFcompare.do - compares forecasts of Survey of Professional Forecasters and high-frequency forecasts

trendvsgrowth.do - tests for forecast smoothing

UnivariatemodelandHFcompareGDP.do and UnivariatemodelandHFcompareInflation.do - Compares RMSEs of high-frequency forecasts to AR(1) and AR(4) models


### /Data/public

hf_final_allobservations_public.dta and .csv - complete high-frequency dataset.  Approximately 4,600 observations.

hf_final_tbonly_public.dta and .csv - Greenbook/Tealbook dataset, used for Mincer & Zarnowitz regressions that only use Greenbooks in section 3.1 (section 2, equation 1 of preanalysis plan).  Approximately 700 observations.

### /Data/Real-time data

pconx_first_second_third.xlsx - Real-time core PCE inflation from real-time dataset for macroeconomists hosted by the Federal Reserve Bank of Philadelphia.  Approximately 100 observations.

routput_first_second_third.xlsx - Real-time GDP from real-time dataset for macroeconomists hosted by the Federal Reserve Bank of Philadelphia.  Approximately 200 observations.

### /Data/SPF

[Individual/Mean/Median]_[COREPCE/RGDP]_[Level/Growth] - Survey of Professional Forecaster data from FRB Philadelphia, representing individual level/mean/median forecasts for inflation/GDP.  The number of observations varies by the release; about 70 for mean core PCE, about 200 for mean GDP, about 2,500 for individual core PCE, about 8,600 for individual GDP.

SPFDates.csv and .txt - SPF release dates, one per quarter of data.

### /Data - Modified/HF

Empty folder for intermediate data manipulation

### /Data - Modified/SPF

Empty folder for intermediate data manipulation


### /Figures

Empty folder to output of figures_forecastplots.do


### /Results
#### /logs/final
Empty folder for regression log output.

#### /latex/regs_final/public 
Empty folder for latex output files from regs_final_public.do (PAP regressions), with a subfolder for summary stats

#### /latex/nonpap_regs

Five empty subfolders, allobs, asymmetricloss, clustered_std, fixed_even and revisions for associated results outputs


## Codebook for hf_final_allobservations, hf_final_tbonly_public

year: year (yyyy) forecast was made.

date: date (ddmmmyyyy) forecast was made.

dateqtr: quarter of date (yyyyq#).

obsdate: quarter that is being forecasted (dateqtr + horizon), string type.

projqtr: quarter that is being forecasted (dateqtr + horizon), float type.

fore_dist: the number of calendar days between the current forecast, for a particular macroeconomic variable-horizon, and the previous such forecast for a given FOMC cycle.  This variable equals zero when, for a given macroeconomic variable-horizon, there was no other such forecast for a given FOMC cycle (this situation implies we only had the Greenbook for that variable-horizon for that FOMC cycle).  Used in weighting regressions that used high-frequency data.

fomc: the date of the first day of the next regularly scheduled FOMC meeting (ddmmmyyyy).

varible: the macroeconomic variable being forecasted. 

source: the document type we used to record the forecast. we also record briefing tables and texts as pre-FOMC (the in-person forecast update just prior to a FOMC meeting) or bi-weekly (all other non-pre-FOMC briefings).

horizon: forecast horizon in quarters, -1 = 1 quarter backcast, 0 = current quarter, etc..  Forecasts are indexed by horizon based on the next regularly scheduled FOMC date, not based on calendar quarters.

rgdp_fore: real GDP growth forecasts, p.p., a.r.

pcepilfe_fore: core PCE inflation forecasts, p.p., a.r.

rgdp_3rd: BEA 3rd release of real GDP growth for the quarter being forecasted, p.p., a.r.

pcepilfe_3rd: BEA 3rd release of core PCE inflation for the quarter being forecasted, p.p., a.r.

fomcqtr: the quarter of the upcoming regularly scheduled FOMC meeting (yyyyq#).

prevfomc: date of previous regularly scheduled FOMC (ddmmmyyyy).

from_end: the number of forecasts for a particular variable-horizon-FOMC until the last forecast for that variable-horizon-FOMC, (i.e., =0 for the last forecast available before a FOMC).

weeks: number of weeks until the first day of the next regularly scheduled FOMC meeting, rounded down.

weeks01: indicator for whether it is less than 14 days until the first day of the next regularly scheduled FOMC meeting (i.e., I{weeks == 0 | weeks == 1}).

rgdp_rev: revision (first difference) to real GDP growth forecast based on previous forecast for real GDP growth at the indicated horizon within a FOMC cycle.  rgdp_rev == missing for the first forecast of real GDP growth for the indicated horizon for that FOMC cycle.  rgdp_rev includes consecutive observations where the forecast does not revise.  regression file (regs_final_public.do) excludes consecutive observations where the revision is less than one basis point to weed out thin trading issues.

pcepilfe_rev: same as rgdp_rev for core PCE inflation.

rgdp_err: real GDP growth forecast errors, 3rd release - forecast.

pcepilfe_err: core PCE inflation forecast errors, 3rd release - forecast.

zscore: rolling 71 day average of Bloomberg forecast errors, standardized by rolling 2 year standard deviation of Bloomberg forecast errors, NOT weighted by S&P 500 futures returns (news(tau) without r_i_tau in the summation formula).

zscorexreturn: news(tau), S&P 500 return weighted rolling standardized sum of Bloomberg forecast errors.



