/*
This master program kicks off all of the other programs.

User must specify the raw input dataset name and the paths for the do-files (also 
	the location of Matlab m-files), output (where results from the paper will 
	be placed), and intermediate data files. 

If this code is being used for replication on the synthetic data, then there are 
	four implicates of the synthetic data. Because this code will ultimately be 
	run on the internal data for validation, which is only one file, the code 
	is built to run on only one file even when using the synthetic data. If an individual 
	wants to obtain accurate statistics and inference on the synthetic data, 
	they will need to run the analysis on all synthetic implicates and then 
	combine the results using the combination formulae outlined in "The 
	Creation and Use of SIPP Synthetic Betae v7.0" (2018) by Benedetto, Stanley, and Totty.

dataname is the name of the original SSB/GSF data file
datapath is the path to the original SSB/GSF data
mydatapath is a user directory for saving intermediate data files
outputpath is a user directory for saving the output from the paper
dopath is the directory where Stata do-files and Matlab m-files are located
*/


set matsize 11000
set maxvar 32767
set more off
clear all

*-------------------------------------------------------------------------------*
***** 1. DECLARE DATA AND PATH SETTINGS *****
*-------------------------------------------------------------------------------*
global dataname 
global datapath ""
global mydatapath ""
global outputpath ""
global dopath ""



*-------------------------------------------------------------------------------*
* 2. Run files for data prep/sample selection/analysis
*-------------------------------------------------------------------------------*
*select variables of interest, reshape data files
*do ${dopath}/p1_DataPrep.do

*prepare sample
*do ${dopath}/p2_SampleSelection.do

* estimate summary stats and ols results (table 1, table 3, table d1, and table d4)
*do ${dopath}/p3_Sumstats_OLS.do

*export variables to .xls, to load into matlab for factor model estimation
*do ${dopath}/p4_ExportMatlab.do

*run the matlab factor model estimation programs (table 4, table 5, table d2, table d3, and CD test statistics)
do ${dopath}/p5_FM_master.do

*calculate the mean, sd, and percentiles of the individual returns (table 6 and table 7)
do ${dopath}/p6_heterogeneity.do

*construct the bias decomposition graph and kernel density of returns graph (figure 1 and figure 2)
do ${dopath}/p7_figures.do



