Gerald Carlino and Thorsten Drautzburg, "The Role of Startups for Local Labor Markets", Journal of Applied Econometrics, Vol. 35, No. 6, 2020, pp. 751-775. All data and program files are stored in cd-files.zip. Since there are both binary files and ASCII files, Unix/Linux users should *not* use "unzip -a". Raw data description: The data are contained in VAR_DATA.csv and in VAR_DATA.mat. The dataset covers 38 years and 354 MSAs. In VAR_DATA.mat, each data series is a 38 x 354 matrix. In VAR_DATA.csv, each series is a (38*354) x 1 vector. The series are: - Year (vYear in .mat): Year. - MSA_FIPS (vMSA in .mat): MSA FIPS code. - dlog_pop: Population growth: We compute the log growth rate of Census population. The log growth rate has the advantage of being additive to compute level changes, from which we can back out the change in the employment level. - dlog_wage: Wage growth: Growth of average wages: We compute the log growth rate of the average wage rate in the County Business Patterns. - v_migrant_rate_exm: Net migration rate: We define the net migration rate as the difference between inflows and outflows of IRS exemptions, divided by the population level in the prior period. - vfirm_entry_rate: Firm entry rate: We define the firm entry rate as the change in the number of firms aged 0, divided by the average of the number of firms of any age in the current and prior year. - vfirm_exit_rate: Firm exit rate: We define the firm exit rate as the change in the number of firms aged 1 that exit, divided by the average of the number of firms of any age in the current and prior year. - vfirm_exit_rate_all: Overall firm exit rate: We define the overall firm exit rate as the change in the number of firms of any age that exit, divided by the average of the number of firms of any age in the current and prior year. - vjob_creation_rate_births: Job creation rate: We define the job creation rate as the change in job creation by firms aged 0, divided by the average of overall private employment in the current and prior year. - vlog_age0_sz: Startup average size: We compute the average size of a startup as the log of the ratio of startup employment in an MSA divided by the number of startups in an MSA. - vlog_emp_pop: Employment-to-population ratio: We use overall employment from the County Business Patterns. This measure agrees closely with BDS employment. We use Census population to compute the employment-to-population ratio. It enters the analysis in logs. - vpop: Employment-population ratio (log) - vZit_Bartik: Overall labor demand shock proxy, varying base year weights. Source: CBP. - vZit_Bartik1974: Overall labor demand shock proxy, constant base year weights. Source: CBP. - vZit_Bartik_firm: Barriers to entry shock proxy, varying base year weights. Source: CBP and BDS. - vZit_Bartik_firm1974: Barriers to entry shock proxy, constant base year weights. Source: CBP and BDS. - vZit_Bartik_jc: Startup productivity shock proxy, varying base year weights. Source: CBP and BDS. - vZit_Bartik_jc1974: Startup productivity shock proxy:, constant base year weights. Source: CBP and BDS. Omitted data: 1) The house price data in the paper are provided by CoreLogic Solutions, see https://www.corelogic.com/insights-download/home-price-index.aspx. o The data are provided at the level of Core Based Statistical Area (CBSA)/Metro areas. o We follow that definition, except for the following cities, where we used the main division instead: Boston, Chicago, Dallas, Detroit, Los Angeles, Miami, NYC, Philadelphia, San Francisco, Seattle, Washington DC o The data are monthly. Given that our main data are based on mid-March payroll, we use the data from March in every year. o We use the house prices for the tier ?Single Family Combined?. 2) TFP data used in the Appendix are described in Appendix E. Replication files -- General notes: - The replication files use Matlab for the main results and Stata for the comparison of historical shocks with external data. - The underlying house price data in the article are proprietary and omitted. Since house prices were included only in the "periphery", the main results here are unaffected and directly comparable to those in the article. - The code is located in the "Code" folder. The data is located in the "Data" folder. The code saves results in the "Graphs" and "Tables" folders. - A TeX file assembling all the results in contained in the subfolder "TeX". It may require minor adjustments in file names. Data assembly: - Run MakeData_JAE.m to construct the data from raw series. Alternatively, a Matlab MAT-file with the data is provided: VAR_DATA.mat, also as a CSV in VAR_DATA.csv. Figure 1: - Run MotivatingScatter.m Figure 2: - Run BartikPlots.m Figures 3 and 4, and Tables 1, 2, and 3: - Run SPVAR_split_jae.m with VAR_SET=13 and WHICH_IV=1. Figure 5: - Run SPVAR_split_jae.m with VAR_SET=130 and WHICH_IV=5. Figure 6 and 7: - Run SPVAR_split_jae.m with VAR_SET=13 and WHICH_IV=1 (from Figurs 3 and 4). - Then run my_Counterfactual.m - Last, run VC_Shocks.do in Stata.