Esteban M. Aucejo and Jonathan James, "Catching Up to Girls: Understanding the Gender Imbalance in Educational Attainment Within Race", Journal of Applied Econometrics, Vol. 34, No. 4, 2019, pp. 502-525. All files are in the zip file readme.aj.txt. This zip file contains both ASCII and binary files. The ASCII files are all in DOS format. The programs required to reproduce the analysis are contained in the following folders within readme.aj.txt: -- nlsy97 -- progs -- stata (I) Creating the NLSY 1997 Data The folder `nlsy97' contains the main set of programs that clean the raw NLSY 1997 data. The main program to run is a sas program called `create data.sas'. This creates the underlying data for the analysis, which includes four .csv files and one .dta (STATA) file. A description of these output files follows: -- meas.csv contains the measurements used for the factor model -- panel.csv contains the history file of education decisions -- pop.csv contains gender, race and populaiton weights for each individual in the sample -- SumData.csv contatins summary data as of Age 24 -- RFdata,dta conatins the variables use in the reduced form analysis (II) Reduced form analysis in STATA The reduced form analysis is conducted in STATA. The do file `Tables_1_3_4.do' in the `stata' folder will create the summary statistics and run the reduced form analysis. (III) Programs for the main analysis The main analysis is done in MATLAB. There are three main steps to run the analysis. (a) create MATLAB data sets from the .csv files, (b) estimate the factor model and take factor draws, and (c) create the output tables. The programs for all of these steps can be found in the `progs' folder. (a) Create MATLAB data sets To create the data necessary to run the MATLAB code, go to /progs/data and run the files `read data.m' and `read summary data.m'. These programs create the .mat files `Gender Gap Data.mat' and `summary data age 24.mat' (b) Estimate the model and take factor draws The program `estimate_model.m' in the folder `/progs/full model' estimates the factor model as well as 50 bootstrap samples. The estimates are stored in `/progs/results/factor model estimates'. After the model has been estimated, run the program `factor_draws.m' in the folder `progs/full model' to create simulated factor draws for each of the bootstrap samples. The factor draws are stored in the folder `progs/results/simulated factors'. NOTE: included in these files are only the simulated draws for the main sample. This program will have to be run to create the simulated draws for the bootstrap samples. Also in this same folder, the program `reduced form.m' creates the results for the reduced form schooling model for the 50 bootstrap samples (c) Creating the output To create the output for the tables in the paper, first run the file `create_output.m' in the folder `progs/tables and figures', then run the associated file corresponding to the desired table or figure in the folder `progs/latex output'.