Jeremiah Richey and Alicia Rosburg, "Decomposing Economic Mobility Transition Matrices", Journal of Applied Econometrics, Vol. 33, No. 1, 2018, pp. 91-108. All files are ASCII files in DOS format and are zipped in the file rr-files.zip. Unix/Linux users should use "unzip -a". There are two directories, Code and Data. Data The data used in the paper combine both publicly available NLSY data and restricted NLSY geocode data. The researcher must obtain access to the NLSY geocode data (via the BLS) to run the merge code. The geocode data is used to link to a state level cost of living index obtained from Richard Fording's website -- https://rcfording.wordpress.com/datasets/ -- derived via his paper with Berry in 2000 ``An annual cost of living index for the American states'' in The Journal of Politics. Using this index, our merge code (merge.decompose.matrix.data_JoAE.R) adjusts incomes according to costs of living. In addition, the merge code links to several instruments for education not used in the final version of the paper; the instruments were used in an earlier (unpublished) version but dropped due to weak instrument concerns. These instruments include local wages, unemployment rates, and information on local colleges (taken from Kling's 2001 Journal of Business and Economics Statistics paper ``Interpreting Instrumental Variables Estimates of the Returns to Schooling''). The merge code calls several data sets and produces one final data set used for our analysis. The data sets called include: 1. nlsy.79.matrix.final.csv - cleaned version of publicly available NLSY data used in the analysis (variables: id(respondent ID from NLSY); parent.income(parental income); income(respondent income); exp(full time equivalent experience); grade(years of education of respondent); afqt(standardized AFQT score of respondent); age(age at time of income measure of respondent); rotter(rotter score of respondent); esteem(esteem score of respondent); perlin(perlin score of respondent). Observations: 1345) 2. nlsy.79.states_orig.csv - state of residence variables for 1979 - 2012 from restricted geocode data (must be obtained directly by researcher) 3. nlsy.cola.csv - cost of living index (variables: fips(state level fips ID number); year(year of cost measure); cola(raw cost level); adj.cola(normalized cola). Observations: 251) 4. nlsy.79.82.state.fips.csv - fips codes and state variables from 1979-1982 as well as at age 14 from restricted geocode data (must be obtained directly by researcher) 5. college.data.kling.clean.csv - data on local colleges (variables: sfips(state level fips code); cfips(county level fips code); fips(combined state+county fips ID code); pub2(indicator for presence of 2 year public college); pub4(indicator for presence of 4 year public college); min.tuit.pub(minimum tuition of all public schools in countty); min.tuit.pub4(minimum tuition of 4 year public school in county). Observations: 3147) 6. nlsy.79.birthyear.17.csv - birth year taken from NLSY and year respondent turned 17 to link to appropriate instruments (variables: id(NLSY respondent ID number); birth_year(year of respondents birth); year17(year respondent turned 17). Observations: 12686) 7. state.unemp.76.81.csv - state level unemployment rates from U.S. Bureau of Labor Statistics (variables: sfips(state level fips ID code); unemp76(unemployment rate in 1976); unemp77(unemployment rate in 1977); unemp78(unemployment rate in 1978); unemp79(unemployment rate in 1979); unemp80(unemployment rate in 1980); unemp81(unemployment rate in 1981). Observations: 52) 8. county.wages.74.81.add.csv - county level wages (1974-1981) from the U.S. Bureau of Economic Analysis (variables: fips(county+state full fips ID code); cwage.74(county level yearly average wage in 1974); cwage.75(county level yearly average wage in 1975); cwage.76(county level yearly average wage in 1976); cwage.77(county level yearly average wage in 1977); cwage.78(county level yearly average wage in 1978); cwage.79(county level yearly average wage in 1979); cwage.80(county level yearly average wage in 1980); cwage.81(county level yearly average wage in 1981). Observations: 3148) The final data output from the merge code and used in our analysis is ``nlsy.79.ind.merged1.csv'' with final observation count of 1321 (24 observations dropped due to lack of match to instruments). Code Our results are based on a set of procedures (``procedures.decompose.matrix_JoAE.R'') and a code file that calls the procedure and produces final results which include standard errors based on bootstraps (``call.decompose.matrix_JoAE.R''). The beginning of the `call' code sources the procedure file. The procedure code provided here is based on the procedures code from Christoph Rothe related to his 2016 Journal of Business and Economics Statistics paper ``Decomposing the Composition Effect'' and available on his website -- www.christophrothe.net. Our decomposition is based on his decomposition method. Our code alters Rothe's to fit the transition matrix setting and also uses a different copula estimation strategy (we use a maximum psuedolikelihood approach while he uses a minimum distance approach). The ``call'' code calls the procedure and performs a bootstrap analysis to obtain standard errors for estimates. The procedure code produces several sets of results stored in a list, while the call code creates tables including SEs (tab1-tab8). These results are: 1. Index decomposition results (see note below) 2. Decomposition results for children from the top quartile of parental income homes 3. Decomposition results for children from the bottom quartile of parental income homes 4. Simulated empirical transition matrix 5. Aggregate decomposition counterfactual transition matrix 6. Simulated empirical and aggregate decomposition counterfactual indices (see note below) 7. Aggregate decomposition composition effect for the full matrix 8. Aggregate decomposition structure effect for the full matrix NOTE ON INDICES: The code produces four indices - M1, M2, M3, M4. However, the M1 and M2 from the code are not the same as the M1 and M2 in the paper (due to different indices reported in earlier drafts). The M1 in the paper, the Bartholomew index, corresponds to the M3 in the code. And the M2 in the paper, the index related to the second eigenvalue from a symmetrized matrix, is the M4 in the code. The M1 in the code is an index related to the trace of the matrix. The M2 in the code is related to the second eigenvalue of the unsymmetrized matrix.