Ricardo Mora, "A Nonparametric Decomposition of the Mexican American Average Wage Gap", Journal of Applied Econometrics, Vol. 23, No. 4, 2008, pp. 463-485. There are three zip files. The file contains . The file contains . The file contains a large number of Stata .do files. Note that and the .do files are ASCII files in DOS format. The original source of the data corresponds to extracts of the Merged Outgoing Rotation Groups file of the Current Population Survey (CPS) prepared by the NBER for the 1994:2002 period. The data used in the paper is obtained from the original morg files in STATA format after running and , two batch files for STATA included in . The result is , a file dataset in STATA format. The data used in the paper can also be directly read from , a plain (ASCII) text file included in . Each line is an observation. Variables are stored in columns in fixed format. Missing variables (wages for non-participants) are represented by ".". The data set contains 75,949 observations and 27 variables. Here is a brief description of the variables. The first variable corresponds to the first column in , and so on. (see also notes in Table 2 in the paper) periodo Month of interview; 1: Jan 1994. state State (74: Texas, 85: New Mexico, 86: Arizona, 93: California) year year of interview salarios log-hourly wages age years of age age2 age*age exp years of potential experience exp2 exp*exp agexed age*educa expxed exp*educa educa years of education voca dummy for vocational studies etnia ethnic status (0 non-hispanic 1: mexame) veterano veteran status marital marital status calif state=93 arizo state=86 texas state=74 newme state=85 wagesmpl dummy for participation in work-for-pay market doccphis percentage of Hispanic population in occupation dindphis percentage of Hispanic population in industry hisppor percentage of hispanics in local area educcon spouse's years of education if marital=1 agecon spouse's age if marital=1 agepad father's age agemad mother's age (variable "Parents" can be obtained as marital=0 and (agepad>0 and/or agemad>0)) In addition, the following STATA batch files are included in to facilitate the replication of the paper. gives the summary statistics from Table 2; obtains results from the parametric participation models and prepares the data sets for the participation models with tree structures; computes for the tree participation models the likelihoods, betas, inverse mill-s ratios, the residuals, and tests for normality (the program calls , and ) computes the marginal effects for the participation models; , , and prepare the wage equations data (line 70 calls an external programm to transfer the dataset from a STATA format into a GAUSS format) estimates the parametric wage equations in STATA takes the nonparametric wage structure and computes marginal effects presents decompositions from wage equations results The estimation of the tree structures was performed with a gauss macro available from the author upon request. Ricardo Mora