Andreas Pollak, "Unemployment, Human Capital Depreciation, and Unemployment Insurance Policy", Journal of Applied Econometrics, Vol. 28, No. 5, 2013, pp. 840-863. This document explains how to replicate the estimation and simulation results in the paper. This is done by providing the necessary information on how to use the various scripts and programs used to obtain the results. If you have any further questions, please contact the author: Andreas Pollak Department of Economics University of Saskatchewan 9 Campus Drive Saskatoon, SK S7N 0P9 Canada e-mail: a.pollak@usask.ca All programs and data are contained in the file pollak-files.zip. Since some of the files appear to be binary, users of non-Windows operating systems should be careful when unzipping. The file pollak-files.zip contains a great many files in a great many directories. All of these files are listed (Unix style) in the file pollak-contents.txt, which is zipped in pollak-contents.zip. The German micro dataset is GSOEP, 1991 to 2001 (waves "H" to "R") as well as the corresponding CNEF. The US data is from the PSID, 1991 to 1997. While the GSOEP dataset is available free of charge to both domestic and international researchers, prospective users must sign a data distribution contract with the DIW Berlin, which is supplying the data. Once users have access to the GSOEP dataset, they can request the corresponding CNEF data for a nominal fee from Cornell University. The PSID is available for free from the University of Michigan. Users are required to register before accessing the data. The PSID component of the CNEF can be downloaded from the CNEF webpage at Cornell. GSOEP contact: The German Socio-Economic Panel DIW Berlin 10108 Berlin Germany web: http://www.diw.de/en/soep PSID contact: Institute for Social Research University of Michigan P.O. Box 1248 Ann Arbor, MI 48106-1248 USA web: http://psidonline.isr.umich.edu/default.aspx CNEF contact: SOEP/CNEF Project Assistant Cornell University Department of Policy Analysis and Management MVR Hall Ithaca, New York 14853-4401 USA web: http://www.human.cornell.edu/pam/research/centers-programs/german-panel/cnef.cfm The rest of this document is structured as follows. Part A describes how the data was selected and combined, and discusses the stage-one estimation as well as the creation of the moment vector and the weighting matrix for the stage-two MSM estimation. Part B documents the simulation-based estimation step and the policy experiments discussed in the paper. Note that the files accompanying these instructions are organized in directories in accordance with the structure of this document. PLEASE NOTE: If you are just interested in replicating results documented in part B (MSM estimation and simulations), you do not need access to the micro datasets, as moment vectors and weighting matrices are included with the source code of the simulation programs. A. Microdata ============ A.1 Data -------- To generate the data files that will be used in all later stages, combine the relevant waves of the CNEF (US: 1991 to 1997 into "us_equiv_p.*", Germany: 1991 to 1997 into "de_equiv_p.*" and 2000 to 2001 into "de_equiv2_p.*") and only keep household heads. For Germany, remove residents of the eastern federal states and Berlin. Then, add the following individual variables (from PSID/GSOEP): wksu: weeks unemployed (0<=wksu<=52.14286) wkse: weeks employed (0<=wkse<=52.14286) wkso: weeks out of labour force (0<=wkso<=52.14286) self: self-employed in current or most recent job food: food consumption (not available for Germany 1991 to 1997) Moreover, include a macro variable: pricef: food price index The SPSS scripts that were used to construct these data files, as well as instructions on how to used them (readme-A1.txt) and information on all macro variables used are included. A.2 Estimation -------------- In what follows, "**" in filenames refers to the country component of the name, "de" for Germany and "us" for the US. The selection of the relevant subsample and the actual estimation of parameters and moments is done in a straightforward fashion by Stata scripts, which I briefly describe now. Two parameters, delta_e and lambda, are directly estimated from the micro data as sample moments. This is done by the Stata scripts "**_deltae.do" and "**_lambda.do" by level of educational attainment and sex. The preference parameter phi is obtained from an OLS estimation of the log-linearized Euler equation, which estimates phi/gamma (remember that gamma=3). "**_demo2.do" does this. For the actual simulation, average demographic information by age is needed. This is obtained by "**_demo.do" and summarized in "**_demo.txt". To generate the moment vector and the weighting matrix used in the MSM estimation, select the appropriate sample and variables (see paper) using "us_finalmente.do" and "de_finalmente_short.do". Based on the output created here, the MatLab programs "us_finalmente.m" and "de_finalmente_short.m" create the moment vector ("us_mmean.csv" and "de_mmean_short.csv") and the weighting matrix ("us_mvar.csv" and "de_mvar_short.csv"). The wage index data used was obtained from Datastream and is included in the file "**_wageindex.csv". Note that for some scenarios reported in tables 2 and 3 in the article, alternative stage-one estimators or moment definitions were used. The scripts to construct these are also included. B. Simulation ============= I use a Java program to simulate the households' lifecycle behaviour and to do the MSM estimation. The basic version of the program is included with some very concise documentation in the "B" folder. For the most part, the programs used to do the estimations and policy simulations use exactly these same versions of the Java classes, and differ only in the "main" method that is run. The exception to this rule are simulations that explicitly deal with changes in the institutions that are modelled (such as taxes, UI, etc.), simulations that use very different parameter values, making it necessary to adjust grids, and simulations that have specific requirements with respect to the data recorded in simulation runs. For each of the various estimation and simulation scenarios, the full program is included. B.1 SMM ------- The programs used in the estimation of each of the scenarios reported in tables 3 and 4 are included. Note that the estimation process is computationally fairly expensive; you should expect several thousand core-hours of computation time on current hardware (as of 2011) for some of the scenarios. Each scenario is composed of two or three programs: estimation, derivatives (the derivatives of the moment vector used to calculate the variances of the estimators), and potentially analysis (generate some simulation-based statistics, such as unemployment rates, reservation wages, etc.). Brief instructions on how to run the programs are provided in "readme-B1.txt". B.2 Experiments --------------- The programs used in the simulations reported in sections 6.1, 6.2 and 6.3 of the article can be found in the corresponding folders. The simulations for the alternative scenario reported in the appendix are included as well. For the most part, these are straightforward modifications of the basic program. Additional information is provided in "readme-B2.txt".