Andreas Pollak, "Unemployment, Human Capital Depreciation, and
Unemployment Insurance Policy", Journal of Applied Econometrics,
Vol. 28, No. 5, 2013, pp. 840-863.

This document explains how to replicate the estimation and simulation
results in the paper. This is done by providing the necessary
information on how to use the various scripts and programs used to
obtain the results. If you have any further questions, please contact
the author:

  Andreas Pollak
  Department of Economics
  University of Saskatchewan
  9 Campus Drive
  Saskatoon, SK
  S7N 0P9
  Canada
  e-mail: a.pollak@usask.ca

All programs and data are contained in the file pollak-files.zip.
Since some of the files appear to be binary, users of non-Windows
operating systems should be careful when unzipping.

The file pollak-files.zip contains a great many files in a great many
directories. All of these files are listed (Unix style) in the file
pollak-contents.txt, which is zipped in pollak-contents.zip.

The German micro dataset is GSOEP, 1991 to 2001 (waves "H" to "R") as
well as the corresponding CNEF. The US data is from the PSID, 1991 to
1997. While the GSOEP dataset is available free of charge to both
domestic and international researchers, prospective users must sign a
data distribution contract with the DIW Berlin, which is supplying the
data. Once users have access to the GSOEP dataset, they can request
the corresponding CNEF data for a nominal fee from Cornell University.
The PSID is available for free from the University of Michigan. Users
are required to register before accessing the data. The PSID component
of the CNEF can be downloaded from the CNEF webpage at Cornell.

GSOEP contact:
The German Socio-Economic Panel
DIW Berlin
10108 Berlin
Germany
web: http://www.diw.de/en/soep

PSID contact:
Institute for Social Research
University of Michigan
P.O. Box 1248
Ann Arbor, MI 48106-1248
USA
web: http://psidonline.isr.umich.edu/default.aspx

CNEF contact:
SOEP/CNEF Project Assistant
Cornell University
Department of Policy Analysis and Management
MVR Hall Ithaca, New York 14853-4401 
USA
web: http://www.human.cornell.edu/pam/research/centers-programs/german-panel/cnef.cfm

The rest of this document is structured as follows. Part A describes
how the data was selected and combined, and discusses the stage-one
estimation as well as the creation of the moment vector and the
weighting matrix for the stage-two MSM estimation. Part B documents
the simulation-based estimation step and the policy experiments
discussed in the paper. Note that the files accompanying these
instructions are organized in directories in accordance with the
structure of this document.

PLEASE NOTE: If you are just interested in replicating results
documented in part B (MSM estimation and simulations), you do not need
access to the micro datasets, as moment vectors and weighting matrices
are included with the source code of the simulation programs.

A. Microdata
============
A.1 Data
--------
To generate the data files that will be used in all later stages,
combine the relevant waves of the CNEF (US: 1991 to 1997 into
"us_equiv_p.*", Germany: 1991 to 1997 into "de_equiv_p.*" and 2000 to
2001 into "de_equiv2_p.*") and only keep household heads. For Germany,
remove residents of the eastern federal states and Berlin. Then, add
the following individual variables (from PSID/GSOEP):

wksu:	weeks unemployed (0<=wksu<=52.14286)
wkse:	weeks employed (0<=wkse<=52.14286)
wkso:	weeks out of labour force (0<=wkso<=52.14286)
self:	self-employed in current or most recent job
food:	food consumption (not available for Germany 1991 to 1997)

Moreover, include a macro variable:

pricef:	food price index

The SPSS scripts that were used to construct these data files, as well
as instructions on how to used them (readme-A1.txt) and information on
all macro variables used are included.

A.2 Estimation
--------------
In what follows, "**" in filenames refers to the country component of
the name, "de" for Germany and "us" for the US. The selection of the
relevant subsample and the actual estimation of parameters and moments
is done in a straightforward fashion by Stata scripts, which I briefly
describe now. Two parameters, delta_e and lambda, are directly
estimated from the micro data as sample moments. This is done by the
Stata scripts "**_deltae.do" and "**_lambda.do" by level of
educational attainment and sex. The preference parameter phi is
obtained from an OLS estimation of the log-linearized Euler equation,
which estimates phi/gamma (remember that gamma=3). "**_demo2.do" does
this.

For the actual simulation, average demographic information by age is
needed. This is obtained by "**_demo.do" and summarized in
"**_demo.txt". To generate the moment vector and the weighting matrix
used in the MSM estimation, select the appropriate sample and
variables (see paper) using "us_finalmente.do" and
"de_finalmente_short.do". Based on the output created here, the MatLab
programs "us_finalmente.m" and "de_finalmente_short.m" create the
moment vector ("us_mmean.csv" and "de_mmean_short.csv") and the
weighting matrix ("us_mvar.csv" and "de_mvar_short.csv"). The wage
index data used was obtained from Datastream and is included in the
file "**_wageindex.csv". Note that for some scenarios reported in
tables 2 and 3 in the article, alternative stage-one estimators or
moment definitions were used. The scripts to construct these are also
included.

B. Simulation
=============
I use a Java program to simulate the households' lifecycle behaviour
and to do the MSM estimation. The basic version of the program is
included with some very concise documentation in the "B" folder. For
the most part, the programs used to do the estimations and policy
simulations use exactly these same versions of the Java classes, and
differ only in the "main" method that is run. The exception to this
rule are simulations that explicitly deal with changes in the
institutions that are modelled (such as taxes, UI, etc.), simulations
that use very different parameter values, making it necessary to
adjust grids, and simulations that have specific requirements with
respect to the data recorded in simulation runs. For each of the
various estimation and simulation scenarios, the full program is
included. 
  
B.1 SMM
-------
The programs used in the estimation of each of the scenarios reported
in tables 3 and 4 are included. Note that the estimation process is
computationally fairly expensive; you should expect several thousand
core-hours of computation time on current hardware (as of 2011) for
some of the scenarios. Each scenario is composed of two or three
programs: estimation, derivatives (the derivatives of the moment
vector used to calculate the variances of the estimators), and
potentially analysis (generate some simulation-based statistics, such
as unemployment rates, reservation wages, etc.). Brief instructions on
how to run the programs are provided in "readme-B1.txt".

B.2 Experiments
---------------
The programs used in the simulations reported in sections 6.1, 6.2 and
6.3 of the article can be found in the corresponding folders. The
simulations for the alternative scenario reported in the appendix are
included as well. For the most part, these are straightforward
modifications of the basic program. Additional information is provided
in "readme-B2.txt".