Valentin Verdier, "Average Treatment Effects for Stayers with Correlated Random Coefficient Models of Panel Data", Journal of Applied Econometrics, Vol. 37, No. 7, 2020, pp. 917-939. The zip file vv-files.zip originally had a different name. It mainly contains empty directories. ------------------DATA PREPARATION---------------------- 1. Household Survey Data. The household data used in this paper are available to other researchers, but access to the data requires that a data use agreement be signed by the researchers and the Tegemeo Institute at Egerton Institute. The data policy of the institute is found here: https://www.tegemeo.org/images/downloads/data/Data-Policy.pdf A request for the data used in this paper can be submitted here: https://www.tegemeo.org/index.php/resources/data/230-request-for-data.html In order to replicate the results in this paper, researchers should request the household survey data for years 1997, 2004, 2007, and 2010 with GPS coordinates included (the latter is needed for the rainfall data to be merged in). 2. Rainfall Data. Once the data files above are obtained, rainfall data for each household can be merged in using the GPS coordinates for each household, GIS, and the publicly available data on daily precipitation made available by the Climate Prediction Center (CPC) of the National Weather Service. These data are available here: https://www.cpc.ncep.noaa.gov/products/GIS/GIS_DATA/. The dates for rainfall seasons across all geographical regions in Kenya are given in "data/Rainfall Periods for Tegemeo Sample Villages (by Division).pdf" (these dates apply to every year of the data). In addition, this rainfall dataset has already been compiled by the Tegemeo Agricultural Policy Research and Analysis Project (TAPRA) and can be obtained by requesting the "General Data" files from the Tegemeo institute using the link provided above. 3. Data Processing Code The raw data obtained from Tegemeo was placed in the folder /data/data/raw. The architecture of folders was preserved (without data files) to make replication by other researchers easier. In particular, we see that the files were divided into household surveys for years 1997, 2000, 2004, 2007, 2010, and a folder named General Data containing, in particular, the rainfall data discussed above. From the raw file, the panel dataset used for analysis (data/panels/SuriPanel_extended.dta) is created by the do files (1) /data/src/data_prep.do and (2) data/src/panel_creation.do. ------------------ANALYSIS CODE --------------------- 1. Table 1 The results in Table 1 can be replicated with the dataset SuriPanel_extended.dta created above and by running the code Table1/Code/extrapolation.do. 2. Figure 2 Run Figure2/Code/graph_extrapolations.do to replicate Figure 2. 3. Footnote 31 The test in footnote 31 (overidentification test of the CRC model) can be replicated using the code in the Footnote31 folder. Obtain the results for 10,000 bootstrap draws (20 times 500) by running the shell script Code/run.sh. Then obtain the p-value by running Code/results_bootstrap_overid_CRC.do. 4. Alternative test of the simple extrapolation In section 4.2.3 we discuss an alternative test of the validity of the extrapolation to stayers, using average distance to the nearest seed seller instead of an over-identification test as reported in Table 1. The code to obtain these results is found in the folder "test_distance". Obtain the results for 10,000 bootstrap draws (20 times 500) by running the shell script Code/run.sh. Then obtain the p-value by running results_testdist.do.