Jan R. Magnus and Mary S. Morgan (eds), "The Experiment in Applied Econometrics", Journal of Applied Econometrics, Vol. 12, No. 5, 1997 (Special Issue). All the papers in this special issue used the same data. All the data files are in DOS format. They are zipped in the file exp-data.zip. Documentation files are zipped in the file exp-doc.zip. (A) The data set The data set consists of 9 files: 5 files for the USA: BS41US.PRN BS50US.PRN BS60US.PRN BS72US.PRN TS1289US.PRN and 4 files for the Netherlands: BS65NL.TXT BS80NL.TXT BS88NL.TXT TS4888NL.TXT The BS-files contain budget-survey data; the TS-files contain time-series data (1912-89 for the USA and 1948-88 for the Netherlands). "US" indicates data for the USA and "NL" for the Netherlands. The names of the "NL" files have been slightly changed compared to the Experiment Information Pack, but there is no ambiguity. The contents of the files and all other details are discussed at length in the Experiment Information Pack. A short data description is available in the Special Issue of the Journal of Applied Econometrics (October 1997), and a much longer and more detailed one in the forthcoming book version of the Experiment (about twice the size of the Special Issue): Magnus, Jan R. and Mary S. Morgan (eds), The Experiment in Applied Econometrics, John Wiley & Sons, Chichester / New York, to appear in 1998. The Dutch data are not usable without the Experiment Information Pack or the forthcoming book. Therefore we attach a wordperfect file DUTCHDAT.WP6, which gives the necessary information about the Dutch data. An ASCII version of this file (DUTCHDAT.TXT) is also included. Of the 9 files that make up the data set, 7 are identical to the ones sent to the participants. The other 2 files (BS41US.PRN and TS4888NL.TXT) are almost identical. We shall point out the differences under (B) and (C). (B) The file BS41US.PRN The file BS41US.PRN is different from the original file sent to participants in three respects: 1) one column (named SAMPSIZE) with sample sizes is added - see comment under (D); 2) In the variable FOODCON the numbers 1708 and 1197 (household size 5 or more) were interchanged. This has been corrected; 3) Tobin made two small and one larger error in the data reported in his Table 2 (p. 119 of his paper). We corrected these three errors in the Experiment Information Pack, but we made one small (one digit) mistake ourselves. The 3rd income observation for 3-person households should be 1358, not 1357. This has now been corrected. (C) The file TS4888NL.TXT In our e-mail message to all participants of 1 May 1996, we wrote: "In the Dutch data set (file TS4888.NL [old name], printed on page 54 of the Experiment Information Pack) a mysterious mistake has occurred in the first two variables 'POP' and 'NOH'. We apologize for the mistake and provide the correct numbers below." The new file has the correct numbers. (D) E-mail message of 19 August 1995 (concerning the 1941 US Survey data) In our e-mail message of 19 August 1995 we wrote to all participants: "Professor J.S. Cramer of the University of Amsterdam suggested that we should try to reconstruct the actual sample sizes for the individual cells in the 1941 survey (the variable SAMPSIZE). We have used the figures given for 'percentage reporting' in various tables to work out the implied number of households sampled for each income/family size group (from Table 18, pp. 94-101; further publication details are given in the original data description) and for all households taken together (from Table 25, pp. 127-8). The summation of these individual cell estimates is given in the 'sum' column; the entries for 'all households' appear in the 'All' column. We are reasonably confident that these individuals cell estimates are correct because they sum to 1213, very close to 1220, the sample size given elsewhere in the text for the 1941 study. However, the larger the number of households involved, the less accurate the calculations in the "All" households column, which may vary by 1 or 2 units in the first seven categories. Our level of confidence for the individual family size cells, and for the final two categories for all households is near 100%. The bottom row of totals has not been estimated from the tables, but is simply the sum of the columns. Estimated Sample Sizes (SAMPSIZE) for 1941 US Budget Survey Household Size 1 2 3 4 5+ (Sum) All Money Income <500 59 29 3' - 5 (96) 98 500-1000 71 54 31 12 20 (188) 188 1000-1500 40 67 33 21 19 (180) 180 1500-2000 18 58 61 23 38 (198) 198 2000-2500 11 65 39 43 25 (183) 183 2500-3000 6 25 50 37 30 (148) 148 3000-5000 - 40 42 44 38 (164) 166 5000-10000 - 6' 12 9 15 (42) 42 >10000 - - 3* 3* 8* (14) 17 Total 205 344 274 192 198 (1213) 1220 NB: Missing cells (-) are where no data was reported in the tables because of the smallness of the sample (almost certainly samples of 0,1, or 2 households). Some of these missing numbers might be easily guessed, but since no data is available on these categories of households, they have been left blank. * These are cells which Tobin did not use. ' These are the two cells which gave rise to the apparent income/consumption 'outliers' described in Haji Izan's (Journal of Econometrics, Vol. 13, 1980, pp. 391-402) discussion and her Figure 1, p. 396. These data are based on small samples, but they are also economic outliers in the sense that they are on the edge of the economic matrix." (E) E-mail message of 29 August 1995 (answer to two questions) In our e-mail message of 29 August 1995 we wrote to all participants: "One of the participants commented on the 1960 US survey data. The household income (HINC) variable is negative (-964) for the lowest income section of the 5-person family group (page 31 of the Experiment Information Pack). This number is 'correct' in the sense that it is as reported in the published tables. However, you may note that (a) the sample size is small. There are 4 families in the cell (0.4% of 984); (b) the cell seems to be dominated by some households with very large negative asset changes (see the NETASSCH variable); and (c) the ACCBAL item also suggests considerable inaccuracy in recording."