Gregorio Caetano, Josh Kinsler and Hao Teng, "Towards Causal Estimates of Children's Time Allocation on Skill Development", Journal of Applied Econometrics, Vol. 34, No. 4, 2019, pp. 588-605. Data used in this paper are from the Panel Study of Income Dynamics (PSID) main study and the 1997, 2002 and 2007 waves of the Child Development Supplements (CDS-I, CDS- II, and CDS-III). Five raw data files are included in the zip file ckt-data.zip. 1997TD.csv, 2002TD.csv and 2007TD.csv are children's time diary datasets from Wave I (1997), Wave II (2002), and Wave III (2007) respectively. Child_TA.csv is the main dataset from the Child Development Supplements. PSID_family_select.csv is part of the PSID Main Family Data. All five datasets can be downloaded from the PSID website: https://simba.isr.umich.edu/VS/f.aspx. More details about the data files and how we organize them are as follows. All data file are ASCII files in DOS format. Unix/Linux users should use "unzip -a". **Data files**: 1.1 Time Diary Datasets 1997TD.csv (N = 131,060) 2002TD.csv (N = 99,467) 2007TD.csv (N = 57,813) Each row represents an activity done by a CDS child during a continuous time period. We don't change how variables are named. PSID-CDS codebook provides detailed information about what each variable is. 1.2 CDS Main Dataset: Child_TA.csv (N = 3,563) The dataset includes information about 3,563 CDS children's demographic characteristics, family background, cognitive and non-cognitive skill measures, etc. We don't change how variables are named. PSID-CDS codebook provides detailed information about what each variable is. 1.3 PSID Main Family Dataset: PSID_family_select.csv (N = 34,004) The dataset selects 95 family demographic variables from the PSID Main Family Data. The dataset covers 1997, 1999, 2001, 2003, 2005 and 2007. We don't change how variables are named. PSID Main Family Data codebook provides detailed information about what each variable is. **Data organization**: Following the steps below, one should be able to go from the five raw data files to the final dataset used for estimation in this paper. Step 1. Drop atypical time dairies and recode the original time dairy data (i.e. 1997TD.csv, 2002TD.csv and 2007TD.csv) into fewer categories. Details about the categories are described in Section 2.1 of the paper. As a result, three new time diary datasets for 1997, 2002 and 2007 respectively are created. They will be used in step 2. Step 2. Merge the three new time diary datasets into the CDS main dataset (i.e. Child_TA.csv). As a result, a new CDS dataset is created and it will be used in step 3. Step 3. Merge the new CDS dataset with the PSID Main Family Dataset (i.e. PSID_family_select.csv). Then convert the merged dataset into a panel dataset. As a result, the final dataset used for estimation is created.