Davide Pettenuzzo and Francesco Ravazzolo, "Optimal Portfolio Choice under Decision-Based Model Combinations," Journal of Applied Econometrics, Vol. 31, No. 7, 2016, pp. 1312-1332. The aggregate market data used in this paper were made available by Amit Goyal and Michael Roberts on their websites, http://www.hec.unil.ch/agoyal/ and http://finance.wharton.upenn.edu/~mrrobert/styled-9/styled-13/index.html. Relevant information on both datasets can be found in the papers: "A comprehensive Look at the Empirical Performance of Equity Premium Prediction", 2008, Amit Goyal and Ivo Welch, Review of Financial Studies 21(4) 1455-1508. and "On the Importance of Measuring Payout Yield: Implications for Empirical Asset Pricing", 2007, Jacob Boudoukh, Roni Michaely, Matthew Richardson, and Michael Roberts, Journal of Finance, 62, 877-915. The industry return data and classifications were made available by Kenneth French on his website, http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html The remaining industry specific data were obtained by combining quarterly accounting data from COMPUSTAT with monthly and daily equity market data from CRSP, as described in "In Search of Distress Risk", 2008, John Campbell, Jens Hilscher, and Jan Szilagyi, Journal of Finance, 63, 2899-2939. All the S&P 500 analysis performed in the paper rely on monthly data extending from January 1926 to December 2010, for a total of 1,020 observations. The MatLab code load_data_monthly_FINAL.m was used to process and prepare the monthly S&P 500 excess returns and the 15 monthly predictors. The 17 variables (columns) of the resulting datafile may be described as follows: Dates: Calendar dates, formatted as dd-mm-yyyy Excess return (CRSP): S&P 500 returns, computed from the S&P 500 index including dividends, in excess of the 3-month Treasury bill rate Log dividend yield: difference between the log of dividends and the log of lagged S&P 500 index Log earning price ratio: difference between the log of earnings and the log of the S&P 500 index Log smooth earning price ratio: difference between the log of the moving 10 ten-year average of earnings and the log of the S&P 500 index Log dividend-payout ratio: difference between the log of dividends and the log of earnings Book-to-market ratio: ratio of book value to market value for the Dow Jones Industrial Average T-Bill rate: 3-month Treasury bill rate Long-term yield : Long term government bond yield Long-term return: Long term government bond yield Term spread: difference between the long term yield on government bonds and the Treasury-bill Default yield spread: difference between BAA and AAA-rated corporate bond yields Default return spread: difference between long-term corporate bond and long-term government bond returns Stock variance: sum of squared daily returns on the S&P 500 Net equity expansion: the ratio of 12-month moving sums of net issues by NYSE listed stocks divided by the total end-of-year market capitalization of NYSE stocks Inflation: change in the Consumer Price Index (All Urban Consumers) Log total net payout yield: difference between the log of dividends and net equity repurchases (repurchases minus issuances) and the log of lagged S&P 500 index. All the Industry portfolio analysis performed in the paper rely on monthly data extending from January 1926 to December 2010, for a total of 1,020 observations. The MatLab code load_data_5industries_monthly_FINAL.m was used to process and prepare the monthly industry portfolio excess returns and predictors. The 15 variables (columns) of the resulting datafile follow the same definitions provided above for the S&P500 data. The two MatLab files are zipped in the file pr-matlab.zip. They are ASCII files in DOS format. Unix/Linux users should use "unzip -a". pr-matlab.zip: 9992 2015-11-18 11:14 load_data_5industries_monthly_FINAL.m 6504 2015-11-18 11:14 load_data_monthly_FINAL.m All the data files are zipped in the file pr-data.zip. Some files are in a subdirectory called Temp, because that is where the programs expect to find them. The data files are also ASCII files in DOS format. Unix/Linux users should use "unzip -a". pr-data.zip: 36855 2015-11-18 10:34 5_Industry_Portfolios.csv 853946 2015-11-18 11:12 5_Industry_Portfolios_Daily.csv 137487 2015-11-18 11:07 Data for modeling (S&P 500 monthly).csv 366076 2015-11-18 10:37 Industry_Data_for_Matlab.csv 120236 2015-11-18 11:07 Post2Web PayoutPaperDataTS 23 Sep 2011.csv 253147 2015-11-18 11:06 PredictorData2010.csv 0 2015-12-08 13:31 Temp/ 117851 2015-11-18 11:17 Temp/Data for modeling (industry 2 - monthly).csv 117461 2015-11-18 11:18 Temp/Data for modeling (industry 3 - monthly).csv 117696 2015-11-18 11:18 Temp/Data for modeling (industry 4 - monthly).csv 117775 2015-11-18 11:17 Temp/Data for modeling (industry 1 - monthly).csv 141837 2015-11-18 11:17 Temp/Data for modeling (S&P 500 monthly).csv 117479 2015-11-18 11:18 Temp/Data for modeling (industry 5 - monthly).csv Davide Pettenuzzo Brandeis University 415 South Street Mailstop 021 Waltham, MA 02453-2728 http://people.brandeis.edu/~dpettenu/