Nikolaus Hautsch, Lada M. Kyj, and Peter Malec, "Do High-Frequency Data Improve High-Dimensional Portfolio Allocations?", Journal of Applied Econometrics, Vol. 30, No. 2, 2015, pp. 263-290. The raw data used in this paper are confidential. However, the empirical results can be replicated using the blocked correlation matrices, variance estimates, and realized covariance matrices, as well as the open and close prices provided. The data used in the article are from the Trade and Quote (TAQ) database of the New York Stock Exchange. TAQ is a collection of intraday trades and quotes for all securities listed on the New York Stock Exchange, American Stock Exchange, Nasdaq National Market System, and SmallCap issues. This study considers a subset of 400 stocks from the S&P 500 index with the longest continuous trading history during the sampled period, which is from January, 2006 to December, 2009. The stock symbols are listed in the file SPXTickL.txt. A liquidity index, which sorts the stocks according to the number of mid-quote revisions, is in the file SPXLiquIndx.txt. Blocked correlation matrices are estimated from raw datasets which are mid-quote prices sampled at 1 second increments from 9:45 EST to 16:00 EST. The data filtering procedure is discussed in section 4 of the web appendix. The estimated correlation matrices are in five files called BRKCorrx.txt, where x corresponds to the number of liquidity groups used (1, 2, 4, 8 and 10). These files contain 80200 by 1003 matrices, where the rows correspond to correlation elements in vech format, and the columns correspond to the number of trading days. The corresponding numbers of refresh times per block are in five files called NRftx.txt with x defined as above. These files contain 1003 by x(x+1)/2 matrices, where the rows correspond to the number of trading days and the columns correspond to the number of distinct blocks. Variance estimates based on the univariate realized kernel are in the file RKVar.txt. This file contains a 1003 by 400 matrix, where the rows correspond to the number of trading days, and the columns correspond to the number of stocks. 5-minute realized covariance matrices based on previous-tick interpolation of the mid-quote data are in the file RCov5Min.txt. This file contains a 80200 by 1003 matrix, where the rows correspond to covariance elements in vech format, and the columns correspond to the number of trading days. Daily open and close prices are in the files OpenPrice.txt and ClosePrice.txt, respectively. These files contain 1003 by 400 matrices, where the rows correspond to the number of trading days and the columns correspond to the number of stocks. Since the five BRKCorrx.txt files and the RCov5Min.txt file are extremely large, they have been compressed using 7zip instead of zip or pkzip. Using 7zip results in substantially smaller compressed files. This program is freely available for many versions of Windows at http://www.7-zip.org/ Command-line versions for many versions of Linux are also available. On Debian and Ubuntu systems, they may be found in the package p7zip-full. Each of the files BRKCorrx.txt and the file RCov5Min.txt is 7zipped in the file BRKCorrx.7z and RCov5Min.7z, respectively. All other files are zipped in the file hkm-data.zip. These files are ASCII files in DOS format. Unix/Linux users should use "unzip -a". There is also a web appendix in the file Hautsch-Kyj-Malec-appendix.pdf. Please address any questions to: Peter Malec Institute for Statistics and Econometrics Humboldt-Universität zu Berlin malecpet [AT] hu-berlin.de