Xiangkang Yin and Jing Zhao, "A Hidden Markov Model Approach to Information-Based Trading: Theory and Applications", Journal of Applied Econometrics, Vol. 30, No. 7, 2015, pp. 1210-1234. The data set of this paper is a sample of 120 stocks that were traded on the New York Stock Exchange (NYSE) in 2010 and 2011. In order to choose representative stocks from a variety of industries and market capitalizations, we randomly select 40 stocks each from the S&P 500 Index, S&P MidCap 400 Index, and S&P SmallCap 600 Index. The ticker symbols of these sample stocks are listed in Panel A of Table I of the paper. The source of the data includes three databases: (1) Transaction and quote data of the sample stocks are taken from the Thomas Reuters Tick History (TRTH) transaction database from January 1, 2010 through December 31, 2011. We exclude transactions and quotes that occur before and at the open, as well as those at and after the close. Quotes with zero bid or ask prices, quotes for which the bid-ask spread is greater than 50% of the price, and transactions with zero prices are also excluded to eliminate possible data errors. Data for November 26, 2010 and November 25, 2011 are removed due to an early "day after thanksgiving" closing. (2) From the Center for Research in Security Prices (CRSP) database, we obtain data on daily return, price, number of shares traded, and shares outstanding of each sample stock, which are used in calculating variables including share turnover, firm size, and illiquidity measure. (3) We identify quarterly earnings announcements using the announcement dates and times recorded in the Thomas Reuters I/B/E/S database. Over the two years (2010 and 2011) there are 960 earnings announcements for the 120 sample stocks. Announcements occurring at or after 4:00 pm are relabeled with the following day's date to ensure that the event day we considered is the day on which investors and stock prices have time to react to the earnings announcement. About 46% of the announcements in our sample occur between 4:00 pm and midnight. For earnings surprise measure, we require that there is at least one observation in the I/B/E/S database for calculating the mean of analyst forecasts prior to an earnings announcement. This screens out five announcements and leads to a final sample of 120 stocks with 955 earnings announcements. We are not allowed to redistribute any of these data. Researchers need to subscribe the databases in order to access them. Please address any questions to: Dr Jing Zhao Department of Finance La Trobe Business School La Trobe University Bundoora, VIC 3086 Australia E-mail: j.zhao@latrobe.edu.au