Adam Nowak and Patrick Smith, "Textual Analysis in Real Estate", Journal of Applied Econometrics, Vol. 32, No. 4, 2017, pp. 896-918. In our study we employ two unique datasets, both of which include confidential data. The primary data source used in our study was provided by the Georgia Multiple Listings Service (GAMLS). We were given consent to use the GAMLS data solely for research purposes, but do not have permission to disseminate the data. We also collected data from tax assessor offices in the counties of interest to our study. Although these data are available to the public, we do not have permission to disseminate the data. Individuals interested in obtaining the county tax assessor data have to contact the offices directly and provide documentation that the data will be used for research purposes (if used for commercial purposes the tax assessor offices charge a fee). Our contacts at the GAMLS and tax assessor offices are as follows: Brian Chew (GAMLS), Lisa Ballouk (Gwinnett County), Karen Bess (DeKalb County), Constance Mackey (Fulton County), Rodney McDaniel (Clayton County), and Peggy Parker (Cobb County). Description of the Data The GAMLS website states "Listing content includes, but is not limited to, photographs, images, graphics, audio and video recordings, virtual tours, drawings, descriptions, remarks, narratives, pricing information, and other details or information related to listed property." Our study incorporates property attributes (square footage and age), transaction attributes (short sale or agent owned indicators), and location fields (address) from the GAMLS. The program "Import-Data.R" reads in the raw GAMLS data and creates variables used in the estimation. All variable names are self-explanatory. The output is "MLS_Atlanta.csv", a file of sale prices and listing attributes names. Of particular interest in this study are the "public remarks" that are entered by real estate agents that have toured and assessed the charactertics, quality and condition of the properties. The program "Token-Maker-Program" is used to (1) clean the remarks, (2) create tokens for each remark, and (3) save a list of bigrams and unigrams as the R data objects "big.bigram.list" and "big.unigram.list", respectively. Both R programs are ASCII files in DOS format. They are zipped in the file ns-programs.zip. Unix/Linux users should use "unzip -a".