Arnab Bhattacharjee, Eduardo Castro, Taps Maiti, and João Marques, "Endogenous Spatial Regression and Delineation of Submarkets: A New Framework with Application to Housing Markets", Journal of Applied Econometrics, Vol. 31, No. 1, 2016, pp. 32-57. The database used in this empirical work was provided by the firm Janela Digital S.A., which manages the largest real estate portal in Portugal (http://www.casa.sapo.pt). The full database is used by real estate agents and is confidential. Access to part of the data used in this study was obtained through a research contract. These original data are also confidential and cannot be made available for public use. Requests may be made directly to Janela Digital S.A. The sub-sample used in our study includes 12,467 observations pertaining to the Aveiro-Ílhavo urban area, covering the period between 20 October 2000 and 20 March 2010. Substantial cleaning and transformations were made to make the original data amenable to our research. This modified database is being released for research use. Each record in the tab-separated values ASCII text file (in DOS format) "bcmm-data.txt" pertains to one house available for purchase in the housing market over the period under analysis. The details of the data cleaning process can be found in Marques (2012, section VII.5.1, p.291). This file is zipped in the file bcmm-data.zip. Unix/Linux users should use "unzip -a". Several issues relating to data inconsistency were resolved. The free text description of each property was extracted and data mining techniques applied to populate numerous fields on the hedonic characteristics of each house. Each house was mapped to the centroid of its "zone" -- the smallest homogeneous georeferenced area containing the index house. Distances were then computed to central and local amenities. In the process, many new variables were constructed, combining both existing attributes and extracting useful information from the description field and from the location of each house. Thus, the final dataset was organized with: * 19 variables related with housing physical attributes (Table 33 in Marques, 2012); * 24 variables related to location characteristics (Table 18 in Marques, 2012); and * 58 variables related to time (monthly dummy variables and Time on the Market). Statistical factor analysis was then applied, transforming the 52 initial dimensions into 5 factors (note that the time variables and the living space variable were not included in the factor analysis). The details of the factor analysis can be found in Bhattacharjee, Castro and Marques (2012) and Marques (2012, section VII.5.4., p.320). This final dataset was used for the study and is being released for research use. The variables included in our dataset are the following: zone smallest homogeneous georeferenced area containing the index house x x-coordinate of the centroid of zone y y-coordinate of the centroid of zone lnp_m2 logarithm of the asking price at listing (in Euros per square meter) ln_m2 logarithm of living space (in squared meters) lntom logarithm of time on the market (number of days since listing) factor1 statistical factor: access to the CBD or central amenities factor2 statistical factor: access to local amenities factor3 statistical factor: accessibility to the beaches factor4 statistical factor: housing dimension (other than living space) factor5 statistical factor: additional desirable features (garage, balcony, central heating) References: Bhattacharjee, A., Castro, E. and Marques, J. (2012). Spatial Interactions in Hedonic Pricing Models: The Urban Housing Market of Aveiro, Portugal. Spatial Economic Analysis 7(1), 133-167. Marques, J. (2012). The notion of space in urban housing markets. PhD Thesis, University of Aveiro, Portugal. http://ria.ua.pt/handle/10773/8789 Contact: Arnab Bhattacharjee Professor of Economics, and Director, Spatial Economics & Econometrics Centre (SEEC), Heriot-Watt University School of Management and Languages Room 1.06, Mary Burton Building Edinburgh EH14 4AS, United Kingdom Email: a.bhattacharjee [AT] hw.ac.uk Telephone: +44 (0)131 451 3482 Fascimile: +44 (0)131 451 3296