What’s Wrong with the Case-Shiller Price Index?

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditShare on TumblrDigg thisEmail this to someone

Last week, the think tank Bruegel hosted an event in Brussels on the use of micro data for the evaluation of the impact of investment at the macro level.

I gave a presentation at the panel. The main point is that macro stats on house prices such as the Case Shiller index explain very little of the overall variance of house prices. In practice a simple variance decomposition of house prices on metro area fixed effects, with transaction level data, will reveal that only about 35% of overall variance is captured by the Case Shiller. That means that 65% of the total variation of house prices is local.

Talking about prices in New York vs. prices in Montreal isn’t that informative compared to comparing prices in Brooklyn versus prices in Mile End.

Perhaps even more importantly, as prices rise overall in a metro, prices can fall significantly in neighborhoods. The most significant example is that of San Francisco, where my data shows that there was a compression of the price distribution during the house price boom of 2000-2006 (yes, there’s another one now).



Zipf law for City Size Distributions

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditShare on TumblrDigg thisEmail this to someone

A striking empirical regularity in urban economics is that there is a linear relationship between the log rank and the log population of metro areas. Take, for instance, the list of U.S. metro areas, ordered in decreasing population sizes. Assign rank 1 to the most populated metro area, rank 2 to the second most-populated area. Then Zipf’s “law” states that there is a linear relationship between the log rank and log population, with a slope of 1. With a simple scatter plot using 2015 data, and the 135 largest metro areas, we get this:

This is a fairly striking statistical regularity: the estimated coefficient is 1.01, but perhaps even more surprising the R squared of the regression is 98% ! I don’t particularly remember better fitting relationships in my career as an economist. Now this coefficient of 1, found in Gabaix (1999) and oft-repeated, turns out to be a non-robust estimate. This is what we get with the full sample of 380+ metro areas.

The coefficient is quite different from 1, it is actually 0.8 with a very small standard error. So here the impressive finding is the linearity of the relationship, not so much the slope of -1.  Replications of Zipf’s law for multiple countries also reach the same conclusion: for more than half of countries, the Zipf coefficient is statistically different from 1. The R squared remains strikingly close to 100% though. Zipf law should simply be about that fact, not the strong constraint on the slope.

So what does a deviation from Zipf’s law mean? Zipf’s law is the consequence of Gibrat’s law: all cities grow on average with the same proportionality coefficient, regardless of their size. If Gibrat’s law is not satisfied, then Zipf’s law typically won’t be. We typically find that smaller cities have higher proportional growth variance, suggesting that they are subject to more shocks — a lack of sectoral diversification in smaller metro areas makes them both more susceptible to booms and to busts.

For more, follow my Masters level urban economics course, at www.ouazad.com/teaching.html.

Summer School in Urban Economics 2017 in Paris — Call for Papers

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditShare on TumblrDigg thisEmail this to someone

for Economics PhD Students

Co-sponsored by Ecole Polytechnique (l'X), CREST, Université Paris-Saclay and the Urban Economics Association (UEA)

June, 14-16, 2017
Fondation Hellénique, Cité Universitaire, Paris

This Summer School is designed for Economics PhD students with an interest in Urban Economics. In addition to those students specialized in Urban Economics, we encourage the participation from PhD students in Labour, Development, Public, Trade and other applied fields who want to improve their knowledge in the field. The objective of the school is twofold: to offer an intensive training program for interested PhD students, and to provide them with the opportunity to present and discuss their own ongoing research with leading researchers in the field in a relaxed and open atmosphere.

Faculty: Christian Hilber (London School of Economics), Henry Overman (London School of Economics), André de Palma (ENS Cachan), Diego Puga (CEMFI), Holger Sieg (University of Pennsylvania), Jacques-François Thisse (Université Catholique de Louvain).

Scientific Committee: Gilles Duranton (University of Pennsylvania), Matthew Kahn (University of Southern California), Isabelle Méjean (Ecole polytechnique), Amine Ouazad (Ecole polytechnique), Diego Puga (CEMFI), Jacques Thisse (Université Catholique de Louvain), Kurt Schmidheiny (Universität Basel),  Harris Selod (Development Research Group at the World Bank), Elisabet Viladecans-Marsal (University of Barcelona and IEB).

Organizing Committee: Amine Ouazad (Ecole polytechnique), Weronika Leduc (Ecole polytechnique).

The program and details about the application process can be found at:

Deadline for applications: March 5, 2017.

How Zillow could improve Zestimates

Share on FacebookShare on Google+Tweet about this on TwitterShare on LinkedInShare on RedditShare on TumblrDigg thisEmail this to someone

Zillow’s well-known Zestimates provide an ‘estimate of the market value of a property‘, in Zillow’s own words:

The Zestimate home valuation is Zillow’s estimated market value for a home, computed using a proprietary formula. It is a starting point in determining a home’s value and is not an official appraisal. The Zestimate is calculated from public and user-submitted data. Updating your home facts can help make your Zestimate more accurate.

That so-called proprietary formula is in fact fairly well-known standard econometrics, as Zillow combines information on properties (tax records have a wealth of information on the topics), information on neighborhoods, past transactions, and location-specific effects to provide a linear estimate. They also provide a confidence interval around these estimates.

Criticism of Zillow’s zestimates abounds. Part of the criticism is based on a misunderstanding of basic statistics. The zestimate is a forecast of the average price, based on minimizing the mean-squared error between the estimate and the actual transaction price. There will necessarily be prediction error. For instance, this website attacks the accuracy of zestimates based on 3 data points (!).

There are two more serious issues with the zestimates. The first issue is that estimating prices requires long time series — that’s how statistical estimates converge –, but also requires recent data — house prices can be very volatile in the short-run, e.g. Staten Island suddenly became a hot market experienced some very high upward volatility. There is a fundamental trade-off between how recent estimates are and how much precision one can expect from the zestimates.

A second issue with the zestimate is that, apart from a tax perspective, the ‘value’ of a house doesn’t really make sense/is not measurable. The buyer’s reservation price does, the asking price does, and the transaction price do make sense. There is no such thing as the value of a house. Certainly, the maximum price that a buyer is ready to pay for a property depends on:

  • his time horizon, i.e. his ability to wait for more offers.
  • the amount of friction on the market, i.e. the rate of arrival of offers.
  • his specific preferences for amenities; single individuals won’t have the same valuation of a house than couples with kids.
  • the characteristics of the mortgage that the buyer could get.

So at the end of the day there is no specific reason why even displaying the average or median transaction price would make sense from a buyer’s perspective, and it’s clear that Zillow is setting itself up for failure by providing an estimate that is independent of the buyer’s specific characteristics. The composition of the pool of buyers can change quite quickly over time.

Note: a few weeks after writing this blog post, I realized that the following paper in the Review of Economics and Statistics is an in-depth analysis of the points I mentioned.

Goetzmann, William, and Liang Peng. “Estimating house price indexes in the presence of seller reservation prices.” Review of Economics and statistics 88.1 (2006): 100-112.