My workflow

In my PhD course in econometrics I used to ask students for their end-of-year paper but also for the full replication code, in whatever language they use. Being able to fully replicate a submitted paper’s tables is key as most top journals now require submission of such code upon acceptance — such data availability and data replication policies became the norm in academia after the well-documented Hoxby v. Rothstein debate.

Looking at my students’ code has been revealing. Here, rather than pointing out flaws (whose work doesn’t have flaws?) I’ll point out how I process data and write models. First,

  • More than 2/3 of the work on an empirical paper will involve merging and recoding data from different sources. Many PhD students repeat the same code lines 10 or 20x. While simple software like Stata makes it easy to do a merge, they are extremely bad at abstracting code. For most heavy projects, data processing can be done in <100 lines, because data preparation instructions are very repetitive. Hence advice #1 is: use R, and abstract your code as much as possible. For instance, creating a dummy for missing values of variables ‘var1’-‘var100’ in 5 different files shouldn’t take 100 lines. It should take one line. There are many more elaborate ways of abstracting even the most complex operations; and using functional programming as in R makes abstracting fast and easy. Thus learn lapply, sapply, tapply and use functions as much as possible.
  • For GIS operations, which most urban economists need, I use the R libraries rgeos, sp, rgdal as much as possible given their flexibility. But these libraries can be slow. For large operations, use ogr2ogr, which performs an order of magnitude faster but is hard to handle. Write Makefile files to replicate your GIS operations that use ogr2ogr.
  • I never start writing source code without writing the literate (natural language) version of the code. First, write the plain English description as comments, then insert the required equations. Second, write the code in-between the comment lines. Some people use R Markdown to intersperse R code and comments, but I prefer using plain R code with #’ comments that can be either rendered as pdf/html or executed as R. Donald Knuth, the author of the Art of Computer Programming, has long been a proponent of literate programming, and it does help in clarifying code and writing code as a work of art.
  • For theory models, Mathematica has been great help. The notebook interface really is much more suited for theory work (e.g. solving a symbolic system of equations) than for numerical/empirical work. Mathematica is well-suited for functional programming and so switching between R and Mathematica shouldn’t be conceptually hard. (There is a Mathematica package called RLink as well). When simulating a model under different sets of parameters, I have written R Shiny user interfaces to play with the model’s parameters and check for the nature of equilibria.

Disclaimer: Stata has been and still is a great pedagogical tool. Its output is clear as it’s a software that almost trains its users. It displays the right statistics for publication, and provides useful warnings. That simplicity can become a straightjacket. In addition, there is simply no GIS support in Stata (apart from the super basic spmap and shp2dta): good luck doing a regression discontinuity design at a border in Stata.




Mumbai’s Economics – 3 Key Proposals to Change the City


Fascinated. That’s the feeling an economist experiences when looking at the vista of Mumbai’s west side from the 30th floor. In the back, next to the two pointy towers, is one family’s skyscraper, the Ambani family tower. At ground floor level, thousands of slum residents share land next to the swanky (and securely guarded) St Regis hotel. This is Mumbai in 2016: strong, persistent spatial inequalities. Mumbai (most residents still call it Bombay) has the whole range of economic statuses. Shoppers line up to buy the latest mid-range fashion in the northern suburbs of Santa Cruz. Upper income families browse brands the upscale Phoenix and Palladium malls. And all buy fruits, vegetables, meat, and day-to-day clothing from street shops. Travelling through the city either using its famous suburban train or by car (Uber does well) is never-ending fascination at the cohabitation of upward of 22 million different personal lives.

Talking to residents, I noted three recurring economic topics.

  • Land use efficiency

Large slum areas next to premium, high value, land built up with gleaming skyscrapers. Abandoned multimillion dollar value land in the center of town. The evidence on inefficient land use is obvious for all to see, but a more important task is to understand its causes.

In interviews with contacts, the most cited reason for such inefficient land use was litigation delays. Litigation over property rights typically involves two or more parties. And lawsuits do not come to a conclusion within reasonable time frames. The Indian court system is clogged up — the Chief Justice of India requires action; and such issue has dire consequences for urban land use patterns.

A large swath of land that could be profitably redeveloped is the area of the mills “midtown” (called Lower Parel). There again, there are long standing property rights lawsuits.

Proposal #1: Speed up lawsuits on property rights.

  • The political accountability of commissioners

Another major concern is investment in infrastructure. The situation is not as challenging as in cities such as Bangalore, whose economic activity would greatly benefit from an FDR-style investment in roads and highways: a few kilometers can take hours to drive, the metro’s planned 43km has seen only 8km built up to now.

That doesn’t mean that there are no major infrastructure needs in Mumbai. Benign monsoon rain can lead to traffic paralysis for days. Part of the solution is the increase the length of terms and the accountability of officials in charge of urban planning policies. Contacts mentioned that the elected mayor is not in charge of urban development, the commissioner is. And the commissioner is not elected, but appointed by the state of Maharashtra. Looks like a change in the political economy of urban management is required.

Proposal #2: Make elected mayors accountable for urban planning and development.

  • Liquidity constraints and redevelopment

Rental costs per square foot in Mumbai can easily reach London or Paris costs. Families typically own inherited real estate that can be worth hundreds of thousands of dollars; but neither redevelop or sell their property. The quality of the housing stock can explain why households do not move — the alternatives may not be better –, but it does not explain why they do not redevelop the land or invest in upgrading their unit. Part of the explanation can lie in liquidity constraints: if upgrading can repay itself quite quickly, the added value cannot be easily captured by the owners. There seems to be plenty of room for redevelopment/upgrading loans.

Proposal #3: Offer government guaranteed loans repaid on increased equity.