policy — Blog — Fiona Burlig

New semester, new blog-resolutions. We're back with a WWP...except that the analysis I'm talking about here hasn't actually made its way into a working paper yet. That said, the work is interesting and cool, and extremely policy-relevant, so it's worth taking a minute to discuss, I think.

For those of you not up on your India news, Delhi's air pollution is horrendous. Air pollution data suggest that Delhi's PM2.5 and PM10 concentrations are the worst in the world - the city has even less breathable air than notoriously dirty Beijing. Having spent some time in Delhi last January, I can add some of my own anecdata (my new favorite Fowlie-ism) as well: after three days of staying and moving around in the city, I was hacking up a lung trying to walk up three flights of stairs to our airbnb. I'm certainly not the fittest environmental economist around, but a few steps don't usually give me trouble.

So I was glad to hear that Delhi has recently been undertaking some efforts to improve its air quality. I was less glad to hear the method for doing so: between January 1 and January 15, Delhi implemented a pilot driving restriction. Cars with license plates ending in odd numbers would be allowed to drive on odd-numbered dates only, while cars with license plates ending in even numbers would be allowed to drive on even-numbered dates only. This sounds good - cutting the number of cars on the road by about half should have a drastic effect on air quality, right? The problem is that Mexico City has had a similar rule in place for years - Hoy No Circula - and rockstar professor Lucas Davis took a look at its effects in a 2008 paper, published in the Journal of Political Economy. Unfortunately (I thought) for the Indian regulation, Lucas finds that license-plate-based restrictions lead to no detectable effect on air quality across a range of pollutants.

Here's Lucas' graphical evidence for Nitrogen Dioxide. If the policy had worked, we would've expected a discontinuous jump downwards at the gray vertical line. He shows similar figures for CO, NOx, Ozone, and SO2.

Lucas provides an interesting possible explanation for the lack of change: he has suggestive evidence that drivers responded to the regulation by buying additional vehicles - in the Delhi case, if I have a license plate ending in 1, but really value driving on even-numbered days, I might go out and get a car with a plate ending in 2 instead. In light of this evidence, I was less-than-optimistic about the Delhi case.

So what actually happened in Delhi? New evidence from Michael Greenstone, Santosh Harish, Anant Sudarshan, and Rohini Pande suggests that the Delhi driving restriction pilot did have a meaningful effect on pollution levels - on the order of 10-13 percent! (A more detailed overview of what they did is available here). These authors use a difference-in-differences design, in which they compare Delhi to similar cities before and after the policy went into effect, doing something like this:

Effect = (Delhi - Others)_Post - (Delhi - Others)_Pre

Under the identifying assumption that Delhi and the chosen comparison cities were on parallel emissions trajectories before the program went into effect, this estimation strategy is nice because it removes common shocks in air pollution. The money figure from this analysis shows the dip in pollution in Delhi starkly:

It looks like, in its brief pilot, that Delhi was successful at reducing pollution using this policy. So why is the result so different than in Mexico City? Obviously, India and Mexico are very different contexts. It also seems like the channel Lucas highlighted, about vehicle owners purchasing more cars, is something that people would only do after being convinced that the policy would be permanent - so there might be additional adjustment that occurs that isn't picked up in a pilot like this. (Would you go out and buy a new car in response to someone telling you that over the next two weeks they're trying out a driving restriction? I don't think I have that kind of disposable income...) Also, the control group obviously matters a lot. I'd like to see (and expect to see, if this gets turned into an actual working paper) a further analysis of what's going on in the comparison cities over the same time period. The pollutants being measured are different - though I doubt that this actually affects much, given how highly correlated PM is with the pollutants measured in Lucas' paper.

In general, I'm encouraged to see both that Delhi is taking active steps to attempt to reduce air pollution, and that these steps are being evaluated in credible ways. As the authors point out in their Op-Ed, and as I've tried to highlight above, we should be cautious about extrapolating the successes of this pilot to the long run - a congestion or emissions pricing scheme might be a more effective long-run approach to tackling air pollution.

I'd also like to briefly highlight the importance of making air pollution data available for these kinds of analysis. There's a cool new initiative online called OpenAQ that's trying to download administrative data from pollution monitors and make this information publicly available - and they're not the only ones. Berkeley Earth is also providing some amazing data on Chinese air quality - and rumor has it they'll be adding more locations soon. Understanding the implications of air quality on health, productivity, and welfare is increasingly important as developing country cities grow and house millions in dirty environments - the more data that's out there to aid in this effort, the better.

The New York Times put out an important article yesterday discussing the importance of credible policy evaluation, featuring work by the all-star team of Fowlie, Greenstone, and Wolfram. The upshot of the article? When program evaluation is done by people with a stake in seeing that same program succeed, we have reason to worry about the conclusions. The problem is the following: if non-independent evaluation teams suggest that a program is great, it's hard to know whether it's actually great or if existing incentives distorted the results.

The Weatherization Assistance Program is a large effort by the US government to weatherize low-income households and make them more energy efficient. The aforementioned all-star team of economists put out a paper in June (now R&R at the QJE!) using a state-of-the-art randomized controlled trial to measure the energy savings from the program. They concluded, much to the chagrin of many energy efficiency advocates, that the costs of the program are twice the energy savings benefits among Michigan households.

Cheery weatherization clipart from here.

The DOE recently released their own study of the program, in over 4,000 pages spread across 36 documents. If you're cynical like me, you're perhaps not shocked that the DOE's report finds that overall, the program benefit-cost ratio is 4:1. This takes into account non-energy benefits such as health that Meredith, Michael, and Catherine did not directly include in their original study (though to be fair, they did look at indoor temperature set-points, and find no evidence of changes - suggesting that there is little propensity for large health effects to result from reduced exposure to extreme temperatures among their sample).

What even a cynical reader might be surprised by is the magnitude of the problems with DOE's reports. From the Energy Institute blog (I also highly recommend that you read the accompanying deep dive into thermal-stress-related benefits):

“We have spent many hours poring over these opaque documents. Our judgment is that many of the DOE’s conclusions are based on dubious assumptions, invalid extrapolations, the invention of a new formula to measure benefits that does not produce meaningful results, and no effort to evaluate statistical significance. Using the DOE’s findings, we show below that costs exceed energy savings by a significant margin. We also document major problems with the valuation of non-energy benefits, which comprise the vast majority of estimated program benefits.

Overall, the poor quality of the DOE’s analysis fails to provide a credible basis for the conclusion that the benefits of energy efficiency investments conducted under WAP substantially outweigh the costs. This blog summarizes our assessment of the DOE’s analysis for a general audience. We provide a more technical discussion, using the important example of benefits relating to thermal stress, here.”

— Energy Institute at Haas blog, October 6, 2015

Eduardo Porter, author of the excellent New York Times article described above, also conducted a Q&A with Bruce Tonn, head of the DOE evaluation team. If you ask me, this is almost more damning than the original article - but I'll leave you to judge for yourself.

Full disclosure: I provided research assistance on the economists' response, helping to read over the thousands of pages of documents from DOE. So maybe I'm a less-than-impartial commentator. But I will say this: I would have been thrilled to see a DOE report that, using modern empirical techniques and direct measurements, was able to provide definitive proof of the existence of large and real health benefits from WAP. I'm disappointed that the evidence that DOE did provide was appears unconvincing and flawed. Getting climate policy right, and furthermore, getting low-income assistance right, in situations where governments have limited budgets, demands honest, sometimes hard-to-stomach, independent evaluation. Getting these policies right now will pay off in the long run - as will moving towards an institutional culture of proper ex post evaluation.

PPS: Not-at-all-humble brag moment - guess who the "graduate student" mentioned in the NYT article is? Hopefully not my last NYTimes mention...but my own work has a long way to go before being ready for anything like that, so I should stop writing this blog and get to writing a job market paper.

Disclaimer: I wrote this blog post without the supervision or knowledge of Meredith, Michael, and Catherine. I certainly do not speak for them, nor for the E2e project, EPIC, Energy Institute, etc, etc, etc.

Weekend Op-Ed: Delhi driving restrictions actually work [so far]!

Forget weatherization - how do we make evaluation work?