‘All models are wrong, but some models are useful’

What with the comments on Mark Thoma’s posts of Paul Krugman’s earlier attempts to explain economic methodology to non-economists and the apparent publication bias in favour of significant results in political science and in economics, George Box’ dictum is worth repeating for those who are too sceptical of models, and for those who aren’t sceptical enough.

‘All models are wrong,’

Since models are by their very nature approximations to an unfathomably complicated reality, they are of course literally false. Unfortunately, this rather basic point is hard to remember if you’ve been trained in classical (frequentist) techniques of statistical inference in which one tests to see if the null hypothesis – which is invariably expressed as a special case of a model – is ‘true’. If we know a priori that the hypothesis is false, what’s the point of testing to see if it is true?

‘but some models are useful’

Models are pretty much the only instruments we have for understanding complex phenomena. Anyone who claims to be able to understand the economy, the climate or the universe without making any simplifying assumptions is fooling only themselves.
The usefulness of a model depends on the task at hand. A static model with two goods might be useful in developing an understanding of some aspects of international trade, but it won’t help you get a handle on consumption and savings decisions.

6 comments

Ryan Webb · September 29, 2006 - 10:43 am ·

I think you may have it backwards. Doesn’t frequentist statistics only test to see if the null hypothesis can be rejected as false? Making no statement about the actual truth of the model? Whereas a Bayesian would be prepared to make a claim about the truthfulness of his statistical estimation, ex post?
Ryan Webb · September 29, 2006 - 10:49 am ·

Ha… I just realized that this isn’t your dictum, but George Box’s 🙂 But still, the question remains…
Stephen Gordon · September 29, 2006 - 11:31 am ·

The null hypothesis is generally presented as ‘The data were generated according to the following process.’ We either reject it or not. Since we know with probability one that the data were not generated by that process, what’s the point of testing the null? And if the alternative isn’t well-posed, what’s the point of rejecting it?
A Bayesian will assign a probability to a hypothesis, and only if there’s a well-specified alternative. If only one model can be used he’d choose the one that minimises expected posterior loss. But first, he’d ask why he has to choose only one in the first place.
Ryan Webb · October 1, 2006 - 10:13 am ·

I see your argument now… I’m grasping at straws here but while we know that the model is false with probability one (assuming an estimate can take on any value of the real line right?), we still have to test because if we can reject the null we can rule out a possible model. This is something you don’t know a priori. Essentially it’s a ruling out process, not a ruling “in” but it still must be undetaken.
Stephen Gordon · October 1, 2006 - 7:01 pm ·

There’s little point in ruling out a model that you already know to be false if you don’t have a better alternative (which is also false, but perhaps a better approximation) at hand.
justme · March 20, 2007 - 1:24 am ·

If we know a priori that the hypothesis is false, what’s the point of testing to see if it is true?
We don’t know it’s false.