"Behavioural Science", Data Mania, and Academic Knowledge

As readers of this blog will know, I have various mild grievances with trends in economics and academia more broadly. One of these is being frenetic about data and methods. One aspect of this data obsession that I want to write about here is the proliferation of "behavioural scientists" (similarly "data scientist").




Papers get published in top journals because they have a million observations, even if what those obs are used for is kind of banal. When you ask young economists what they work on, they often tell you their data and their methods rather than their subject area: "I do structural equation modelling on administrative data" as opposed to "I'm interested in how to reduce childhood obesity". They also seem pumped up all the time because their work style is to get moar data and run moar regressions. This methodology is very time- rather than thought-intensive, so keeping yourself pepped is critical - only energiser bunnies can stay at their desk for 12 hours a day, 6 days a week.

This focus on methods and data rather than questions is one reason why everyone and their dog jumped on the Corona bandwagon. Such bandwagoning is bad because noise from methods experts jumping into new subject areas crowds out signal from people who do both methods and subject matter in those areas.

I've noticed that the most frenetic data data data people often fly under the label of "behavioural"; either "behavioural economist", "behavioural scientist", or "behavioural policy". They might be a bit more specialised, like "behavioural health". What's weird about this is I don't see any behaviourism in what they do. "Behavioural" refers, in the first instance, to two bodies of theory. The first is behavioural psychology in the Skinnerian tradition. The emphasis here was on measuring observable, objective behaviour and making inferences about psychology from those behaviours. Pavlov's dog is an example. The second is behavioural economics, which emerges out of Kahneman and Tversky's work (and Thaler and others) on the effect of cognitive biases on "rational" decision making.

Contemporary behavioural science pays at best lip service to these roots. As far as I can, "behavioural" nowadays just means that you observe what people are doing in data. Everything is led by data, and theory is seen as a trap. Behavioural people are then understandably frenetic about data because it is their lifeblood, and they are obsessed with ever more frequent and larger samples because they can't get external validity any other way.

In terms of economics, this methodology amounts to a rejection of both the rational choice approach of theory building without ever going to data, and the empirical turn approach of testing well developed theories against data (i.e. the correct approach).

I struggle to see how you can build coherent knowledge when theorising is post-hoc and very ad hoc. In particular, I struggle to see how you ever get external validity.

Consider a paper I saw recently testing whether putting signatures at the top or bottom of a document changes people's signing behaviour. Why would it? On the one hand it doesn't matter why as long as you observe some significant difference. If what you want is people to sign more (or less), you can exploit that observation. But you immediately run into external validity. Was it something about this particular document that had the effect? Something about the sample? Something about the brain? And then in order to eliminate these various hypotheses, you need to build some theory to test them.

The response of behavioural people is that you can build very small theories and test them incrementally. This again leads to being frenetic about data. You can't get published in a good journal nowadays without a sample size of 1000+ MTurks from diverse backgrounds. So you run one experiment at great cost, move the knowledge needle 0.1%, then another, another, and another. And you spend your life pouring over descriptive statistics building a model that can barely be transferred to any other context. Compare this to something like evolution as an idea, which is colossally powerful across a huge number of contexts. It gives birth to profound new hypotheses with ease and elegance. OK maybe evolution is too much to expect from most academics, but something like Baumeister's "willpower is like a muscle" theory isn't, and it's way better than behavioural data ming.

Being theory-avoidant also makes you reluctant to think your hypotheses through in any sophisticated way, which leans to lazy theorising that leans on established concepts. This is the most straightforward explanation for "bias bias", where "biases" are invoked to explain every aspect of human behaviour. Having a think about the complex evolutionary, environmental, and cognitive factors that might give rise to seemingly irrational behaviour requires philosophical skills and wide reading for which there is no time when you need to absorb a suite of 40 empirical methods and spend 8 hours a day collecting, cleaning, and analysing data.

Data-first methods are quite common in psychology (e.g. in subjective well-being studies), but then there is at least some allusion to construct validation. A lot of behavioural research doesn't even have a construct it is interested in, perhaps because so much of it is driven by impact evaluation settings. I can appreciate that to assess whether policy A shifted variable X and that had a change on behaviour Y you don't need a theory. But again, if you want to understand why X shifted Y then you do, and you're going to need that generalizable theory to port policy A cost-effectively to other settings where it might be useful.

The notion of "data science" is oxymoronic because scientific method involves testing hypotheses. If you draw your hypotheses from the data then you can't get around the induction problem. Simplistically, you might observe a stochastic phenomenon (i.e. luck) and mistake it for a deterministic process because you have no theoretical architecture. This method is fine in the early stages of a research agenda (e.g. in subjective well-being studies), but it's not appropriate for a mature science.

One last aspect of all this that I find pernicious is that data freneticism ironically obstructs the collection of richer data because people stop at the data that is available. Justifying the collection of new, carefully worded questions in something like the European social survey requires a bucketload of theorising and empirical evidence of the need for change. Who has the time!? I need to publish now, and publishing requires data now, and look this is the data that we have so let's just go with that.

What's weird to me about a lot of this is that these arguments were levelled at Skinnerian behavioural psychology. None of this is new, it's just reemerged because we have bigger data sets now.

Comments

  1. I've seen people calling themselves "behavioural scientists" on Linkedin. As you know, I'm all about what you call "the rational choice approach of theory building without ever going to data". Any idea what my Linkedin tag-line can be?

    ReplyDelete
  2. this is such a zinger "The notion of "data science" is oxymoronic because scientific method involves testing hypotheses. If you draw your hypotheses from the data then you can't get around the induction problem"

    ReplyDelete
  3. "Assuming reality doesn't exist since " X-)

    "Applied mathematician"

    "Economic Dinosaur"

    "Neoclassical Economist"

    "Economist"

    ReplyDelete

Post a Comment