The lead-up to any election these days is an orgy of commentary
on ‘the latest polling numbers’. This is facile. Such numbers don’t tell you anything, can’t be used for causal analysis
and cannot predict the outcome of the election in all but the most obvious of
cases. What they do provide is some convenient noise for pundits to fit their
preferred narrative to and give it a scientific gloss. My main reason for
bringing this up is that polling constitutes a very clear example of the
increasingly common phenomenon of people applying statistical techniques to
questions that don’t meet the basic criteria for statistical inquiry. This worries
me deeply.
The basic idea of a predictive poll is that if you can get
a representative sample of the electorate
then you can predict the outcome of an election. So far so good. The first
problem for polling types if that it’s near impossible to get a representative
sample, and you can’t really ever know whether you’ve got one or not. Calling
round people’s land-lines certainly doesn’t get you one because most people
under 50 living in the city now only have mobiles. Calling around during the
day certainly doesn’t either. Calling 1000 people in an electorate of 150 000
(the average poll and the average population of an Australian electorate),
which means you sampled 0.6% of the population, is not going to get you much of
a representative sample either. If 78% of the people you call hang up, then you can bet your sample isn't representative because their is selection involved in who doesn't hang up.
In Australia, you can at least rely on the fact that pretty
much everyone you polled will vote, because it’s compulsory to vote in
Australia. In other countries, notably America, you’ve also got to contend with
turnout problems. How do you know that the people you polled will actually
vote? One way this is done is by using exit polling data from the previous
election. But exit polls are themselves a non-representative sample. And we're in a time-series context here: those numbers from the previous election might be wildly different this time around.
Combining
erroneous samples doesn’t make them less erroneous. Your measurement errors are
just multiplying. This idea that Nate Silver’s team ‘unskews’ polls (i.e. fits
their own biases to the data before presenting the data as 'clean' data) is
a bad joke. It’s just rough estimation based on rough estimation based on rough
estimation. You’re not drilling down to the truth here, your floating slowly
away from it to pure punditry.
The great innovation of groups of like 538 is to combined
data from multiple polls to form a clearer picture of the true distribution of
voting behaviour within an electorate. This is bullshit wrapped in Bayesian
updating. Such updating relies on the distribution whose true nature you are
trying to ascertain staying the same
between samples. If you’ve got a bag with an unknown quantity of white and
black balls and you draw 5 out, record a distribution, then take another 5 out
and update your presumption about the distribution on the basis of your now 10
ball sample, you’re getting closer to knowing the true distribution of the
colour of the balls in the bag because the colour of those balls isn’t
changing. Not so with voting behaviour. A poll before the FBI reopens its
investigations into Clinton and a poll afterwards are polls of two
fundamentally different populations. You can’t aggregate them.
The other notion I find specious in polling discourse is
the idea that 538 or anyone else has a ‘probabilistic model’. Again, this is
bullshit. You can’t have a probabilistic model in a practical sense if there is
only one draw from the distribution of the event ‘US Election 2016’ because you
can’t test the probabilities you have come up with without multiple draws. If
Nate Silver’s model says that Trump has a 30% chance of winning and Huffpost’s
says it’s actually 0.3% and then Trump wins the election, you’re no closer to
knowing whether Trump’s chances were 30% or 0.3%. You can’t know that unless
you take far more draws from the distribution, but you only get to make 1 draw!
Some of my friends talk about living on 538 trying to ‘make sense’ of the election in the lead up
to Trump’s victory. Why? There’s nothing to be learnt there. Those numbers don’t
tell you why people are voting a
certain way. When thee numbers change you can’t understand why they have changed. Where is the understanding here? You’ve got
no causation. All these numbers do is encourage you to fit narratives, which is
exactly why nobody saw Trump’s victory coming.
Where this all really goes to shit is when it reaches the
media. Contemporary media commentators don’t actually have expertise in
anything of substance. They are either retired hacks with biases out the wazoo
or people with an expertise in communication. Neither makes for deep analysis.
Poll numbers give such people fabulous fodder for speculation. For example, ‘Hillary’s
poll numbers took a dive this week – it must be because everyone hates that she
called those Trump supporters a basket of deplorables’. Or maybe it’s just
because you got a different sample this time! Such comments are particularly
hilarious when the change in the poll numbers is less than the standard
deviation of the poll (meaning the results of the new poll are in fact
basically the same as those of the old poll).
The love media types have for endlessly discussing poll results in facile realpolitik terms has a lot to do with why there is no policy analysis anywhere anymore.
When political parties do polls, they aren’t just asking ‘democrat
or republican?’—they get qualitative information. Parties try to understand
what messages are cutting through, what policies are really at the forefront of
people’s minds, what would change their vote and what they like/dislike about a
candidate etc. This is still far from great data, but it’s something you can
actually conduct an analysis on.
I don’t understand why people have such a desperate need
for information leading into elections. It’s not like you can control the
result. It’s not like you’ll discover something there that won’t be so much
clearer two years after the election once all the data is actually out from the
authorities. I understand why people want something with a glint of science to
it, but they’re looking in the wrong place with predictive polling and
predictive modelling more generally. Probability theory breaks down really fast
in a time series context and in the context of a dynamic system. Voting is both
of those problems put together. It’s Arkham Asylum for probability.
I think you've overstated your case a bit. Think about a situation like the presidential primaries or the French election where some candidates are polling at 30% and others at 1%. That's highly relevant to any political actor who might be interested in who should get media coverage or how to vote tactically. The polls obviously have unpredictable errors that can sometimes be like 5% or whatever and there's no doubt that the media makes a circus out of them but they're still both predictive and important
ReplyDeleteThanks Tom. Sure, but I do say this in the second sentence: '...cannot predict the outcome of the election in all but *the most obvious of cases*.' When someone is polling atrociously it's obvious. But I would say that the sentiment is then palpable in the streets - you don't need a poll to tell you that Pauline Hanson won't ever win be Australian Prime Minister. I also think it's impossible to know the size of those errors you mentioned, and when you're aggregating and unskewing and aggregating again and massaging and adjusted and speculating etc. etc. I think it all turns into a hot mess of garbage.
ReplyDeleteI've no expertise in statistics or polling BUT - I've always doubted the reliability of any survey based on self-reporting. People are so complex and dynamic in their opinions/feelings that the results of any survey will change from day to day and even faster.
ReplyDeleteAs so many psychological findings are based on surveys it's no wonder that psychologists/psychiatrists are often found wanting in their attempts to handle the problems of their patients.
Pollsters may soon suffer the same lack of respect if their predictive powers continue to disappoint.