The Dim-Post

May 26, 2012

More poll chart porn

Filed under: polls — danylmc @ 9:38 am

Peter Green put together an R script that generates an svg file showing historical polling trends. It estimates polling bias and factors it into the trend-line. Unfortunately WordPress won’t let me post svg files, so here’s a screen-capture of the result below. If you download the actual file and open it in a browser that isn’t Internet Explorer you can click on individual polls and highlight trends from different pollsters. The yellow lines below the x axis are significant political events. If this was an svg file you could click on them and find out what they were. Peter’s explanation of the statistics below the jump:

“Generalised” means that a GAM can use non-Gaussian error terms (e.g. Poisson, Binomial).  I haven’t used that feature, so it’s really just an “AM”.  For Gaussian models, least-squares and maximum-likelihood coincide.
“Additive” means that the independent variable is converted into a sum of splines.  So instead of fitting a straight line as in OLS, we’re fitting a penalised curve.
The bias correction might be easiest to understand by analogy to the OLS case — it’s basically the same thing, but with a straight line instead of a curve.  We have a “factor” variable with n+1 levels, one for each polling company plus one for elections.  This can be converted into n dummy variables: “continuous” covariates that happen to only take the values 0 and 1.
Imagine we had three companies A, B, C.  The linear model would be:
y[i] = b0 + b1 * x[i] + b2 * isA[i] + b3 * isB[i] + b4 * isC[i] + e[i].
Note that for an election, isA=0, isB=0, isC=0, We can rewrite the model as:
y[i] = (intercept) + b1 * x[i] + e[i],
(intercept) = b0 * isElection[i] + (b0 + b2) * isA[i] + (b0 + b3) * isB[i] + (b0 + b4) * isC[i]
i.e. we’re fitting a separate intercept for each company, but the same slope.  For OLS this means we get 4 parallel lines, for GAM it means we get a bunch of parallel curves.  The curve that’s plotted corresponds to elections.  The percentages on the right are simply the endpoints of the curves for each party.

opt_ewt weights the election results.  A weight of 1 means that the gam assumes that elections have the same random variation as polls.  A large value for opt_ewt (say 1000) means that the election results have a very small estimated sampling error, and so the curve will pass through them (approximately).

About these ads

41 Comments »

  1. I think you have a big issue here, in that your chart suggests that over 95% of all polls about National over the past six years were overestimates, while over 95% of polls about New Zealand First were underestimates. Speaking personally, I do not think that is true, and I do not think you have the data to back up that of a claim (the N you are relying on here is basically three). Which means you are more assuming those biases than demonstrating them.

    Comment by Rob Salmond — May 26, 2012 @ 9:46 am

  2. Hi Rob – What’s interesting is that if you cut off the line a day before the 2011 election (ie n = 2) it’s still very accurate in predicting the actual election results – more accurate than even aggregates of the polls.

    Comment by danylmc — May 26, 2012 @ 9:56 am

  3. Interesting, though, that the Nat election results seem to be at the lower end of the range of polls taken on the preceding month or so (and something similar for the Greens, but harder to see), whereas Labour election results look to be in the middle of the distribution of pre-election polls.

    Agree you can’t say much on the basis of three elections, though. Is there more data on polls preceding earlier elections?

    Comment by Dr Foster — May 26, 2012 @ 10:09 am

  4. Noob interpretation error: why does the Nat trend line pass mainly through the lowest data points, and given that it does, is it an accurate reflection of their popularity as expressed in the polls?

    L

    Comment by Lew (@LewStoddart) — May 26, 2012 @ 10:11 am

  5. Sigh. Everyone already got in on this one. Note to self: hit refresh before commenting.

    L

    Comment by Lew (@LewStoddart) — May 26, 2012 @ 10:12 am

  6. @Danyl: That is an interesting finding, despite the N of 2. But I struggle to understand how any form of poll / election averaging procedure is accurate when estimating, as at today, that National is on 41.4%, lower than any poll or election result they have scored in the last 3.5 years; and that NZ First is actually on 7.8% even though they haven’t scored that high in any poll or election since mid-2008.

    Comment by Rob Salmond — May 26, 2012 @ 10:43 am

  7. Rob,

    Pretty hard to argue against data. Perhaps some evidence based speculation please: Why do political polls consistently overestimate the amount of voters voting for National, when compared to actual election results. Some sort of selection bias in the polls compared to the general electorate?

    FM

    Comment by Fooman — May 26, 2012 @ 10:45 am

  8. If the question was “do general elections tend to (under/over)estimate opinion polling results?”, we’d have N=3. For the question “do polling results (over/under)estimate general election results?”, it’s more like N=300.

    The trend-line is meant to predict election results, rather than measure popularity. It could be that National are much more popular when people aren’t paying attention, and then tend to drop just before an election; or it could be that the polls consistently overestimate their popularity. Either way, if we’re interested in who’s going to govern next, we need to make the adjustment.

    Comment by pete — May 26, 2012 @ 10:52 am

  9. Ephipany here:

    This is analogous to my power bill – the power company gives an estimate every 2 months (based on prior history), and an actual reading on the third month. They want to stick in a smart meter, to enable accurate day-by-day billing (won’t happen until cell coverage improves). There must be some way to “smart meter” polls rather than random landline calling…something that _everyone_ does regardless of socio-economic/cultural status…

    Wireless part-coded flush buttons works for me…

    FM

    Comment by Fooman — May 26, 2012 @ 10:53 am

  10. er, party-coded

    FM

    Comment by Fooman — May 26, 2012 @ 10:54 am

  11. Ok, I see — you’re modelling an observed variance between opinion polls and election results. That seems … contentious. I wish we had more than 3 elections worth of data.

    L

    Comment by Lew (@LewStoddart) — May 26, 2012 @ 11:07 am

  12. To add: It will make Bomber Bradbury and Chris Trotter very happy, though.

    L

    Comment by Lew (@LewStoddart) — May 26, 2012 @ 11:09 am

  13. Firstly the graph is much more attractive than mine, so kudos for that.

    I think a constant correction for each party for each pollster (which I think is what Peter Green is doing, though I’d have to go through the code to be sure) is inappropriate. To give an example: National polled over 50% for most of the last cycle, but everybody knew they wouldn’t get 50% in the election. The two competing explanations are:

    (i) Regression to the mean: runaway leaders tend to fall back a bit as the election approaches. (This has been widely observed internationally, and is built into e.g. Nate Silver’s prediction model.)
    (ii) The pollsters inherently overestimate National’s support and will continue doing so for the foreseeable future.

    If (ii) is right, a constant correction is appropriate; if (i) is right then it isn’t. I think (i) is right. This is unprovable, but I might as well slip in the totally unfair ad hominem argument that a number of the people who believe in (ii) are the same people who used to tell you how great the Horizon Poll was.

    Comment by bradluen — May 26, 2012 @ 11:18 am

  14. If anyone wants to bump the sample size up to five:

    Last Digipoll before the 2002 election:

    http://www.nzherald.co.nz/election-2002/news/article.cfm?c_id=774&objectid=2197270

    Same for 1999 (with other polls mentioned in passing):

    http://www.nzherald.co.nz/election-1999/news/article.cfm?c_id=717&objectid=103952

    Comment by bradluen — May 26, 2012 @ 11:32 am

  15. How are the significant political events selected? There seems to be a bias towards the most recent, and gaps of 9 or 10 months in a couple of places.

    Comment by bka — May 26, 2012 @ 11:58 am

  16. Fooman @9 don’t be a foo: if I had a smartpoll meter in my house, then “they” would always know what my political leanings are. I’d have to wrap the meter in tinfoil, just as I do my head each night before I lay down.

    Mind you, those smartpoll meters would be great for referendums. is this what the Swiss use?

    Comment by Clunking Fist — May 26, 2012 @ 12:04 pm

  17. Constant adjustment and regression to the mean are probably indistinguishable with this data.

    It might help to think of the constant adjustment as a local approximation to the regression to the mean model.

    Comment by pete — May 26, 2012 @ 12:06 pm

  18. It will make Bomber Bradbury and Chris Trotter very happy, though.

    I sometimes think that’s why I was put on this earth.

    If anyone wants to bump the sample size up to five:

    I actually have historic polling data around someplace. If I find the time I’ll create the wiki pages and adjust the script.

    Comment by danylmc — May 26, 2012 @ 12:38 pm

  19. @bka

    How are the significant political events selected? There seems to be a bias towards the most recent, and gaps of 9 or 10 months in a couple of places.

    Events come from the Wikipedia polling pages.

    Comment by pete — May 26, 2012 @ 2:41 pm

  20. Code if anyone wants it:

    Comment by pete — May 26, 2012 @ 2:59 pm

  21. a last comment, but these results remind me of some good analysis by this guy from the aussie elections a few years back.

    he demonstrated pretty comprehensive (as far as i can remember) graphs illustrating that individual data points are meaningless, despite the media’s love for trumpeting every changed result. individual polls were putting that rat bastard howard fairly high, but the cumulative results were showing a long, slow decline to nothing. a slow decline punctuated by cynical, deeply racist moves like condemning boat people during election week.

    the only thing that saved the despical little turd really.

    Comment by Che Tibby — May 27, 2012 @ 10:59 am

  22. Here’s a brain teaser for R geeks :)

    http://www.cerebralmastication.com/2012/03/solving-easy-problems-the-hard-way/

    Comment by ropata — May 27, 2012 @ 3:37 pm

  23. Hmmm, if there is a systemic pro-National bias in polling results, then perhaps Winston has a good case for not allowing opinion polls in the three-six weeks before an election?

    Comment by Sanctuary — May 28, 2012 @ 10:56 am

  24. It could be that National are much more popular when people aren’t paying attention, and then tend to drop just before an election; or it could be that the polls consistently overestimate their popularity.

    Probably a bit of both, though I suspect the ‘not paying attention’ scenario is related to incumbency and external factors rather than party specific.

    People love to hate the government though conversely, the polling does seem to show the effectiveness of National’s ‘Brand Key’ style-over-substance message management with the electorate and Labour’s complete inability to emulate it.

    Comment by Gregor W — May 28, 2012 @ 12:00 pm

  25. Taking a quick look at the 2002 data, most of the polls just before the election do overestimate National’s result, and underestimate NZFirst.

    Comment by danylmc — May 28, 2012 @ 12:14 pm

  26. Would that be indicative of polling method (channel and selection of sample population) or possibly the style of campaigning?

    Comment by Gregor W — May 28, 2012 @ 12:43 pm

  27. Analysing polls is a bit like self-reporting talk therapy, where we dance around the subject without doing what we really want … ripping open the brain and finding out what’s going on in there, seeing which wires are connected – or not.

    So, what are opinion polls, in reality? Answer: a collection of telephone conversations. And what happens in those conversations? Let’s get hold of some transcripts. Here’s one …

    (after preamble, audible sigh of relief from caller as s/he finally gets a willing respondent)

    Caller: “If there were a general election tomorrow, which party would you vote for?”

    Voter: “Tomorrow? Is it? Already? Bloody hell, that’s no good for me, because – oi, kids, keep it down will ya, I’m on the phone to the telly – nah, I’ve got to take Dane to the dentist, and then there’s the soccer, and …

    Caller: “No, IF there were a general election tomorrow, which party would you vote for?”

    Voter: “Oh, I dunno. That John Key is good for a laugh, he came to my niece’s school once. Signed her hat. He’s all right. Don’t like the government though, which ones are they?”

    Caller: (desperate not to let this one fish go, after dozens of rejections): “So … if there were a general election tomorrow, which party would you vote for?”

    Voter: “Is Winston still around? Yeah, his lot I suppose. National, isn’t he?”

    Caller: (looks at clock, tosses coin, ticks a box …)

    Comment by sammy 2.0 — May 28, 2012 @ 5:01 pm

  28. Link to the interactive version: http://imgh.us/nzpolls.svg

    Comment by pete — May 28, 2012 @ 8:49 pm

  29. we’re fitting a separate intercept for each company, but the same slope. For OLS this means we get 4 parallel lines, for GAM it means we get a bunch of parallel curves.

    Can you tell us the offsets for each polling company, i.e. its systematic bias?

    Comment by Smut Clyde — May 28, 2012 @ 11:22 pm


  30. Green Labour National NZ First
    3 News TNS 0.3 2.5 1.5 -1.8
    Fairfax Media–Nielsen -0.4 -1.2 5.9 -1.3
    Herald-DigiPoll -0.6 2.3 4.1 -2.4
    One News Colmar Brunton -0.5 -0.4 6.0 -2.1
    Roy Morgan Research 1.1 -0.4 2.9 -1.1
    UMR Research 1.0 0.1 3.7 -1.9
    3 News Reid Research 0.5 -0.8 6.4 -2.9
    Fairfax Media–Research International 0.3 -0.4 6.1 -3.6

    Comment by pete — May 28, 2012 @ 11:30 pm

  31. Okay, maybe this will work:

                                         Green Labour National NZ First
    3 News TNS                             0.3    2.5      1.5     -1.8
    Fairfax Media–Nielsen                 -0.4   -1.2      5.9     -1.3
    Herald-DigiPoll                       -0.6    2.3      4.1     -2.4
    One News Colmar Brunton               -0.5   -0.4      6.0     -2.1
    Roy Morgan Research                    1.1   -0.4      2.9     -1.1
    UMR Research                           1.0    0.1      3.7     -1.9
    3 News Reid Research                   0.5   -0.8      6.4     -2.9
    Fairfax Media–Research International   0.3   -0.4      6.1     -3.6
    

    Comment by pete — May 28, 2012 @ 11:32 pm

  32. This chart illustrates that the Green Party have, for at the very least the foreseeable interim, cemented themselves as the third most popular political party. Also, the fact that we are currently have a second-term National Government and the Green Party is a socialist party, suggests that their popularity will probably increase further, to around 15%, in the 2014 election, particularly because people don’t seem to be interested in voting for David Shearer to be PM and because Labour no longer reflects the needs of the working class but rather serves as a pathetic platform for those who were bullied in school to have a dig at those who are ignorant and smug. Instead, it is parties such as the Green Party, NZ First, the Conservative Party, and the Mana Party that are increasingly becoming reflective of the needs of different societal groups.

    Comment by Daniel Lang — May 29, 2012 @ 10:43 am

  33. “…because Labour no longer reflects the needs of the working class but rather serves as a pathetic platform for those who were bullied in school …”

    I suspect the ideal coalition partner for the Greens would be ACT, that way the chips on their shoulders would perfectly balance.

    Comment by Sanctuary — May 29, 2012 @ 10:55 am

  34. I’ve populated the 2005 election polling page on wikipedia with more polls from 2002 to 2005. It changes the curve a little bit.

    https://docs.google.com/open?id=0B9H4zSibZV15Ui1FZzdwX0J2Mzg

    Comment by danylmc — May 29, 2012 @ 11:20 am

  35. The adaptive smoother (“opt_adapt <- TRUE") might be worth using with the increased amount of data — the results appear to be more plausible (residuals have a similar magnitude to the expected margin of error).

    Comment by pete — May 29, 2012 @ 5:55 pm

  36. (thinking out loud)

    Those are some huge numbers for pro-National polling bias. If you look at Colmar Brunton’s last polls before each of the last three elections, they overestimated National by 4.9, 2.1, and 2.7 points; obviously these alone don’t justify a 6.0 adjustment. We’re getting these huge offset values because we’re not just comparing to the last poll before the election, we’re also comparing to polls two or three weeks before the election when National’s getting results in the mid-50s. This ties in with my idea that opinions change more close to elections, so the optimal amount of smoothing is time-dependent. Visually, you can see the graphed curve for National is fairly flat going into each election, when I would argue it should fall sharply.

    Comment by bradluen — May 29, 2012 @ 7:50 pm

  37. Also Danyl mislabeled a particularly crucial Morgan poll (Sept 2005) as a DigiPoll, so if his goal was to bait me into editing Wikipedia for the first time in six years, which it surely was, he succeeded.

    Comment by bradluen — May 29, 2012 @ 8:08 pm

  38. Brad, you might prefer this one: http://imgh.us/nzpolls_4.svg

    Adaptive smoothing allows for more rapid changes close to elections when more polling is done. The bias estimates are a bit more believable too.

                                         Green Labour National NZ First
    3 News TNS                             1.2    2.7     -0.2     -1.6
    Fairfax Media–Nielsen                  0.4   -1.3      3.9     -1.1
    Herald-DigiPoll                        0.4    1.0      2.2     -1.6
    One News Colmar Brunton                0.1   -0.1      4.3     -2.2
    Roy Morgan Research                    2.0   -0.8      1.0     -0.7
    UMR Research                           1.9   -0.6      2.9     -1.6
    3 News Reid Research                   1.5   -1.4      4.5     -2.3
    Fairfax Media–Research International   1.2   -1.1      4.3     -2.3
    

    Comment by pete — May 29, 2012 @ 9:59 pm

  39. Those are some huge numbers for pro-National polling bias.
    I guess the pollsters know about their systematic errors and see benefits that outweigh any impact on their credibility, and the people commissioning the polls are happy with the errors, so feature not bug.

    Comment by Smut Clyde — May 29, 2012 @ 10:25 pm

  40. Thanks pete. I certainly prefer the error estimates on that one…

    Comment by bradluen — May 30, 2012 @ 6:44 am

  41. Lol Sanctuary

    No, I think you’d find the ideal coalition partner for the Greens to be Mana

    Annual Parliamentary Picnic Intinerary:
    1. Vegetarian sausage sizzle (hosted by former Labour leader Helen Clark)
    2. Politicians to have races in water-fuelled vehicles. Winner gets to have a gay marriage paid for by the state
    3. Children of politicians to play “put the bow on the head of the non-specific animal and/or human (cannot play put the tail on the donkey becausse it is too traditional and doesn’t have any imagination. For example, who said that a donkey had to have a tail, or had to be a realistic brown colour? What’s happened to thinking outside the square?)

    Comment by Daniel Lang — May 30, 2012 @ 1:15 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 408 other followers

%d bloggers like this: