Sunday, September 27, 2015

On accurate readings

Paul Barber offers a rundown of the problems with an overreliance on polls, while Heather Libby goes further and suggests that we ignore national polls altogether. But I'll follow up on the argument I've made before that rather than taking any concerns about poll data as a basis for throwing polling out the window altogether, we should instead treat them as reasons for caution in interpreting useful information.

Barber focuses largely on the methodological issues involved in trying to get a representative sample from an electorate in which people are less and less inclined to respond to requests to participate in the first place. And there are certainly reasons to question each of the workarounds on their own.

That said, if we face the choice of either (a) lending at least some credence to the view that each methodology might have some merit while using competing polls (and ultimately electoral results) as a check, (b) buying completely into one style of poll and thus excluding all other data, or (c) trusting no polling information at all and thus relying solely on parties and pundits to tell us where an election stands, I'd have a hard time seeing how we're well served by any option other than (a).

And fortunately, the poll information we have is then compiled in ways which makes it relatively easy to analyze national-level data. So while we should absolutely question whether a single poll tells the full story (particularly in its subsamples), we can check with public aggregators for both a big-picture look at the national race, and a test as to the plausibility of new polling information.

Of course, those sites focus largely on the national level. So what about Libby's view that there's a meaningful distinction between national and riding-level poll data, and that we should pay attention only to the latter?

The problem there lies in the limited number of riding-level polls actually conducted. Parties, pollsters and media outlets may decide to conduct polls in ridings of particular interest - but we should have learned by now that national and regional trends make a huge difference in determining what ridings actually affect electoral outcomes in the first place. And then, if a small number of polls are conducted in a riding, a single skewed sample or methodological issue can grossly warp the results.

Again, those are cautions as to the use of riding-level data alone. But if we can compare a single-riding poll to see how it fits into broader national or regional pictures, then we have a far better chance of finding the right balance between the two.

And that should be our ultimate goal. While some partisans who should know better have been particularly motivated to cherry-pick polls to tell only the story they want told, the fact is that all polling information is potentially useful if we recognize its limitations. And rather than looking for excuses to throw out some or all of the data we have based on either partisan preference or methodological squabbles, we should instead be incorporating it into a full analysis of what's happening around us.

5 comments:

  1. Excellent analysis.

    However, I wonder if moving out from the riding-specific polling it might be best to first consider regional polling before looking at National trends. In BC (where I am), I think that's critical (of course, comparing BC to Alberta, regionally, is too easy...But how about BC to Quebec where the regional numbers will tell you a lot about, say, softness in the NDP and Liberal vote in a number of ridings, especially on Vancouver Island given the comparative Green numbers).

    Regarding who's doing riding-specific polling - in addition to parties, pollsters and media outlets, increasingly there are also the advocacy groups, Leadnow and the Dogwood Initiative for example.

    .

    ReplyDelete
    Replies
    1. Again, my main concern is with the belief that it's safe to exclude any source of information - but the question of which to look at first depends on one's purpose in examining it. Regional and similar data (e.g. adjacent-riding polling) offers a better set of information as to conditions on the ground; national polling, with its larger sample size and increased segmentation of voters, offers a better indication as to how opinions and votes might shift on a larger scale.

      And you're right to point out the work advocacy groups are doing in carrying out additional polling which was lacking in earlier election cycles - which may be particularly valuable if it moves us beyond relying on one or two polls per riding.

      Delete
  2. Anonymous12:39 p.m.

    Hi Greg,

    You are correct about the relative deficiencies of individual polls being reduced through the use of aggregation. The threehundredeight.com projections come closest to using the methods recommended by forecasting industry experts at forecastingprinciples.com

    Most journalists take the view of Warren Kinsella that the election is a horse race too close to call. That's a big mistake, although it might sell newspapers. (He also falls victim to a fundamental mistake of projecting his own view, that of a spurned Liberal candidate, into a type of wishful thinking when forecasting.)

    Glenn Ashton at CuriosityCat.blogspot.ca has a similar analysis to your own. Harper is toast.

    Harper has about a 2.5% chance of forming a majority government, 19 times out of 20 forecasts according to Eric Grenier's ThreeHundredEight.com calculations on cbc.ca/polltracker . Harper has no chance of getting Liberal or NDP support on any vote of confidence. He's gone.

    However flawed polls are, they are better than no polls at all. This problem, as you point out, can be illustrated by Ontario's 905. It has been reliably Conservative throughout Harper's reign.

    ThreeHundredEight.com specifically warns that their riding projections are NOT polls. There have been polls of Ajax where the Minister of Citizenship and Immigration, Chris Alexander, is projected to go down to defeat. In the adjacent new riding of Pickering-Uxbridge municipal politician Liberal Jennifer O'Connell is expected to win an area that was formerly Conservative blue. So much for ridings that have had polls published.

    Nearby ridings have not had any polling, primarily because they have been "safe" Conservative ridings. ThreeHundredEight.com is depending on projecting forward the results of the last election. The model is vulnerable because extending a trend line cannot project changes. That's the flaw of this type of methodology but it is used successfully because often things don't change much from time to time. This isn't one of those times. About two thirds of Canadians want a change of government and that's well established across many polls for a very long period of time.

    In the long held Conservative riding of Durham, Veterans' Affairs Minister Erin O'Toole is supposed to have a 91% chance of winning. But internal polls show he is in a close race and may well lose to municipal politician Corinna Traill. In last year's provincial election a novice politician, Granville Anderson, took this riding for the Liberals. O'Toole's father, John, had held the riding for the Conservatives for almost 20 years. A provincial Liberal victory was not anticipated by ThreeHundredEight.com because there were no source polls done in the riding. (They had the Conservatives at 80% confidence.)

    Jim Flaherty's old riding of Whitby-Oshawa was won in a 2014 by-election by former Mayor of Whitby, Conservative Pat Perkins. Meanwhile first time politician Celina Caesar-Chavannes brought the Liberals from 22% to 40% of the vote. There is no way that the Liberal candidate's support is going to drop back to 24% as ThreeHundredEight.com is presently showing. Their model has not been updated with any new source polls and simple projects the same results as the 2011 election. Why? The ridings were considered too Conservative to allocate money for public polls. It is a huge mistake on behalf of pollsters. You're hearing it here first. Celina is going to win. Incidentally both Granville Anderson and Celina Caesar-Chavannes probably were given no chance because media/polling decision makers judged them by their appearance. They are people of colour. Durham and Whitby-Oshawa are two of the most "old stock" ridings in Ontario but the electorate paid attention to their candidacy and campaigns, not their ancestry.

    p2p

    ReplyDelete
    Replies
    1. Good post. But I will push back on the theory that we can take the "2.5% chance of a majority" analysis as set in stone: that's based on the polling averages today, but doesn't take into account the likelihood that polls will move in one direction or another as the campaign progresses. I'd place the likelihood of any of the parties winning a majority higher than the current seat projections simply because the averages are in a narrow range of three-party competition where a majority is unlikely, but a relative shift of just a few points between any two or more parties could change the picture substantially.

      Delete
    2. Anonymous4:04 p.m.

      Even if ThreeHundredEight.com has limitations they are more transparent than the others. Their projections show the probability of a Conservative majority outside the two standard deviations used for high and low. From a statistical point of view the confidence interval of two standard deviations accounts for 95% of all variability. That's how the differences between samples and the population from which the samples are taken are expressed. In other words, it does include your concerns about changes in the numbers.

      We aren't interested in the low side so the high number represents 97.5% of the variability (100 - 95)/2 = 2.5

      In this case I was not picking the eventual winner, just showing that the loser is going to be Harper. That's enough for me and most people looking for change.

      p2p

      Delete