Sunday, November 28, 2010

Using probabilities to predict election results

The recent mid-term elections, especially the Senate results for Colorado and Nevada, made me wonder about the efficacy of likely-voter screens that pollsters normally use. Rather than discarding all "not-likely" voters, could one assign probabilities to different sub-groups formed based on likelihood-of-voting?
Thanks to Mark Blumenthal, I now know that the CBS/New York Times poll uses probabilities instead of a "likely voter" screen. Here is Mark Blumenthal's write-up on the method and its results for Presidential elections till 2004. How did the poll do in 2008?
First, all the polls (from the lovely charts at the erstwhile Pollster.com, now part of HuffPo):


Second, the CBS/Times polls:


Much more consistent than some other polls, including Rasmussen and USA Today/Gallup, for sure. You can click on any of the points to see the raw trend instead of the Pollster.com-generated trend line. The final result, of course, was Obama 52.9%, McCain 45.7% (Wiki). While the "all polls" trendlines finally converge to the actual results, the CBS/Times poll suggests the race was relatively consistent throughout the fall - especially the CBS/Times-specific trendline.

Here is a description of Registration-Based Sampling, which uses voter registration history to contact only those voters with a history of voting, rather than a self-reported history provided by randomly-contacted voters (Random Digit Dialing).

(Much thanks to Mark Blumenthal for the links: Follow him on Twitter, and on the new Huffington Post/Pollster website. Any errors in data interpretation are purely my fault.)

No comments: