The election is finally over! At time of writing, the Coalition stand to hold 89 seats and Labor 57 seats, with the remaining 4 seats going to a Green and independents. It seems likely these numbers may well change in the coming weeks as postal votes come in for a few close electorates, but we’ll go with them for now.
Leng and I have been sorting through the entrails of our predictions, looking at what worked and what didn’t. What did the betting markets get right, and where did they fall down? What could we have done better?
The betting markets successfully predicted the Coalition would win government with a large majority. This bears repeating, even though by election day pretty much everyone was predicting this. In contrast to the polls, at no point in the campaign were Labor anywhere close to being the favourites in the betting markets. The high-water mark for Labor in the polls was around July 8, when Newspoll recorded a 50-50 2PP. In contrast, our AFR analysis of the electorate-level betting markets on July 11 still put Labor well behind the Coalition, with an expected Coalition-Labor seat count of 84-60. The closest the parties ever got was 81-66 in our July 30 analysis. We aren’t criticizing the polls here — the polls and betting markets measure different things — but this clearly demonstrates that betting markets do more than just parrot polls, contrary to a common misconception.
Assuming the 89-57-4 seat count holds, our predicted expected seat count of 99-48-3 was off by 10 seats. While this was well within the margin of error of our maximum covariance model, this is worse than we’d expected! There are a few possible reasons for this.
Starting with the most general, could this be considered strong evidence against the underlying theory of prediction markets? Not really. Since prediction markets give probabilistic forecasts, they have to be assessed over multiple campaigns. Results from any one campaign don’t tell you much about predictions of underlying probabilities, just as rolling a die once doesn’t tell you enough to know if it’s loaded or not.
Could this result be due to violations of some of the underlying assumptions of prediction markets? That’s certainly possible. We have no information on the amount of money wagered in each seat, although we have been told that around $250,000 was wagered on Sportsbet’s individual seat market between January and the election, with most of it on marginal seats. Nevertheless, we don’t know how that money was distributed, or if it was sufficient, so it’s possible that this had something to do with the seat count error. Longshot bias does not appear to have had any large, obvious impact on the result, with the betting markets ultimately underestimating Labor’s seat count rather than overestimating it (while this may be partly due to our rather crude correction for longshot bias, this doesn’t appear to have altered the results much). Our understanding of these biases is very crude. Predictions from seat-level betting odds have little heritage and it will take time to be able to observe and eventually correct for these biases.
However, we think the most likely culprit is in the modelling, not the betting markets themselves. Using electorate-level betting odds to predict seat counts requires knowledge of the underlying covariance structure between electorates (ad hoc methods, such as counting up the number of seats where Labor has > 50% probability of victory, make no statistical sense). There is no obvious way to estimate this, so we have to make some assumptions about this structure in practice. To try and get an idea of the impact of this uncertainty, we used two models for our estimates: a model that assumed zero covariance between seats (the ‘independent seats’ model), and another that assumed the maximum possible covariance between seats, conditioned on the betting odds (the ‘maximum covariance’ model). These models both generated distributions of seat counts for each party.
Covariance is key
The independent seats model is commonly used, but there seem to be good intuitive reasons to believe it is badly wrong. The final seat counts were outside the margin of error for the independent seats model’s predictions; that is, there was less than a 5% chance of this election outcome occurring by chance, according to the independent seats model. We aren’t big fans of hypothesis testing, but if you were to use the independent seats model as the null hypothesis, you would probably reject this model given the outcome (p < 0.05). The value was well within the margin of error for the maximum covariance model, however. We don’t suggest the maximum covariance model is necessarily a good representation of the real world. But it demonstrates the sensitivity of the results to the assumed covariance structure.
Why did the two models yield the same expected seat counts, even though their seat count distributions were so different? The maximum covariance model represents a scenario where a large, nationwide swing occurs in one direction, resulting in a symmetric distribution that yields the same expected seat count as the independent seats model. However, more subtle covariance structures, such as ones where different states have different magnitude swings (as occurred in this election), may result in asymmetric distributions that shift the expected seat count. So understanding the role of covariance structure in both expected seat counts and the overall distribution is essential. Covariance is key, and it’s something Leng and I want to look at in more detail.
But for now, we’re going to take a break! We’ve had a great time making this blog. We couldn’t have done it without the support of a lot of great people. We want to thank Bob Chen Ren and Jeff Chan for their assistance with the blog, Edmund Tadros and Jason Murphy at the Australian Financial Review for writing up our work, Kevin Bonham and Simon Jackman for their insights, and everyone who showed an interest in this blog!