The election is finally over! At time of writing, the Coalition stand to hold 89 seats and Labor 57 seats, with the remaining 4 seats going to a Green and independents. It seems likely these numbers may well change in the coming weeks as postal votes come in for a few close electorates, but we’ll go with them for now.

Leng and I have been sorting through the entrails of our predictions, looking at what worked and what didn’t. What did the betting markets get right, and where did they fall down? What could we have done better?

## What worked

The betting markets successfully predicted the Coalition would win government with a large majority. This bears repeating, even though by election day pretty much everyone was predicting this. In contrast to the polls, at no point in the campaign were Labor anywhere close to being the favourites in the betting markets. The high-water mark for Labor in the polls was around July 8, when Newspoll recorded a 50-50 2PP. In contrast, our AFR analysis of the electorate-level betting markets on July 11 still put Labor well behind the Coalition, with an expected Coalition-Labor seat count of 84-60. The closest the parties ever got was 81-66 in our July 30 analysis. We aren’t criticizing the polls here — the polls and betting markets measure different things — but this clearly demonstrates that betting markets do more than just parrot polls, contrary to a common misconception.

## What didn’t

Assuming the 89-57-4 seat count holds, our predicted expected seat count of 99-48-3 was off by 10 seats. While this was well within the margin of error of our maximum covariance model, this is worse than we’d expected! There are a few possible reasons for this.

Starting with the most general, could this be considered strong evidence against the underlying theory of prediction markets? Not really. Since prediction markets give probabilistic forecasts, they have to be assessed over multiple campaigns. Results from any one campaign don’t tell you much about predictions of underlying probabilities, just as rolling a die once doesn’t tell you enough to know if it’s loaded or not.

Could this result be due to violations of some of the underlying assumptions of prediction markets? That’s certainly possible. We have no information on the amount of money wagered in each seat, although we have been told that around $250,000 was wagered on Sportsbet’s individual seat market between January and the election, with most of it on marginal seats. Nevertheless, we don’t know how that money was distributed, or if it was sufficient, so it’s possible that this had something to do with the seat count error. Longshot bias does not appear to have had any large, obvious impact on the result, with the betting markets ultimately underestimating Labor’s seat count rather than overestimating it (while this may be partly due to our rather crude correction for longshot bias, this doesn’t appear to have altered the results much). Our understanding of these biases is very crude. Predictions from seat-level betting odds have little heritage and it will take time to be able to observe and eventually correct for these biases.

However, we think the most likely culprit is in the modelling, not the betting markets themselves. Using electorate-level betting odds to predict seat counts requires knowledge of the underlying covariance structure between electorates (ad hoc methods, such as counting up the number of seats where Labor has > 50% probability of victory, make no statistical sense). There is no obvious way to estimate this, so we have to make some assumptions about this structure in practice. To try and get an idea of the impact of this uncertainty, we used two models for our estimates: a model that assumed zero covariance between seats (the ‘independent seats’ model), and another that assumed the maximum possible covariance between seats, conditioned on the betting odds (the ‘maximum covariance’ model). These models both generated distributions of seat counts for each party.

## Covariance is key

The independent seats model is commonly used, but there seem to be good intuitive reasons to believe it is badly wrong. The final seat counts were outside the margin of error for the independent seats model’s predictions; that is, there was less than a 5% chance of this election outcome occurring by chance, according to the independent seats model. We aren’t big fans of hypothesis testing, but if you were to use the independent seats model as the null hypothesis, you would probably reject this model given the outcome (*p* < 0.05). The value was well within the margin of error for the maximum covariance model, however. We don’t suggest the maximum covariance model is necessarily a good representation of the real world. But it demonstrates the sensitivity of the results to the assumed covariance structure.

Why did the two models yield the same expected seat counts, even though their seat count distributions were so different? The maximum covariance model represents a scenario where a large, nationwide swing occurs in one direction, resulting in a symmetric distribution that yields the same expected seat count as the independent seats model. However, more subtle covariance structures, such as ones where different states have different magnitude swings (as occurred in this election), may result in asymmetric distributions that shift the expected seat count. So understanding the role of covariance structure in both expected seat counts and the overall distribution is essential. Covariance is key, and it’s something Leng and I want to look at in more detail.

But for now, we’re going to take a break! We’ve had a great time making this blog. We couldn’t have done it without the support of a lot of great people. We want to thank Bob Chen Ren and Jeff Chan for their assistance with the blog, Edmund Tadros and Jason Murphy at the Australian Financial Review for writing up our work, Kevin Bonham and Simon Jackman for their insights, and everyone who showed an interest in this blog!

Thanks for all your hard work and congrats on predicting the outcome! State-by-state covariance may be the trick. Not only do some people identify with their state first, but state politics complicates federal voting. The vestiges of Labor Government in Tassie are part of the reason that state as a whole swung so hard to LNP, for example.

I’ll be posting some comments on this when all seats are decided. My preliminary view is that the final seat markets were actually slightly inaccurate (as a result of a meltdown in the last few days) while the seat total markets will be closer to the mark and perhaps even very close indeed.

Different ways of reading a projection from the seat markets will change the expected range of outcomes (as you’ve noticed) but the effect on the median outcome shouldn’t be that great. Even just counting the favourites shouldn’t lead to an error of more than about two unless there is a very heavy skew in the number of close seats on one side.

I’ll be very surprised if Labor does hold 57 seats from here by the way. They’ve lost the lead in McEwen and Barton already and had a bad result in post-counting in Capricornia last time.

Will be interested to see the final seat count and suspect they will be very close to your final predictions, Kevin. Well done.

Agree that the individual seat markets appeared to go a little wacky towards the end, which is very interesting. Also agree that the median appears to be relatively insensitive to covariance structure (although we haven’t really kicked it hard enough yet to be sure).

But the range and distribution are very sensitive to covariance structure, meaning that even if the median is insensitive, it may not be a very useful point estimate of the seat count. The maximum covariance seat count distribution was typically bimodal, meaning the median was located in a very low density region of the distribution, making it a poor point estimate. For highly non-Gaussian distributions like that, the median can be very different to the mode, and doesn’t really tell us much. The increase in the range is also important, since it can massively increase the error of our estimate (witness the very large 95% credible interval on our maximum covariance seat count predictions).

These may just end up being academic curiosities that only apply to extreme cases. Counting up the favourites in each seat seemed to give reasonable results for most of the campaign, even though there doesn’t seem to be any good reason for this to be the case all the time.

We’ve appreciated your feedback on the blog! Look forward to seeing your final results once the counting is finalized.