Kevin Bonham has written a very detailed summary of the performance of seat-level betting markets in the last election. It’s well worth reading — go check it out. We’ve really enjoyed reading Kevin’s blog and his analysis. In a spirit of friendly debate, we wanted to respond to a few comments that are directly relevant to this blog. Kevin writes:

electionlab in their final analysis considered that the most likely culprit in the errors made by seat betting markets was their modelling and not the markets themselves. In my view, there was nothing significantly wrong in their model’s read of what the seat betting markets were thinking – rather, what actually happened was that seat betting markets themselves were in fact wrong. Different modelling assumptions regarding covariance and so on greatly affect the spread of modelled expectations, but they have little impact on the mean. The seat betting markets

werecollectively expecting Labor to win fewer than 50 seats at the end. There is no way to remodel the final odds to find 55 seats for Labor in them because it is just not true that those markets thought Labor would win that many seats. Or at least, if someone “finds” such a way to read that result into the markets, the next time they test it I can pretty much guarantee the post hoc overfitting in their new model will cause it to blow up.

We disagree here, for two reasons.

First, changes in covariance structure don’t just change the *spread* of the seat-count distribution; they fundamentally alter the *shape* of the distribution, too, and this has important implications for estimating seat counts from the betting odds. Assuming independence between seats results in a unimodal, bell-shaped distribution; on the other hand, assuming maximum covariance between seats (as constrained by the betting markets) results in a bimodal distribution, with very little density in the middle. More generally, it makes intuitive sense that the distribution might become multi-modal if you bring in covariance between seats. We agree that the mean is relatively insensitive to covariance structure. But for bimodal distributions, the mean is a very bad point estimate and lies in a low probability region of the distribution, meaning it is not a very useful tool in this situation (in hindsight, we should not have used the mean as a point estimate for our maximum covariance model because of this). This is really important, because it affects how we make inferences using the distribution derived from the betting odds. There is nothing sacred about the mean, and it appears that a point estimate centered on the mode (or two point estimates, centered around the two modes) may make more sense here. Or perhaps a point estimate is just a bad idea, and we should look at 95% credible intervals instead. We don’t know the true covariance structure, so we don’t know the true underlying seat count distribution. But we have very good reason to believe that the mean will be a poor point estimate in this case.

Second, a final Labor seat count of 55 seats is absolutely consistent with the betting odds, as long as you expect there to be moderate amounts of covariance between seats. According to the maximum covariance model, there was a 95% chance Labor would obtain somewhere between 32 and 64 seats. Of course, the maximum covariance model is an extreme case, but it’s not hard to show that for more moderate covariance structures, the 55 seats result is within a 95% credible interval. Almost the only way to obtain a prediction with 55 seats outside the 95% credible interval is to assume independence between seats. But as we know, that’s unlikely to be a good assumption.

It baffles me that experienced statisticians attempt to determine how many seats betting markets think parties will win by looking at an indirect and problematic measure (aggregation of implied probabilities concerning particular seats) when there are more direct markets available on seat total events and their past track record has been excellent.

This is a reasonable question: why did we derive predicted seat counts from seat-level betting odds, rather than just directly looking at the seat count betting markets? Why use an indirect method when there is a direct one? We did this because we wanted a general approach that allowed us to look at interesting seat-level scenarios (e.g., this and this). The seat count predictions were something we could readily do with our more general model, so we had a go. Unfortunately, the election ended up being a landslide and, for many of the scenarios we thought might be interesting, the betting markets ended up giving pretty obvious and boring predictions! So the seat count predictions ended up being prominent. We agree that, if your sole aim is to predict seat counts, the seat count betting markets are the way to go. But our ultimate goal is not to just get good at predicting seat counts. We want to use the seat-level betting odds to find interesting stories that can’t be revealed without seat-level data.