My buddy Davis is an actual, real statistician, and a damn fine one at that. Leng and I want to be real statisticians one day when we grow up, so we were really pleased to get some feedback on the blog from him.
One thing he pointed out was that the PMFs in the previous post for the ‘maximum covariance’ case look a little funky. Here they are again:
These don’t look like distributions we’d normally see, like a normal distribution. They’re truncated at seemingly arbitrary values (a minimum bound at 25 seats for Labor and 48 seats for the Coalition) and reach their peaks at the edges. And they suggest that it’s roughly twice as likely Labor will only win 25 seats compared to a more believable 60ish, which doesn’t sound right.
Is something wrong here? We don’t think so.
The maximum covariance model doesn’t represent the real world. It’s only meant to be an upper bound, an outer limit of what could technically be possible given the probabilities we inferred on August 6. In the bizarre maximum covariance model universe, election outcomes between seats are very strongly correlated. This means that when Labor win, they win big, and similarly for the Coalition.
Consider an extreme case, where all 150 seats have 100% correlation with each other. Then in this case, there are only two possible outcomes: Labor wins 150 seats, or the Coalition wins 150 seats. A brutal environment for career politicians! In this case, the PMFs would be completely concentrated at the edges. In practice, the probabilities inferred from the betting odds constrain the correlations between many seats to be much less than 100%, resulting in some mass between the two extremes.
What causes the arbitrary truncation points? We make adjustments to the inferred probabilities to counter longshot bias: seats with less than a 0.1 probability of victory for Labor are rounded down to 0 (and similarly for all other parties). On August 6, this resulted in 25 seats with a 100% probability of Labor victory and 48 seats with a 100% probability of Coalition victory. So the bias-adjustment explains the truncation points.
The aim of this whole exercise has been to build covariance into the model. We don’t know what the true covariances are between seats, so instead we look at upper and lower bounds on the covariances (these bounds are set by the probabilities inferred from the betting odds). The main takeaway is that the overall probability of a Labor victory increases dramatically when you include covariance between seats; but even if you include unrealistically high covariance, the betting markets still believe Labor are likely to lose the election.