On probabilistic predictions

Betting markets can be used to generate probabilities of victory. We can then use these probabilities to make predictions. The process of estimating probabilities from betting markets is distinct from using them to make a prediction, and this seems to cause some confusion.

One criticism of using betting markets has followed this chain of argument:

a. In a previous election, betting markets were used.

b. The prices implied one party had a higher probability of winning.

c. That party did not win.

d. Ergo, the betting markets and the probabilities are useless.

It is fully possible for the first three statements to be true and yet for the betting markets and probabilities to be right.

One example:

a. You have a fair coin.

b. You know for certain that the probability of heads = probability of tails = 50%.

c. You know this information perfectly but still wont know for sure whether a head or a tail will come up on the next coin toss.

It is possible that the betting markets are perfectly priced, and the implied probabilities are fully accurate. But the result is still uncertain!

Making predictions is a two stage affair. First, we try to model the range of possible outcomes and how likely each outcome is. Even if that modeling is fully accurate, it only tells us the probability of each outcome occurring. Second, we then try making inferences or predictions from this modeling. This stage is more subjective: based on the probabilities, we might feel confident in predicting a seat will go to the Coalition if the probability of Coalition victory is greater than 70%. Nevertheless, we would still expect the Coalition to lose this seat 30% of the time. The first stage may be correct (i.e., the estimated probabilities may be correct), but our predictions may be wrong.

This is why arguments of the form “betting markets performed poorly/well in this limited set of elections” are bad arguments. We expect probabilistic predictions, like those from betting markets, to be wrong some of the time, depending on the probability! For instance, if the probability of Coalition victory is 51% in a particular seat, we expect a prediction of Coalition victory to be wrong 49% of the time. This is not a weakness of betting markets, but a property of probabilistic predictions. This may be perceived as a weakness; we see recognizing uncertainty in a highly uncertain situation, such as an election campaign, as a strength. The only way to test betting markets is to look at how they performed over many elections. Fortunately, some very smart people have had a go at this. There is always more research to be done, but they found betting markets tend to outperform polls and pundits.

Next week, as the election draws near, Kaighin and I will be making our own predictions. The probabilistic nature of these predictions means we will expect some of them to be incorrect! Betting markets have their weaknesses, but even if they were perfect, we would expect this to be the case. Stay tuned for predictions next week!

Probability of victory vs vote share

We are going to be talking a lot about probabilities on this blog. These are very different to common quantities used in polling, like the two-party preferred, or other measures of vote share. It’d be good to look at how misinterpreting the probabilities can lead to wrong conclusions.

We get our data from the Sportsbet website. They have betting odds for all 150 electorates. For example, here is the link for the betting prices for each Victorian seat. From the betting odds, we can then calculate the probability of each party winning the seat.

The wrong approach

The natural response would be to think that the task of making predictions is then very easy. Just count up the seats that the Coalition has greater than 50% chance of winning and compare it to the number of seats the ALP has greater than 50% chance of winning. This could then form the predicted number of seats each party will win based on the betting market. And the party with more than 76 seats is predicted to win government.

But this is wrong.

To demonstrate why, here is a simple example. Imagine there are 3 seats so a majority of 2 is needed to win the election. Say the ALP has a 51% chance of winning seat 1 and 2, but 0% chance of winning seat 3. The simple approach would predict the ALP to win 2 seats and the Coalition would win 1 seat. But this ignores the fact that there is a lot of uncertainty around who will win seat 1 and 2.

What is the right approach?

We know the ALP is definitely going to lose seat 3. The only way for it to win the election is to win both seat 1 and 2.

Probability of ALP winning the election = Pr (ALP >= 2 seats) = 26.01% (0.51 * 0.51 because the two events are independent).

But the coalition can win the election a number of ways. It has seat 3 locked up. It can win the election by winning:

  1. seat 1 only; OR
  2. seat 2 only; OR
  3. win both seat 1 and 2

The associated maths is:

  1. Probability of winning seat 1 only (ie. lose seat 2) is 24.99% (0.49 * 0.51)
  2. Probability of winning seat 2 only (ie. lose seat 1) is 24.99% (0.51 * 0.49)
  3. Probability of winning seat 1 and 2 is 24.01% (0.49 * 0.49)

Summary of Results

Probability of Coalition winning  the election = 73.99% (24.99% + 24.99% + 24.01%)

Pr(Coalition wins exactly 2 seats) = 49.98%

Pr(Coalition wins exactly 3 seats) = 24.01%

What explains the difference between the two approaches? The wrong approach confuses probability of victory with the proportion of the vote won. Saying the ALP has a 51% chance of winning a seat is not the same as predicting they will win 51% of the vote. In fact, a party could have a 99% chance of victory, and only be expected to win 51% of the vote (since 51% is enough to win you the seat); or it could have a 1% chance of victory and be expected to win 49% of the vote. The two terms are only loosely related.

Instead of having 3 seats, and probabilities of the ALP winning the seat at 51%, 51% and 0%, we have data for 150 seats and the corresponding probabilities. We then create a probability mass function (PMF) which gives the probability the ALP (or the Coalition) will win a certain number of seats. Later posts will outline how we generate the PMF and what they currently look like.