Political forecasters convert national vote shares from polling into seat predictions. We do this at Electoral Calculus using our MRP (Multi-level regression and Post-stratification), which has a strong track record of giving accurate seat predictions. But even the best predictions are subject to some random errors, caused by sampling error in the opinion polling or by model error.
This means that each seat prediction is not 100% definite, but is really a set of possible outcomes with likelihoods attached. Some outcomes are more likely than others, but these uncertainties can be important.
There are two main ways to handle these uncertainties and to present seat totals to the public – which we call "Seat winners" and "Expected seats". It can be valuable to understand the difference between these two approaches, because they can lead to variations in forecasts, even when using the same polling data. Although both approaches are statistically valid and have good records of accuracy, we outline why we prefer to use "Seat winners".
The first method is the one that Electoral Calculus usually shows. For each seat, we take the expected votes cast for each party, averaging over any random uncertainty, and declare the party with the largest number of expected votes to be the 'seat winner'. We then add up the number of seats won by each party to get the overall seat totals.
The second method is to look at the likelihood of each party winning in a seat, and assign that fraction of the seat to them. For example, if a party has a 60pc chance of winning the seat, we say that they have won 0.6 expected seats. We then add up the expected seats won by each party to get a different set of overall seat totals.
For example, if a party has a 20% chance of winning in 100 seats, they will get 20 expected seats. This probabilistic model may sound quite familiar to football fans, as expected goals (or xG) is a metric that works in exactly the same way.
The election night Exit Poll, which will be published at 10pm on 4 July, uses expected seats.
But expected seats have a couple of strange behaviours. The number of expected seats will usually have decimal places and not be a whole number, such as 7.4 seats. (Though sometimes the number is rounded to disguise that.) And you can't see a list of the expected seats that a party will win, because there is a long list of seats that they might win, but many are uncertain.
Applying these methods to our own forecast, we would allocate 424 expected seats to Labour, but 461 seat-winning Labour seats. This is because we predict seat totals by counting the (sole) seat-winner in every seat, rather than calculating expected seats. There are a large number of seats which Labour is likely, but not certain to win, and these count as full seats under the seat-winner method, but only as partial seats under the expected-seats method.
The expected seats method is a statistically valid approach to election forecasts, and has led to many accurate election forecasts in the past. The election-night exit poll, which has a good track record of predicting general elections since 2005, uses expected seats.
However, under certain circumstances, this approach can run into problems. This is because seat-by-seat results in general elections can be correlated, but expectations are good predictors when dealing with independent events. For example, if you were to roll a dice 600 times, you would expect to land on six approximately 100 times. This is a generally reliable prediction because every roll of the dice is an independent event; rolling a six once will have no bearing on what the next roll of the dice will be, and the chance of rolling on any number is equal (1/6 or 16.67%).
For the large number of Conservative-Labour marginals, this approach makes sense. Whether or not a party wins a marginal seat depends on a number of hard-to-predict factors such as the strength of the candidate, the dedication of the campaign team, and which local issues happen to resonate most with people. If there are 100 such marginal seats where the Conservatives have a 40pc chance of winning, then it's sensible to show them winning 40 expected seats.
However, general elections are different from rolls of a dice or xG, as results in each seat are not always independent events. If a party does well nationally, their probability of winning in each set goes up accordingly, so the outcomes between seats are linked. For example, if Reform UK have a 20% chance of winning in 40 seats, a probabilistic model will predict them to win in 8 seats (0.2 x 40). However, the statistical variation for party performance happens on a national level, as well as on a seat-by-seat basis. Therefore, if Reform UK were to do very well nationally, it is likely that they will win closer to 40 seats. Conversely, if they were to underperform nationally, it is more likely that they win next to no seats. In both cases, the prediction of 8 seats would be incorrect, so the expected seats isn't a good predictor in either potential scenario.
The 'Seat winners' approach is a bit more straightforward and easier to understand and explain. This approach worked well for us in 2019 General election, where we gave the most accurate pre-poll prediction. With our approach, there is a single predicted winner for each seat, even if the seat is marginal. And if we say that Reform will win three seats, there will be three specific seats which we predict that Reform will win.
In elections, as in football, it's the actual goals at the back of the net which count – not hopes for what might have been.