🔮The Underrated Chances of a Democratic Trifecta
A conversation with Josh Taft of Dark Horse Politics
🔮This is the final installment of our pre-election interview series on election modeling with Josh Taft of the Dark Horse Politics substack.Â
TL;dr Josh’s model was born out of dissatisfaction with the polling landscape prior to the 2022 midterms, which caused him to search for alternative indicators such as primary vote, special elections, and generic ballot polling. His model for 2024 shows a D +2.9 environment as we head into election day. This interview with Josh Taft has been edited for length. All answers are his own.
The Oracle: How did you get started in modeling?Â
I started in the 2022 election when everyone was calling for a red wave. There was some polling to back this up, but I was looking at primary and special election data, and it just wasn’t pointing to a red wave. I thought there must be some other way to forecast elections since polling can be off, as we saw in 2022, 2020, and 2016.
The Oracle: What’s the problem with polling today? Â
There's probably ten different issues that alone wouldn't be horrible, but cause a lot of problems when put together. For example, many polls use recall to try to balance the partisan samples to match the last election. The idea is if they can get the ‘correct’ amount of Trump and Biden voters in their sample it helps them ensure they have the right number of Republicans and Democrats. But the problem is that this pushes the sample towards the last election, when you are actually trying to understand what has changed since.
Another big problem is pollstersÊ» fear of underestimating Republicans. They are terrified, rightfully so, of making the same mistake in two presidential elections in a row. If they underestimate Democrats, there's a lot less downside for them. We even saw a preview of this in 2022, where pollsters had underestimated Democrats in various races to little detriment.
The Oracle: Why were you skeptical of the polls in 2022?Â
With Roe having just been overturned, you could see a lot of energy with the special elections in Nebraska and Minnesota. With so much Democratic energy, a red wave just wasn’t in the cards. Not that Republicans couldn't win, but it wouldn't be a 2018-style tsunami. Also, the Washington primary that had just happened showed good results for Democrats, and this usually has good predictive power.Â
The Oracle: Isn’t that what modelers like Nate Silver are doing? Using the polls plus other data?Â
Even when you try to control for the flood of Republican pollsters there's only so much you can do. There was a really egregious example of this in the 2022 Washington Senate race. Even non-partisan polls way overestimated Republicans and the election result ended up being plus 15 for the Democrat. So it's not even just a Republican polling problem, but it's also non-partisan polling too.
If both types of polls are not reliable. You need something that's completely outside of polling. This is why I started looking at other indicators. Something to be a check on polling.Â
The Oracle: What are the inputs to your model? And how did you pick them?
When I started in 2022, I took out a sheet of paper and just made a list of anything that might be predictive. These included primary data, voter registration, polling, fundraising, gas prices, GDP, inflation, unemployment, income, and economic sentiment.Â
I also looked at favorability of the candidates and the number of candidates in congressional races, basically any quantitative data that I could find. Then I tested to see which had the most predictive power. I ended up throwing most of these out and settling on the three main inputs in the model today.
The Oracle: Can you walk us through the inputs?
The first one is data on the primaries, which is about 33% of the model’s weight. I split this data in half: participation in the national primaries—how many people voted in the Democratic primaries versus the Republican primaries—and the Washington state jungle primary.Â
The Oracle: How does that work when there wasn’t a real Democratic primary this cycle? Â
I weigh the data to account for uncontested primaries, like this cycle where Biden ran uncontested. Ultimately, I see this as a gauge for enthusiasm, and to measure what kind of environment we will see in November.Â
In this cycle, we did see a lot more votes than expected on the Democratic side, probably because of the unaffiliated movement. However, in retrospect, I think if my model fails in any area, this is likely to be it because it contains a lot of data from before Harris was the candidate.Â
The Oracle: So you’re a Washington primary truther?
Yes, the second half of my model’s primary component comes from the Washington primary. Despite being a blue state, Washington has shown to have a very accurate track record compared to any other state. This is partially because it’s a jungle primary where all candidates from all parties appear on one single ballot. This makes it a lot easier to capture cross-party appeal: a Democrat can vote for a Republican if they're moderate enough, and vice versa. Â
You have to shift it rightward by around 15 or 17 points, because it's a blue state, but once you do that it almost always ends up around the final result. For example, it showed that Clinton wouldn’t do well in 2016 when many were predicting a landslide win.Â
The Oracle: Where are your polling averages from? Do you adjust them in any way?Â
The generic ballot polling, which is about 62% of the model’s weight, is taken straight from RealClearPolitics and FiveThirtyEight. Both have done amazingly well in the past. I switched in 2018 from RCP to 538 because I found 538 to be more accurate. In recent years RealClearPolitics seems to have a growing bias towards Republicans, which is unfortunate given their archive of data.Â
The Oracle: And last the special elections?Â
This one carries the least weight in the mode, just about 3%. This is because in some elections it ends up being very accurate, but when it’s off it’s way off. This indicator looks at the performance of Democrats and Republicans in special elections and compares it to how that district did in the previous presidential election, and then takes that shift and applies it to the national popular vote.Â
The Oracle: What does your model say right now? Â
Right now my model says that the national environment is Harris+2.9. It doesn’t make state-by-state predictions, but you could say it’s showing anything from a Harris narrow win of exactly 270 electoral votes, or a scenario where she wins all seven of the swing states. It also says that Trump could win, just like he almost won in 2020 with Biden ahead by four and a half. Basically, it means Harris is narrowly favored.Â
On a personal level, I’d say that there's a fairly good chance Democrats could be underestimated. Due to this, I would say there's probably a bit more weight to Harris winning - like I could see her winning all seven states.Â
The Oracle: On Polymarket, what’s the purest way to express your position? A Harris win of the popular vote?Â
Yes, it would definitely match up with the popular vote. Even without my model, you could probably give Harris the popular vote due to how much Democrats have had an edge in the popular vote in the 21st century, which they have won in every election with the exception of 2004.
For the electoral college, I would put probably 55% odds on Harris winning based on the model, but that contains a wide range of outcomes. It could be 270 electoral votes per Harris or it could be 319, which would be all the 2020 states plus North Carolina.
The Oracle: Does the model tell us anything about the House?Â
Yes, if Democrats are up 2.9, and the House popular vote ends up being similar, then I would say Democrats are pretty good favorites to win the House, I’d put that at 60-65%. It would be similar to their 2020 performance.
The Oracle: This will probably come out right before the election. Last chance to get your hottest takes on the record.Â
My hottest take is that there's going to be a polling error in favor of Democrats, but this has nothing to do with my model.Â
The other one is that a trifecta, where both houses of Congress plus the presidency are controlled by either party is a lot more likely than people think. On Polymarket the current odds are 40% Republican sweep and 14% Dem sweep, I do think Polymarket does overestimate Republicans, but not to a huge extent. I think this is due to the belief that polling could underestimate Trump again, which is somewhat reasonable. Â
But if you believe my model that says we’re in a Dem +2.9 environment, and then you look at the Senate races which all have the Democrats borderline romping it and overperforming Harris – including in the very tight Senate race in Ohio – and then you translate that national edge to the House, you can see all the pieces are in place for a Democratic sweep. It’s unlikely, yes, but you can see how it would happen. The current Polymarket odds say 14% but I’d put it at 20- 25%, so still unlikely, but you can see how it could happen.Â
Disclaimer
Nothing in The Oracle is financial, investment, legal or any other type of professional advice. Anything provided in any newsletter is for informational purposes only and is not meant to be an endorsement of any type of activity or any particular market or product. Terms of Service on polymarket.com prohibit US persons and persons from certain other jurisdictions from using Polymarket to trade, although data and information is viewable globally.
Wow. That chart at the top is insanely misleading. Glancing at it, I thought, this guy has an amazing track record. So I read the interview only to discover that he only started modelling after 2022.
Those amazingly accurate past forecasts are not actually forecasts, there just the error bars on his 20-20 hindsight academic modeling exercise. He and his model have successfully forecast exactly 0 elections.
Polymarket does not enhance its credibility by publishing bullshit like this.