Reflections on My MLB Model’s 2025 Season

Last updated: January 6 2026 – these figures summarise how my model performed during the 2025 Major League Baseball season.

I built my MLB model to estimate win probabilities and identify value in baseball betting.  It uses historical data, player statistics and situational variables to forecast game outcomes.  Throughout the 2025 season I tracked performance in three ways: overlay picks (bets where the model’s probability was substantially higher than the bookmaker’s implied odds), and picks where the model’s projected win probability exceeded 60 % or 70 %.  In this post I’ll review those numbers, share what looks promising and be candid about the shortcomings.

Understanding the break‑even bar

In sports betting, it’s not enough to win half your wagers.  Because bookmakers add a vig (often −110 odds on spreads or totals), a bettor must hit roughly 52.38 % to break even .  This benchmark applies across sports, including baseball.  Throughout this discussion I’m looking for win rates that clear that break‑even threshold when betting standard −110 lines.  However, as explained below, the break‑even percentage changes when the odds are positive (plus money).

Overlay picks

Overlay bets are those where my model’s estimated win probability was higher than the odds implied by the bookmaker — in other words, value bets.  Over the course of the season I flagged 275 overlay opportunities, of which 140 won, yielding a 50.91 % win rate.

It’s important to remember that win rate alone doesn’t tell the full story with overlays.  These plays are often plus‑money underdogs, which means the break‑even win percentage can be much lower than the 52.4 % threshold used for −110 bets.  For example, Betting Kings notes that consistently betting games at +150 odds requires winning only about 40 % of the time to break even, and a +130 line has a break‑even rate around 43.5 % .  Without knowing the average price of my overlay bets, I can’t say for certain whether 50.91 % is profitable or not.  In any case, it isn’t automatically disappointing just because it sits below 52.4 %.

A closer look at the last couple of weeks of the season (10 October – 17 October) shows eight overlay picks and five winners (62.5 %), but the sample size is too small to draw conclusions.

This tells me that my overlay criteria — how I define “value” relative to the betting market — need refinement.  I plan to review not just win rates but expected value, combining the model’s probabilities with the actual prices available, to determine whether these plays are truly +EV.  I’ll be revisiting the overlay algorithm before next season.

Predictions with ≥60 % win probability

Across all games where the model’s predicted win probability was 60 % or higher, there were 666 such predictions, and 353 were correct, producing a 53.00 % accuracy rate.  While 53 % edges just above the 52.4 % break‑even mark, the margin is thin.  In practical terms, this means that blindly betting every ≥60 % prediction might yield only a negligible profit once vig and variance are accounted for.

I’m encouraged that the model’s moderate‑confidence picks did not underperform, but there’s room for improvement.  Increasing the accuracy here even by a percentage point or two could materially improve profitability, so I will focus on feature enhancements and calibrating probabilities.

Predictions with ≥70 % win probability

The highest‑confidence tier had 291 predictions with win probabilities of 70 % or more.  Out of these, 157 were winners, giving a 53.95 % win rate.  This is slightly better than the ≥60 % group, but again the margin above break even is slim.  Intuitively, I would expect my top‑confidence picks to win at a much higher clip than moderate ones; that they barely outperform the 60 % cohort suggests the confidence calibration isn’t as sharp as I’d like.

Late‑season sample

From October 4 through October 17, the report lists six high‑confidence picks (≥70 %), of which three won.  That 50 % result underscores the variability in small samples and reminds me not to overreact to short streaks.  The season‑long totals provide a more reliable gauge.

Key takeaways and plans

  1. Overlay performance needs EV analysis. A 50.91 % win rate across 275 overlay bets might still be profitable if those bets were mostly at plus‑money prices; break‑even percentages drop below 50 % for positive odds .  I need to analyse the average odds and actual returns on these plays before concluding they are weak.
  2. Moderate confidence picks are barely profitable. Predictions with estimated probabilities ≥60 % hit 53.00 %, slightly above the −110 break‑even benchmark.  It’s encouraging that this group isn’t losing money, but the edge is minimal.  Improving the model’s feature set or adjusting for season‑specific factors could push this higher.
  3. High confidence picks need recalibration. Even at ≥70 % probabilities, the win rate is only 53.95 %.  That tells me the model may be overconfident or underestimating the vig’s effect.  I plan to examine why my “sure bets” aren’t translating into a bigger advantage.
  4. Sample sizes matter. The late‑season breakdown shows how a handful of games can skew perception.  While short‑term performance can be exciting or frustrating, only season‑long datasets (hundreds of games) provide meaningful insight.  I will continue to monitor the model’s performance daily but evaluate changes based on larger samples.
  5. Refinement is essential. Baseball is a high‑variance sport with long seasons.  That my model is hovering around break even means I’m on the right track but haven’t yet unlocked a consistent edge.  Before the 2026 season I plan to incorporate additional variables (pitcher form, bullpen usage, weather conditions) and revisit probability calibration to better differentiate between moderate and high‑confidence spots.

Closing thoughts

Building a predictive model for MLB is challenging due to the sport’s variability and the vig imposed by bookmakers.  In 2025 my model generated a small edge on moderate and high‑confidence picks but produced ambiguous results on overlays.  I’m committed to improving the model for the 2026 season by refining how I define and evaluate overlays (using full EV rather than win rate alone), calibrating confidence thresholds, and integrating new data sources.  While the current results are only marginally profitable at best, they provide a clear roadmap for improvement.