Beating the Big Numbers on High Rollers

Trebek and Lee. They're cops!

Trebek and Lee. They’re cops!

When you think about dice and game shows, the show you probably think of first is the Heatter-Quigley show High Rollers, hosted by Alex Trebek and his awesome 70’s fro for two separate runs on NBC, and then again by Wink Martindale in 1987. Through every incarnation of the show, the bonus round remained the same: the Big Numbers.  Based on the old gambling game Shut the Box, it heavily relied on luck, but also had some strategy in how you played it.  It also seemed extremely difficult, as players rarely walked away winners.  We wondered if the difficulty of the game was due to poor strategy, or if the game was simply stacked against the contestant.  To answer that question, we first had to figure out just what the optimal strategy should be.

The rules are simple enough. The contestant is faced with the numbers 1 through 9.  In order to win, they must eliminate each number, which they do by rolling two dice.  After each die roll, the contestant chooses from the numbers they have remaining either a number or combination of numbers that equal to the number rolled, which are then eliminated from further consideration.  If the contestant manages to eliminate all nine numbers before rolling a number that cannot be matched, they win the grand prize.

During the course of the game, the contestant can also earn “insurance markers”. Every time they roll doubles, they earn an insurance marker, which essentially give the contestant an extra life – if they roll a total which they cannot match, they may instead roll again.  Contestants can use insurance immediately, even on the same roll that earned them the insurance.

The Big Numbers board in the 1987 revival, in all its glorious 80s-ness.

The Big Numbers board in the 1987 revival, in all its glorious 80s-ness.

Sometimes you can only match a roll one way.  But most of the time, especially in the first few rolls, you have several ways to match a total with the numbers remaining.  For example, there are 12 possible ways of matching a roll of 12 on the first roll!  Which of these 12 would give you the best chance of winning the game?

We decided to take a brute force approach to evaluating this game.  Since there are nine different numbers, and each number has two states (either still on the board or removed), that means that there are 512 (2 to the 9th power) possible combinations of numbers that you could be left with at some point during the game.  While that would be a large number to work out by hand, luckily we can make the computer do most of the heaving lifting for us.

First, we can trivially work out the probabilities of every situation where there are no choices to be made.  For example, consider the situation where only the number 7 is lit up and we have no insurance markers.  The only way to win in this case is to roll a 7. Seasoned gamblers would know that the chances of rolling 7 on a pair of dice is 16.66% chance, or ⅙.  However we have to remember that not all non-seven rolls will lose the game – a roll of doubles grants an insurance marker that we could immediately cash in.  That means that of the 36 possible rolls, 6 are winners, 6 let us roll again, and 24 are losers, which means the chances of winning are 6 / 30, or 20%.

Once we have these easy cases calculated, we can start working on the cases where we have a choice.  We used an iterative process for this.  We made up a list of all number combinations that we could not evaluate as above, and looked at each of them.  If we came up with a case where every option’s chances of winning had not yet been determined, we skipped it for the moment.  If we could figure out every choice’s chances of victory, we could then figure out which option would provide the greatest chance of victory, and then determine what the overall chances of winning were from that configuration of numbers.

For example, say we have the numbers 3, 4, and 7 left on the board.  We could not have figured out the chances of victory from this position earlier, since if we roll a 7, we can do one of two things: take off the 3 and 4 leaving the 7, or remove the 7 and leave the 3 and 4.  But now that we’ve determined the chances of winning from all of the simple cases, we can use those to figure out this case as well.  We’ve already figured out that the chances of victory if we have a bare 7 left on the board is 20%.  It turns out that the chances of victory with 3 and 4 on the board are slightly better: 20.83%.  That means in this instance, we should eliminate the 7 if we roll a seven with the dice, since a board with a 3 and a 4 left on it would be easier to deal with than a board with only a 7 on it.  Combine that with the chances of winning when we roll the other good numbers, none of which require making a choice (3, 4, 10, 11), we can determine that the overall chances of victory from this position are 7.64%.

After iterating through the list of unsolved configurations several times, we eventually discovered both the best strategy to use and the chances of winning from every possible combination of numbers.  From the starting position, where all nine numbers are lit and you have no insurance markers, you have a 17.1% chance of knocking all the numbers off.

Gene has a 9.32% chance of winning right now. How'd he do?

Gene has a 9.32% chance of winning right now. How’d he do?

Can we make any characterizations about the best strategy?  In general, it seems that the best strategy is to remove the largest numbers from the board that you can from your roll.  This means that on your first roll, if your roll is less than nine, you should remove just the number you rolled.  If you roll 10 or more, you should remove the 9 and either the 1, 2, or 3, depending on if you rolled 10, 11, or 12.

I do say this is the best strategy in general, but there appear to be a large number of exceptions.  Say you rolled a five to start with, and took off the 5.  If on your second roll you rolled another five, you might be inclined, following the rule of thumb above, to knock off the 4 and the 1.  Doing this leaves you with a 7.01% chance of winning.  If instead you removed the 3 and 2, you’d have a 7.47% chance of winning.  This is just one of many cases where following the general strategy is not optimal.  We tried to figure out if there was a common thread to these exceptions, but nothing jumped out at us.  Even so, We would expect that a person following the basic strategy and ignoring these exceptions would probably only cost themselves a few tenths of a percent on their overall winning percentage.

If you’d like to play around with these results, I’ve included a little widget at the end of this post for you to play with.  Highlight the numbers remaining on the board and the number of insurance markers you have, and it’ll outline the best strategy to follow at that point, as well as your chances of victory.  Have fun!

Chasing Down The Best Chaser


Pictured Left to Right: Velma, Scooby, Daphne, Shaggy, Fred.

Earlier this year, Jenny Ryan joined the cast of the ITV hit show The Chase as the fifth resident chaser. The Vixen will take her place among trivia’s rogues gallery alongside Mark “The Beast” Labbett, Shaun “The Dark Destroyer” Wallace, Anne “The Governess” Hegerty, and Paul “The Sinnerman” Sinha. Given Jenny’s trivia bona fides (QI elf, Only Connect series champ, University Challenge, Mastermind, and Fifteen-to-One alumna), it’s no surprise that she fit right in alongside the others, who between them have over 700 episodes of experience with crushing the hopes and dreams of unlucky contestants. But, after all those episodes, who among them is the best at their job? Who’s the one you would least want to meet at night in a dark alley of trivia?  (Note: Given that Ryan has not had much time to accumulate data, we will ignore her for the purposes of this question.)

The chasers have two chances to catch contestants and eliminate them during the show. First, they can eliminate contestants individually during each contestant’s head-to-head round. Finally, they can eliminate the team as a whole by catching them during the final chase. Let’s look at each round individually and see what data we can get.

Head-to-Head Round

chase_tableDuring the head-to-head round, the contestant is tasked with getting multiple choice questions right, each correct answer allowing the contestant to take one step towards victory and earning money for the communal team bank. The chaser starts eight steps away from the finish. The contestant can choose to start either four, five, or six steps away from the finish (and thus starting with either a four, three, or two step head start on the Chaser), with the further starting locations worth more money. The Chaser and the contestant are asked the same questions, with each right answer bringing them closer to the finish line.  If the contestant manages to stay ahead of the Chaser and reach the finish line, they put their earned cash into the communal bank, and earn a spot in the Final Chase at the end of the show. If the Chaser catches up to them before that happens, they don’t earn the money and are eliminated.

It is tough to get an accurate read on the Chaser’s abilities from the data that we have in this round. We are relying on the results provided by the Chase Wikia, which only gives a one-line summary of how each episode finished. It would be great if we could watch all 707 episodes from the first eight seasons of the show and keep detailed records of the Chaser’s correct answer rate, but that may be a project for when somebody finally invents the 28-hour day. Still, we do have a record of how often each Chaser catches a contestant in this round. Can we do anything with that?


Before we put too much stock in these numbers, I do want to share my reservations about it. First, we do not have the data of how many contestants opted to start with a two, three, or four step head start. We can assume that each chaser gets about the same amount of contestants to step closer or further away, but it introduces a small element of imprecision we can’t address.

Another issue that muddies the water is that some contestants are uncatchable. If a contestant answers every question correctly (or answers incorrectly a fewer amount of times than their head start), then the performance of the Chaser is moot. It will always be chalked up as a loss. It doesn’t seem right that there are times that a Chaser could answer every question correctly or every question incorrectly and have it look the same either way in the data.

Finally, the bigger issue is that these numbers are so close as not to be statistically significant. Even though each Chaser has faced down between 600 and 800 contestants each, that’s still too small of a sample size to say that these numbers are definitive. The margin of error of each of these numbers at a 95% confidence interval is around 3.5% percent, as illustrated in this chart.


In this graph, we’ve highlighted the margin of error in red, representing where we think each Chaser’s true value could fall within a 95% certainty.  You can see that even Paul’s low mark could theoretically still be the highest among the four.

So it’s clear that this analysis isn’t the best determination of Chaser performance. Can we do better in the Final Chase?

Final Chase


Pictured Left to Right: Ginger, Baby, Scary, Sporty, Posh

After every contestant has had a chance to play head-to-head against the Chaser, those who survive are brought back to try and win the communal bank. The team is given 2 minutes to answer as many questions as they can. The Chaser then gets another two minutes to try to match the score set by the contestants. If they do that, then it’s game over for the team. If the Chaser falls short, then the surviving team members split the bank. To even the playing field, the contestants are given two major advantages. Firstly, they earn a head start equal to the number of surviving contestants. Secondly, anytime that the Chaser misses a question during their turn, the clock is stopped and the contestants get a chance to answer it themselves. If they get the question correct, they push the Chaser back one step.

The data we have for this round is of a much higher quality. We know how many contestants the team has left and how well they scored during their two minutes. We also have the Chaser’s final score, and how long the Chaser took to catch the team if the Chaser won. What we’d like to do is use the Chaser’s score as the performance metric in this round, but we’ll need to do a few things first to take care of the variable conditions of this round.

chase_final_picThe most obvious place to start is by normalizing the amount of time each Chaser has to answer questions. Since the game is over as soon as the Chaser meets the score set by the contestants, we need to figure out what the Chaser would have scored had they had their full two minutes. That’s simple enough: We will give them credit for the missing time by assuming that they will continue to answer questions at the same rate. Thus, if a Chaser catches a team with a score of 15 with 30 seconds left, we will treat that as a score of 20. If the Chaser fails to catch the contestants, their score will not change, as they used the entire two minutes.

Now, there are a couple of issues with treating the scores this way. If a Chaser has to chase down a small score, it’s possible that they might take a couple of seconds extra to think about each question before answering. On the flip side, if they have to chase down a large score, they may rush and become prone to more mistakes. Also, (and I have no proof of this except my own anecdotal experience of watching the show) I feel that the host, Bradley Walsh, will speed up his reading of the questions if time is winding down and the Chaser is close to the target. While these are things we need to be aware of, I still feel comfortable about normalizing the scores in this way.


Pictured Left to Right: Niall, Liam, Harry, Louis, Zayn

The other thing we need to control for is the number of opponents that the Chaser is facing. Since the contestants get offered any questions that the Chaser misses, and their right answers are deducted from the Chaser’s score, the number of contestants left on the team has a direct impact on the Chaser’s final score. Analyzing the Chaser’s scores as a function of team size tells us that a two or three player team will earn around one more pushback than a one player team, while a full four player team earns around 1.5 pushbacks more than the single player. If the Chaser faces a multi-person team, we will give them credit for these extra pushbacks so the data is normalized to a Chaser facing a single player.

Now that we’ve eliminated all variables outside the control of the Chaser, here’s each Chaser’s average performance.


Mark, Anne, and Paul are all very close, but Shaun ends up averaging almost 2 questions less. This is borne out by using each chaser’s raw winning %: Mark, Anne, and Paul win about three-quarters of the time, while Shaun’s victory rate is only two-thirds.

This data does not suffer from the issues of the data from the head-to-head round. This data is a metric of raw performance on the part of the Chasers; we have eliminated any effects the contestants have on this score. It is also significant to a 95% confidence level. The margin of error on these numbers is between 0.6 and 0.7 of a question for each Chaser, which means that while we can’t say that Mark, Anne or Paul are better than one another, they all have performed better than Shaun.


There’s something else that we can do with this data that’s pretty cool. To illustrate this, let’s take a look at a graph of the frequency of Mark’s normalized scores, rounded to the nearest whole number.


Say, that kinda looks like a bell curve, doesn’t it? Doing some normality testing on the data bears this hypothesis out; this data likely conforms to a normal distribution. The other Chasers’ data has the same feature. Therefore, since we know each Chasers’ average performance and standard deviation during the Final Chase, we can extrapolate upon this data and determine the odds of a Chaser beating any given score by fitting a normal distribution to each Chaser’s average score and standard deviation.

For example, let’s assume that a full team of 4 sets a score of 17 during their final chase. Not too shabby, right?  What’s the likelihood that each Chaser will chase down that score?

Since our averages are normalized for a 1 person team, and this example uses a 4 person team, we will add 1.5 to their final score to represent the greater number of pushbacks that the team will score. So, the Chaser will have to score at least 18.5 points in order to catch the team. What is the team’s chance of victory against each Chaser?


Here you can see just how much an effect that two question difference between Shaun and the other three has. Facing Paul, Anne, or Mark, the team has less than a 1 in 4 chance of victory. Up against Shaun, the team will run out winners 42% of the time.

Here’s the full graph that shows the chance that a team will beat each Chaser based on their final score (before pushbacks).


Pictured Left to Right: Wasp, Hulk, Iron Man, Thor, Ant-Man

Pictured Left to Right: Wasp, Hulk, Iron Man, Thor, Ant-Man

So who is the best Chaser? With the data we have right now, it’s hard to tell. I’d be inclined to call it a dead heat between Mark and Anne, with Paul just a nose behind them, and Shaun a bit further back. Despite this gap, I want to stress that Shaun is still an formidable opponent, and if the contestants facing him are expecting an easy game, they’re going to be disappointed.  Time will tell how Jenny will fit into this group, but given her quizzing pedigree I expect her to do just as well as the other four regulars.

2015 Jeopardy Tournament of Champions: Semifinal Update

Couple of random thoughts before revealing my predictions for the ToC Semifinals:

– My system did pretty darn well this year, getting 4 of the 5 winners of the semifinals correct. Granted, it wasn’t much of a radical prediction to say that Matt Jackson and Alex Jacob would win their games. However, I’d argue that Kerry Greene, despite nominally being the top seed in her game, was not an obvious favorite, nor would Catherine Hardee be easy to pick out as a favorite to win from the third lectern. The system’s one miss was favoring Greg Seroka and Kristin Sausville over the eventual winner from Tuesday’s game, Brennan Bushee, though to be fair Bushee won the game from last place by being the only player to get Final Jeopardy correct.
– The Wild Card cutoff point was higher than average this year at $14,000. The average over the history of the tournament (after doubling the scores of the pre-double dollars era) stood at $10,464. Anecdotally, I’d think that may be the effect of almost all players going into their games with the goal of not necessarily winning, but playing to hit a self-determined goal score that would earn them a wild card. With much more data out there about things like historical wild card totals, I wonder if this is going to lead to a situation where the wild card cutoff will always be higher than historically expected. Then again, last year’s cutoff was $9,100, and most of the data was available then too, so it’s just as likely there’s no great reason for this year’s cutoff being so much higher.
– I would love to know how the Jeopardy team selects the semifinal matchups. I’ve tried to come up with some set of seeding rules, but nothing I can find explains the matchups perfectly. The only rules that I know for sure are that players will not face their opponents from their quarterfinal match, and two people with the same first name will not play each other. This is different from the quarterfinals, where the games are fairly obviously seeded so that in each match one of the top 5 players (ranked by games and money won) plays one of the second five and one of the bottom five.
– I need to thank Andy Saunders of The Jeopardy Fan for his guesses as to what the semifinal matchups would be, which turned out to be correct and give me a little more time to run the numbers. Jeopardy didn’t officially release the matchups until Monday morning (as far as I saw through the official channels), which is slightly annoying for those of us in the game-show-data-analysis business.

Our prediction of Jackson vs. Jacob vs. [Seroka/Sausville/Hardee] isn’t going to be happening, since neither Greg Seroka nor Kristin Sausville made the second week, and Catherine Hardee is playing Matt Jackson in Wednesday’s game. Instead, the favorite for that third slot becomes Dan Feitel, winner of the Semifinal Matchup sweepstakes. He had the biggest movement in our prediction engine thanks to staying out of the path of the two juggernauts, increasing his chances of winning the tournament from 4% to 14%.


Alex dominated his quarterfinal game, becoming the only player to have a lock game last week. We see no reason to expect a different result from his semifinal matchup against Brennan Bushee and Vaughn Winchell.


If anybody is going to keep us from a final of M Jackson v. A Jacob v. AN Other, Catherine Hardee has the best chance of doing it. She could actually outbuzz Jackson, possibly the first time he’s ever had to face somebody who could do that.  If she can keep her number of wrong answers down and take a few Daily Doubles, she could still certainly crash the finals.  However, the smart money still has to be on Jackson winning this matchup.


Thanks to a slightly easier Semifinal matchup, Alex Jacob takes the tag of favorite by a clear margin over Jackson. Both men are close to 1-in-4 odds of taking the title. As prevously stated, Dan Feitel moves from his quarterfinal position of “best of the rest” into a solid third place thanks to avoiding the two favorites. Good luck to all participants, and here’s hoping that the games to come are just as fun, interesting, and exciting as last week’s games.

Lightning Round: Matt Jackson, New Jeopardy Record Holder?

So things have been quiet here for a few months.  I’ve been very busy at work lately, but I do have a couple of articles that are very close to being finished that will be up in the next few weeks. One is a treatise on Daily Double wagering that I’ve been working on for the better part of a year, and another article will be evaluating the performances of the Chasers on ITV’s The Chase.  However, current events have prompted me to write a Lightning Round article about this man, who has polarized fans of Jeopardy over the past two weeks:

The owner of this smile is Matt Jackson, a paralegal from DC who yesterday became the 5th person ever to reach 10 wins, putting him 5th on the all-time win list behind Arthur Chu (11), David Madden (19), Julia Collins (20), and, of course, Ken Jennings (74).  Given Jackson’s performances so far, how many wins is he likely to finish his run with?  Could we be looking at a new record holder?

I’ve taken a look at this sort of thing before, back in June of 2014 after Julia Collins had finished her 20 game run.  I’m going to use the same methodology here: look at Jackson’s game situations heading into Final Jeopardy, and determine how often Jackson should be expected to win if he continues in that fashion.

In Jackson’s 10 games so far, he has achieved 8 lock games and 2 crush games heading into Final Jeopardy.  The lock games are easy to deal with – Jackson wins those 100% of the time.  That leaves the 20% of the time when Jackson is leading by more than 2/3s of his nearest opponent’s score.  In order to lose a game that you are crushing heading into Final Jeopardy, two things need to happen: you need to respond incorrectly to Final, while your nearest opponent needs to respond correctly.  So far, Jackson has a 60% correct response rate in Final Jeopardy.  I’ll use the historical correct answer percentage for an average contestant in Final Jeopardy to determine the chance that his trailing opponent answers correctly, which is 48.8%.  Since both events have to happen in order for Jackson to lose, we multiply the chances that Jackson misses (40%) by the chances that his opponent answers correctly (48.8%).  This means that the chance that Jackson loses in a crush situation is 19.6%.  Or, in other words, Jackson wins a crush 80.4% of the time.

So, 80% of the time, he locks up the game before Final and wins.  20% of the time, he has a crush heading into Final and wins 80.4% of the time.  Combine those two probabilities, and you come up with an impressive 96.1% win rate.  That is very impressive, close to Ken Jennings’ 97.0% win rate and well ahead of the third place win rate, David Madden’s 85.6%.

Does that mean he’s a threat to Jennings’ record?  It’s not very likely.  Jennings was very good but also very lucky, and outperformed his expectation (a mere 31 wins) by a large margin.  A person with a 96.1% chance of winning would be expected to “only” win 24.56 games before losing.  In Jackson’s case, we can add his 10 wins to that total to get our current estimate: an astounding but far from record-setting 34 games won.  I predict his current chances of the setting the record at 7.2%: possible but unlikely.

Of course, this analysis is predicated on him keeping up his pace of dominating the first two rounds before heading into Final Jeopardy.  Should he start to leave more openings for his opponents to catch him in Final, or (gasp) actually come into Final behind at some point, his expected win total would plummet.  Still, as long as he keeps up this level of performance, I’d expect to see Matt Jackson on our screens for some time to come.

Lightning Round: 500 Questions

500QABC’s latest Big Event Game Show Thing, 500 Questions, is currently in the middle of its nine night run.  And while pundits are keeping tabs on whether or not the show will actually manage to ask 500 questions during its entire run (spoiler alert: no), a question asked on the LearnedLeague forums got my attention.  A user asked what the chances were of a contestant actually completing the titular 500 questions.  That sounds like something we can look into.  So, we’re starting a new series here at Game Show Theory, The Lightning Round, devoted to questions about game shows that are interesting, but don’t really qualify for a full strategic breakdown.

Let’s have a quick refresher of 500 Questions’ rules.  A contestant is asked trivia questions one at a time, up to a theoretical total value of 500 questions.  Answering them correctly can earn money, which they secure after every 50 questions.  However, if they ever miss three questions in a row, they’re off the show.  There are some different types of question, and the presence of another player who may occasionally make life difficult for the contestant, but for our purposes we are going to ignore their effect on the game.

Question 5: How many fingers am I holding up?

Question 5: How many fingers am I holding up?

So, how likely is it that a contestant sees all of their 500 questions? I know of no simple probability distribution to address this question, so we’re going to take a slightly more manual and iterative approach to the problem.  We’re going to break the problem down by calculating the chance of a contestant surviving 1 question, 2 questions, 3 questions, and so on, up to the goal of 500 questions.

Let’s work through an example.  Let’s assume a prospective contestant will give a correct answer to a question a respectable 60% of the time.  Figuring out the chances of surviving the first two questions is trivial – it’s 100%, since there is no way to get three wrong answers in a row yet.  The first chance of losing comes at 3 questions.  The player would have to get the first three questions wrong in a row, which translates to 3 straight 40% shots:


The chances that the player would go three-and-out is 6.4%, meaning that 93.6% of the time they’re still in the game after three questions.

With that done, let’s work out the chances that the player bombs out after 4 questions.  You might initially think at first that it’s the same as above, 6.4%, but actually there’s a couple of wrinkles we have to consider.  First of all, we need to factor in the chances that they’ve already been defeated, since you can’t lose after 4 questions if you’ve already lost after 3 questions.  Secondly, losing at 4 questions not only requires the contestant to have gotten questions 2, 3, and 4 incorrect, but also must have gotten question 1 correct.  If Question 1 was answered incorrectly, there is no way the player can get three in a row wrong on question 4.  Either they answer questions 2 and 3 wrong, in which case they’ve already been eliminated, or they answer one of those questions correctly, in which case Question 4 can’t be the third wrong answer in a row.

This gives us the following formula:

CodeCogsEqn (1)

Factoring all that in, we now have a 3.6% chance of losing after question 4.  Combined with the chances of losing after question 3, the total chances of survival are now a hair above 90%.  If 10% of the time, the player will be out after only 4 questions, their chances of surviving 500 questions is not looking too strong.

Calculating the odds of losing on questions 5 and beyond can be calculated in the same way as question 4.  We multiply the chance that we are still in the game at that point by the chance of answering one question correct and three questions wrong.

Question 500: Mark Burnett's Beard. Seriously, WTF?

Question 500: Mark Burnett’s Beard. Seriously, WTF?

As you might imagine given the results so far, the results do not make pleasant reading for our hypothetical player.  They only have a 50% of getting to question 19, well before they have the chance to make any money.  They’re only going to be able to bank the money earned in their first 50 questions 14.8% of the time.  And the chances of getting through all 500 questions?  0.0000003%.  That’s about a 1 in 300 million chance,  meaning they have as much chance of surviving 500 questions as they have of dying  as a result of a shark attack.  Put another way, our 60% contestant should stick to playing Powerball – they’ll have about twice the chance of winning the jackpot there.

What if we increase the contestant’s average question get rate?  Here’s a chart that breaks down the chances of players of different strengths hitting the 25, 50, 100, 250, and 500 question milestones:


If a player wanted to have a 50/50 shot of getting through the 500 questions, they need to be very, very good.  A player would have to get 88.37% of their questions right to stand a break-even chance of finishing the game, a number I would expect only trivia elite could get close to achieving.

One of the decisions that the producers of 500 Questions have made is to be outspoken in calling their contestants geniuses.  They better be – only geniuses stand a chance of performing well.