1000 Heartbeats: When to Cashout?

Logo

You know what they say about the Deputy Undersecretary of the Interior, they’re only 1,000 heartbeats away from the Presidency.

ITV debuted a new game show on February 23rd, 1000 Heartbeats.  The main gimmick of the show is that the contestant’s own heartbeat determines how long they have to play.  It’s a stylish show, complete with a live string quartet providing the music at the same tempo as the contestant’s heart rate, and has been getting favorable reviews.

The contestant is given a “clock” of 1000 of their own heartbeats (measured by a well-hidden heart monitor) to play a series of minigames testing the contestant’s skills in anagrams, mental arithmetic, and general knowledge. Each successfully completed minigame increases the potential winnings of the contestant, up to a maximum of £25,000.  However, if they run out of heartbeats, their game is over and they leave empty-handed.  After each game is completed, the contestant is given a preview of the next game and, taking into consideration the number of heartbeats they have remaining, may either play on for more money or stop.  However, before the contestant can walk away with their money, they must play one final minigame with the remainder of their heartbeats, named Cashout.

gameplay

My cardiologist’s new stress test has yet to be endorsed by the American Heart Association.

Cashout is a fairly simple game.  As their heartbeats tick down, the contestant is given a series of True or False statements, and must correctly answer 5 in a row in order to win.  Giving an incorrect answer not only forces them to start their chain of 5 answers over again, but also deducts 25 heartbeats from their clock.  It’s an effective denouement, and has provided us with tense finishes already in the show’s short history.  But watching it got me thinking – how many heartbeats would you want to bring into Cashout to maximize your chance of success? And when during the course of the game is it ideal to play Cashout instead of pressing onward for a potentially higher payday?

strings

People were shocked to see the bold new direction being taken by Brian Eno.

Before we can measure how successful a contestant will be in Cashout, we first need to determine what metrics we can measure that will allow us to estimate a contestant’s success.  I’ve identified three metrics: what their heart rate is, how often they give the correct answer, and how long each question takes to read and answer.  Using those three variables, we can figure out the odds of completing a sequence of 5 correct answers, and how many heartbeats would elapse during that time.

These metrics will be different for each player, but for the purposes of this article, we will create an average contestant, using the data from the contestants who played during the first five episodes.  In Cashout, the average contestant would answer 70.14% of the True/False questions correctly, while taking 6.45 seconds per question with a heart rate of 128 BPM.  Using these values, we ran a Monte Carlo simulation to determine the chance that the average contestant will successfully complete Cashout given the number of heartbeats they started with.  The results are displayed in this graph.

Cashout Chart

It’s a curve that starts declining fairly gently, but increases in slope before crashing to 0% around 70 heartbeats   That’s the minimum number of heartbeats needed to see and answer 5 questions correctly with no wrong answers  It may not have have happened during the first week of shows, but it should happen 17.5% of the time.  Starting Cashout with anything above 250 heartbeats leads to a better than 50% chance of succeeding.

We can use these values to help decide whether or not to proceed to the next round during gameplay.  At any given point in a player’s game, we can calculate the Expected Value (EV) of the game, which is the amount of money the player would win on average if they played Cashout.  That value is determined by the amount of money banked so far multiplied by their chances of completing Cashout with their remaining heartbeats.  For example, if a contestant has £500 banked, and has a 90% chance of winning Cashout with their remaining heartbeats, the EV of their game at that point would be £450.

If it’s a good idea for the contestant to play onward, the EV of their game after the next round must be higher. We can represent this in the following formula:

Equation1

M is the amount of money currently banked, H is the number of Heartbeats remaining, and C is the function that tells us the chances of winning Cashout given a number of heartbeats, as defined by the chart above.  M’ and H’ are the money and heartbeats left after the next round is played.

This function will be easier to work with if we divide both sides by M’, as so:

Equation2

Money LadderWe now see that the contestant should want to move on if their future chances are greater than their current chances multiplied by the ratio at which the banked money will increase.  Looking at the money ladder, we can see that in rounds 2, 3, 4, and 6 the money doubles, and M divided by M’ would be .5.  Thus, the contestant should move on if their chances of winning Cashout are greater than half of their current chances.  In rounds 3 and 7, the jump is greater, as the money is increased 150%.  M divided by M’ in this case would be .4, so the contestant’s future chances can drop by as much as 60% of their current level before moving on becomes a bad idea.

So, we know the ratio at which our money rises, and we can determine our current chances of winning Cashout.  But how can you determine your future chances?  After all, you don’t know exactly how many heartbeats you’ll have left.  This is the position in the game where an element of estimation comes into play.  Taking a look at your past performances as well as the difficulty level of the next game, you’ll have to estimate how many heartbeats you’ll think you’ll need to successfully complete the next game.

Here’s that data broken down in graph form.  The two lines represent the break-even points of how many heartbeats you can spend given your current number of heartbeats, depending on what round you are playing.

HB Chart

Click to see the full-size chart.

Let’s use this data to take a closer look at the two contestants who completed games on the show that aired on March 2nd.

contestant1

Luanne had a relatively poor first round of Contrast, and continued the theme with an even worse attempt at Unravel in round 2.  By the time Round 3 came along, she only had 366 heartbeats left to play Assemble. According to the chart above, Luanne should have continued to play if she thought that she could complete Assemble in fewer than 200 heartbeats.  Since she hadn’t done that yet in her game, it was probably a wise move that she opted to play an early Cashout.   Taking 366 heartbeats into Cashout should be enough for an average contestant to win about two-thirds of the time.  In this particular example, she managed to play Cashout successfully, ending with a scant 6 heartbeats remaining and taking home a hard-earned £500.

contestant2

Andy blew threw Contrast and Unravel before hitting a stumbling block in round 3 with Assemble.  He righted the ship in Round 4 with Link, and faced a decision on whether to play Keep Up, a mathematical game, in round 5.  It seems like most players are loathe to play these games requiring mathematical computation, but Andy opted to continue playing.  This could be seen as an aggressive move, but is a move that should lead to more money won as long as he spends fewer than 231 heartbeats on the game.  He only spent 190, which increased the expected value of his game by £414.25.  With only 204 heartbeats remaining, he had an easy choice to opt for Cashout at this point.  Unfortunately, he quickly exhausted his heartbeats with a series of wrong answers, and wound up not converting his banked £5,000.  Andy made a risky, yet mathematically sound choice to play Keep Up, but was not rewarded in the end.

Estimating how many heartbeats each minigame would take to complete would be very useful, We could could look at the past playings of each minigame and determine the time taken to complete it.  However, after only five episodes, the data we would get would not be very reliable due to the small sample size.  Perhaps if ITV gives this show the run it deserves, we will revisit this topic in a future article.

Jeopardy ToC Update: Semifinal Game 2

The Stats

You gotta hand it to Arthur Chu.  Despite being handed the toughest draw in the field, he’s making this look easy.

toc_sf2_table

Mark Japinga’s negging and Rider being crowded out on the buzzer could have made this a runaway, but Rider doubled up on a good Daily Double, and Japinga made a late run to remain relevant.

Daily Doubles

I’ll admit – as soon as Chu wagered $1,000 on his first Daily Double, my bad-bet-o-meter went crazy.  He found it early in the Jeopardy round, having called for the $600 and $800 clues in random categories looking for it as is his M.O.  He was trailing Japinga slightly at the time, $2,600 to $2,800, with Rider yet to open her mouth. I loaded up my still-in-development Daily Double evaluator, and fed it the parameters.  Here’s what it spat out at me:

dd_sf2

The five lines represent how comfortable you are with the category.  They have nothing to do with the expected difficulty of the clue, which has already been accounted for.  The table may be hard to read, but it suggests 3 wagers, depending on your level of confidence:

– $300-400 if you are not confident. Enough to take over the lead, but still enough to stay in touching distance if you miss it.

– $2,000 if you’re confidence is average. Beyond a certain point when you’re leading, every dollar you have is gives you a smaller chance of winning than the previous dollar. At this point, the diminishing returns of increasing your score catch up with the chances that you’ll actually get the clue right.

–  $2,400 if you’re supremely confident.  The system suggests the diminishing returns beyond that point are not worth an all-in wager, but I certainly wouldn’t begrudge one if that was your choice.

Chu’s selection of $1,000 is an odd one. It’s not conservative enough if you don’t like the category (which I imagine was Chu’s reasoning), as it leaves you at least two clues behind to catch up if you’re wrong. It’s also not aggressive enough to take advantage of the opportunity that a Daily Doubles represents.

Chu hit the first Daily Double in Double Jeopardy, and again wagered $1,000.  The situation was much different this time, with Chu on a commanding $13,200 over Japinga’s $3,600 and Rider’s $1,800.  Without looking too deep at the details this time, I think the two choices in wagers would be the minimum of $5 if you want to protect your lead, and a more aggressive wager of $6,000, looking to close out the game here but still keep twice Japinga’s score if you’re wrong.  I don’t begrudge Chu his wager that much, since I doubt there was any great difference between wagers of $5 and $1,000. And despite the disclaimer that this could be the stupidest thing she ever did, Rider’s true Daily Double later with her score of $4,600 a long way back of Chu’s $16,600 was the only real move she could have made, especially in a category that she seemed to like.  She answered correctly, and Japinga made a late run to set up a Final Jeopardy where all three players could still take the victory.

Final Jeopardy

The scores were Chu with $17,800, Rider with $12,800, and Japinga with $8,000.  I’ll leave the exact analysis of the situation to Keith at The Final Wager, who does a better job at it than I could ever hope to. But I do want to talk about something that I feel very strongly about which comes into play here: Stratton’s Dilemma.

Coined by Andy Saunders, Stratton’s Dilemma is the term for the situation Rider finds herself in, where she has to choose between two possible wagers. Let’s analyze her situation. We can assume that Chu will wager $7,801 or thereabouts, the amount needed to guarantee his victory as long as he gets Final Jeopardy correct. We must assume that Chu responds incorrectly, otherwise we have no chance of winning.  If Chu is incorrect, he will be left with $9,999. Thus, it is in our best interest to wager no more than $2,800, staying ahead of Chu no matter what our outcome is. However, the presence of Japinga complicates matters.  If Japinga wagers everything and doubles up, he’ll have $16,000.  To ensure that we remain ahead of Japinga in that situation, we have to bet at least $3,201.  Sometimes we can find a wager to satisfy both scenarios, but this is not one of those times. We have to choose one or the other, and know that some of the time we will choose incorrectly and lose when we could have won.  It’s an infuriating position. Which bet should we choose?

Well, let’s break down the scenarios.  Since each player can either be correct or incorrect in Final Jeopardy, there are 8 possibilities to consider.  Assuming that Chu wagers $7,801, and Japinga wagers his entire $8,000 (our worst case scenario), what happens when we bet $0?stratton1Compare that to what happens when we bet everything:

stratton2

We’ll always win in either case if we’re the only person to respond correctly. When we bet small, we will win if everybody answers incorrectly.  If we bet big, we will win if the leader misses and both we and the player in third respond correctly.  Which is more likely to happen?  If only we had a large repository of previously played Jeopardy games to look at…

I took a look at 2,184 games played over the last 10 years where all three players made it to Final Jeopardy.  Specifically, I counted the times when each of the above scenarios happened.  I separated the players by their ranking going into Final Jeopardy, so I know how often the leader missed Final Jeopardy while the other two got it right, or only the second-ranked player responded correctly, for example. And what were the results?stratton3

Unsurprisingly, the scenarios when all three players responded correctly and all three players responded incorrectly are more likely than any other case.  Players possess a shared knowledge base, and a question that one person knows is likely to also be known by the others, and vice versa.  The situation we’re hopping happens when we bet big, where 2nd and 3rd place answers correctly but the leader does not, is the least likely to happen. If we quantify our chances of winning using the above table, we have a 30.9% chance of victory with a small bet, while we only have a 19.1% chance of success if we wager big.  I don’t know about you, but if one choice of a dilemma increases my chances by over 60%, I wouldn’t call it much of a dilemma.

The Odds

Well, we’re going to get the matchup we wanted, Collins vs. Chu. And frankly, this tournament is now Chu’s to lose. Regardless of who wins the next match between Ben Ingram, Joshua Brakhage, and Sandie Baker, I can’t help but see Chu as a strong favorite, especially over a two-day affair where the normally high level of variance inherent in a Jeopardy match is lowered.

toc_sf2_odds

 

Jeopardy ToC Update: Semifinals Game 1

The Stats

The “curse” of second place continues.  This makes it the sixth straight time that the player with the second ranked chance by my system won the game.

toc_sf1_table

Terry O’Shea may have been extra choosy in her clue selection, but her 100% precision and her tendency to score rebounds off of the others still gave her a strong showing.  Jared Hall and Julia Collins played very similar games, but Hall found all three Daily Doubles.  Had he converted his all-in wager early in Double Jeopardy, the result may have been a lot different.  Instead, Julia played up to the expectations placed upon her, and became our first Finalist.

Strategy

Hall, as lampshaded by Trebek at one point, was the only player playing tonight who picked clues from the middle of the board at first, hunting for the Daily Double.  He was rewarded handsomely for his efforts, as all three Daily Doubles landed in his lap.  The first and third were rather straightforward efforts, with Hall having less than $1,000 in the Jeopardy round and $2,000 in Double Jeopardy, which almost always necessitates the maximum wager allowable wager of $1,000 or $2,000, depending on the round.  His second Daily Double, however, I think deserves some extra attention.

Hall had $4,200 early in the Double Jeopardy round, poised between Collins’ $6,800 and O’Shea’s $3,000.  The Daily Double was hiding in the $1,600 clue in the category Great American Novels.  The average player would probably pick a middling value without too much thought, probably about half their score or less.  Hall did what I imagine most good Jeopardy players would do, and bet it all.  There’s still enough money left on the board ($27,600 out of the $36,000 that started the round, as well as the other Daily Double) that even if you go bust, there’s still enough time to make a comeback.  And if you respond correctly and double up, you’ll have a lead and be well poised for the rest of the game.

I’ve mentioned that I’m working on a system that evaluates Daily Doubles and determines the expected win % of every possible wager, trying to find the “best” bet, if such a thing exists.  It’s still very rough around the edges, but I thought I’d give it this situation to mull over.  Here’s what it said:

dd_sf1

The five lines represent how comfortable you are with the category (not the clue value – that’s already been baked in to the system.), with red being the least comfortable to green being the most comfortable.  In the end, your comfort level with the category didn’t matter, as bets could be clustered into three categories:

– $0 – $1,200: Pretty bad.  You’re still going to be in second place no matter what.

– $1,200 – $2,600: Even worse.  Best case scenario is that you’re still trailing the leader, but now if you’re wrong you’re dumped down into 3rd place.

– $2,600+: Best. The benefit of being in first place if you are correct greatly outweighs the chance that you’ll be in third.  What’s interesting about this section is that your chances decrease as your wager increases beyond what it would take to get into first, indicating that perhaps what you gain by increasing your score above $6,801 is not worth what you lose if you by going further and further behind.  Like I said before, I don’t consider this system to be ready for prime-time, but it gives us something to think about: that a knee-jerk all-in bet in this scenario may be good, but not optimal.

Going into Final Jeopardy, it was still anybody’s game.  Collins led with $12,000, O’Shea had $8,200, and Hall was right behind them with $7,600.  This situation may look a little boring at first, but digging a little deeper into the math reveals a truly fascinating situation – and one that’s very scary for Collins, despite being in the lead.  Keith Williams of premier Jeopardy wagering blog The Final Wager has an excellent write-up and video detailing why, one that I highly recommend watching.

The producers of Jeopardy stop tape to give the players as much time as they want to calculate their wagers; Collins took 15 minutes to finally settle on hers.  In a cruel twist of fate, all of the strategizing proved moot, as only Collins was able to get the correct response in Final Jeopardy, and is our first confirmed finalist.  Well done to her.  She rode her luck a little, earning a wild card with a historically below-average score, and not getting any of the Daily Doubles in this match, but I don’t think anybody could begrudge her her place in the finals.

The Odds

Collins moves up to the top by virtue of having secured her finals place.  The remaining players’ chances only moved a fraction of a percentage point – Collins’s stats were darn near close to the average expected stats of the player coming out of the first semifinal.

toc_sf1_odds

We’ll see if tomorrow will get us the final match that everybody wants to see: Arthur Chu vs. Julia Collins, or if Mark Japinga or Rebecca Rider will upset the storyline.

Jeopardy ToC Update: Semifinals Preview

Sorry that I never got an update out on Friday’s match.  Jeopardy was preempted here, so I didn’t get to watch the match until late Saturday.  Quite a good match, mind, as all three players played well and wagered well, and all three were rewarded with advancement.

toc_5_table

The only player who played close to expectations was Mark Japinga.  Jared Hall played conservatively, but never made a misstep (outside of 1 Daily Double), and took the win after Japinga wagered for a Wild Card spot in Final Jeopardy.  Sandie Baker didn’t have the best game either, but still had enough money to bet big in Final and take the last Wild Card spot.

So, we have our nine players.  Stepping back to evaluate the performance of my system so far, I’d have to give the system’s predictive power a rating of B- so far.  It whiffed on Andrew Moore being the favorite, but on the other hand of the five players least likely to advance, four of them indeed failed to advance. On an individual game basis, the middle ranked player won all five games, although to the system’s credit in two of those games the favorite was leading heading into Final.toc_original_odds_after_week_1

Looking forward, the picture is much more clear.  We know what the three semifinal matchups are going to be (thanks again to The Final Wager for releasing them on Friday), and the picture of what our three finalists are going to be is getting clearer.  As a result, there have been some massive shakeups in our odds.  Let’s look at each matchup in detail.

Monday’s Game

toc_sf_1

(Our buzzing and precision scores have been recalculated to include each player’s quarterfinal performance)

After looking at the draw, I’m pretty sure that the semifinal matches are seeded based on each player’s quarterfinal score. The top three winners go into one group, the other two winners and the best wild card go into the second group, and the remaining wild cards go into the final group.  Each game is made up of one player in each group.  Terry O’Shea’s Thursday score was the third best score among the winners, and I bet Jared Hall and Julia Collins are thankful for that fact.  In another universe, they could be playing against one of the other top seeds: Arthur Chu or Ben Ingram.  Instead, Hall and Collins will likely battle it out among themselves for the finalist spot.  Hall’s stronger buzzing percentage gives him the edge over Collins’ better precision.

toc_sf_2

If Arthur Chu ends up winning the tournament, nobody could ever accuse him of getting an easy draw.  After taking down Andrew Moore in his quarterfinal match, he now faces off against Mark Japinga, our system’s current favorite. Rebecca Rider rounds out this trio – she’ll have to be at the top of her game to get past Chu and Japinga.

toc_sf_3

Wow.  Just … wow.  After factoring in each player’s performance in the quarters, we have a situation where each player’s buzzing percentage is practically the same.  Ben Ingram has the edge in the system thanks to his superior precision, but really this one could be anybody’s game.  It would not surprise me to see Joshua Brakhage or Sandie Baker advance.  Expect this game to be close going into Final Jeopardy.

The Odds

With the semifinal matches now set, our odds now look quite different:

toc_srf_odds

 

Mark Japinga now takes up the mantle of favorite.  If Arthur Chu beats him on Tuesday like he did to our previous favorite, I would be very surprised if he didn’t go on to win the tournament.  Ben Ingram moves up to #2 despite Wednesday’s game being little more than a crapshoot.  Monday’s winner will probably end up being the underdog going into the final games; the system only sees one of those three winning one time in four.

No matter who advances, expect some excellent knowledge and strategic game playing on display.  Check back all this week for updates.

Jeopardy ToC Update: Quarterfinal Match 4

When I suggested in my preview that you could “expect [a] game where a number of clues pass by unanswered”, I didn’t quite mean it to this extent.

The Stats

toc_4_table

Whether it was the material, the players, or the lunch break that would have preceded this taping, the game just never got off the ground. All three players underperformed on their buzzing, and only Terry O’Shea hit her previous level of precision. Thirteen out of the 57 non-Daily Double clues passed by our competitors without a single buzz, not to mention several more that only elicited incorrect responses.  It was a bad day at the office for all parties involved, and unfortunately it cost two of the players their shot at the title.

The Strategy

Normally I’d break down the strategic choices made by the players in this space, but there’s not much to say.  All three Daily Doubles were found by the player in distant last place, all three times the player correctly wagered big, all three times the wager was lost. Drew Horwood finished the match in the red, and didn’t play Final Jeopardy, while O’Shea had a slim lead of $8,800 to Sarah McNitt’s $8,600.  Both bet big, since you’d think their scores wouldn’t be worth a wild card, although this tournament is shaping up to have a low cutoff (again, they wouldn’t know that at the time).  O’Shea bet to lock-out McNitt, and was correct, earning the automatic advancement.  McNitt bet as big as she could, but still kept enough back to win on a Triple (Double?) Stumper, but missed Final Jeopardy.  Her final total of $500 is already known to be not enough to advance.

The Odds

On a more positive note, with Horwood and McNitt dropping out, we can now welcome Rebecca Rider and Julia Collins into the semifinal fold.  Rani Peffer’s $7,599 is looking pretty good at this point. and Jim Coury’s $5,600 still has a shot of holding up.  We’ll see what happens after the dust settles on tomorrow’s last quarterfinal.

toc_4_odds

Jeopardy ToC Update: Quarterfinal Match 3

It’s a situation I doubt many could have foreseen.  Julia Collins, 20 game winner, had the game on her racket … and double-faulted.

The Stats
toc_3_chart

 

Today’s game stats matched expectations pretty well, with Jim Coury’s forays into the red being the only real deviation.

Daily Doubles

Coury found the first two Daily Doubles, and correctly bet as big as he could both times.  He made the first one, helping atone for his poor first round showing, but dug himself a hole after finding his second early in the Double Jeopardy round with his score $3,000 to Collins’ $6,600 and Brakhage’s $4,000.  His true Daily Double backfired, digging himself a hole that he never really climbed out of.

Collins found the second Daily Double very late in the game, setting up a very interesting situation.  She had $10,600, just trailing Brakhage at $11,600 while Coury had revived a little to sit at $2,800.  Given that the clue was in a $1,200 box and should be relatively easy, I might have considered a large wager, looking to take first place and keep hold of it until Final Jeopardy.  Collins opted for a $4,000 wager instead.  Since this is a quarterfinal and wild cards are in play, I agree with the prudence.  The wager is big enough to give you a very good chance of staying in first if you are correct. Worst case scenario, if you get this wrong and don’t answer another clue for the rest of Double Jeopardy, you can still double up your remaining $6,600 to set a quite decent wild card score.  In the end, she answered correctly, and did carry the lead into Final, with a score of $16,200 to Brakhage’s $11,600 and Coury’s $2,800

Final Jeopardy

While Coury had an obvious bet (everything), both Collins and Brakhage were faced with dilemmas.

Collins’ traditional lock-out wager of $7,100 gives her the best chance of winning the game outright, but risks losing a wild card spot if she gets it wrong and loses to Brakhage.  If we assume that Brakhage bets for the win (not necessarily a correct assumption, see below), then Collins will win the game with a lockout wager if either she answers Final Jeopardy correct or Brakhage gets it wrong, a situation which happens on average about 80% of the time.  If she loses, the remaining score of $9,199 would be good for a wild card about 40% of the time (using our estimations from before the tournament, not taking into account the scores already seen this week, as Collins does not know about them at this point.). Together, wagering $7,001 adds up to an 88% chance of advancing.

However, what about a bet of $0?  I’d expect a score of $16,200 to earn a wild card spot about 84% of the time.  But Collins could still win even with a bet of $0, since Brakhage still has to answer correctly.  The average get rate for Final Jeopardy is around 50%, so we’ll use that as the chances that Collins will still win.  Combining these chances leads to a 92% chance of success.  Considering that the best case scenario for a lockout wager grants only an 88% chance of survival (possibly less, if Brakhage bets small), and I think a tactical $0 wager (or maybe a couple hundred in case Brakhage bets to just cover your total) is the best option here.

Brakhage has a similar situation to consider, as he has to bet at least $4,601 to guarantee a chance at victory. If he bets that, he’ll win the 20% of the time he answers the question correctly and Collins misses (assuming she’s betting for the win). In the remaining 80% of the time, he’ll answer correctly in half of the cases and be left with a very good shot at a wildcard (84%), and in half of the cases he’ll miss and be left with $6,999 (a 26% chance of earning a wild card). Total chance of advancing: 64%. If he stays pat and hopes his $11,600 is good enough, he’ll advance 57% of the time, assuming Collins also bets less than the difference between the two of them giving Brakhage no chance to win  However, both of these percentages swing wildly based on what Collins chooses to do.  If she bets big, Brakhage should bet small, while if Collins bets small, Brakhage should bet big. It’s a mind games situation.

In the end, Coury doubled up, Brakhage bet to cover Collins’ score and answered correctly, while Collins … bet to lock out Brakhage … and whiffed.  Brakhage takes the automatic semifinal spot, and Collins is left hoping that $9,100 is enough to get her into the second week.  She had no way of knowing this, but $9,100 is enough to put her second on the wild card list with two games left to play.  She might still be alright.

The Odds

toc_3_odds

Not much change in the odds.  Andrew Moore and John Pearson are now officially out. Brakhage moves up after guaranteeing his semifinal slot. Julia Collins still has a better than even money shot of making the next round. Jim Coury takes up residence on the wild card bubble with $5,600, which is unlikely to be enough.

Tomorrow’s game is Drew Horwood vs. Terry O’Shea vs. Sarah McNitt. We’ll know the identities of six of the nine semifinalists after the game.  Check back here for all the gruesome and over-complicated details.

Jeopardy ToC Update: Quarterfinal Match 2

Jeopardy tournaments are a harsh mistress.  One bad game, and you’re out on your backside.

Tonight’s Stats

 

toc_2_chart

Arthur Chu’s performance was nothing less than I expected from him coming into the game.  It was Andrew Moore’s underachievement (and to a lesser extent Rani Peffer as well) that turned the game into a blowout for Chu.

Daily Double

My gut instinct on Daily Doubles in the Jeopardy! round is to always bet aggressively.  When Moore found the Daily Double after he and Chu went on a wild goose chase for it, most of the high-dollar clues were off the board and the two of them were tied at $2,600, with Peffer trailing at $600.  Moore opted to bet only $1,000, and after analyzing the situation, I’m inclined to agree with him.  Given that the DD was hidden in the $1,000 space (indicating a harder clue), and that most of the money had been stripped from the board, a modicum of restraint was to be called for.

However, the situation was different when Moore found the first DD in the Double Jeopardy round early on.  He had a slight lead at the moment, $5,600 to Chu’s $5,200 and Peffer’s $2,200.  It’s a close decision, but this would have been the time to make a move.  A wager of all or almost all of his money would have given him the opportunity to take a stranglehold on the game with only one DD left on the board.  He instead chose to wager $2,000, and perhaps this was the right move, since he missed the clue.  After that stumble, Chu went on a hell of a run, and by the time he found the 2nd Daily Double late in the round, he had an unassailable lead.

Final Jeopardy

Not much to say here.  Chu had a lock on the game, and could sit this one out.  Moore and Peffer knew their current scores wouldn’t hack it, so each bet everything but $1.  Moore missed, and has gone from our favorite to all but out in the blink of an eye.  Peffer nearly doubled up, and takes the #2 spot on the wild card list for the nonce with $7,599.

Updated Odds

With our favorite knocked out of contention, there’s no surprises who took over at the top of the list.

toc_2_odds

 

A bit of change elsewhere on the list.  With Moore no longer a potential hurdle, many people saw their chances of getting to the finals and winning increase.  Two wild card scores coming in under expectations increases the chances of the people yet to play of taking a wild card.  Rebecca Rider’s score in the clubhouse of $11,600 is looking better; she’s even money to advance now.

Arthur Chu has dropped the gauntlet.  Will Julia Collins pick it up tomorrow and keep the chances of everybody’s dream matchup alive? Check back tomorrow for the analysis.

 

Jeopardy ToC Update: Quarterfinal Match 1

Congrats to Ben Ingram, becoming the first person to punch his ticket to the second week of the tournament.

Tonight’s Stats

toc_qf1

When I looked at the difference in contestant metrics between regular games and tournaments, buzzing percentage decreased by an average of 12% while precision remained relatively stable.  I’m happy to report that, whether by luck or design, both Ingram’s and John Pearson’s predicted metrics came darn close to matching the actual result.  Rebecca Rider had a tough match, and her poor precision is probably related to the fact that she only attempted 12 responses.

Daily Doubles

Strategically, the players went searching for Daily Doubles early, and attempted to use them to their best advantage.  Both Ingram’s early Jeopardy round DD and Pearson’s early Double Jeopardy DD came in situations where all-in wagers seemed logical; it was unlucky that both missed their responses.  Ingram’s second DD of the game came midway through the round with him in a commanding position, leading $11,600 to Rider’s $3,800 and Pearson’s $1,200.  I’m doing some preliminary work on theories behind Daily Double wagering, and even in the best of cases I think the maximum one should bet in this situation is around $2,500.  His bet of $100 was born out of smart quarterfinal strategy, looking to secure a high score rather than trying to lock the game down.

It’s possible that his failure to bet big prevented him from securing the lock game, as Pearson went on a late run to stay in the running for the automatic berth. The scores finished $16,100 to Ingram, $8,400 to Pearson, and $6,600 to Rider.

Final Jeopardy

toc_qf_fj

The way I saw it, Ingram had two choices for wagers: the traditional lock-out wager of $701, or the conservative $0, maximizing his chances of earning a wild card while still possibly winning the match.  He chose to wager $701, which I think is the correct choice. If he’s wrong and John doubles up, he’s only cost himself 4% on his chance of winning a wild card by losing $701.  On the other hand, his chances of winning the match outright go up by almost 20% with the wager of $701, since he now wins whenever he answers correctly and Pearson misses, whereas before he was at the mercy of Pearson’s response only.

Rider opted to wager only $5,000 of her $6,600, a wager that I cannot support. She’s playing solely for a wild card spot at this point – there should be no permutation that allows her to win on a Triple Stumper.  In that case, since $6,600 is below the expected wild card cutoff line, she really should be looking to double up to maximize her chances.  If she doubled up, I expect her score to earn a wild card 48% of the time.  As it stands, she answered correctly, and her final total of $11,600 leaves her with a 39% chance of advancing.  Had she missed, a score of $1,600 would have advanced only 2% of the time, so it’s likely not going to be worthwhile to leave something in reserve.

Pearson was in a similar situation as Rider, and chose to bet everything.  If he stood pat, he would only advance 12% of the time.  If he doubled up, in addition to the chances that he would win outright, his score of $16,800 would see him get a wild card about two-thirds of the time.  Unfortunately, he couldn’t come up with the right response, and barring a major miracle his tournament is over.

Updated Odds

The biggest winners tonight might actually have been the other 12 contestants.

toc_qf1_table

 

With Pearson out of the running and Rider putting up a solid, yet beatable score, everybody’s chances of earning a wild card go up a few percentage points.  Despite winning, the shifting wild card situation sees Ingram’s chances of winning the whole event go down thanks to the potential matchups in the semifinal becoming worse. Sandie Baker’s chances have actually tripled, due to the increased chances of getting a wild card and her own potential road to the final improving.

See you back here tomorrow, where two of the top three play against each other. And Rani Peffer probably beats Arthur Chu and Andrew Moore to throw all my predictions into the garbage.

Predicting the 2014 Tournament of Champions

B1ufETNCYAE4BYS

People who have made nearly $17.3 million from Jeopardy in the past 18 months, not including Colonial Penn Life kickbacks.

This Monday kicks off the 24th Jeopardy! Tournament of Champions, a two week sesquiennial bacchanalia of trivia where one lucky person gets to inscribe their name on the annals of Jeopardy! history.  It’s the closest thing to an end-of-season playoff that we have in the American game shows, and in the spirit of March Madness I thought I’d put on my Nate Silver hat and take a shot at predicting who the winners would be.  Let’s just ignore that in this case I’d be “predicting” the results of an event that was actually filmed a couple of months ago.

The Rules

classictoc

There is no truth to the rumor that the first Tournament of Champions was held inside a seaside photo booth.

The tournament format, which was supposedly created by Alex Trebek himself, sees fifteen previous champions come back to play over two weeks.  The fifteen participants are the winners of qualifying tournaments (currently the College Championship and the Teachers Tournament, previously winners of the Teen Tournament and the Seniors Tournament also got invites), and then enough champions who played since the last Tournament to fill in the remaining slots, ordered by number of games won, then amount of money won.  The cut-off this year was Mark Japinga, who won 4 games and $112,600. (One 6-game champion was not invited back due to legal issues.)  The first week sees the fifteen contestants placed semi-randomly (I’m pretty sure the contestants are seeded so that each game has one of the top five players, one of the middle five players, and one of the bottom five players) into five quarterfinal games.  The five winners of these games are guaranteed passage to the next round, along with the four highest-scoring non-winners as wild cards.  These nine players are again randomly drawn into three semifinal games, with the stipulation that you cannot play against someone you already played against in the first round.  The three semifinal winners then face off in a two-day final, where the cumulative amount won on both days is used to determine the tournament winner.

It’s an interesting format because proper play requires a shift in strategy.  Only the semifinal game is a traditional winner-take-all game of Jeopardy.  In the quarterfinal, you don’t necessarily need to win the game, just have enough money at the end of the game to be one of the top four non-winners. This leads to some conservative wagering on Daily Doubles and in Final Jeopardy.  The catch is that you are sequestered before you play your quarterfinal, so you do not necessarily know how much money would qualify you for the semifinal. There have been years where a score of $20,000 would have sent you home, and one year where multiple people finished with $0 but still made the semifinals.  (I’ve done some work with this subject, which I’ll save for a later post.)  The two-day final should be treated like a marathon compared to the usual format’s sprint, and strategic wagering in the first game could leave you with a strong platform to win, or leave you so far behind that it’ll take a miracle in the second game to come back.

Watson’s System

watson

“OK, you were just shown up on national TV by the monolith from 2001 … Smile!”

I was reading the Journal of Artificial Intelligence Research one night (as you do), and came upon a very interesting article written by part of the team that created Watson for IBM.  In it, they discussed some of the various challenges they faced in building an AI capable of the strategy of playing Jeopardy!. To test out strategies, they built a game simulator where Watson would take on two “average” Jeopardy! contestants. They measured a contestant’s ability using two numbers, the percentage of clues that they attempt to buzz in on (called buzzing percentage), and the percentage of clues that they respond correctly to having buzzed in (called precision).  Using data obtained from the Jeopardy! Archive, they determined that the average contestant attempts to ring in on 61% of clues, and answers 87% correctly.  I wondered if I could use this same methodology to evaluate individual contestants’ skill as well.

Precision was easy enough to determine, just by observation.  Buzzing percentage, however, is only an estimation.  We have no way of knowing for sure how many clues a contestant buzzes in on. Using the Watson team’s methodology, we can make an estimate based upon how many clues a contestant successfully buzzes in on compared to their opponents and the number of “triple stumpers”, clues that nobody attempts.

Now, I had an issue with their buzzing model, and it might be one that you’ve noticed as well.  Anybody who’s been on Jeopardy! or played a similar game knows that knowing the correct response is only half the battle. There’s also the dreaded lockout system that contestants have to conquer. Ever since the second season of the revived series in 1985, contestants have been forced to wait until the clue is read in its entirety by Alex before buzzing in.  If they buzz in before the buzzer is active (represented by a set of lights around the playing board, not shown to the viewers), they are locked-out of buzzing for two tenths of a second.  If they try to buzz in again during those two-tenths of a second, the lock-out resets.  If you have ever seen a contestant frantically trying to ring in but nothing is happening, even if nobody else has rung in, it’s because they’ve fallen prey to the lock-out.

People have spoken of buzzer skill as a major part of a Jeopardy player’s ability.  Some players manage to get in a rhythm with Alex’s cadence and seem to control the buzzer better than their opponents.  The Watson team chooses to ignores that ability.  It instead argues that a person who successfully buzzes in more just attempts to buzz in on more clues.  It assumes that everybody has the same amount of “buzzer skill”; if more than one person attempts to buzz in, the successful contestant is simply a matter of chance.

To test this out, I built a program that took these two contestant metrics and simulated a game, using Eric Feder’s work on Jeopardy Win Expectancy as a starting point.  When I ran simulations against the results of the last couple of tournaments, I was pleasantly surprised to see the simulation reasonably matching the results.  Despite my misgivings, I feel comfortable moving forward with predictions on this year’s tournament.

Quarterfinal Matchups

Let’s take a look at the first five games. (Thanks to The Final Wager for revealing the matchups early, and thanks to Buzzerblog for the use of the contestant images.)

toc1

Monday’s contestants, and their estimated win percentages.

The first match will be a contrast in styles.  Ben Ingram and Rebecca Rider both made their bones by being selective about their clue selection – they sport the two highest precision scores of the fifteen participants.  John Pearson, on the other hand, buzzed in nearly 10% more often than either of his opponents, but wound up giving the wrong answer almost 10% more often as well.  The system give a slight edge to Rider in this matchup, but Pearson should be selecting clues more often, giving him a higher chance of finding a well-timed Daily Double.

toc2

Tuesday’s contestants, and their estimated win percentages.

When I first calculated these statistics, I had to double-check Andrew Moore’s scores by hand to ensure their accuracy.  I couldn’t believe that a fairly anonymous 6-time champ in a season full of memorable winners could have a buzzing percentage almost 5% higher than any of the other competitors, but the numbers checked out.  While his precision is below average among tournament participants, it’s still good enough to give him the edge in this matchup, as well as in the overall tournament.  However, he could find trouble playing against Arthur Chu, who the system is also keen on.  Unfortunately, the tough matchup combined with her weak buzzing percentage does not bode well for Rani Peffer’s tournament chances.

toc3

Ednesdayway’s ontestantscay, anday eirthay estimateday inway ercentagespay.

20-game winner Julia Collins is given a slight edge against Joshua Brakhage and Jim Coury, although that’s more down to the quality of the competition than her own skill. Despite coming into the tournament as the favorite or co-favorite with Arthur Chu, the system sees Collins as merely an average ToC-quality player. If she were to play in either Tuesday’s or Friday’s game, she might have had a rough time making it to the second week.  Incidentally, all three players are fairly conservative on the buzzer – expect a good number of Triple Stumpers.

toc4

percentages win estimated their and, contestants Thursday’s.

This matchup threatens to play out much like Wednesday’s, with three conservative buzzers, none of which have shown extremely strong game play, pitted against each other.  Expect another game where a number of clues pass by unanswered.  Sarah McNitt has the edge over Drew Horwood and Terry O’Shea in both buzzing and precision, and as such is given the favorite tag.

toc5

Friday’s … you know the drill by now.

The week looks to end on a bang, with two strong competitors battling it out.  Mark Japinga may have gotten a bit lucky to even be invited back to take part in the tournament, but now that he’s here he’s a serious threat to win the whole thing.  Jared Hall is another strong player looking to go far.  Sandie Baker is probably drew the worst game to play out of all the contestants – if she were playing on Wednesday or Thursday she might have been a favorite to win.  Instead, she could get squeezed out early.

The Odds

Overall, I see this as a fairly wide open field.  Thirteen people have a better than even money chance of making it to the semifinals, and the favorite’s chances of winning is only about 1 in 5.  This could very well be the most exciting, unpredictable tournament in recent history.

toc-odds

Note: There are two ways of making the semifinals, by winning your quarterfinal or qualifying as a wild card. We model the chances of each event happening (the percentages in gray), and sum the two together to determine the total chance that a contestant makes the semifinals.

Check back with us after every game next week for analysis and updates.

Making the Most out of the Money Cards

The game of Card Sharks is one of the best remembered shows to come out the Goodson-Todman stable. It ran for three years from 1978 – 1981, had a successful revival run from 1986 to 1989, and spawned several international versions, most famous of which is the British version (titled “Play Your Cards Right”), which ran for a total of 13 years.

The game itself was based on the old gambling card game of Acey-Deucey. As such, it is ripe for strategic and mathematical evaluation.  While we will no doubt one day look at the proper strategy in the front game, today we’re going to take a closer look at the “Money Cards” bonus round.  This was one of the most exciting and lucrative bonus rounds around in the 1970’s, where a contestant, if armed with enough skill and backbone, could parlay $200 into $28,800.  But how should one play the Money Cards to maximize one’s winnings?  Should a player even play to maximize their winnings?

The Rules

Seven cards are laid out on three rows, as seen in the image to the right. An eighth card is turned face up, and the contestant wagers any or all of their bank on whether the next card will be higher or lower (Aces are high). The contestant is staked $200 at the beginning of the round, and must bet in $50 increments, with a $50 minimum bet. Guessing correctly wins the amount bet, guessing incorrectly (even tying) results in the bet being lost.

After the fourth card is turned over, play proceeds to the second row and the contestant is given another $200. Contestants keep wagering, following the same rules up until the last higher/lower decision on the top row, where the minimum bet becomes half of the player’s total. In addition, the player may choose to swap the face-up card before the first, fourth, and seventh decision (the first card on each level) with a new card from the deck.

While some of the rules changed during the run of the show (the ramifications of which we will address later), this set of rules will be the basis of our discussions.

Strategy

Some of the simpler aspects of the game are easy to analyze.  We know that in a deck of cards, the “8” card is the midpoint – there are 24 cards lower and 24 cards higher than it.  We should thusly wager on the next card being higher when our base card is lower than 8, and lower when the base card is higher than an 8.  When the card is an 8, we can count the cards we’ve seen (not a difficult task in the end game, as almost all cards remain visible when after they are played), and determine whether there are more higher and lower cards in the deck, and predict using this knowledge.

When it comes to choosing whether or not to switch our base card when given the opportunity, we can analyze our chance of success with every base card, as seen in the following table:

Using this data, we can determine that we have on average, a 72.4% chance of calling the next card correctly given a random base card.  Therefore, it makes sense to switch our base card when our chances of winning are below that number. Using the table above, we see that we should switch on any card between a 5 and Jack.

Betting Strategy – The Naive Approach

It’s clear that we should bet the minimum when we are faced with an 8 that we can’t change – barring a large number of lower or higher cards already used, it’s going to be a net loser, so we should minimize our losses.  But what about the other 12 cards?

For our purposes, let’s define a betting strategy as a series of 6 percentages, representing the percentage of your bank that you should bet depending upon your face up card. Why only 6 instead of 12?  Because the chances of winning with a 2 are the same as winning with an Ace, the chances of winning with a 3 are the same as a King, and so forth.  We only need 6 percentages to cover all twelve possibilities.

For example, the following would be a betting strategy: 30%|50%|70%|80%|90%|100%.  These represent the percentage of your bank that you will bet, depending on the face-up card.  If you are facing a 7 or a 9, you bet 30% of your bank, the left-most percentage. If you are blessed with an Ace or a 2, you bet the right-most percentage.  If a betting strategy’s wager falls below the minimum bet allowed, then you would instead bet the minimum.

Let’s envision a theoretical contestant, All-In Adrian. He’s a smart gambler, so he knows that when you’re offered even money on a better than even money proposition, you should bet heavily.  In this case, he’s going to bet everything on every card, as long as it’s not an 8.  His betting strategy would be 100%|100%|100%|100%|100%|100%.

Adrian is correct, you should bet heavily when the odds are in your favor.  He’s found the strategy that will earn him the most money on average.  But is that the necessarily the best strategy?

Betting Strategy – The Nuanced Approach

Let’s look at another theoretical contestant, Nervous Nick.  Nick hates gambling and hates to lose. He’s going to do nothing but wager the minimum bet of $50 every time except for the last, where he’ll bet the new minimum of half of whatever he has left.  His strategy would be written as 0%|0%|0%|0%|0%|0%.

Something funny happens when you compare Nick and Adrian’s strategies.  On average, Adrian is going to win a heck of a lot more money than Nick.  But if you were to look at who would be more likely to have more money after each contestant played one round, Nick would be ahead most of the time.

This speaks to the risk involved in Adrian’s strategy.  Sure, in the long run, he would come out ahead.  But in the meantime, he’s going to go bust about 70% of the time, while Nick is guaranteed to walk away from the game with some money.  And, unfortunately for Adrian, there is no long run.  Most contestants would only get the privilege of playing the Money Cards once or twice before losing the front game and having to leave the show.  Adrian’s strategy would be great if he could play the game unlimited times, but that’s just not the case.

Risk

The story of Nick and Adrian shows that we can’t just look at the raw value of a strategy in our analysis; we have to consider risk as well.  And that’s going to complicate matters.  We can’t find a strategy that creates a mathematically proven equilibrium between value and risk – such a strategy doesn’t exist.  As any economist or sociologist would tell you, different people quantify the risk/reward ratio differently.  But, just because we can’t provide the solution, doesn’t mean we can’t provide a solution.

What we want to do is analyze every possible betting strategy.  We’ll look at the expected value of the strategy, and to represent the risk we’ll use the standard deviation that the strategy creates.  The higher the standard deviation, the more volatile the possible range of values, and the higher the risk. However, analyzing every possible combination of six percentages would be extremely overwhelming.  We’ll limit the amount of possible strategies by doing two things:

  • Every percentage must be less than or equal to the next percentage in line.  This makes sense – there’s no reason why you would ever, say, wager 40% on a 6 but only 30% on a 5.
  • We’ll limit our percentages to increments of 10%.  We’re already losing precision in our potential betting strategies by being forced to wager in $50 increments, so the precision we lose by limiting ourselves to 10 percent increments is not that much.

Those two rules took our potential number of strategies down to a manageable 8,008.  We then ran a Monte Carlo simulator to determine the expected value and standard deviation of each strategy. When looking at the results, we found that we could eliminate a large number of potential strategies from consideration.  To see why, we plotted each strategy’s expected value and standard deviation onto a scatter chart:

The general shape of the graph forms a curve sloping upwards.   In modern portfolio theory, the points on the curve form what is known as the efficient frontier.  The points that form this curve are the strategies that offer the highest value for a given level of risk.  Strategies that do not form part of this curve are not optimal, as they offer less return for their risk than another strategy would.

Let’s take a look at an individual example involving two potential strategies: Strategy A (40%|40%|40%|50%|50%|90%) and Strategy B (0%|0%|30%|50%|60%|100%)  Strategy A was determined to have a value of $1,768.28 with a deviation of $1,562.92.  Strategy B has a value of $1.807.79 and a deviation of $1,496.47.  Since Strategy B has both a higher value and lower risk than the Strategy A, there’s never a reason to use Strategy A.  Thus, we can discard Strategy A from further analysis.

Removing these inefficient strategies leaves us with a group of 647 viable strategies. Updating our scatter point chart shows us the efficient frontier in clearer detail.

 

Unfortunately, this is where our search for an “optimal strategy” has to end.  Each of these 647 strategies is as viable as the next from a mathematical viewpoint.  It only depends on how much risk you wish to undertake in the pursuit of profit.

Personal Strategies

Now, we could leave it here, and allow you to select a strategy based on your personal level of risk, but standard deviation doesn’t do a very good job of measuring risk aversion. If a person says “I’m willing to accept a standard deviation of $3,000”, what does that really mean?  It’s clear we have to reframe the question if these results are going to be worth anything.

Luckily, we can use the standard deviation of a distribution to determine how likely a given event is to occur.  Given a dollar value, we can find how many standard deviations away from the mean each strategy is, and find the strategy that will give you the highest chance of achieving that dollar value by finding the strategy the lowest number of standard deviations away from the mean.

For example, say you wanted to find the strategy that gave the you highest chance of winning at least the modest sum of $500.  Then you’d want to use the strategy 0%|0%|0%|20%|30%|50%, where $500 is -1.012 standard deviations away from the mean, corresponding to an estimated chance of 84.4% of making at least $500.  If you wanted to shoot higher and maximize your chances of making $2,500, then the riskier strategy of 50%|90%|100%|100%|100%|100% would give you the best chance.

Rule Changes

The Money Cards underwent several rule changes over the years, each in the favor of the contestants.  Two years into the show’s run, the rule concerning ties changed. Where before ties were considered losses (meaning it would be possible to lose on an Ace or 2), ties were now considered pushes – no money was gained or lost if the same card came up twice.  For one thing, this now meant that all strategies should end in 100% for Aces and 2’s, since there was no possible way to lose money.  It also would allow contestants to play more conservatively to find the best strategy for minimum values.

When the show was revived in 1986, two major rule changes happened to the Money Cards.  Firstly, the amount of money given to the contestant upon reaching the second row was increased to $400, making the top prize a round $32,000. The second big change had to do with when the contestant could change cards.  Now, instead of just the first card on each level being changed, you could change any card at any time, but only once per level.  This complicated the strategy involving choosing when to change cards, the analysis of which we will leave for another time.

Our Final Answer

In case you want to try these strategies out yourself, we’ve created a series of charts which will give you the ideal strategy given the minimum amount you want to win for any of the three rule sets.  While we can’t point to a single strategy and say it’s the best strategy to follow, looking at these charts and gauging how risky you would want to be would take you a long way to maximizing your potential earnings.