The depth of Hanabi

This post is a short chrono-logue about my time with the card game Hanabi, which I play with the H-group. Thus, it’s also implicitly an advertisement for why I enjoy the game Hanabi so much.

I think the progression is a bit interesting because it can be divided into almost discrete “stages”, with each stage feeling really different from the last.

0. Casual in-person play: a memory game

Like many other people in my age group, I first met the card game Hanabi in-person at some summer math camp or other (either MOP or SPARC?). The rules are pretty simple to explain, so it’s popular. But we didn’t have much strategy behind it. We had the idea that we played from left to right, a clue means “play all”, and some form of a Finesse-type blind play.

That meant the game felt kind of like a chaotic party game, sort of like the first game of One Night Ultimate Werewolf or something. The game mostly revolved around trying to remember past clues and guesswork at how people would respond. (It’s true that we would allow asking “what do you know about your cards?”, but that was only a subset of the information anyway.) I thought the game was okay, I would play it, but it definitely wasn’t my favorite game, and I’d only play it if others wanted. We were not particularly good at it; we rarely got a perfect score, if ever.

1. Online play: a guesswork game

Around when the pandemic happened, everything got virtualized, and we started playing on the hanab.live website.

We still played with the same crappy convention set (if you can even call it that), but I distinctly remember saying “this is one of the few things that works better virtually than in-person”. That’s because the hanab.live is really feature-rich. The website will completely keep track of the “what do you know about all your cards” situation for you, marking both positive and negative clues, as well as letting you make arbitrary additional notes on each individual card. It also keeps track of the entire history of each game for you, so you can always rewind to see what clues had been given before.

So playing online eliminated the memory component from the game. I’m not saying memory games are always bad (I play Fish, too), but for Hanabi I preferred playing without having memory as a constraint. Nonetheless, because the game was still mostly about guesswork, we still rarely got a perfect score.

2. Starting the H-group convention: learning a new game

Around August 2020 I started learning the H-group conventions, which start to standardize the meaning of “normal” clues. This eliminates most of the guesswork component from the game, which I also thought made the game more interesting. The experience became more about logic and learning strategy, rather than guesswork and memory like it had been at the start.

The H-group plays with a spectrum of experience levels (and they are actually named “levels”, numbered 1-25 in the most recent document). For me, I actually enjoyed the experience of learning a lot of strategies. Despite the large number of conventions, there’s a requirement that strategies the H-group uses should be “intuitive and easy to remember”, so learning each individual strategy usually felt like it made sense, rather than memorizing an arbitrary or contrived rule.

So, for a while, I was slowly moving up through the level system, getting used to the various conventions the group played with.

3. Familiarity: a logic game

Once I started getting more familiar with the basic conventions, I didn’t have to spend as much effort interpreting clues anymore. After a while you develop some instincts and things become more automatic.

Instead, the “getting used to conventions” you start thinking about strategy, in addition to just the logic. If I give a red clue, what do I think the next player will do? And what will the player after that do? Versus if I discard, and let someone else give the red clue, what would happen instead? What would be the next outcome?

I talked about this previously under the name foresight. It’s one example of a way in which logic and strategy mix, in which, given the conventions you know, you want to give clues that are as efficient as possible.

One other thing I should mention: my win-rate skyrocketed. It turns out that even with the most basic level 2 conventions, if you don’t make many mistakes, you can get a perfect score more often than not in standard no-variant games.¹ The “base” game often felt easy. So we would often play variants which changed the games in ways that made the game harder, like having a suit that was not touched by any rank clues, or suits where there was only one of each card, or playing clue-starved where discarding gives you 0.5 clues back instead of 1.

4. Intermediate play: bending the rules

Once you play variants, Hanabi becomes difficult enough that you cannot simply follow pre-existing conventions too rigidly. There is a great passage in the strategy document that dispels this myth:

Up until now, you may have the impression that the group has a lot of conventions, and that if you just memorize all of the conventions, you will become a really good Hanabi player. Or, you may have the impression that the conventions are like laws and that you must always follow them. Neither of these things are true.

You can read the full paragraph on the conventions website, but the short version is — now that you know the conventions now, break them whenever the situation demands it. This is especially true in really hard variants, where you might always have to take a few risks in order to win because the baseline required efficiency is so high.

5. Advanced play: making trade-offs

When you first learn the H-group conventions, part of the exciting thing is that you can give clues that are really efficient; for example, a single clue could get four or five or more cards to play at once. So, beginners are taught to look for the most efficient clues among the ones they have available to them, and that’s a lot of fun.

However, as you get more advanced, it becomes important to also think about what costs you might incur by trying to set up a really efficient clue. There are a few examples of other parameters to pay attention too:

In serious play, Hanabi also has a parameter called Tempo or Pace, referring to the speed at which the cards are played. A clue can have high efficiency but be slow enough that a less efficient clue would still be better for the team’s win rate. So, it becomes necessary to balance efficiency and tempo.

One related concept that strong players talk about is Bottom Deck Rate; in Hanabi, if you discard a green 3 and the other copy of green 3 was at the bottom of the deck, it’s impossible to win. So there’s a certain notion of “safety”, in which you want to give moves that minimize the number of unsafe discards, and protect cards that might be useful later on in the game. When more efficiency is unnecessary, you often need to trade out some efficiency and give less efficient clues in the hopes of orchestrating safe discards, where you use the foresight I mentioned earlier to try to guide the team down a path where the players whose cards are trash do most of the discarding.

And finally, one more trade-off: the clarity principle. If a player gives a really complicated clue their teammates technically could have figured out, but doesn’t, you might burn the whole game. An “optimal” clue might still be not worth it, even if it’s great on paper, if it’s too confusing for the humans trying to interpret the clue.

During advanced play, it no longer felt like I was just trying to set up the most efficient clue I could. Instead, it felt like I always needed to balance several different metrics and make some judgment calls. That felt really cool to me.

The hanab.live website understands this too. During each game, one of the most important numbers it displays is the future required efficiency, which calculates how many cards you still need to get divided by how many clues you will have over the rest of the game. Good players need to always look at this number when making decisions. If future required efficiency is low, play conservatively and prioritize clear, simple clues that protect cards. When future required efficiency is high, play more aggressively, and be willing to take some risks to try and capture more cards.

6. Competitive play: trust and teamwork

Once you play competitively with expert players, it’s not enough to just be making judgment calls yourself. Because the foresight of thinking several turns ahead is so important, you need to trust that everyone on your team is in sync about judgments. After all, part of picking each move relies heavily on predicting what clues your teammates will give in the future.

That’s why in the H-group, we did post-game reviews after each game, where we go through every move made in the game and talk through any tough decisions that we had to make, or things that in hindsight were mistakes. Because at the highest levels of play, understanding how your teammates play, and the resulting trust and teamwork, become the heart of the game.

An earlier version of this post had an exaggerated claim of “well over 80%”. It was pointed out in the comments section this was grossly overconfident, both due to human error or tough deals. ↩

3 thoughts on “The depth of Hanabi”

bbuurrkkss 26 September, 202310:12 Reply

I love Hanabi and hanab.live! In addition to all of these depths, I’ve found hundreds of hours of enjoyment in theorycrafting, testing, and playing with conventions.

LikeLike
purplejoe 27 November, 202317:34 Reply

TLDR: I think that you and many others underestimate the difficulty of No Variant.Hello Evan, I’ve never played with you on hanab.live, but Ramanujan1729 (and myself, MarkusKahlsen, and a few others) recently started a project for No Variant Streakhunting, where we attempt to win as many games in a row of No Variant using max level conventions, along with a set of other conventions (found here: https://pad.abstractnonsen.se/hanabi-novar-conventions) and have been tracking our winrate. You can find the results here: https://docs.google.com/spreadsheets/d/1Ie5Iz4pJ2i3XJDVMDFEguUz8wwMESQL0JDtwXoWhb8w/edit#gid=1038328870As you can see, over some ~200 games played of max level No Variant, the best of us have averaged around 80-83%. As a three player team, we have barely hit 90%, and even the best bots only win around 96% of their games. We are trying to push the human limit of hanabi and see what the highest winrate humans can achieve in No Variant is (with a modified h-group convention set). It’s sort of easier to conceptualize if you think about it in terms of games lost instead of games won. For example, going from 90% to 95% means going from losing 1/10 games played to 1/20 games played. Also, if the best players make one mistake 1/10 games played (which is already an absurdly low number to be making a mistake), then three of the best players would make 3 mistakes every 10 game. Every mistake has a pretty high possibility of bombing/discarding a crit/bdr, or causing desynchronizationthat already loses the game.I wrote this long text to basically summarize that winning No Variant is easy, but consistently winning No Variant is pretty hard.

LikeLike
1. Evan Chen (陳誼廷) 27 November, 202322:56 Reply
  
  OK, you’re definitely right that I’m overconfident. I’ll nerf the claim like heck :) good call.
  
  I think when I was writing that, I was thinking back to my experience as a teacher, where I would have 3-4 new players in a table that I was spectating and talk through each of them on each turn as they were learning to interpret clues and so on. My experience was generally that even with beginner conventions, we’d still win most of the time if I made sure no one messed up. Obviously that’s a really different “idealized” in-theory situation which doesn’t apply in practice since, as you said, even the best players will make mistakes occasionally in No Variant which can rapidly cause cascading failures.
  
  LikeLike