Wordle Strategies

There has been endless chatter about the best wordle strategy. What starting word to use? Once a few letters have been revealed, should one restrict to use only words with these letters, or pick words that will give the most information about other that letters could be in the hidden word? In this post I compare different strategies by calculating the average number of guesses each strategy takes to find the hidden word. Strategies are tested on the list of 12,972 possible words extracted from the source code of the game.

Dumb strategy

The simplest strategy would be just to pick words at random each time from the list of possible words. This is quite a dumb strategy, but useful to look at as a baseline to compare other strategies to. When there are nn valid words, finding the average number of attempts required for this strategy is equivalent to asking how many rolls of a fair nn sided dice on average does it take before one sees a specific side. This turns out to be nn, so for this dumb strategy it would take on average 12,972 attempts to guess the word. This is obviously far too many since there are only 6 guesses allowed in the game. Deriving that this should be nn was deceptively tricky to derive (for me at least). I outline this in the next section which can safely be skipped.

Derivation of average for dumb strateg

The expected number of attempts is the sum of each number of attempts weighted by the probability of it taking that many, which can be written as

f(n)=1nkk(n1n)k1 f(n) = \frac{1}{n} \sum_k k \left(\frac{n-1}{n}\right)^{k-1}

Writing x=n1nx = \frac{n - 1}{n}, this simplifies to

f(n)=(1x)kkxk1 f(n) = (1 - x) \sum_k k x^{k-1}

Can see that the series kkxk1\sum_k k x^{k-1} is the derivative of the geometric series kxk\sum_k x^k , which converges to 11x\frac{1}{1-x}, for x<1|x| < 1. This means kkxk1\sum_k k x^{k-1} should converge to the derivative of 11x\frac{1}{1-x} which is 1(1x)2\frac{1}{(1-x)^2}. Putting this into the above expression gives

f(n)=1(1x)=n f(n) = \frac{1}{(1-x)} = n

Excluding words

A natural way to improve upon the dumb approach outlined above is to exclude words containing letters known not to be in the hidden word. Trying this simple approach, the average number of guesses taken is just under 17. This is a big improvement over the dumb strategy, but since only six guesses are permitted, there is still a lot of space for improvement.

Next I improved upon the above by also excluding words that do not contain the letters known to be in the hidden word. With this improvement, the average number of guesses is reduced to just over 5. This is a nice jump, and is under the six guess limit. Since this is the average though, in many games the six guess maximum will be exceeded, so more is required to be able to reliably beat the game.

Ranking words

In the above strategies, the word to guess each time is chosen completely at random from the remaining list of words. This means that words with z's and q's are just as likely to be picked as words with a's, e's or s's. To address this, I assign a score to each word based on how often its letters appear. Words that have letters that occur more frequently are ranked more highly. This is done by counting the number of times each letter appears across all words and then summing the letter counts of each of the letters of a word to come up with a score for each word. For example, 'a' appears in 5,990 of the valid words, while 'z' only appears in 434 so these numbers are used as the scores for these letters. There is an added detail here that if a letter appears more than once we only count its score once, as we don't get as much new information the second time a letter appears in a word. Assigning scores in this way and sorting the words using these scores, the highest ranked word found is "aeros", which will always be the first guess. Using this strategy, the average number of guesses is reduced to just under 5 (4.95 to be exact). While this is an improvement, it's not as much as I was hoping for.

I tried improving this further, by also taking into account how often each letter appears in each position. For example 'a' appears in 2,263 words as the second letter, but only in 737 as the first letter. Using this approach, the highest ranked word is "cares". With this strategy though, I only get a marginal improvement, with an average of 4.86 attempts.

I was a little surprising to me that adding all this additional information doesn't offer much improvement over the simpler approaches.

Troubleshooting

To investigate why these strategies don't behaving as well as expected, I looked at some of the worst cases. For the strategy that uses letter position and frequency information, one of the worst was "eater", which took 12 guesses. The strategy choose the following sequency of words

  1. cares

  2. baler

  3. pater

  4. mater

  5. dater

  6. gater

  7. hater

  8. water

  9. oater

  10. tater

  11. rater

  12. eater

This shows that after three guesses, the strategy knew four out of five of the letters and their positions, yet it took a further nine guesses to find the correct word. After the first three guesses, each guess is only ruling out a single letter which isn't a very good use of a guess. Another approach is clearly needed that can maximise the amount of information learned each time a guess is made, which I hope to look into in future.

Software

To test the strategies, I wrote a Julia package called WordleStrategies. This evaluates how each strategy does by testing it on each of the words from the game. In addition one can use a strategy interactively which can be used for playing along with the actual game.