Originally posted January 12, 2022.
As the game has grown in popularity, I have seen more and more people discuss their approach to the game or what they generally use as an opening word (since you have no clues before the first guess). You could, of course, look at how frequent certain letters are in English, and try to find a word using the first five such letters to maximize your odds of a match.
But keep in mind that in this context, you are only dealing with a limited set of 5-letter words, so the frequency of letters may be different. I also recently saw someone share this blog post from Tyler Glaiel proposing a “mathematically optimal” first choice. That started me on my own rough analysis of letter frequency and word choices, leading to the tips you see below.
As Glaiel discovered, you can easily pull up a list of allowed word choices by looking at the source code for the online Wordle. Furthermore, the code includes a list of 2,315 words that the game draws on as possible winning solutions. (Running daily, this still leaves enough challenges for the game to run until October 2027 without repeating a word!)
If you optimize for the winning word list, you will quickly find that the letter frequencies are indeed somewhat different than English writing in general; for example, “R” ranks much higher than a typical list.
Glaiel’s approach leads to a suggested opening word that uses all five of the most common letters above: ROATE. (ROATE can be an acronym for an obscure financial term, but its inclusion in the Scrabble word list appears to be based on it being an archaic variant of “rote.”) That play gives you the best chance of having at least one match; it covers nearly 40% of the letters in the complete list of winning words.
However, that does not take into account the distribution of those letters across the different words. If you look at what combinations of letters product at least one match from the most unique words in the list, you can do better by focusing on vowels. Many of these combinations do not produce allowed words, but AUREI (plural of aureus, a gold coin in ancient Rome) and URAEI (plural of uraeus, a representation of a cobra in ancient Egyptian art) give you four plus the most common consonant. These openings match against 2,201 different words on the list, compared to 2,120 for ROATE.
In picking between these two options, we can also look at the frequency of letters in each position. For example, “A” is the last letter in only 64 winning words, but it appears as the second or third letter over 300 times. The table below shows the top 10 letters for each position.
Taking these rankings into account, URAEI has more letters with high frequency for their positions, so it seems more likely to produce a green letter (a matching letter already in the correct position) than AUREI.
But what I really wanted to find was the first word or two that would give you the most information about the solution. Common letters more easily lead to matches, but that often does not tell you much; the presence of an “E” does little to narrow down the possible choices. On the contrary, matching a rare letter gives you a big advantage; the presence of a “J” immediately leave you with just 27 possible solutions.
The problem, of course, is that most rare letters are inefficient; the flip side of guessing “J” is that the absence of a “J” also tells you next to nothing. So what if we could come up with a word (or two) that strikes a balance between the most common and the most rare letters to give you the best chance of unique matches across the entire word list?
Determining that set of letters definitively would likely require a good bit of computation, but for now, I just did some simple analysis in Excel to come up with some suggested openings that seem to provide the best coverage based on different optimizations. With just two words, you can guarantee a match with at least one letter in every possible solution.
If you use ROATE as your first guess and want to continue focusing on the most common letters, PULIS (a type of dog) or PILUS (a bacterial appendage) will add enough coverage to ensure you match at least one letter. (This exploration also introduced me to the word SULCI, which essentially describes the wrinkles on your brain; ROATE and SULCI match at least one letter in every winning word except “nymph” and “pygmy.”)
RATIO is another option that omits “E” but still uses more common letters; it actually matches more unique words than ROATE. By adding PULSE, you can again ensure at least one match. (WHEYS is a follow-up that include some less-common letters, but leaves 11 words still without matches.) Also, note that RATIO appears on the winning word list, while ROATE does not.
Returning to URAEI, you can combine it with PLOTS to continue using the most common letters while also ensuring coverage of every possible word. But if you want to include more rare letters without sacrificing too much efficiency, you can balance those rarer letters with the high frequency and coverage of URAEI. From experimentation, WONKY URAEI seems to give the best balance of coverage (they match at least one letter in every word) and rarity (“W” and “K” are both in the bottom half of letters by frequency, with “Y” in 12th place).
While working on these suggestions, I noticed my friend Eduardo Vela on Twitter posting lists of 5-letter words and started a discussion. He went one step further, trying to come up with a list of four words that match as many letters as possible. Theoretically, it may be possible to get enough info from those four words to automatically determine the actual solution.
In going back over some of my analysis, my best suggestion for this challenge ended up being CLASP EDIFY GROWN THUMB. Eduardo’s best idea so far included the same letters but with “K” instead of “W”; either set seems to produce the most matches, but personally I thought using “W” seemed to give more unique results.
Of course, none of these multi-word approaches work on Hard Mode (which requires you to use matches in subsequent guesses); while it is fun to come up with efficient choices for solving arbitrary puzzles, I do suggest trying to do Hard Mode after you have played Wordle for a while! I have been trying to use Hard Mode more recently, and personally I am still debating between RATIO, URAEI, or WONKY as my default opening word.