Quardle oodle ardle wordle doodle
The hunt for the best Wordle opening
(Updated following New York Times acquisition)
Quardle oodle ardle wardle doodle... the magpies said.
When the title of this post isn't busy almost being the refrain of the famous poem by New Zealand poet Denis Glover (no, not related to Danny or Donald), it could almost pass as a game of Wordle, the latest internet sensation to be taking me by storm.
Wordle, in case you didn't already Google it (you got here somehow!), is a variation on the Word Mastermind game, but using five letters per word rather than 3-4. Each guess has to itself be a valid dictionary word.
The Wordle website publishes only one new puzzle each day, and every visitor does the same puzzle, which is a nice touch.
There's a maximum of six guesses allowed. On my first puzzle, I took four attempts:
Green means the letter is in the right place. Yellow means the letter is in the word, but not at that position.
The next day, I solved it in three, using the same opening guesses:
I took a bit of care choosing my first guess - THENS. It's a valid word, but don't ask me to use it in a sentence. I wanted a word that hit some of the most common consonants, with a good chance of being in the right place.
The best opening word?
But is it the best opening word? Well, it only takes a bit of coding and CPU time to crunch through all the possibilities.
Taking a sneaky peak at the Wordle source code reveals a list of 2309 words that appear to be candidate solutions, and an additional list of 10638 words (typically less common ones) that are permitted as guesses. That gives a total of 12947 valid guess words.
We - the computer and I - evaluated all the opening guesses by these measures:
- Number of words that can be solved directly (only one possible solution fits the clues)
- The 'worst case' number of possible solutions remaining
- The 'almost worst case' 90th percentile
The aim is to narrow down the possible solutions as quickly and reliably as possible.
I'm assuming that each day's solution word is drawn completely at random from the list. Otherwise, the advice here mightn't be optimal.
It turns out that just one word isn't enough to solve many puzzles directly - at best around 2% (opening with LATEN or ROTAN).
So I focussed on the remaining two measures. Crudely averaging the two, the first guess that comes out on top is RAISE. Other commentators have come to the same initial conclusion, such as Matt Rickard - but like me later found a better option (his spoiler is also my spoiler).
The worst case clue for RAISE, unsurprisingly, is all grey - no letters match. There are 167 possible solutions, including POOCH, HOUND and PUPPY.
That same list also applies to ARISE, being an anagram of RAISE. But to break the tie, RAISE scores better on the 90th percentile measure, beating ARISE's 51 possible solutions, with just 41:
SEEDY, SPEND, STEED, SWEET, SMELT, ZESTY, FETUS, UPSET, ONSET, UNSET, SPELL, SHEEN, SPELT, SLEEP, SLEET, BESET, SWEEP, SHEET, SHELL, STEEL, SMELL, SPECK, SEVEN, SHELF, STEEP, SPEED, SWELL, PESKY, ETHOS, SWEPT, SCENT, SPENT, PESTO, SLEPT, NOSEY, BUSED, SHEEP, TESTY, SETUP, SLEEK
Wait, that's only 40. Oops, I almost left out SEMEN. That one's sure to cause some consternation if it ever comes up.
(Actually it's a tie between that group, and the one including OVATE and AMAZE, which also has 41 possibilities.)
If you looked closely at SEMEN, it may have alarmed you that it includes two E's. It didn't come up in the Wordle instructions or my first couple of puzzles, but sure enough, many of the solution words have repeated letters.
Repeated letters will give rise to some difficult puzzles - and there are some quite nasty words in there, like SISSY.
To help you place repeated letters, each position in the solution gives at most one clue tile. So for example, if the solution is AMASS but you guessed SASSY, you'll see yellow, grey and green S's:
The best opening two words!
Taking RAISE as the first guess then, is there a magic second guess that always solves the puzzle?
Turns out that no, there isn't. A good fixed second guess is LOFTY, but there are still eight words that don't match any letters (CHUMP, PUNCH, HUMPH...).
That's a problem in general for a three-guess method: two guesses cover at most 10 letters, leaving another 16 in the alphabet untouched.
However, the worst outcome from RAISE + LOFTY is 21 possibilities, if the clues give just yellow A and R (GRAND, CHARM, AWARD...), or yellow O and R (JUROR, GROUP, WRONG...).
Are there any fixed first two guesses that do better than RAISE and LOFTY?
Why yes, there are. Crunching through the 83,805,931 valid pairs of guess words, and identifying the top candidates for the three measures already mentioned, a clear winner emerges:
SOARE + CLINT
Let it be known that the dictionary defines SOARE as 'a young hawk'. But now it has a new significance: the best five-letter word to be paired with CLINT since CLYDE the orangutan.
Together, these two guesses directly solve 641 of the 2309 possible puzzles. So around one in four puzzles can be solved with certainty on the third attempt. The runner-up pair, SLANT + PRICE, is close behind with 635.
Our earlier favourite, RAISE + LOFTY, lags well back on 435. RAISE does have better pairings on this measure though; the best is RAISE + CLOTH with 581. Very good, but not quite a SOARE + CLINT.
The worst-case clue for SOARE + CLINT is yellow A, R and E. That has 25 possibilities:
AMBER, BAKER, BREAD, BREAK, DEBAR, DREAD, DREAM, EAGER, FREAK, GAMER, GAYER, GAZER, HAREM, MAKER, PAPER, PARER, PAYER, RARER, REBAR, REHAB, REPAY, WAFER, WAGER, WAVER, WREAK
To its credit, RAISE + LOFTY did do better on this measure, with 21 as mentioned. Another honorary mention goes to SOARE + INTEL (or INLET), with 22 (COMFY, FOGGY, WOOZY...).
There are 15 words that don't match any letters in SOARE + CLINT:
BUDDY, BUGGY, DUMMY, DUMPY, FUZZY, GUMMY, GUPPY, HUMPH, JUMPY, MUDDY, MUMMY, PUDGY, PUFFY, PUPPY, PYGMY
Follow-on guesses
SOARE + CLINT gives good odds for cracking the puzzle on the third attempt. But inevitably some clues leave too many possibilities, so a third exploratory guess may be needed.
You may want to choose a guess based on the clues (green and yellow tiles), and there are plenty of possible approaches, but here are some suggestions:
- Usually good: DUMPY or PUDGY
- Or if E is yellow, consider: FUMED
- Or if A or R is yellow, consider: BARMY
And if a third exploratory guess doesn't get you on the green, hitting it with GOWFS might just keep you in the game (it's a valid word, related to golf).
Letter frequencies
After the opening two or three exploratory guesses suggested above, you're on your own to narrow down the possibilities. Using a computer is cheating (whoops).
The following chart of letter frequencies might help. This is drawn from the number of times each letter appears in the possible solution list, from most to least frequent: EAROT LISNC UYD HPMGB FKWVZ XQJ.
Frequency in each position is shown, abusing the Wordle colour scheme with green meaning very common, yellow fairly common, and grey meaning rare or absent. You can easily see that S is common at the start of a word, and Y is common only at the end.
Note that SOARE + CLINT covers the ten most common letters, and each letter is in its most common location - with the exception of R. That might be why it's such an effective opening.
Let's try that out...
For today's Wordle I tried out the SOARE + CLINT opening:
It almost led me straight to the solution - if it wasn't for those pesky duplicate letters.
Now all I need is something to distract me for the next 23 hours, 14 minutes and 27 seconds.
TL;DR: SOARE + CLINT is my pick for the best Wordle opening
BONUS EXTRA solution list analysis
Shortly after publishing this article, I suffered my first Wordle humiliation:
The solution was FAVOR.
What went wrong? First up, I didn't follow my own advice. This word is in a blind spot for SOARE + CLINT, so I should've followed up with BARMY (since A and R are yellow).
Next, as a commonwealth citizen, I overlooked that -OR is actually a popular word ending. I'd normally spell the solution word as 'favour'. After fumbling a guess with MAYOR (I'd already eliminated M, d'oh!), my final guess was on track with a US spelling, VAPOR, but I lucked out and finished with a double bogey.
That experience brought me back to the word lists I yoinked from the Wordle source code, for a closer look. (In case you're wondering now, YOINK is not a valid guess word.)
The solutions are (almost) all US spellings, but both US and UK spellings are accepted as guesses.
If you're a UK speller, solution words to watch out for are:
CHILI, AGING, FECAL, FETUS, FETAL, FETID, FILET, SHEIK, WAGON, WOOLY
... and of course, all those ---OR words:
ARBOR, ARDOR, ARMOR, COLOR, FAVOR, HONOR, HUMOR, LABOR, RIGOR, RUMOR, SAVOR, TUMOR, VALOR, VIGOR, VAPOR
There are a few UK-specific spellings you won't find hidden, but that could be useful as guesses:
ENROL, MOULD, ODOUR, TONNE, TYRES
Also a valid guess is PZAZZ (an accepted UK spelling of 'pizzazz'), but please don't.
One non-US spelling may come up as a solution: MOULT. That one will have the yanks tearing their hair out. Glad we could return the FAVOR.
Plural nouns and declined verbs ending in S aren't used as hidden words. (This didn't bode well for my THENS opening.) That's not to say all plurals are excluded - FUNGI, CACTI and RADII are some that could appear.
While most of the solution words are well-known, here's a few less-commonly used ones to watch out for:
- Excuse me, Professor Brainiac: AXION, UMBRA, FICUS, BETEL, ILIAC, SUMAC, HUMUS
- Barnyard and beyond: OVINE, SHORN, TAPIR, EGRET
- Geolophy and geograly: SCREE, BUTTE, BAYOU, BIOME
- Ewww, gross: HYMEN, GONAD, SPERM, SEMEN (again)
- Pardon my French: ENNUI, BOULE, ECLAT
- In the vernacular: FELLA, MATEY, MAMMY, HUSSY, NINNY, BIDDY, BILGE
- OK, old-timer: TRICE, TWIXT, OMBRE, ASCOT, UTILE, AUGUR, CRUMP
- Can I say spello?: GIPSY, CAPUT, EMCEE, GAYLY, BELIE, LEANT, BONEY
- Is that even a word? NATAL, RECUT, PLIER, QUASI, STEIN, ABLED, SWAMI, AFIRE, TULLE, DROIT
And one final observation, a word that's been a strong candidate for best first word, turns out to be best left to last: ADIEU
Addendum: New York Times changes
Since going viral, Wordle was acquired by the New York Times and moved to their website. The game is mostly unchanged, save for cosmetics and the removal of a few contentious words from the solutions and valid guess lists.
Words removed from the solution list: AGORA, PUPAL, LYNCH, FIBRE, SLAVE, WENCH
In the pre-NYT version of this post, I already highlighted AGORA as one of the more obscure solution words, and FIBRE as being both a non-US spelling and a dupe of FIBER.
AGORA would've been the daily solution shortly after the New York Times acquisition, and this omission caused a brief splintering of the Wordle community as people were still migrating from the original host. Maybe the new owners were concerned about frustrating existing players, who might wrongly suppose the NYT was making the game harder.
There were also 19 words removed as valid guesses, mostly offensive slurs relating to race, sexual orientation or proclivity, which I'm too much of a timid or cowardly person (probably male) to list in full here.
Highlighting a few that are hopefully acceptable to discuss (and if not - sorry!): BITCH, DYKES (along with DYKED and DYKEY), FAGOT, HOMOS, SLUTS and PUSSY.
(If you really want to see the complete list of removals, they're quoted here in a Hacker News comment.)
I refrained from calling myself one just now, but I don't think many people would be offended by PUSSY, at least without some context. It's one of the few sexually suggestive words that's socially acceptable to call out from your front porch (provided you own a cat), or recite to your children at bedtime. But it's also a word that the New York Times notably has a tense relationship with.
Clumsily, none of the words removed as solutions were added to the list of valid guesses - even though only one (looking at you, WENCH) is similar to words removed directly from the guess list.
Clearly the aim was to only remove words contemporarily regarded as likely to be offensive. Some common words with similar meanings, but that lack (or have lost) the negative connotations, are still included. Notably, QUEEN, QUEER, GAILY, GAYLY (again) and GAYER are all in the solution list. It might take more time, but removed words DYKE- and HOMOS are surely getting close to being 'no longer offensive' too. The New York Times themselves referred to 'homo' as 'an old derogatory' in a 2014 article: The Decline and Fall of the 'H' Word.
Continuing the bias towards US English, some potentially offensive words in UK vernacular might have been overlooked by the NYT team. FAIRY and PANSY are still in the solution list, despite being nearly equivalent to FAGOT (or more commonly, 'faggot') in the US, which was removed. Similarly, POOFS and POOFY are accepted as guesses. Although PUSSY isn't being permitted as a guess, MINGE and MUFFS are, and FANNY is even lined up for its day in the sun as a future solution.
You can't guess BITCH, but you can take a punt with BINTS.
There are a few words sometimes regarded as offensive racial slurs still in the solution list. That includes GYPSY and (as mentioned) GIPSY, FRITZ and GOLLY. Outside of Wordle, you really can't win with GOLLY: either it has racist overtones, or it's a blasphemous euphemism.
Looking at the remaining list of valid guesses, there are plenty more ethnic and racial pejoratives that haven't been cleansed. I found 15, mostly with help from Wikipedia's List of ethnic slurs. I'm not callous enough to quote the list as it singles out almost everyone for denigration, but I'm entitled to mention HONKY as one word that I'm personally offended by. (If I ever catch you kids tapping that word into your Wordle puzzle, I'll take you aside and firmly ask you to wash your fingers with soap.)
As a point of reference, the unmentionable word I just mentioned, along with many that I didn't, were expurgated from the Official SCRABBLE Player's Dictionary as far back as 1996.
Another hint that the NYT changes may've been rushed: there's still a 'who's who' of taboo four-letter words that are allowed as guesses, mostly in -S plural or present tense forms. That includes words sharing roots with all of George Carlin's 'Seven Words You Can Never Say on Television'.
Polygon quotes the NYT on the reason for the changes:
Offensive words will always be omitted from consideration.
That's fair and understandable, but I'm curious why disallowing guessing offensive words is at the top of the agenda. When you enter a guess, it only appears on your screen. And if you attempt to guess a disallowed word, it still appears on your screen. Sharing Wordle results via social media only shows the grid colours, not the guessed words (avoiding spoilers) - and it's unlikely anyone will infer that you used a bad word (let alone be offended by it).
It may be that the NYT regard the Wordle code as something they're publishing, and censored it by removing objectionable words appearing there in plaintext.
What makes the changes controversial is the philosophical clash with the stated intent of the game, which is to allow any valid five-letter English word as a guess, regardless of meaning or association. Removing words then implies that they're no longer valid English. Removing all objectionable words from the language makes it difficult to directly express objectionable concepts, and perhaps even to form objectionable thoughts. All of a sudden, our fun little word game is starting to take on a distinctly dystopian vibe.
That reminds me, another two words that rightfully don't belong in the game of Wordle: SAPIR and WHORF.