How does what3words handle similar combinations of words?

With what3words we’ve taken something that many people would consider inherently complex and not relevant to them (GPS coordinates) and made it accessible and something people use, want to use, and benefit from. Our goal is always to explain it in the simplest possible way for everyone to be able to understand easily.

what3words rightly receives public scrutiny because of its use by emergency services. Our partners and independent researchers often look into how what3words works in some detail. In this post, we’ll give some more context on a few related aspects of the system: shuffling, similar sounding combinations of words, and our Autosuggest feature.

The basics

We chose 3 words, as 3 units of information are easy for most people to work with. And we chose 3 metres because that is precise enough for most everyday purposes and in line with the capabilities of standard phone GPS chips. And 3 of everything keeps it simple.

We then need 25,000 words for the land, and an additional 15,000 for the sea.

For context, many studies in English suggest that an average person will know 20,000–40,000 words, but this is the dictionary form of the word, so plurals and other forms of the same word make the actual vocabulary much larger.

Shuffling

The overwhelming proportion of similar-sounding 3 word combinations will be so far apart that an error is obvious, but there will still be cases where potentially confusable word combinations are nearby.

Confusability

When we first designed what3words, we were wary of including related words like plurals, which might be confused with their singulars. Our own research has found that people confuse plurals only about 5% of the time when hearing them read out loud (which makes sense: languages typically evolve to be efficient while still keeping semantically distinct words like plurals or tenses possible to identify). So, to improve the accessibility of the overall vocabulary used, we decided to keep plurals and other inflected forms (this is in English — there are many differences in our other language versions). In addition, words are less easily confusable when people read them out deliberately and with pauses between them — as is typically the case with what3words — compared to free-flowing speech.

Many languages contain homophones (words that are spelled differently but pronounced the same), and English is an example of a language which contains many, especially considering the large range of accents. Whilst we work hard to remove homophones in all our language versions, it is a near-impossible task to be entirely comprehensive (particularly given the range of accents and therefore word pairs to consider) but we’re confident that — as with any potentially confusable words — the probabilities of these resulting in a real-life confusion are very low, and we are able to make adjustments to Autosuggest to help handle any edge cases.

Our team of linguists have several ongoing projects in this area (for example, analysing voice recordings of 3 word addresses across multiple languages and accents to identify potential difficulties in speech recognition).

We’ll look to publish some of our linguistic research in a future article, and share how we’re able to use any findings to improve our products.

Probability

Examples of similar or potentially confusable combinations of words (depending on what parameters you set for “similar” or “confusable”) can of course be found close to each other (depending on how you define “close”) with systemic searching using our software.

It can sound like a large number for thousands of 3 word addresses across the UK to be found where similar sounding ones are close to each other but it’s worth keeping in mind the orders of magnitude mentioned above. Let’s say there are 10,000 potentially confusable combinations ambiguously close to each other in the UK. That is 1 in 2.5 million (0.00004%) overall chance of hitting a square that could be considered to have a nearby square with a confusably similar address. (It is easy to do the maths to see that the probabilities are still very low even if you broaden the definitions to increase the number of potentially confusable combinations by a factor of 5 or 10 or even 100).

Because of how the system is designed, these odds are different depending on what country or city you are in, and what language you use what3words in. In English, the chances of similar combinations of words close to each other is most pronounced in cities where the system uses a smaller set of the words to generally provide easier to use words.

Other coordinate systems (and some origin story)

You are therefore far less likely to encounter similar-sounding 3 word addresses nearby than similar codes of the other hierarchically-based systems in use.

Our co-founder Chris found — when working as a tour manager in the music industry — so many mistakes were being made through the communication of GPS coordinates (“18” vs “80”, digits being omitted or transposed, people just not able to engage with the long strings of numbers etc.) that there had to be an easier, simpler way. This led him to co-found what3words — and implement shuffling as one of the design features.

We launched what3words with full understanding that whilst we had made trade-offs in our shuffling algorithm, balancing against a range of other factors, it provided a huge communication benefit over the commonly-used location system alternatives. We have good feedback from our partners and users supporting this view.

Autosuggest

Based on our research of how people actually use what3words, we constantly review the effectiveness of Autosuggest and our user interface (as well as our developer tools for both) to help people get the most out of what3words.

Step-by-step

  1. We start with a very small chance (see the probability section further up) of hitting upon a potentially confusable pair of 3 word addresses (0.00004% in the example above).
  2. Does the potentially confusable pair of words actually result in a confusion? (e.g. if the potentially confusable words are plurals, then there is only a 5% chance of it then causing confusion).
  3. Is the resulting confusion not spotted by either party? Nor resolved by Autosuggest?

The chances start low and become lower.

And we’re not taking into account any additional contextual information that the people communicating the address might have. For example, in the case of emergency services, the call handler will often (but not always) have a GPS location from the caller’s device when they call 999 (if that location is the relevant one for the emergency).

Call handlers are very experienced in getting information, clarifying and verifying it, and we take time to help them understand how to get a what3words address (after the GPS has settled), confirm it, and then cross-reference or verify it using all their usual tools for establishing location. This is an area that we are constantly working on: for example, we are looking into some design changes to our products to make it more evident to the user when GPS is inaccurate.

Further reading

And why don’t UK Emergency Services just get locations directly from the caller’s device? This blog post gives some interesting context.

Feedback

And finally

what3words is the simplest way to talk about location. It has divided the world into 3m x 3m squares, each with a unique 3 word address.