How does what3words handle similar combinations of words?
With what3words we’ve taken something that many people would consider inherently complex and not relevant to them (GPS coordinates) and made it accessible and something people use, want to use, and benefit from. Our goal is always to explain it in the simplest possible way for everyone to be able to understand easily.
what3words rightly receives public scrutiny because of its use by emergency services. Our partners and independent researchers often look into how what3words works in some detail. In this post, we’ll give some more context on a few related aspects of the system: shuffling, similar sounding combinations of words, and our Autosuggest feature.
As a reminder, we have split the world into a grid of 3 metre x 3 metre squares, and each of those squares has been assigned an address made up of 3 words. There are around 57 trillion such squares. We’ve done this in English and around 50 other languages and the whole system can be packaged up to work offline.
We chose 3 words, as 3 units of information are easy for most people to work with. And we chose 3 metres because that is precise enough for most everyday purposes and in line with the capabilities of standard phone GPS chips. And 3 of everything keeps it simple.
We then need 25,000 words for the land, and an additional 15,000 for the sea.
For context, many studies in English suggest that an average person will know 20,000–40,000 words, but this is the dictionary form of the word, so plurals and other forms of the same word make the actual vocabulary much larger.
The design of our system shuffles word combinations around the world as one of several intentional design features that we balance (another being to place simpler/more common words where they are more likely to be used and more complex words in more remote areas and in the sea).
The overwhelming proportion of similar-sounding 3 word combinations will be so far apart that an error is obvious, but there will still be cases where potentially confusable word combinations are nearby.
Assessing how effective the shuffling is depends in part on how you define confusable words or confusable 3 word combinations.
When we first designed what3words, we were wary of including related words like plurals, which might be confused with their singulars. Our own research has found that people confuse plurals only about 5% of the time when hearing them read out loud (which makes sense: languages typically evolve to be efficient while still keeping semantically distinct words like plurals or tenses possible to identify). So, to improve the accessibility of the overall vocabulary used, we decided to keep plurals and other inflected forms (this is in English — there are many differences in our other language versions). In addition, words are less easily confusable when people read them out deliberately and with pauses between them — as is typically the case with what3words — compared to free-flowing speech.
Many languages contain homophones (words that are spelled differently but pronounced the same), and English is an example of a language which contains many, especially considering the large range of accents. Whilst we work hard to remove homophones in all our language versions, it is a near-impossible task to be entirely comprehensive (particularly given the range of accents and therefore word pairs to consider) but we’re confident that — as with any potentially confusable words — the probabilities of these resulting in a real-life confusion are very low, and we are able to make adjustments to Autosuggest to help handle any edge cases.
Our team of linguists have several ongoing projects in this area (for example, analysing voice recordings of 3 word addresses across multiple languages and accents to identify potential difficulties in speech recognition).
We’ll look to publish some of our linguistic research in a future article, and share how we’re able to use any findings to improve our products.
As mentioned above, what3words has around 57 trillion squares each with unique 3 word addresses around the world. In the UK alone, there are around 25 billion unique 3 word address squares.
Examples of similar or potentially confusable combinations of words (depending on what parameters you set for “similar” or “confusable”) can of course be found close to each other (depending on how you define “close”) with systemic searching using our software.
It can sound like a large number for thousands of 3 word addresses across the UK to be found where similar sounding ones are close to each other but it’s worth keeping in mind the orders of magnitude mentioned above. Let’s say there are 10,000 potentially confusable combinations ambiguously close to each other in the UK. That is 1 in 2.5 million (0.00004%) overall chance of hitting a square that could be considered to have a nearby square with a confusably similar address. (It is easy to do the maths to see that the probabilities are still very low even if you broaden the definitions to increase the number of potentially confusable combinations by a factor of 5 or 10 or even 100).
Because of how the system is designed, these odds are different depending on what country or city you are in, and what language you use what3words in. In English, the chances of similar combinations of words close to each other is most pronounced in cities where the system uses a smaller set of the words to generally provide easier to use words.
Other coordinate systems (and some origin story)
It’s worth noting that other systems used for location communication where street addresses aren’t available or accurate enough, such as GPS coordinates (latitude/longitude) or grid references, are specifically designed to group similar codes close together, and where communication errors are widely accepted to be very common.
You are therefore far less likely to encounter similar-sounding 3 word addresses nearby than similar codes of the other hierarchically-based systems in use.
Our co-founder Chris found — when working as a tour manager in the music industry — so many mistakes were being made through the communication of GPS coordinates (“18” vs “80”, digits being omitted or transposed, people just not able to engage with the long strings of numbers etc.) that there had to be an easier, simpler way. This led him to co-found what3words — and implement shuffling as one of the design features.
We launched what3words with full understanding that whilst we had made trade-offs in our shuffling algorithm, balancing against a range of other factors, it provided a huge communication benefit over the commonly-used location system alternatives. We have good feedback from our partners and users supporting this view.
To help our users and our partners, including Emergency Services call handlers, we include our Autosuggest feature in our app, website, and developer products (which include customisation options to focus on specific geographical areas in the case of our developer products). This actively intercepts possible errors or confusions and highlights other possibilities to the user, helping to identify what might need to be checked.
Based on our research of how people actually use what3words, we constantly review the effectiveness of Autosuggest and our user interface (as well as our developer tools for both) to help people get the most out of what3words.
Let’s work through a scenario to consider the factors that affect the probabilities:
- We start with a very small chance (see the probability section further up) of hitting upon a potentially confusable pair of 3 word addresses (0.00004% in the example above).
- Does the potentially confusable pair of words actually result in a confusion? (e.g. if the potentially confusable words are plurals, then there is only a 5% chance of it then causing confusion).
- Is the resulting confusion not spotted by either party? Nor resolved by Autosuggest?
The chances start low and become lower.
And we’re not taking into account any additional contextual information that the people communicating the address might have. For example, in the case of emergency services, the call handler will often (but not always) have a GPS location from the caller’s device when they call 999 (if that location is the relevant one for the emergency).
Call handlers are very experienced in getting information, clarifying and verifying it, and we take time to help them understand how to get a what3words address (after the GPS has settled), confirm it, and then cross-reference or verify it using all their usual tools for establishing location. This is an area that we are constantly working on: for example, we are looking into some design changes to our products to make it more evident to the user when GPS is inaccurate.
People often ask us what our business plan is for making money. You can read this article for more detail.
And why don’t UK Emergency Services just get locations directly from the caller’s device? This blog post gives some interesting context.
We want to deliver technology which supports real life challenges. We take inspiration from our users and partners, and try to take on board constructive feedback for possible improvements to our product and messaging.
We think there is demand from our users to publish more details around some of the workings of what3words, so watch this space!