[RFC] Feature Proposal - Add languages: tags for name rendering

aseigo · November 14, 2024, 6:57am

Yes, they have name:<code> entries, but there is no hint as to which is the default and so without applying heuristics they pick name. When name is the same as one of these name:<code> entries, and/or it successfully applies location-aware heuristics (hit and miss in multi-lingual areas), then it can be deduced … but when name has more than one language name in it, it becomes increasingly complicated to deduce which languages are represented.

Keep in mind that “language the user has their nav set to” may not be the same as the language of the streets you are driving through. This happens all the time in the country where I live, where the language of street signs can change multiple times over the course of a half-hour drive, and where many people prefer to use navigation in their native language rather than whatever the town they are driving through speaks.

Can you explain how the following is intended to work:

set the navi language to English
drive through a multi-lingual town where there are no name:en entries
get place names from OSM entries with no language metadata, only name, name:fr, and name:de entries, all of which different since name is a by-hand combination of the fr and de entries

Again, the text-to-speech could use heuristics, it could use that to read specific language-specific entries only from the set of name:<code> entries (assuming those are even provided by the driving instructions, which IME they typically are not), but those are a lot of hoops we are asking the software to jump through.

Or OSM could instead provide language metadata that would enable detecting which localizations to use.

Yes, it is more understandable when voiced in the correct language.

In the case I mentioned, the driver can read German (and even speak to some degree) fine. They are used to the street names in German. The English vocalizations were nearly indecipherable. Comparing them to the street signs was not straight-forward.

It gets even worse if you try to share those wrongly-enunciated instructions with locals, as they will struggle as well. Again, I know this from first-hand experience, on both sides of that interaction.

A related “funny story” is when Biel/Bienne rolled out an automated VoiP system for city services that was poorly configured and would pronounce some names with the wrong language pack, and it was also hard to figure out for native speakers living there. (That story courtesy of my partner, who lived there at the time.)

Vocalizing names in the proper language is very useful.

“German names in Germany” is easy mode, as that is another (mostly) monolingual situation. The more difficult cases are street names in a French-speaking and/or bilingual town in a mostly-German-speaking canton of Switzerland, or the Haida names that now appear in some name fields in Haida Gwaii.

The motivation for this proposal are these multi-lingual use cases, where names appear in multiple languages, including putting two or more names in the name field either with some separator character (“French Name / Flemmish Name” in Brussels, e.g.) or just bodged together (<French Street Type> <Street Name> <English Street Type> in various Canadian cities)

It would be great if navi software was sophisticated enough to sort this out, but the OSM dataset makes that harder than necessary. I’ve personally witnessed it result in failure.

Some places have name:<lang> tags in languages other than the ones generally spoken in the location, and which do not appear on street signs.

Some places have names in use which are not in the commonly used local language (e.g. name:hai in Canada), and which also have name:<lang> entries in the commonly spoken language.

Some places, such as the Chinatown or other ethnic communities in major North American cities, also have preferred names in official use that do not match the general language preferences of the area.

The S. Africa issue is similarly complex, where some names are localized and some are, as a matter of general use, not.

Yes, there are many places which are mono-lingual, or which are easy to figure out. This is for the rest of the world. It incurs little, if any, extra cost for the places which are mono-lingual.

edit: Note that “no extra cost” includes preserving the name field as it currently is. No changes to the dataset are needed for places where the current scheme works fine, and the rendered results will be the same as they currently are.