Dealing with language code overlaps when adding more translations

Right now we use two-letter language codes for the translations that come with Neos and Flow. So far that worked out, but we added Brazilian Portuguese and Traditional Chinese as available translation targets since then. Which leads to a situation that cannot be resolved using just two-letter language codes.

Some background: the translations are downloaded using the Crowdin CLI tool with a configuration that includes the %two_letters_code% placeholder.

Option 1: Using the locale

We could switch to using the %locale% placeholder instead, and adjust the configuration in Neos to use the “new” language identifiers. But using this option would be breaking for everyone that created their own translations, because those would be using de and not de-DE.

Option 2a: Using language mapping

What seems to be a better approach is to add a mapping to the existing config.

languages_mapping:
  two_letters_code:
    # crowdin_language_code: local_name
    'pt-BR': 'pt_BR'
    'zh-TW': 'zh_TW'

This way existing translations would “stay where they are” and any added translations that conflict with existing ones could be mapped to a different name (based on their locale). I tried this out and generally speaking it works fine, but currently I am waiting for feedback from Crowdin support on an issue I still have when using this.

Option 2b

This would be the reverse, somehow: switch the config to use the %locale% placeholder and then add a mapping that maps all “existing” translations to their “old names”. This would mean a lot more mapping and moves farther away from what we did so far.

Summary (so far)

Option 1 is not an option, and Option 2b falls behind when compared to Option 2a. Thus I’d suggest to go for Option 2a, assuming the problem I still have with that can be resolved.

Summary (a bit further)

So I got an answer from the Crowdin support… The use of two_letters_code leads to things being overridden already when building the translations on the server side, so the mapping cannot have any effect any longer. I got a configuration example back from them that is exactly what I dismissed as Option 2b earlier. Indeed with an approach like this, it works. I am still tweaking this, but the end is near… :wink:

Summary (and a PR)

Ok, locale_with_underscore and a matching mapping it is… see https://github.com/neos/BuildEssentials/pull/5

1 Like

For those interested, this is the problem I still have (the text is what I sent to the Crowdin support):

Problem with two letter language codes and language mapping

Hi, in the Neos project (Neos @ Hosted Weblate) we have had only languages that could be exported with a two letter langauge code. We use the CLI tool (v 0.5.1) to download translations, using a setting like this for the packages:

translation: ‘/some/path/here/%two_letters_code%/**/%original_file_name%’

Now we added pt-BR (in addition to pt-PT) and zh-TW (in addition to zh-CN). Due to the old two-letter codes having been used for a while, we cannot change everything to use the locale. Instead I added this:

languages_mapping:
  two_letters_code:
    'pt-BR': 'pt_BR'
    'zh-TW': 'zh_TW'

I did “crowdin-cli upload” (as far as I understood this is needed to update the configuration on the server) and a “crowdin-cli download” now extracts to “pt_BR” and “zh_TW” as expected - but there a no files extracted for pt-BR (expected to end up in “pt”) nor zh-CN (expected to end up in zh).

Investigating further showed that before what ended up in pt was actually pt-BR, so this is correct - but why is pt-PT not downloaded to pt now? The same is true for the Chinese translation: zh actually contained zh_TW, but zh_CN is now missing…

I even tried mapping pt-PT to pt in the configuration, but that didn’t have any effect.

I think option 2b is better in the long run anyway. Can’t we somehow fallback in Flow? Like “pt” was requested, if there is no generic “pt” we choose the best possible “pt” (like “pt-BR”).