The first wave of test automation hardly addressed localization (i18n) checks since the primary markets focused on English-language interfaces. However, as applications globalized, the quality requirements for localization increased: the interface must display correctly in all supported languages, and text resources and formatted strings must load correctly depending on the selected locale.
The main problem is that manual checks are very resource-intensive, and automated tests are complicated due to variability in format, context, and language specifics (e.g., right-to-left or grammatical features). There may be missing translations, formatting errors, and layout violations.
Solutions include working with test data for each locale, using snapshot tests, comparing UI elements with benchmarks, implementing validation utilities on a "key-value" basis for resource files, automated extraction and comparison of strings via APIs, and regular linting of resource files.
Key features:
Is it possible to create a universal test that validates any locale with a single script?
Partially yes, but the nuances of languages (cases, gender, input direction) often require manual adjustments or additional conditions in such tests. 100% universality is not achievable.
If the translation file exists and has been successfully loaded, does that mean the i18n test has passed?
No. The file may be incorrectly linked on the application side, there may be an error in the key, the context of using the translation may be violated, and unaccounted special characters may exist, etc.
Is it worth automating localization testing for languages with <1% of users?
Yes, if the business criticality of even one user is high, for example, when fulfilling contractual obligations or for markets with special requirements. Automation significantly saves resources compared to manual checks.
The team implemented automated tests to compare keys in the .po file with the original English text, thinking this was sufficient. They did not write UI tests — in the release of the Arabic version, it turned out that all text was misaligned outside buttons, and some strings were not translated at all due to incorrect keys.
Pros:
Cons:
A combination of resource linting and automated tests was implemented, which cycled through the interface in all languages, took screenshots, and compared them with benchmark layouts. By detecting the mixing of RTL/LTR elements, the team identified and resolved the root cause before the release.
Pros:
Cons: