CLDR 31 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.
Aside from the regular updates to codes and data, some of the more noticeable changes are:
-
Canonical codes
- The subdivision codes were changed to consistently use the bcp47 format.
- The locales in the language-territory population data and the exemplars directory were regularized (dropping likely scripts subtags).
- The timezone ID for GMT has been split from UTC.
- There is a new mechanism for identifying hybrid locales, such as Hinglish.
-
Subdivisions
- Names for Scotland, Wales, and England have been added in many languages.
-
Emoji 5.0
- Short names and keywords have been updated for English.
- Collation (sorting) adds the new 5.0 Emoji characters and sequences, and some fixes for Emoji 4.0 characters and sequences.
-
Transforms
- The Zawgyi→Unicode transform has been improved.
- Tamil can now be transcribed to the International Phonetic Alphabet (IPA).