Wednesday, March 28, 2018

CLDR Version 33 Released

Bold image Unicode CLDR 33 provides an update to the key building blocks for software supporting the world’s languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

This release had a limited submission phase. The focus was on improvements to emoji keywords and to the Odia and Assamese locales, addition of typographic names data, and improvements to the structure for specifying keyboard layouts. Improvements include:
  • Structure
    • New structure for typographicNames translations (such as terms for Bold, Italic, ...), with data for 33 locales.
    • The structure for specifying keyboard layouts was significantly enhanced, with many new elements and attributes, and expanded syntax for some preëxisting attribute values.
  • Additional Translations/Data
    • Annotations (emoji keywords) for a limited set of locales had a full review (ar, en_GB, de, es, ja, ru).
    • Two additional locales (Odia, Assamese) were brought up to Modern coverage level; some missing items were added in other locales.
    • Added 4 new transforms, and number spellout rules for 6 additional languages.
  • Property files
    • The emoji property data file ExtendedPictographic.txt has been removed from CLDR data, since the contents are now part of the UTS #51 “Unicode Emoji” data.
    • labels.txt was added for emoji categories and subcategories. 
For further details and links to documentation, see the CLDR Release Notes.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Shopify, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP, Symantec, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to

For more information, please contact the Unicode Consortium