Friday, February 10, 2012

Unicode Releases Common Locale Data Repository, Version 21.0


Mountain View, CA, February 10, 2012 - The Unicode® Consortium announced today the release of a new version of the Unicode Common Locale Data Repository (Unicode CLDR 21.0), providing key building blocks for software to support the world's languages.

Unicode CLDR 21.0 contains data for 193 languages and 170 territories: 528 locales in all. This release did not include a public data submission phase, and focused on improvements to the LDML structure and tools, and consistency of data.

Main features included the updates for Unicode 6.1, a major cleanup of timezone names, date format data, and delimiters (“…” vs „…“ vs „…” vs …); the new BCP47 -t- extension; addition of ordinal categories (1st, 2nd,…), collation reordering (eg, Cyrillic before Latin), multiple numbering systems for a locale, abbreviated numbers (eg, “1.2 B”); and restructuring of Chinese calendar data. For more information on other changes since the 2.0.1 release, see the CLDR 21 Release Note.

Unicode CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; transliterating different alphabets; and many others. Unicode CLDR 21 is part of the Unicode locale data project, together with the Unicode Locale Data Markup Language (LDML: http://unicode.org/reports/tr35/). LDML is an XML format used for general interchange of locale data, such as in Microsoft's .NET.

For web pages with different views of CLDR data, see http://cldr.unicode.org/index/charts. For more information about the Unicode CLDR project (including charts) see http://cldr.unicode.org/.