Wednesday, September 18, 2013

CLDR Version 24 released

Unicode CLDR 24 has been released, providing an update to the key building blocks for software supporting the world's languages. This data is used by a wide spectrum of companies for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.
Unicode CLDR 24 focused on additional structure for formatting units, dates, and times, and improving data coverage. This version contains data for 238 languages and 259 territories—740 locales in all. Ten languages were added to the 100%-modern-coverage list for a total of 70 languages. Between the new languages, and the new structure, more data was entered than in any previous release.

The new structure focused primarily on formatting of units and improvements to date and time formatting.
  • fractional plural forms. major extension to handle fractions (eg, some languages use the equivalent of “1.2 teaspoons” but “2.1 teaspoon”)
  • measurement units. many additional unit types (“10.3 kg”), in up to 6 plural forms per language
  • compound units. video length: "23 hrs, 7 mins", or "23:07"
  • dates/times. new relative fields such as "last Sunday", and "now"; 12 hour time formats that omit "am/pm"; neutral eras ("405 BCE"); additional timezone falback regional patterns ("{city} Daylight Time")
  • number formatting. exponential notation (1.42×1023), at-least ("99+"), ranges ("3.5-4.5 kg"), narrow currency symbols (both "US$12.23" and "$12.23").
  • collation. major simplification of rule syntax, updated root files to Unicode 6.3; preliminary version of European Ordering Rules; documentation of the CLDR Collation Algorithm (extending UCA)
  • JSON. improved support, including new structure and data.
In addition, the data already present from CLDR v23 was reviewed for the supported languages, and many improvements made.

Details of coverage improvements and new features are provided in, along with a detailed Migration section.

About the Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, Google, Government of Andhra Pradesh, Government of Bangladesh, Government of India, IBM, Microsoft, Monotype Imaging, Sultanate of Oman MARA, Oracle, SAP, Tamil Virtual University, The University of California (Berkeley), Yahoo!, plus well over a hundred Associate, Liaison, and Individual members. For more information, please contact the Unicode Consortium: