Monday, June 3, 2019

CLDR v36 open for data submission

The Unicode CLDR Technical Committee is pleased to announce the opening of the CLDR Survey Tool for general data submission. CLDR relies on community contributions for its ongoing data refinement and to offer new data to the CLDR user community. The collected data will be released as Version 36 on October 15.

Unicode CLDR provides key building blocks for software to support the world's languages, and is used by much of the world’s software — for example, all major browsers and all modern mobile phones use CLDR for language support.

Version 36 is focusing on:
  • New measurement units and patterns
  • New names and search keywords for the draft candidate emoji for Emoji 13.0 (scheduled for release in 2020Q1)
  • Adding more locales for data contributions
  • Fleshing out Islamic calendar support
  • Improving translation quality in general
For more information on contributing to CLDR, see the CLDR Information Hub. If you would like to contribute missing data for your language, see Survey Tool Accounts.

The Common Locale Data Repository (CLDR) provides key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data available. This data is used by a wide spectrum of companies for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks as:
  • Locale-specific patterns for formatting and parsing: dates, times, timezones, numbers and currency values, measurement units,…
  • Translations of names: languages, scripts, countries and regions, currencies, eras, months, weekdays, day periods, time zones, cities, and time units, emoji characters and sequences (and search keywords),…
  • Language & script information: characters used; plural cases; gender of lists; capitalization; rules for sorting & searching; writing direction; transliteration rules; rules for spelling out numbers; rules for segmenting text into graphemes, words, and sentences; keyboard layouts;…
  • Country information: language usage, currency information, calendar preference, week conventions,…
  • Validity: Definitions, aliases, and validity information for Unicode locales, languages, scripts, regions, and extensions,…



Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages

[badge]