The Unicode Blog: Unicode CLDR Version 35 Language/Locale Data Released

Wednesday, March 27, 2019

Unicode CLDR Version 35 Language/Locale Data Released

Unicode CLDR 35 provides an update to the key building blocks for software supporting the world's languages. CLDR data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 35 included a limited Survey Tool data collection phase. The following summarizes the changes in the release.

Data	70,000+ new data fields, 13,400+ revised data fields
Basic coverage	New languages at Basic coverage: Cebuano (ceb), Hausa (ha), Igbo (ig), Yoruba (yo)
Modern coverage	Languages Somali (so) and Javanese (jv) increased coverage from Moderate to Modern
Emoji 12.0	Names and annotations (search keywords) for 90+ new emoji; Also includes fixes for previous names & keywords
Collation	Collation updated to Unicode 12.0, including new emoji; Japanese single-character (ligature) era names added to collation and search collation
Measurement units	23 additional units
Date formats	Two additional flexible formats, and 20 new interval formats
Japanese calendar	In Japanese locale, updated to use Gannen (元年) year numbering for non-numeric formats (which include 年), and to consistently use narrow eras in numeric date formats such as “H31/3/27”.
Region Names	Many names updated to local equivalents of “North Macedonia” (MK) and “Eswatini” (SZ).
Segmentation	Enhanced Grapheme Cluster Boundary rules for 6 Indic scripts: Gujr, Telu, Mlym, Orya, Beng, Deva.

A dot release, version 35.1 is expected in April, with further changes for Japanese calendar.

For details, see Detailed Specification Changes, Detailed Structure Changes, Detailed Data Changes.

Over 136,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages

Wednesday, March 27, 2019

Unicode CLDR Version 35 Language/Locale Data Released

Links of Interest

Blog Archive

Labels

Followers

Wednesday, March 27, 2019

Unicode CLDR Version 35 Language/Locale Data Released

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog