Tuesday, December 13, 2011

Two New Public Review Issues: UTR #36, UTS #39

The Unicode Technical Committee has posted two new issues for public
review and comment. Details are on the following web page:


Review periods for the new items close on January 30, 2012.

Please see the page for links to discussion and relevant documents.
Briefly, the new issues are:

Issue #208 Proposed Update UTR #36: Unicode Security Considerations

This UTR is being prepared for an update to bring the IDNA 2008
references up to date. Public review and comment is invited on this draft.

Issue #209 Proposed Update Unicode Technical Standard #39 Unicode
Security Mechanisms

This UTS is being prepared for an update to align with Unicode 6.1.
Public review and comment is invited on this draft.

To supply feedback on these issues, see
http://www.unicode.org/review/#feedback .

All of the Unicode Consortium lists are strictly opt-in lists for members
or interested users of our standards. We make every effort to remove
users who do not wish to receive e-mail from us. To see why you are getting
this mail and how to remove yourself from our lists if you want, please
see http://www.unicode.org/consortium/distlist.html#announcements

CLDR v21 Milestone 2 available for testing

Milestone releases of CLDR provide an opportunity to test a snapshot of the next version of CLDR; they are not intended for use in production. CLDR v21 is not a data submission release; instead, the CLDR group is engaged in improving tools, and making specific changes to data.

Note that the CLDR v21 release is intended to support Unicode 6.1, and depends on some new Unicode 6.1 property values for grapheme break and line break. This Milestone 2 release depends on values from the beta versions of Unicode 6.1 data files.

New additions in this Milestone 2 release include:
  • Changes to the segmentation data to match Unicode 6.1. The behaviors associated with the former "th" grapheme break tailoring and "he" line break tailoring have been moved into the root behavior, so those tailorings are no longer necessary and have been deleted.
  • Two new calendar element structures needed for support of the Chinese lunar calendar (and other calendars such as the Hindu lunar calendars); for more information see http://cldr.unicode.org/development/development-process/design-proposals/chinese-calendar-support:
    • Addition of the <monthPatterns> element structure to indicate how to modify standard month names to mark intercalary leap months, as well as (for some calendars) months adjacent to leap months and combined months. This is supported via the standard month pattern characters 'M' and 'L', so the pattern character 'l' (SMALL LETTER L) formerly provided as a way to mark leap months has been deprecated (it was never supported by underlying data).
    • Addition of the <cyclicNameSets> element structure to support cyclic names for years (and other calendar entities in some calendars).
  • A new "ar_001" locale for Modern Standard Arabic as the default content for "ar". This will permit the "ar_EG" locale (formerly the default content for "ar") to use some Egypt-specific names.
  • Addition of codes for South Sudan
  • Other specific data fixes such as for Ukrainian collation, Ewe day periods, various metazones, and some specific translation errors.

Highlights in the Milestone 1 release (Sept. 29) included:
  • Work in support of pending -t- extension in BCP47
  • Deprecation of 'commonlyUsed' element in timezone names
  • Removal of "whole-locale" aliases (data for constructing is in supplementaldata.xml)
  • First cut at incorporating European Ordering Rules (EOR)

The data is available from SVN under "tag/release-21-d02" as described in
The full list of changes in this milestone is
The current draft LDML specification is