Friday, September 14, 2018

Unicode CLDR 34 alpha available for testing

The alpha version of Unicode CLDR 34 is available for testing. The alpha period lasts until the beta release on September 26, which will include updates to the LDML spec. The final release is expected on October 10.

CLDR 34 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 34 included a full Survey Tool data collection phase. Other enhancements include several changes to prepare for the new Japanese calendar era starting 2019-05-01; updated emoji names, annotations, collation and grouping; and other specific fixes. The draft release page at lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Thursday, September 6, 2018

New Japanese Era

A new era in the Japanese calendar is expected to begin on May 1, 2019, following the announced abdication of Japanese Emperor Akihito. This era will be represented in dates by two names: one consisting of a sequence of two existing kanji and one consisting of a new single Japanese character that combines those two. (Similarly, the current era Heisei can be represented by either “平成” or “㍻”.)

The Japanese calendar system and support for era names is essential for important public sector business functions. Therefore, most software distributed in Japan will need to adopt the new era name and add font support for the new character.

The current Heisei era has been in place since 1989 — during the evolution of modern computer systems. Because of this, most software systems have not been tested for such an event. The exact date of the announcement of the new era name is unknown, but current expectations are that there will be a very narrow window for implementing the new era information in IT environments, perhaps less than a month. Until the announcement, dates in 2019 and beyond will continue to be written with the Heisei era name and its year numbering.

To prepare as well as possible for this unprecedented event, the Unicode Consortium has taken the following actions:

  • The code point U+32FF has been reserved for the new era character.
  • Once the new era name is announced, the Unicode Consortium will quickly issue a dot-release (Version 12.1) that will add that character at the reserved code point, U+32FF, with an appropriate character name, decomposition, and representative glyph.
  • Unicode CLDR and ICU are including test mechanisms in the 2018 October releases of CLDR 34 and ICU 63. Systems that use CLDR or ICU (all smartphones, for example) can test using these mechanisms.
  • Systems and applications that do not use CLDR or ICU will need to take similar steps for testing.
The short time window between the actual announcement and the effective date will present challenges to the IT industry. IT systems in Japan will be expected to have the support in place seamlessly. Because of the narrow timeframe and the need to upgrade or patch legacy software, it is important to start now to determine how soon your application/system can add support to your current implementations, stacks, and dependencies.