Monday, April 17, 2017

ICU 59 Released

ICU LogoUnicode® ICU 59 has just been released! ICU is the main avenue for many software products and libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

ICU 59 upgrades to CLDR 31 and to emoji 5.0 data, together with segmentation and bidi updates from Unicode 10 beta. The Java code for number formatting has been completely rewritten for reliability and performance. There is also a new case mapping API for styled text, and a technology preview of enhanced language matching.

There are major changes for ICU4C that will make ICU easier to use but require changes in projects using ICU: C++11, char16_t, UTF-8 source files.

For details please see

Thursday, April 13, 2017

Call for Unicode 10.0 Cover Design Art

 [cover1] The Unicode Consortium is inviting artists and designers to submit cover design proposals for Version 10.0 of The Unicode Standard.

The cover design will appear on the Unicode Standard 10.0 web page, in the print-on-demand publication, and in associated promotional literature on the Unicode website. The chosen artist will receive full credit in the colophon of the publication, and wherever else the design appears, and receive $700. The two runner-up artists will receive $150 apiece.

Please see the announcement web page for requirements and more details.

Friday, April 7, 2017

PRI #351: Combined registration of the KRName collection and of sequences in that collection

PRI 351 The Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #351: A submission for the “Combined registration of the KRName collection and of sequences in that collection” has been received by the IVD Registrar.

This submission is currently under review according to the procedures of UTS #37, Unicode Ideographic Variation Database, with an expected close date of 2017-07-07. Please see the submission page for details and instructions on how to review this issue and provide comments:

The IVD (Ideographic Variation Database) establishes a registry for collections of unique, and sometimes shared, variation sequences for Ideographs, which enables standardized interchange in plain text, in accordance with UTS #37, Unicode Ideographic Variation Database.

Monday, March 27, 2017

Unicode Emoji 5.0 characters now final

Fifty-six new emoji characters are in the just released Emoji 5.0 data, including such characters as:

shushing face mage
flying saucerpie
* for healthy eaters!

The new Emoji 5.0 set is fixed, and available for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 10.0, scheduled for June 2017.

The majority of these new emoji characters are the 34 Smileys & People, with 13 new Food & Drink, followed up by 6 Animals & Nature and a few others.

There are an additional 180 emoji sequences for gender and skin-tone in Smileys & People — such as woman in lotus position: medium skin tone — and new regional flags for England, Scotland, and Wales. This makes a total of 239 new emoji (characters and sequences). For a full list, see Emoji Recently Added.

The emoji charts have been updated to show the new characters and sequences. The draft Emoji 5.0 specification will be finalized in the May UTC meeting, and is still available for comment.
The 239 new emoji are also now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

Adopt a Character

Monday, March 20, 2017

CLDR Version 31 Released

CLDR CoverageUnicode CLDR 31 provides an update to the key building blocks for software supporting the world’s languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

Some of the improvements in the release are:
  • Canonical codes
    • The subdivision codes have been changed to all have the bcp47 format.
    • The locales in the language-territory population data are in canonical format.
    • The timezone ID for GMT has been split from UTC.
    • There is a mechanism for identifying hybrid locales, such as Hinglish.
  • Emoji 5.0
    • Short names and keywords have been updated for English. (Data for other languages to be gathered in the next cycle).
    • Collation (sorting) adds the new 5.0 Emoji characters and sequences, and some fixes for Emoji 4.0 characters and sequences.
    • For Emoji usage, subdivision names for Scotland, Wales, and England have been added for 65 languages.
For further details and links to documentation, see the CLDR Release Notes.