Wednesday, April 26, 2017

Last Call on Unicode 10.0 Beta Review

U10 beta image The beta review period for Unicode 10.0 and related technical standards will close on May 1, 2017. This is the last opportunity for technical comments before version 10.0 is released in Q2 2017. Implementers and interested parties are encouraged to download data files, review proposed updates, and submit comments soon.

In addition to the Unicode Standard proper, three other Unicode Technical Standards have significant text and data file updates that are correlated with the new additions for Unicode 10.0.0. Review of that text and data is also encouraged during the beta review period.

UTS #10, Unicode Collation Algorithm Data files
UTS #39, Unicode Security Mechanisms Data files
UTS #46, Unicode IDNA Compatibility Processing Data files

Additional documents are available for public review and will be discussed at the May UTC meeting, such as the final Emoji 5.0 text, and a proposed Unicode character property. For more information, see the open public review issues and the UTC document registry.

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 10.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 10.0, often in coordination with changes to character properties. In particular, there are changes to UAX #14, Unicode Line Breaking Algorithm, UAX #29, Unicode Text Segmentation, and UAX #31, Unicode Identifier and Pattern Syntax. In addition, UAX #50, Unicode Vertical Text Layout, has been newly incorporated as a part of the standard. Four new scripts have been added in Unicode 10.0, including Nüshu. There are also 56 additional emoji characters, a major new extension of CJK ideographs, and 285 hentaigana, important historic variants for Hiragana syllables.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by May 1, 2017. Feedback instructions are on the beta page.

See for more information about testing the 10.0.0 beta.

See for the current draft summary of Unicode 10.0.0.

Monday, April 17, 2017

ICU 59 Released

ICU LogoUnicode® ICU 59 has just been released! ICU is the main avenue for many software products and libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

ICU 59 upgrades to CLDR 31 and to emoji 5.0 data, together with segmentation and bidi updates from Unicode 10 beta. The Java code for number formatting has been completely rewritten for reliability and performance. There is also a new case mapping API for styled text, and a technology preview of enhanced language matching.

There are major changes for ICU4C that will make ICU easier to use but require changes in projects using ICU: C++11, char16_t, UTF-8 source files.

For details please see

Thursday, April 13, 2017

Call for Unicode 10.0 Cover Design Art

 [cover1] The Unicode Consortium is inviting artists and designers to submit cover design proposals for Version 10.0 of The Unicode Standard.

The cover design will appear on the Unicode Standard 10.0 web page, in the print-on-demand publication, and in associated promotional literature on the Unicode website. The chosen artist will receive full credit in the colophon of the publication, and wherever else the design appears, and receive $700. The two runner-up artists will receive $150 apiece.

Please see the announcement web page for requirements and more details.

Friday, April 7, 2017

PRI #351: Combined registration of the KRName collection and of sequences in that collection

PRI 351 The Unicode Consortium has posted a new issue for public review and comment.

Public Review Issue #351: A submission for the “Combined registration of the KRName collection and of sequences in that collection” has been received by the IVD Registrar.

This submission is currently under review according to the procedures of UTS #37, Unicode Ideographic Variation Database, with an expected close date of 2017-07-07. Please see the submission page for details and instructions on how to review this issue and provide comments:

The IVD (Ideographic Variation Database) establishes a registry for collections of unique, and sometimes shared, variation sequences for Ideographs, which enables standardized interchange in plain text, in accordance with UTS #37, Unicode Ideographic Variation Database.

Monday, March 27, 2017

Unicode Emoji 5.0 characters now final

Fifty-six new emoji characters are in the just released Emoji 5.0 data, including such characters as:

shushing face mage
flying saucerpie
* for healthy eaters!

The new Emoji 5.0 set is fixed, and available for vendors to begin working on their emoji fonts and code ahead of the release of Unicode 10.0, scheduled for June 2017.

The majority of these new emoji characters are the 34 Smileys & People, with 13 new Food & Drink, followed up by 6 Animals & Nature and a few others.

There are an additional 180 emoji sequences for gender and skin-tone in Smileys & People — such as woman in lotus position: medium skin tone — and new regional flags for England, Scotland, and Wales. This makes a total of 239 new emoji (characters and sequences). For a full list, see Emoji Recently Added.

The emoji charts have been updated to show the new characters and sequences. The draft Emoji 5.0 specification will be finalized in the May UTC meeting, and is still available for comment.
The 239 new emoji are also now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

