Thursday, March 10, 2016

Unicode 9.0 Beta Review

 [Adlam Sample Image] Mountain View, CA, USA – The Unicode® Consortium today announced the start of the beta review for the forthcoming Unicode 9.0.0, which is scheduled for release in June, 2016. All beta feedback must be submitted by May 2, 2016.

Unicode is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones – plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). Thus it is important to ensure a smooth transition to each new version of the Unicode Standard.

Unicode 9.0.0 comprises several additions and changes which require careful migration in implementations. These include asymmetric case mappings, numerous variation sequences, new fractional numeric values, and changes to property values, especially East_Asian_Width values. The line breaking and text segmentation algorithms handle character sequences that represent emoji as indivisible units via the addition of new property values and rules. Implementers need to modify code and check assumptions for all affected processes to support these additions and changes.

The new character repertoire includes 74 emoji symbols, 19 symbols used in Japanese TV broadcasting, and multiple additions to existing scripts. There are six new scripts, of which three are in modern use (Adlam, Osage, and Newa) and three are historic (Bhaiksuki, Marchen, and Tangut). Adlam and Osage have case pairs and require data updates for casing functions. Tangut is a large ideographic script whose addition incurred changes to the Unicode Collation Algorithm (used as the basis for sorting text in all languages).

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by May 2, 2016. Feedback instructions are on the beta page.

See for more information about testing the 9.0.0 beta.

See for the current draft summary of Unicode 9.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emoji One, EmojiXpress, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Sultanate of Oman MARA, Oracle, SAP, Tamil Virtual University, The University of California (Berkeley), Yahoo!, plus well over a hundred Associate, Liaison, and Individual members. For more information, please contact the Unicode Consortium