Thursday, September 25, 2014

Updated Unicode Security Specifications and Guidelines

The major Unicode security-related specifications and guidelines have been updated for Unicode 7.0. The security-related data files have undergone a major revision to improve their algorithmic consistency, as well as to take into account new information about confusable character data. We strongly advise that implementations be updated to make use of this new data. Pay particular attention to persistent data stores, such as database indexes, that use strings folded with the previous version of the data files. Mixing strings folded with new and old data files in the same persistent store will likely cause failures. It may be necessary to provide APIs for both old and new folding during a migration.

The guidelines have also been updated with descriptions of additional security issues. In particular, it is now clear that display of Punycode URLs as a security measure can, in some circumstances, actually make the spoofing problem worse.

Punycode Spoofing Image

For details, see:

Unicode Security Considerations:
Unicode Security Mechanisms:

Wednesday, September 24, 2014

Proposed Update UAXes for Unicode 8.0

Proposed updates for several of the Unicode Standard Annexes for Version 8.0 of the Unicode Standard have been posted for public review. See for details and links to the various documents.

UTS #10, Unicode Collation Algorithm has also been posted for public review. In this update, Cyrillic contractions have been removed. See the Modifications section of the draft document for further information.

Review periods for provision of feedback on these proposed updates close on October 20, 2014 for the November UTC meeting, but there will be further opportunities for feedback on the annexes after that November meeting.

To supply feedback on these issues, please see

Monday, September 22, 2014

New version of UTR #50, Unicode Vertical Text Layout released

A new revision of UTR #50, Unicode Vertical Text Layout, has been released. The data tables have been updated, to bring them into line with Unicode 7.0. A few additional changes in the property values were made, mainly for consistency across similar characters.

Thursday, September 18, 2014

CLDR Version 26 Released

CLDR 26 Coverage Unicode CLDR 26 has been released, providing an update to the key building blocks for software supporting the world's languages. This data is used by a wide spectrum of companies for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks. This release focused primarily on Unicode 7.0 compatibility, Survey Tool improvements, increased coverage, new units, and improvements to collation and RBNF. Changes include the following:
  • Data Growth: Major increase in the number of translations, with 77 locales now reaching the 100% modern coverage level, and an overall growth of about 20% in data.
  • Units: Added 72 new units, added display names for all units and a new perUnitPattern (eg, liters per second).
  • Collation: Updated collation (sorting) to Unicode 7.0, moved Unihan radical-stroke collation into root to avoid duplication, used import to reduce source size by 23% and ease maintenance. Major changes to Arabic collation.
  • Spell-out numbers: improvements for round-trip fidelity; new syntax for use of plural categories.
  • Specification: documented new structure, \x{h...h} syntax for Unicode code points; construction of “unit per unit” formats; clarified BCP47 and Unicode identifiers, and different kinds of locale lookup, matching, and inheritance.
  • Survey Tool: Major improvements to the UI to make it easier and faster to enter and check data.
Details are provided in, along with a detailed Migration section.

Wednesday, September 3, 2014

IUC 38 Keynote Presenter Announced


Dr. Mark Davis 絵文字 : 🏰, 🎁, and 🚀 = Emoji: Past, Present, and Future
Dr. Mark Davis 
Unicode President and Co-Founder
The Unicode Consortium has announced that its president and co-founder, Dr. Mark Davis, will deliver the keynote address at this year’s Internationalization & Unicode Conference (IUC), November 3-5. Dr. Davis’s talk, Emoji: Past, Present, and Future, will discuss where emoji came from, why they have gotten so popular, where they’ve gone wrong, and what the future will bring.

“Emoji became very popular in Japan right after they were introduced in 1999,” said Dr. Davis. “Once they were added to Unicode in 2010, they became popular worldwide, used in modern mobile phones, texting systems, email, and so on. For example, there were some 6,000 articles on emoji in the month after Unicode 7.0 released, according to Google News.”

Dr. Davis will explore the history of emoji, how they came to be added to Unicode, how they are used in practice, and some of the deficiencies that people see. For example, what about the lack of human diversity and why isn’t there a hot dog emoji? He will then illuminate some of the future additions from Unicode and answer some of the most common questions about emoji.

IUC is the premier event covering the latest in industry standards and best practices for bringing software and Web applications to worldwide markets. Subject areas include the global impact, programming practices, fonts and rendering, and mobile computing. For the eighth year, Adobe will be sponsoring the conference.

To view the full IUC agenda and to register, visit

Tuesday, September 2, 2014

PRI #281: Proposed encoding model change for New Tai Lue

The UTC is considering a significant change to the encoding model for New Tai Lue script from logical order to visual order for Unicode 8.0. Details of the proposal are in the background document.

Please see the PRI page: for further information about how to discuss this Public Review Issue and how to supply formal feedback.