Wednesday, March 14, 2018

Unicode 11.0 Beta Review

U11 beta image The beta review period for Unicode 11.0 has started. The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 11.0, often in coordination with changes to character properties. In particular, there are major changes to UAX #29, Unicode Text Segmentation. Seven new scripts have been added in Unicode 11.0, including Hanifi Rohingya. A major adjustment has been made to the Georgian script, with the introduction of uppercase Georgian letters. There are also 66 additional emoji characters.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by April 23, 2018. Feedback instructions are on the beta page.

See for more information about testing the 11.0.0 beta.

See for the current draft summary of Unicode 11.0.0.

Wednesday, March 7, 2018

Call for Unicode 11.0 and 12.0 Cover Design Art

book cover The Unicode Consortium is inviting artists and designers to submit cover design proposals for Versions 11.0 and 12.0 of The Unicode Standard. This call is being issued simultaneously for the next two versions of the standard, scheduled for publications in 2018 and 2019, respectively.

The two selected cover designs will appear on the Unicode Standard 11.0 and 12.0 web pages, in the print-on-demand publications, and in associated promotional literature on the Unicode website. The two artists whose designs are selected for the covers will receive full credit in the colophon of the publication for which the art is used, and wherever else the design appears, and will each receive $700. Two selected runner-up artists will receive $150 apiece.

Please see the announcement page for requirements and more details.

Tuesday, February 27, 2018

Unicode CLDR 33 alpha available for testing

cldr v33 alpha The alpha version of Unicode CLDR 33 is available for testing. The alpha period lasts until the beta release on March 7, which will include updates to the LDML spec. The final release is expected on March 21.

CLDR 33 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 33 included a limited Survey Tool data collection phase focusing on emoji names/annotations and certain specific locales (Odia, Assamese). Other enhancements include a new typographic Names element, four new transforms, changes to properties data files for emoji, and other specific fixes. The draft release page at lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:
Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Unicode Emoji 11.0 characters now ready for adoption!

And by the way, we have updated charts for the new emoji, with some fixed glyphs (thanks to Emojipedia!).

Monday, February 26, 2018

Adopt-A-Character Grant to Support Three Historic Scripts

document image The Adopt-a-Character Program has awarded a grant to support development of proposals for encoding the following three historic scripts in the Unicode Standard:
  • Book Pahlavi, an Aramaic-based script important to Zoroastrian and Parsi communities worldwide
  • Persian Siyaq Numbers, a numerical system used in Iran from the 9th to 20th centuries for accounting and administration
  • Uighur, a script used in the region spanning Uzbekistan to Mongolia from the 8th to 19th century.
The work will be done by Anshuman Pandey under the direction of Deborah Anderson (SEI, UC Berkeley) and Rick McGowan (Unicode Consortium).

