Wednesday, December 5, 2018

Support Unicode with an Adopt-a-Character Gift this Holiday Season!

party hat emoji This holiday season you can give a unique gift by adopting any emoji, letter, or symbol — and help support the Unicode Consortium’s mission to enable all languages to be used on computers. Three levels of sponsorship are available​, starting at $100. With over 130,000 characters to choose from, you are certain to find an appropriate character, for even the most demanding recipient. All sponsors will receive a custom digital badge featuring the adopted character for use on the web and elsewhere. Sponsors at the two highest levels will receive a special thank-you gift engraved with the name you supply and the adopted character.

The program funds work on “digitally disadvantaged” languages, both modern and historic. In 2018 the program awarded grants to support work on improved keyboard layouts, additional work on Mayan hieroglyphs, and more historic Indic scripts, among others.

To date, the Adopt-a-Character program has had over 500 sponsors. Be part of the next wave, with a worthwhile gift!

For more information on the program, or to adopt a character, see the Adopt-a-Character Page.

Monday, November 5, 2018

Unicode 12.0 Beta Review

U12 beta image The beta review period for Unicode 12.0 has started. The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 12.0 includes a number of changes and 554 new characters. Some of the Unicode Standard Annexes have modifications for Unicode 12.0, often in coordination with changes to character properties. In particular, there are minor changes to UAX #29, Unicode Text Segmentation, to account for differences in Georgian casing behavior. Four new scripts have been added in Unicode 12.0. There are also 61 additional emoji characters, as well as very significant enhancements to the representation and behavior of multiperson emoji.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by January 7, 2019. Feedback instructions are on the beta page.

See for more information about testing the 12.0.0 beta.

See for the current draft summary of Unicode 12.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, SAP, Shopify, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.


Tuesday, October 23, 2018

Draft Candidates for Emoji 12.0 Beta (2019)

Emoji The Emoji 12.0 Beta contains 236 Emoji Draft Candidates, consisting of 61 characters plus 175 sequences. These are slated for release in 2019Q1 together with Unicode Version 12.0.

The emoji are in the following categories: 3 smileys & emotion, 209 people & body, 7 animals & nature, 9 food & drink, 6 travel & places, 3 activities, 15 objects, and 12 miscellaneous symbols. 50 of  the new emoji (including gender/skin-tone variants) are for accessibility, such as ear with hearing aid and woman in manual wheelchair. The hearts, circles, and squares now have the same set of colors for decorative and/or descriptive uses.

Multi-person emoji now have skin-tone variants:

(A) Full Emoji v12.0 support requires that the holding-hands emoji (👫 👬 👫) with specific genders be supported with 55 combinations of mixed skin tones, such as:
  • man with dark skin tone and woman with light skin tone holding hands
  • woman with medium skin tone and woman with medium light skin tone holding hands
  • man with light skin tone and man with light skin tone holding hands
(B) Full Emoji v12.0 support requires that the 6 multi-person emoji (👯️‍  🤼 🤝 💏 💑 👪) without specific gender be supported with the 5 human skin tones, such as:
  • family (adult+adult+child) with dark skin tone
  • couples with heart (adult+adult) with medium skin tone
  • couples kissing (adult+adult) with light skin tone
A mechanism is provided for mixed skin tones for emoji in group B, such as with a family of man+woman+girl+boy, but support is optional.

The following notes are relevant for implementers:
  1. The 40 holding-hands emoji with mixed skin tones have a simpler internal representation, compared to the previous draft. The 15 with uniform skin tones use a single character plus skin-tone modifiers.
  2. Implementations may optionally support all combinations of mixed skin tones for the 6 multi-person emoji in the B group. This can be a large number — over 4,000 for the family emoji alone — and thus may not be practical for all devices.
  3. Clearer definitions are now provided in the specification, along with a new set for Basic_Emoji. For other details, see the specification.
The complete list of emoji sequences for Emoji 12.0 will be finalized during the next UTC meeting in January 2019. The CLDR English names and keywords for the new emoji characters will be finalized within the next month, and translation into 80+ languages (such as Slavic languages) will begin. Feedback is welcome on the sorting order and the English names and keywords.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.


Tuesday, October 16, 2018

ICU 63 Released

ICU LogoUnicode® ICU 63 has just been released. It updates to CLDR 34 locale data with many additions and corrections, and some new languages. ICU adds an API for number and currency range formatting, and an API for additional Unicode properties and for constructing custom properties. CLDR and ICU include data for testing readiness for the upcoming Japanese calendar era.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see

Monday, October 15, 2018

CLDR Version 34 Language/Locale Data Released

Emoji Version 34 is the latest version of CLDR, the core open-source language data that major software systems use to adapt software to the conventions of over 80 different languages. CLDR data is used by many products for Unicode and language support, including Android, Cloudant, Chrome OS, Db2, iOS, macOS, Windows, and many others.

CLDR 34 included a full Survey Tool data collection phase increasing to 85 languages at the “modern” (full) level, 4 at the lower “moderate” level (suitable for document content), 18 at the basic level, and about 100 others that don’t meet the level requirements.

Among the other changes: new units were added (e.g., atmosphere, petabyte); many new emoji keywords and names were corrected/refined, with updated emoji sort order; and preparations for the New Japanese Era (affecting most software for Japan) were made. The specification was also updated with many changes for Unicode Locale Identifier and BCP 47 Conformance sections, plus defining the syntax of unit identifiers. For other changes, details, and links to documentation, see the CLDR 34 Release Notes.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.