Wednesday, June 27, 2018

New Gold Sponsor dotFM .FM TLD

The Unicode Consortium is pleased to announce that dotFM .FM TLD is now a gold sponsor for:

dotFM .FM TLD's  sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

BRS Media’s dotFM is pleased to sponsor Adopt a Character. This year, dotFM launched Emoji Domains within the .FM Top-Level Domain. Emoji domain is a domain name with an expressive digital image or icon in it. dotFM pioneered the ‘multimedia’ domain space since launching the .FM Top Level Domains in 1998. Today, the comprehensive portfolio of registrants not only includes broadcasters, Internet radio and the music community, but also interactive companies, premier social media ventures and podcast entrepreneurs worldwide.  — dotFM .FM TLD

The Unicode Consortium thanks dotFM .FM TLD for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 140,000 other characters are available for adoption — see Adopt a Character

Wednesday, June 20, 2018

ICU 62 Released

ICU LogoUnicode® ICU 62 has just been released. It upgrades to Unicode 11 and to CLDR 33.1 locale data. A new syntax for locale-neutral number skeleton strings can be used in MessageFormat for more control over number formatting. Several still-draft NumberFormatter methods and helper classes have been modified or renamed. In C++, DecimalFormat wraps the new NumberFormatter code, and there is a new implementation for number parsing.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.


CLDR Version 33.1 Language/Locale Data Released for Unicode 11.0

Emoji Unicode CLDR 33.1 adds support for the recently released Unicode 11.0. Version 33.1 is the latest version of CLDR, the core open-source language data that major software systems use to adapt software to the conventions of over 80 different languages. The open-source Unicode ICU library incorporates the CLDR Version 33.1 data as part of its update to Unicode 11.0 in its ICU 62 release. ICU code is used by many products for Unicode and language support, including Android, Cloudant, ChromeOS, Db2, iOS, macOS, Windows, and many others.

The CLDR 33.1 release focuses on updates for Unicode 11.0: new names and keywords for the Unicode 11.0 emoji, Chinese collation stroke order, and script metadata. In addition, there are major improvements for names and annotations for the pre-11.0 emoji in CLDR languages. More extensive updates are planned for CLDR 34 (release expected in early October), with data submission still continuing.

For further details and links to documentation, see the CLDR 33.1 Release Notes.


Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.


Tuesday, June 5, 2018

Announcing The Unicode® Standard, Version 11.0

U+10F3D Sogdian Ain 10F3D Version 11.0 of the Unicode Standard is now available, both the core specification and data files. Version 11.0 adds 684 characters, for a total of 137,374 characters. These additions include seven new scripts, for a total of 146 scripts, as well as 145 new emoji.

The new scripts and characters in Version 11.0 add support for lesser-used languages and unique written requirements worldwide, including:
  • Georgian Mtavruli capital letters, newly added to support modern casing practices
  • Hanifi Rohingya, used to write the modern Rohingya language in Southeast Asia
  • Medefaidrin, used for modern liturgical purposes in Africa
  • Mazahua, a Mesoamerican language recognized by law in Mexico
  • Mayan numerals used in printed materials in Central America
  • Historic Sanskrit, Gurmukhi, and the Buryats
  • Five urgently needed CJK unified ideographs: three for chemical names and two for Japan's government administration
Popular symbol additions:
  • Copyleft symbol
  • Half stars for rating systems
  • More astrological symbols
  • Xiangqi Chinese chess symbols
  • New emoji characters including:
🦸 👨🏽‍🦰
🧸 🦞
🧨 🥳

For the full list of emoji characters, see emoji additions for Unicode 11.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji. Version 11.0 also includes other improvements for emoji handling:
  • a mechanism to request the glyph direction for emoji
  • descriptions of the four new emoji hair components
  • descriptions of gender neutral emoji
  • simplified statements of emoji-related rules for grapheme cluster boundaries and for word boundaries.
Three other important Unicode specifications have been updated for Version 11.0:

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications, often in coordination with changes to character properties. In particular, there are changes to:

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.


All the new characters including the new emoji are now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.