Tuesday, June 21, 2016

Announcing The Unicode® Standard, Version 9.0

🥂Version 9.0 of the Unicode Standard is now available. Version 9.0 adds exactly 7,500 characters, for a total of 128,172 characters. These additions include six new scripts and 72 new emoji characters.

The new scripts and characters in Version 9.0 add support for lesser-used languages worldwide, including:
  • Osage, a Native American language
  • Nepal Bhasa, a language of Nepal
  • Fulani and other African languages
  • The Bravanese dialect of Swahili, used in Somalia
  • The Warsh orthography for Arabic, used in North and West Africa
  • Tangut, a major historic script of China
Important symbol additions include:
  • 19 symbols for the new 4K TV standard
  • 72 emoji characters such as the following
Animals 🦋  BUTTERFLY

For the full list, see emoji additions for Unicode 9.0. For a detailed description of support for emoji characters by the Unicode Standard, see UTR #51, Unicode Emoji.

Three other important Unicode specifications have been updated for Version 9.0:
Some of the changes in Version 9.0 and associated Unicode technical standards and reports may require modifications in implementations. For more information, see Unicode 9.0 Migration and the migration sections of UTS #10, UTS #39, UTS #46, and UTR #51. For full details on Version 9.0, see http://unicode.org/versions/Unicode9.0.0/

Thursday, June 16, 2016

Unicode 9.0 Emoji Available for Adoption

[Emoji image]The Unicode Consortium’s Adopt-a-Character program is an opportunity to permanently adopt and dedicate an emoji, letter or any symbol on the keyboard. The new Unicode 9.0 emoji are now available for adoption, including 🤷 (shrug), 🤦(face palm),🤞(crossed fingers), 🥓(bacon), and 68 others. The funds help the consortium’s work of supporting the world’s languages in digital form.

We welcome sponsors of the new characters to join existing sponsors like Elastic in helping to further the work of the Unicode Consortium.

The emoji charts have also been updated with these new emoji, and with new images from Messenger, EmojiOne, EmojiXpress, and others. Soon after Unicode 9.0 is released, the other new Unicode 9.0 characters will be available for adoption.

Monday, June 6, 2016

72 New Emoji Characters

[Emoji Image] The 72 new emoji characters for Unicode 9.0 are now final, and listed in Emoji Recently Added. They include 7 faces, 7 people, 7 hand gestures, 14 plants/animals, 18 food emoji, 12 sports emoji, and a few others. The corresponding documentation in UTR #51 Unicode Emoji, Version 3.0 has also been updated, with additional guidelines for implementers and the new versions of the emoji data files. These should appear on smart phones and other devices that support emoji once vendors have a chance to update them.

Four of the new emoji are added to complete gender pairs. Work has already begun on the Version 4.0 of Unicode Emoji, with a focus on further enhancing gender representation, and targeted to appear in the near future.

The new emoji characters will soon be available for adoption, helping support projects to improve language support.

Friday, June 3, 2016

Encoding the Mayan Script: your Adopt-a-Character sponsorships at work

[Mayan Image] The first grant of funds from Unicode’s Adopt-a-Character program has been awarded to UC Berkeley’s Script Encoding Initiative (SEI), for the first two phases of a project to include Mayan hieroglyphs as Unicode characters.

Thanks go to our sponsors for providing funds to support this grant. Adopting a character helps the Unicode Consortium in its goal to support the world’s languages.

Mayan hieroglyphs were used from 250 BCE until the 1500s. Mayan textual records include historical, literary, religious, and mythological information, as well as a sophisticated mathematical system on par with that of the Romans. Mayan astronomical records continue to capture the attention of astronomers today. Including Mayan hieroglyphs as Unicode characters will allow them to be used on computers around the world. See more about Mayan.

Mayan is a complex script, requiring special support in layout and presentation. The first phase is a catalog and analysis of the Dresden codex, resulting in a draft set of Unicode atomic signs and composition mechanisms needed for full Mayan text. The second phase is based on that analysis: preparation of a proposal for layout and presentation mechanisms in Unicode text, using those atomic elements. These two phases are to be completed in 2017.

Wednesday, May 18, 2016

ICU joins the Unicode Consortium

ICU ProjectToday we are welcoming the ICU project into the Unicode Consortium.

Every smartphone and laptop uses the Unicode encoding and Unicode CLDR data for language support: from Arabic to Japanese to Zulu — and even plain English. The Unicode Consortium provides the data, but has not provided software to directly use that data, until now.

The ICU (International Components for Unicode) project has long provided software that implements the Unicode data and algorithms. ICU is a mature, very widely deployed set of C/C++ and Java software libraries, open-sourced since 1999 under the stewardship of IBM. When you see a date or number written in your language on your smartphone, for example, or a list of sorted names, the formatting and sorting are done with ICU.

There has long been a close working relationship between the various Unicode Consortium committees and the ICU team, with many people working on Unicode projects as well as ICU. That has ensured that Unicode data and algorithms can be effectively and quickly implemented.

IBM made the decision to transfer ICU to the Unicode Consortium so that ICU could benefit from the formal and open governance that the Unicode Consortium offers. “IBM has a long history in our commitment to open standards as a driver of innovation for our customers worldwide,” said Helena Chapman, IBM Globalization Executive. By moving ICU under the Unicode Consortium, it provides a cross-industry, open source collaboration that will drive greater consistency and interoperability across computing platforms to the benefit of global technology users world-wide. IBM has been an active member of the Unicode Consortium since its inception, and is pleased to see this further consolidation of foundational open source globalization standards.

The ICU team has become a new Consortium technical committee, along with the other Unicode committees. ICU will be released under the Unicode open-source license (similar to the previous license), just like the Unicode Character Database and the CLDR data. For users of ICU, we’ll try to make this transition as smooth as possible.

The Unicode Consortium and the ICU team would like to thank IBM for many years of project stewardship, as well as for major past and ongoing contributions to the project.

For more information, see http://site.icu-project.org/