The Unicode Blog: 2018

Tuesday, December 18, 2018

Unicode Board of Directors Election Results

The Unicode Consortium announces the election of director Tim Brandall for a one year term beginning January 2019. Michele Coady has decided to retire.

Tim has over 19 years of experience in the globalization industry, working in an internationalization capacity for companies like Apple, Vivendi Universal, and most recently Netflix. He has built and lead the internationalization team at Netflix for over 6 years, taking the Netflix product from a US only service to a truly global product available in 190 countries and 27 languages. Much of his work at Netflix has revolved around innovation to support globalization at scale. Tim has a software engineering background, holding a degree in Computer Science.

For the listing of current directors and officers of the Consortium please see Unicode Directors, Officers and Staff.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Wednesday, December 5, 2018

Support Unicode with an Adopt-a-Character Gift this Holiday Season!

This holiday season you can give a unique gift by adopting any emoji, letter, or symbol — and help support the Unicode Consortium’s mission to enable all languages to be used on computers. Three levels of sponsorship are available, starting at $100. With over 130,000 characters to choose from, you are certain to find an appropriate character, for even the most demanding recipient. All sponsors will receive a custom digital badge featuring the adopted character for use on the web and elsewhere. Sponsors at the two highest levels will receive a special thank-you gift engraved with the name you supply and the adopted character.

The program funds work on “digitally disadvantaged” languages, both modern and historic. In 2018 the program awarded grants to support work on improved keyboard layouts, additional work on Mayan hieroglyphs, and more historic Indic scripts, among others.

To date, the Adopt-a-Character program has had over 500 sponsors. Be part of the next wave, with a worthwhile gift!

For more information on the program, or to adopt a character, see the Adopt-a-Character Page.

Monday, November 5, 2018

Unicode 12.0 Beta Review

The beta review period for Unicode 12.0 has started. The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 12.0 includes a number of changes and 554 new characters. Some of the Unicode Standard Annexes have modifications for Unicode 12.0, often in coordination with changes to character properties. In particular, there are minor changes to UAX #29, Unicode Text Segmentation, to account for differences in Georgian casing behavior. Four new scripts have been added in Unicode 12.0. There are also 61 additional emoji characters, as well as very significant enhancements to the representation and behavior of multiperson emoji.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by January 7, 2019. Feedback instructions are on the beta page.

See http://unicode.org/versions/beta-12.0.0.html for more information about testing the 12.0.0 beta.

See http://unicode.org/versions/Unicode12.0.0/ for the current draft summary of Unicode 12.0.0.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. Members include: Adobe, Apple, Emojipedia, Facebook, Google, Government of Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging, Netflix, Sultanate of Oman MARA, Oracle, SAP, Shopify, Tamil Virtual University, The University of California (Berkeley), plus well over a hundred Associate, Liaison, and Individual members. For a complete member list go to http://www.unicode.org/consortium/members.html.

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Tuesday, October 23, 2018

Draft Candidates for Emoji 12.0 Beta (2019)

The Emoji 12.0 Beta contains 236 Emoji Draft Candidates, consisting of 61 characters plus 175 sequences. These are slated for release in 2019Q1 together with Unicode Version 12.0.

The emoji are in the following categories: 3 smileys & emotion, 209 people & body, 7 animals & nature, 9 food & drink, 6 travel & places, 3 activities, 15 objects, and 12 miscellaneous symbols. 50 of the new emoji (including gender/skin-tone variants) are for accessibility, such as ear with hearing aid and woman in manual wheelchair. The hearts, circles, and squares now have the same set of colors for decorative and/or descriptive uses.

Multi-person emoji now have skin-tone variants:

(A) Full Emoji v12.0 support requires that the holding-hands emoji (👫 👬 👫) with specific genders be supported with 55 combinations of mixed skin tones, such as:

man with dark skin tone and woman with light skin tone holding hands
woman with medium skin tone and woman with medium light skin tone holding hands
man with light skin tone and man with light skin tone holding hands

(B) Full Emoji v12.0 support requires that the 6 multi-person emoji (👯️‍ 🤼 🤝 💏 💑 👪) without specific gender be supported with the 5 human skin tones, such as:

family (adult+adult+child) with dark skin tone
couples with heart (adult+adult) with medium skin tone
couples kissing (adult+adult) with light skin tone

A mechanism is provided for mixed skin tones for emoji in group B, such as with a family of man+woman+girl+boy, but support is optional.

The following notes are relevant for implementers:

The 40 holding-hands emoji with mixed skin tones have a simpler internal representation, compared to the previous draft. The 15 with uniform skin tones use a single character plus skin-tone modifiers.
Implementations may optionally support all combinations of mixed skin tones for the 6 multi-person emoji in the B group. This can be a large number — over 4,000 for the family emoji alone — and thus may not be practical for all devices.
Clearer definitions are now provided in the specification, along with a new set for Basic_Emoji. For other details, see the specification.

The complete list of emoji sequences for Emoji 12.0 will be finalized during the next UTC meeting in January 2019. The CLDR English names and keywords for the new emoji characters will be finalized within the next month, and translation into 80+ languages (such as Slavic languages) will begin. Feedback is welcome on the sorting order and the English names and keywords.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Tuesday, October 16, 2018

ICU 63 Released

Unicode® ICU 63 has just been released. It updates to CLDR 34 locale data with many additions and corrections, and some new languages. ICU adds an API for number and currency range formatting, and an API for additional Unicode properties and for constructing custom properties. CLDR and ICU include data for testing readiness for the upcoming Japanese calendar era.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/63.

Monday, October 15, 2018

CLDR Version 34 Language/Locale Data Released

Version 34 is the latest version of CLDR, the core open-source language data that major software systems use to adapt software to the conventions of over 80 different languages. CLDR data is used by many products for Unicode and language support, including Android, Cloudant, Chrome OS, Db2, iOS, macOS, Windows, and many others.

CLDR 34 included a full Survey Tool data collection phase increasing to 85 languages at the “modern” (full) level, 4 at the lower “moderate” level (suitable for document content), 18 at the basic level, and about 100 others that don’t meet the level requirements.

Among the other changes: new units were added (e.g., atmosphere, petabyte); many new emoji keywords and names were corrected/refined, with updated emoji sort order; and preparations for the New Japanese Era (affecting most software for Japan) were made. The specification was also updated with many changes for Unicode Locale Identifier and BCP 47 Conformance sections, plus defining the syntax of unit identifiers. For other changes, details, and links to documentation, see the CLDR 34 Release Notes.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Tuesday, October 9, 2018

Unicode Arabic Mark Rendering UTR #53 Now Published

The combining classes of Arabic combining characters in Unicode are different than combining classes in most other scripts. They are a mixture of special classes for specific marks plus two more generalized classes for all the other marks. This has resulted in inconsistent and/or incorrect rendering for sequences with multiple combining marks since Unicode 2.0.

The Arabic Mark Transient Reordering Algorithm (AMTRA) described in UTR #53 is the recommended solution to achieving correct and consistent rendering of Arabic combining mark sequences. This algorithm provides results that match user expectations and assures that canonically equivalent sequences are rendered identically, independent of the order of the combining marks.

The concepts in this algorithm were first proposed four years ago by Roozbeh Pournader. We are pleased it has now been published as an official Technical Report.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Monday, October 8, 2018

Unicode Board of Directors Election Results

The Unicode Consortium announces the election of four Directors for three year terms beginning January 2019: Bob Jung, Iris Orriss, Alolita Sharma, and Greg Welch. A fifth candidate, Michele Coady, was elected for a one year term.

Michele Coady and Iris Orriss join the Consortium Board of Directors for the first time. Bob Jung, Alolita Sharma, and Greg Welch have been re-elected to continue their service as Directors.

Michele Coady is a Director of Global Readiness at Microsoft, responsible for the Microsoft Global Readiness policy, which includes driving geopolitical, globalization and internationalization compliance, risk management and awareness company-wide. She has been providing geopolitical support and guidance for the Microsoft Unicode emoji work for several years.

Iris Orriss serves as Director of Internationalization at Facebook. She has been with Facebook since January 2013 and is passionate about eliminating the internet language and cultural barriers. Her work focuses on growing Facebook in international markets. In addition, Iris is member of the board at Translators without Borders, a nonprofit organization that provides vital information in the right language at the right time. Prior to Facebook, Iris was a director at Microsoft working on product internationalization and development process in the enterprise and language technology divisions. She is a native of Germany, speaks four languages, and was educated at Freie Universität Berlin.

The Unicode Consortium would like to thank Dachuan Zhang who will step down in 2019 after four years as a member of the Board of Directors.

For the listing of current directors and officers of the Consortium please see Unicode Directors, Officers and Staff.

Monday, October 1, 2018

New Unicode Technical Director

The Unicode Consortium would like to welcome a new Technical Director, Dr. Ken Lunde.

Ken Lunde has worked at Adobe since 1991, specializing in CJKV Type Development, meaning that he develops East Asian fonts, along with the specifications on which they are based. He architected the Adobe-branded “Source Han” and Google-branded “Noto CJK” open source Pan-CJK typeface families that were released in 2014 and 2017, is the author of “CJKV Information Processing” Second Edition that was published by O’Reilly Media at the end of 2008, and frequently publishes articles on Adobe’s CJK Type Blog. Ken holds BA, MA, and PhD degrees in linguistics from The University of Wisconsin-Madison. Ken has been Adobe’s representative to Unicode since 2006, has been the primary representative since 2015, serves as the IVD Registrar, participates in the Unicode Editorial Committee, and received the 2018 Unicode Bulldog Award.

For the listing of current directors and officers of the Consortium please see Unicode Directors, Officers and Staff

Friday, September 14, 2018

Unicode CLDR 34 alpha available for testing

The alpha version of Unicode CLDR 34 is available for testing. The alpha period lasts until the beta release on September 26, which will include updates to the LDML spec. The final release is expected on October 10.

CLDR 34 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

CLDR 34 included a full Survey Tool data collection phase. Other enhancements include several changes to prepare for the new Japanese calendar era starting 2019-05-01; updated emoji names, annotations, collation and grouping; and other specific fixes. The draft release page at http://cldr.unicode.org/index/downloads/cldr-34 lists the major features, and has pointers to the newest data and charts. It will be fleshed out over the coming weeks with more details, migration issues, known problems, and so on. Particularly useful for review are:

Delta Charts - the data that changed during the release
By-Type Charts - a side-by-side comparison of data from different locales
Annotation Charts - new emoji names and keywords

Please report any problems that you find using a CLDR ticket. We'd also appreciate it if programmatic users of CLDR data download the xml files and do a trial integration to see if any problems arise.

Thursday, September 6, 2018

New Japanese Era

A new era in the Japanese calendar is expected to begin on May 1, 2019, following the announced abdication of Japanese Emperor Akihito. This era will be represented in dates by two names: one consisting of a sequence of two existing kanji and one consisting of a new single Japanese character that combines those two. (Similarly, the current era Heisei can be represented by either “平成” or “㍻”.)

The Japanese calendar system and support for era names is essential for important public sector business functions. Therefore, most software distributed in Japan will need to adopt the new era name and add font support for the new character.

The current Heisei era has been in place since 1989 — during the evolution of modern computer systems. Because of this, most software systems have not been tested for such an event. The exact date of the announcement of the new era name is unknown, but current expectations are that there will be a very narrow window for implementing the new era information in IT environments, perhaps less than a month. Until the announcement, dates in 2019 and beyond will continue to be written with the Heisei era name and its year numbering.

To prepare as well as possible for this unprecedented event, the Unicode Consortium has taken the following actions:

The code point U+32FF has been reserved for the new era character.
Once the new era name is announced, the Unicode Consortium will quickly issue a dot-release (Version 12.1) that will add that character at the reserved code point, U+32FF, with an appropriate character name, decomposition, and representative glyph.
Unicode CLDR and ICU are including test mechanisms in the 2018 October releases of CLDR 34 and ICU 63. Systems that use CLDR or ICU (all smartphones, for example) can test using these mechanisms.
Systems and applications that do not use CLDR or ICU will need to take similar steps for testing.

The short time window between the actual announcement and the effective date will present challenges to the IT industry. IT systems in Japan will be expected to have the support in place seamlessly. Because of the narrow timeframe and the need to upgrade or patch legacy software, it is important to start now to determine how soon your application/system can add support to your current implementations, stacks, and dependencies.

Thursday, August 23, 2018

IUC 42: Keynote Speaker Announced

The Advent of Mayan Script Encoding: Mapping the Last Frontiers of Mayan Hieroglyphic Decipherment

Carlos Pallan Gayol
Archaeologist & Epigrapher, Dept. of Old American Studies & Ethnology, University of Bonn

Mayan hieroglyphs rank among the most visually complex writing systems ever created. Deciphering them has entailed a 200+ year scholarly quest, but this task is not yet completed and posits an inviting challenge for applying new tools from the information-age, culminating in the encoding of the Mayan script. Join us Tuesday morning, September 11th, as this keynote highlights the latest milestones attained in this pursuit by the NcodeX Project, where Carlos Pallan collaborates with Dr. Deborah Anderson, Researcher, Dept. of Linguistics, UC Berkeley, the Script Encoding Initiative and members of the Unicode advisory board. Stemming from research funded by Unicode’s Adopt-a-Character Program, it has been possible to produce new database tools and advanced functionalities, capable of mapping and analyzing all the textual contents of the extant Mayan books or Codices by relying on a novel catalog of Mayan signs with assigned code points.

See What’s Happening At IUC 42

For over 27 years the Internationalization & Unicode® Conference (IUC) has been the preeminent event highlighting the latest innovations and best practices of global and multilingual software providers. Join us in Santa Clara to promote your ideas and experiences working with natural languages, multicultural user interfaces, producing and supporting multinational and multilingual products, linguistic algorithms, applying internationalization across mobile and social media platforms, or advancements in relevant standards.

Join expert practitioners and industry leaders as they present detailed recommendations for businesses looking to expand to new international markets and those seeking to improve time to market and cost-efficiency of supporting existing markets. Recent conferences have provided specific advice on designing software for European countries, Latin America, China, India, Japan, Korea, the Middle East, and emerging markets.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Thursday, August 9, 2018

More Emoji Draft Candidates for 2019

There are now 179 proposed Emoji Draft Candidates (61 characters plus variants) for 2019. These are the short-listed candidates for Emoji 12.0, which is planned for release in 2019Q1 together with Unicode 12.0.

The following changes were made in the recent Unicode Technical Committee (UTC) meeting:

Added a candidate emoji for deaf person
Changed service animal vest to safety vest, and added a candidate emoji sequence using it: service dog
Added candidate emoji sequences for couple holding hands, with 55 combinations of skin tone and gender
Changed names and ordering for various characters

The list of draft candidates will be reviewed and finalized in the next UTC meeting, this coming September. Feedback is solicited on short names, keywords, and ordering. See also the Emoji 11.0 charts.

Eight Emoji Provisional Candidates for 2020 were also added (ninja, military helmet, mammoth, feather, dodo, magic wand, carpentry saw, screwdriver). For example:


ninja	magic wand

Between now and March 2019, these and other Provisional Candidates will be collected. The Unicode emoji subcommittee will then assess the whole set, and make recommendations to the UTC for which emoji to advance to Draft Candidate status for 2020.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Wednesday, July 18, 2018

ICU moves to GitHub and Jira

International Components for Unicode (ICU) is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. ICU is widely portable and gives applications the same results on all platforms and between C/C++ and Java software.

As of this week, ICU has moved from a self-hosted source code and bug tracking environment, to git on GitHub and Jira on Atlassian Cloud, respectively. Pull requests are welcome, as are bug reports on the new issue tracking system.

For more information, please see the following links:

ICU Repository Access: http://site.icu-project.org/repository
ICU Bug Tracking: http://site.icu-project.org/bugs

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Friday, July 13, 2018

Unicode 11.0 Paperback Available

The Unicode 11.0 core specification is now available in paperback book form with a new, original cover design. This edition consists of a pair of modestly priced print-on-demand volumes containing the complete text of the core specification of Version 11.0 of the Unicode Standard.

Each of the two volumes is a compact 6×9 inch US trade paperback size. The two volumes may be purchased separately or together, although they are intended as a set. The cost for the pair is US $16.58, plus postage and applicable taxes. Please visit the description page to order.

Note that these volumes do not include the Version 11.0 code charts, nor do they include the Version 11.0 Standard Annexes and Unicode Character Database, which are all freely available on the Unicode website.

Purchase The Unicode Standard, Version 11.0 - Core Specification

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Thursday, July 5, 2018

Unicode Consortium Announces Version 11.0 and Version 12.0 Cover Designs

The Unicode Consortium is pleased to announce the design selected for the cover of the forthcoming print-on-demand publication of The Unicode Standard, Version 11.0. The Unicode Consortium issued an open call for artists and designers to submit cover design proposals. An independent panel reviewed all submitted designs. Because of the accelerated release schedule for Version 12.0 (March 2019), the design for the print-on-demand publication of The Unicode Standard, Version 12.0 was also selected at this time.

Unicode 11.0 Books

The cover for Version 11.0 is an original design by Joyce S. Lee, a graduate student in the UC Berkeley School of Information. Her artwork was inspired by the well-known early 20th-century Bauhaus design school. She explains, “I see numerous parallels between the Bauhaus and the Unicode Consortium, including an intersection of workmanship and technological reproduction, a spirit of collaboration, as well as a widespread cultural influence. With this Bauhaus inspired cover, I thus aim to represent the Unicode Standard as a form of instructional reference for technologists around the world.”

[cover art by Monica Tang]

Cover artwork for Version 12.0 was created by Monica Tang, a computer science student at UC Berkeley. Her design was inspired by the simplicity of the geometric shapes that comprise the diversity of characters and symbols represented in the Unicode Standard. She notes, “Incorporating a variety of shapes and colors into a patterned design, I seek to convey the sheer breadth of the languages covered in the Unicode Standard as well as a sense of commonality.”

Runner-up designs by Feixiong “Hasutai” Liu and Maurice Meilleur were also selected. Hasutai is the founder and chief designer of Sir Sebsihiyan Sibe-Manchu Culture Center. Maurice Meilleur is Assistant Professor of Graphic Design at Appalachian State University.

Hasutai:
[art by Hasutai]

Maurice Meilleur:
[art by Maurice Meilleur]

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Wednesday, June 27, 2018

New Gold Sponsor dotFM .FM TLD

The Unicode Consortium is pleased to announce that dotFM .FM TLD is now a gold sponsor for:

dotFM .FM TLD's sponsorship directly funds the work of the Unicode Consortium in enabling modern software and computing systems to support the widest range of human languages. There are approximately 7,000 living human languages. Fewer than 100 of these languages are well-supported on computers, mobile phones, and other devices. AAC donations are used to improve support for digitally disadvantaged languages, and to help preserve the world’s linguistic heritage.

BRS Media’s dotFM is pleased to sponsor Adopt a Character. This year, dotFM launched Emoji Domains within the .FM Top-Level Domain. Emoji domain is a domain name with an expressive digital image or icon in it. dotFM pioneered the ‘multimedia’ domain space since launching the .FM Top Level Domains in 1998. Today, the comprehensive portfolio of registrants not only includes broadcasters, Internet radio and the music community, but also interactive companies, premier social media ventures and podcast entrepreneurs worldwide. — dotFM .FM TLD

The Unicode Consortium thanks dotFM .FM TLD for their support!

All sponsors are listed on Sponsors of Adopted Characters. More than 140,000 other characters are available for adoption — see Adopt a Character.

Wednesday, June 20, 2018

ICU 62 Released

Unicode® ICU 62 has just been released. It upgrades to Unicode 11 and to CLDR 33.1 locale data. A new syntax for locale-neutral number skeleton strings can be used in MessageFormat for more control over number formatting. Several still-draft NumberFormatter methods and helper classes have been modified or renamed. In C++, DecimalFormat wraps the new NumberFormatter code, and there is a new implementation for number parsing.

ICU is a software library widely used by products and other libraries to support the world's languages, implementing both the latest version of the Unicode encoding standard and of the Unicode locale data (CLDR).

For details please see http://site.icu-project.org/download/62

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

CLDR Version 33.1 Language/Locale Data Released for Unicode 11.0

Unicode CLDR 33.1 adds support for the recently released Unicode 11.0. Version 33.1 is the latest version of CLDR, the core open-source language data that major software systems use to adapt software to the conventions of over 80 different languages. The open-source Unicode ICU library incorporates the CLDR Version 33.1 data as part of its update to Unicode 11.0 in its ICU 62 release. ICU code is used by many products for Unicode and language support, including Android, Cloudant, ChromeOS, Db2, iOS, macOS, Windows, and many others.

The CLDR 33.1 release focuses on updates for Unicode 11.0: new names and keywords for the Unicode 11.0 emoji, Chinese collation stroke order, and script metadata. In addition, there are major improvements for names and annotations for the pre-11.0 emoji in CLDR languages. More extensive updates are planned for CLDR 34 (release expected in early October), with data submission still continuing.

For further details and links to documentation, see the CLDR 33.1 Release Notes.

Adopt-a-Character

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Tuesday, June 5, 2018

Announcing The Unicode® Standard, Version 11.0

Version 11.0 of the Unicode Standard is now available, both the core specification and data files. Version 11.0 adds 684 characters, for a total of 137,374 characters. These additions include seven new scripts, for a total of 146 scripts, as well as 145 new emoji.

The new scripts and characters in Version 11.0 add support for lesser-used languages and unique written requirements worldwide, including:

Georgian Mtavruli capital letters, newly added to support modern casing practices
Hanifi Rohingya, used to write the modern Rohingya language in Southeast Asia
Medefaidrin, used for modern liturgical purposes in Africa
Mazahua, a Mesoamerican language recognized by law in Mexico
Mayan numerals used in printed materials in Central America
Historic Sanskrit, Gurmukhi, and the Buryats
Five urgently needed CJK unified ideographs: three for chemical names and two for Japan's government administration

Popular symbol additions:

Copyleft symbol
Half stars for rating systems
More astrological symbols
Xiangqi Chinese chess symbols
New emoji characters including:

For the full list of emoji characters, see emoji additions for Unicode 11.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji. Version 11.0 also includes other improvements for emoji handling:

a mechanism to request the glyph direction for emoji
descriptions of the four new emoji hair components
descriptions of gender neutral emoji
simplified statements of emoji-related rules for grapheme cluster boundaries and for word boundaries.

Three other important Unicode specifications have been updated for Version 11.0:

UTS #10, Unicode Collation Algorithm — sorting Unicode text
UTS #39, Unicode Security Mechanisms — reducing Unicode spoofing
UTS #46, Unicode IDNA Compatibility Processing — compatible processing of non-ASCII URLs

Unicode 11.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications, often in coordination with changes to character properties. In particular, there are changes to:

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases.

Adopt-a-Character

All the new characters including the new emoji are now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.

Wednesday, May 9, 2018

Emoji Draft Candidates for 2019

104 proposed Emoji Candidates (60 characters plus variants) have advanced to Draft Candidate status for 2019. These are the short-listed candidates for Emoji 12.0, which is planned for release in 2019Q1 together with Unicode 12.0.

The draft candidates include the following:


Guide dog	Kite	White heart

See Emoji Candidates for the full list.

That list of draft candidates will be reviewed and finalized this September. Feedback is solicited on short names, keywords, and ordering. See also the Emoji 11.0 charts.

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.

Tuesday, April 17, 2018

Submissions open for 2020 Emoji

The deadline for emoji for 2019 was April 1, so any submissions received after that date are considered for release in 2020.

The submission form has undergone some revision, so please be sure to review the new text before putting together a proposal. There is a limited number of emoji characters considered each year, so be sure to follow the form so that you can provide the best case for any proposed emoji.

The emoji subcommittee has also produced a new page which shows the Emoji Requests submitted so far. You can look at what other people have proposed or suggested. In many cases, people have made suggestions, but have not followed through with complete submission forms, or have submitted forms, but not followed through on requested modifications to the forms.

Over 130,000 characters are available for adoption, to help the Unicode Consortium’s work on digitally disadvantaged languages.