Version 10.0 of the Unicode Standard is now available. For the first time,
both the core specification and the data files are available on the same
date. Version 10.0 adds 8,518 characters, for a total of 136,690
characters. These additions include four new scripts, for a total of 139 scripts, as well as 56 new emoji characters.
The new scripts and characters in Version 10.0 add
support for lesser-used languages and unique written requirements worldwide,
including:
- Masaram Gondi, used to write Gondi in Central and Southeast India
- Nüshu,used by women in China to write poetry and other discourses
until the late twentieth century
- Soyombo and Zanabazar Square, used in historic Buddhist texts to
write Sanskrit, Tibetan, and Mongolian
- Syriac letters used for writing Suriyani Malayalam, also known as
Garshuni and as Syriac Malayalam
- Gujarati signs used for the transliteration of the Arabic script
into Gujarati by Ismaili Khoja communities
- A set of 285 Hentaigana characters used in Japan (historic variants
of Hiragana characters)
- CJK Extension F (7,473 Han characters)
Among important symbol additions are:
- Bitcoin sign
- A set of Typicon marks and symbols
- 56 emoji characters including:
mage |
coconut |
fairy |
broccoli |
vampire |
sandwich |
For the full list of emoji characters, see
emoji
additions for Unicode 10.0, and
Emoji Counts. For a detailed description of support for
emoji characters by the Unicode Standard, see
UTS
#51, Unicode Emoji.
Three other important Unicode specifications
have been updated for Version 10.0:
Unicode 10.0 includes a number of changes. Some of the Unicode Standard
Annexes have modifications for Unicode 10.0, often in coordination with
changes to character properties. In particular, there are changes to UAX
#14, Unicode Line Breaking Algorithm, UAX #29, Unicode Text Segmentation,
and UAX #31, Unicode Identifier and Pattern Syntax. In addition, UAX #50,
Unicode Vertical Text Layout, has been newly incorporated as a part of the
standard.
The Unicode Standard is the foundation for all modern
software and communications around the world, including all modern operating
systems, browsers, laptops, and smart phones—plus the Internet and Web
(URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated
standards, and data form the foundation for CLDR and ICU releases.
Adopt-a-Character
All the additional 8,518 characters including 239 new emoji are now available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages.
About the Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend
and promote use of the Unicode Standard and related globalization standards.
The membership of the consortium represents a broad spectrum of corporations
and organizations, many in the computer and information processing industry.
Members include: Adobe, Apple, EmojiXpress, Facebook, Google, Government of
Bangladesh, Government of India, Huawei, IBM, Microsoft, Monotype Imaging,
Netflix, Sultanate of Oman MARA, Oracle, Rajya Marathi Vikas Sanstha, SAP,
Symantec, Tamil Virtual University, The University of California (Berkeley),
plus well over a hundred Associate, Liaison, and Individual members. For a
complete member list go to
http://www.unicode.org/consortium/members.html.