The Unicode Blog: Egyptian hieroglyphs

Tuesday, September 10, 2024

Announcing The Unicode® Standard, Version 16.0

Version 16.0 of the Unicode Standard is now available. This is a major version update that includes new characters and code charts, new data files and annexes, an updated core specification, and updated annexes and synchronized standards.

This version adds 5,185 new characters, including 3,995 additional Egyptian Hieroglyph characters plus seven new scripts, seven new emoji characters, and over 700 symbols from legacy computing environments, for a total of 154,998 characters. See the delta code charts for details on all the new scripts and characters. For additional details regarding new emoji, see Emoji Recently Added, v16.0.

In addition to new characters, new “Moji Jōhō Kiban” (文字情報基盤) Japanese source references have been added for over 36,000 CJK unified ideographs. This is reflected in the code charts for virtually all CJK unified ideograph blocks by additional representative glyphs in the “J” column.

The core specification for Version 16.0 is now available for browsing online as per-chapter web pages with “breadcrumb” and other links for easy navigation.

Two new annexes have been added to this version:

UAX #53, Unicode Arabic Mark Rendering: This annex, which was previously published as a Technical Report, specifies an algorithm for handling combining marks when rendering to ensure correct and consistent display of Arabic script text.
UAX #57, Unicode Egyptian Hieroglyph Database (Unikemet): This annex documents the format of the Unikemet.txt data file, which provides information clarifying the identity of Egyptian Hieroglyph characters and properties useful for implementations.

For complete details on Unicode Version 16.0, see https://www.unicode.org/versions/Unicode16.0.0/.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Tuesday, September 13, 2022

Announcing The Unicode® Standard, Version 15.0

Version 15.0 of the Unicode Standard is now available, including the core specification, annexes, and data files. This version adds 4,489 characters, bringing the total to 149,186 characters. These additions include two new scripts, for a total of 161 scripts, along with 20 new emoji characters, and 4,193 CJK (Chinese, Japanese, and Korean) ideographs. The new scripts and characters in Version 15.0 add support for modern language groups including:

Nag Mundari, a modern script used to write Mundari, a language spoken in India
A Kannada character used to write Konkani, Awadhi, and Havyaka Kannada in India
Kaktovik numerals, devised by speakers of Iñupiaq in Kaktovik, Alaska for the counting systems of the Inuit and Yupik languages

Among the popular symbol additions are 20 new emoji, including hair pick, maracas, jellyfish, khanda, and pink heart. For the full list of new emoji characters, see emoji additions for Unicode 15.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji.

Other symbol and notational additions include:

The nine pointed white star, used by members of the Bahá’í faith
Eight symbols for celestial bodies, used by astronomers and astrologers
Twenty-nine additional Egyptian hieroglyph format controls, which will enable Egyptologists to better represent texts

Support for other languages and scholarly work includes:

Kawi, a historical script found in Southeast Asia, used to write Old Javanese and other languages
Three additional characters for the Arabic script to support Quranic marks used in Turkey
Three Khojki characters found in handwritten and printed documents
Ten Devanagari characters used to represent auspicious signs found in inscriptions and manuscripts
Six Latin letters used in Malayalam transliteration
Sixty-three Cyrillic modifier letters used in phonetic transcription

Important chart font updates include:

A set of updated glyphs for Egyptian hieroglyphs, in addition to standardized variation sequences to support rotated glyphs found in texts
Improved glyphs for Unified Canadian Aboriginal Syllabics, which provide better support for Carrier and other languages
A new Wancho font, with improved and simplified shapes

Updates to the CJK blocks add:

4,192 ideographs in the new CJK Unified Ideographs Extension H block
One ideograph in the CJK Unified Ideographs Extension C block

Unicode properties and specifications determine the behavior of text on computers and phones. The following six Unicode Standard Annexes and Technical Standards have noteworthy updates for Version 15.0:

UAX #9, Unicode Bidirectional Algorithm, amends the note in UAX9-C2 to emphasize the use of higher-level protocols to mitigate potential source code spoofing attacks.
UAX #31, Unicode Identifier and Pattern Syntax, provides more guidance on profiles for default identifiers, clarifies the use of default ignorable code points in identifiers, and discusses the relationship between Pattern_White_Space and bidirectional ordering issues in programming languages.
UAX #38, Unicode Han Database, adds the kAlternateTotalStrokes property. The kCihaiT property’s category was changed to Dictionary Indices, the kKangXi property was expanded, and Sections 3.0, 3.10, and 4.5 were added.
UTS #39, Unicode Security Mechanisms, changes the zero width joiner (ZWJ) and zero width non-joiner (ZWNJ) characters from Identifier_Status=Allowed to Identifier_Status=Restricted; they are therefore no longer allowed by the General Security Profile by default.
UAX #45, U-Source Ideographs, has records for new ideographs in its data file, “ExtH” was added as a new status, the status identifiers for the existing CJK Unified Ideographs blocks were improved, and Section 2.5 was added.
UTS #46, Unicode IDNA Compatibility Processing, clarified the edge case of the empty label in ToASCII and added documentation regarding the new IDNA derived property data files.

About the Unicode Standard

The Unicode Standard provides the basis for processing, storage and seamless data interchange of text data in any language in all modern software and information technology protocols. It provides a uniform, universal architecture and encoding for all languages of the world, with over 140,000 characters currently encoded.

Unicode is required by modern standards such as XML, Java, C#, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is a fundamental component of all modern software.

For additional information on the Unicode Standard, please visit https://home.unicode.org/.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations, many in the computer and information processing industry. For a complete member list go to https://home.unicode.org/membership/members/.
For more information, please contact the Unicode Consortium https://home.unicode.org/connect/contact-unicode/.

Over 144,000 characters are available for adoption to help the Unicode Consortium’s work on digitally disadvantaged languages

Tuesday, September 10, 2024

Announcing The Unicode® Standard, Version 16.0

Adopt a Character and Support Unicode’s Mission

Tuesday, September 13, 2022

Announcing The Unicode® Standard, Version 15.0

About the Unicode Standard

About the Unicode Consortium

Links of Interest

Blog Archive

Labels

Followers

Tuesday, September 10, 2024

Announcing The Unicode® Standard, Version 16.0

Adopt a Character and Support Unicode’s Mission

Tuesday, September 13, 2022

Announcing The Unicode® Standard, Version 15.0

About the Unicode Standard

About the Unicode Consortium

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog