Thursday, February 26, 2026

From Central Bank to Code Point: A Roadmap for Currency Symbol Implementation

In the past year, several new currency symbols have been proposed for encoding in the Unicode Standard:

February 2025: The Saudi Central Bank announced the creation of a new symbol for the Saudi riyal.
March 2025: The Central Bank of the U.A.E. announced creation of a new symbol for the UAE Dirham (cf. Dirham Currency Symbol Guideline).
May 2025: A proposal was submitted to encode the symbol for the Maldivian Rufiyaa. (The symbol was created by the Maldives Monetary Authority in 2022.)
November 2025: The Central Bank of Oman announced the creation of a new symbol for the Omani Rial.

The Saudi riyal sign was proposed for encoding just barely in time for it to be included in version 17.0 of the Unicode Standard, released in September 2025. Proposals for the other currency symbols were submitted too late for version 17.0, so the symbols will be encoded in version 18.0, which will be released in September 2026.

Recent currency symbol trend

Distinct currency symbols are not essential for local or international financial transactions, and most currencies are denoted with their written name or an abbreviation; e.g. “kr” for krone. However, in recent years, since the creation of the euro currency and its distinct symbol, several monetary authorities have created distinct symbols to denote their currency. A currency symbol could potentially be created only for private use of the monetary authoring — printing on bills or embossing on coins. Usually, however, currency symbols are intended for public use: to appear on shop signs, online retail sites, or anywhere that currency amounts are presented.

Such public usage leads to a need for the symbol to be encoded in the Unicode Standard and supported in commercial software and services. Standardization of a new character and subsequent support by vendors takes time: typically, at least one year, and often longer. All too often, however, monetary authorities announce creation of a new currency symbol anticipating immediate public adoption, then later discover there will be an unavoidable delay before the new symbol is widely supported in products and services.

For a contrast with another recent currency development, Bulgaria transitioned from their local lev currency to the euro in January 2026, but the transition was formally decided and announced in July 2025, several months before the change went into effect. This allowed several months for vendors to prepare for the change.

Implementing support for the new currency symbols

Vendor support for a new currency symbol can involve many different things, such as the following:

Updates to fonts
Updates to software keyboard layouts or new designs for physical keyboards
Updating locale data and programming interfaces for formatting currency values
Updating software used for generation of financial statements and reports
Updates to applications, online services or devices for commercial transactions

However, all of these require development time, and development can only begin after the new symbol is encoded in the Unicode Standard. People wishing to start using a new currency symbol in applications and services should anticipate that, from the time the symbol is proposed for encoding, it could take many months or even years before vendors have distributed product updates.

Because there is unavoidable delay from when a new currency symbol is proposed to when it can be supported by vendors, monetary authorities are strongly encouraged to engage with the Unicode Consortium at least one year in advance of when a new currency symbol is expected to go into public usage.

Note regarding support on devices

For many devices, including some mobile phones, many vendors do not routinely provide updates, or discontinue providing updates on older devices. For this reason, users should not be surprised if a new currency symbol is not supported natively on a device years after the symbol was introduced. Applications or online services accessed on those devices can have a different update policy however, so experience using such devices could reflect partial support.

Recommendations for implementation of Unicode 18.0 currency symbols

The following three new currency symbols have been approved for encoding in Unicode version 18.0, which will be published in September 2026:

U+20C2 RUFIYAA SIGN
U+20C3 UAE DIRHAM SIGN
U+20C4 OMANI RIAL SIGN

Complete details for these characters are included in the Unicode 18.0 Alpha preview release. The technical details — character names, code points, property data — are unlikely to change before Unicode 18.0 is released, but these details are not completely stable until the Unicode Technical Committee has made the final technical decisions for Unicode 18.0. For this reason, vendors can choose to start working on implementations once the Alpha preview is available, but vendors should not distribute product updates until after Unicode version 18.0 is released in September 2026.

Extending support with CLDR

Many implementations use Unicode CLDR data for currency formatting, so incorporating the new symbols is an important step for widespread support. A CLDR release will follow not long after release of Unicode version 18.0, and will contain the new currency symbols for applicable currencies and locales.

However, the symbols will initially be listed as “alternative” symbols for the respective currencies. The reason for a symbol being an alternative, rather than the default, is to avoid the symbol being displayed in contexts in which available fonts might not yet support the new symbol, causing users to see a missing glyph for their currency; e.g.,

instead of

Later, when there is confidence that the symbols are more widely supported in platforms and fonts, a future CLDR version can update details to list the new currency symbol as the default, rather than as an alternative.

Working together to support local monetary authorities

When monetary authorities introduce a new symbol for their currency, it marks a significant milestone for financial and commercial activity in their domain. The Unicode Consortium is honored to work with monetary authorities, and would like to help make the launch of a new symbol as smooth as possible. With that in mind, we invite monetary authorities planning creation of a new currency symbol to engage with us well in advance of a planned launch.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Tuesday, February 24, 2026

UTS #58: Making URLs Readable for Humans: From %E0%A4%AE… to महात्मा

People around the world need to use their writing systems in URLs. This is important: in writing their native languages, the majority of humanity uses characters outside of A-Z, and they expect those characters to also work seamlessly.

Browsers and other programs generally handle Unicode in domain names well. But not all browsers and other programs do a good job with domain names, and many make the rest of the URL unreadable. For example, consider the common practice of providing user handles such as the following two:

x.com/rihanna

www.youtube.com/@핑크퐁

The first of these works well in practice — because it is all ASCII. Copying from the address bar and pasting into text provides a readable result. However in the second example, in many browsers and other programs, copying the address bar gives an unreadable string:

www.youtube.com/@핑크

⇩

youtube.com/@%ED%95%91%ED%81%AC%ED%90%81

The names also expand in size and turn into very long, unreadable strings, such as:

hi.wikipedia.org/wiki/महात्मा_गांधी

⇩

hi.wikipedia.org/wiki/%E0%A4%AE%E0%A4%B9%E0%A4%BE%E0%A4%A4%E0%A5%8D%E0%A4%AE%E0%A4%BE_%E0%A4%97%E0%A4%BE%E0%A4%82%E0%A4%A7%E0%A5%80

The other side of the coin is making sure that when programs add links to URLs in a predictable way, linkifying the entire URL, and without extending the link to include sentence punctuation. For example, many programs don’t add links properly to:

… see

https://example.com/αβγ/δεζ?θικ#λμν.

A commonly used email program, for example, stops midway through:

… see

https://example.com/αβγ/δεζ?θικ#λμν.

Others may include the sentence period, question mark, surrounding parenthesis, etc.:

… see

https://example.com/αβγ/δεζ?θικ#λμν.

Users often insert spaces to prevent this. It should be automatic:

… see

https://example.com/αβγ/δεζ?θικ#λμν.

The new UTS #58: Unicode Link Detection and Formatting: URLs and Email Addresses specifies how to format and linkify URLs and email addresses in readable, predictable, user-friendly ways. The data files cover all of the 159,000+ characters in Unicode.

We encourage implementers to adopt this specification for a consistent experience for users worldwide.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Tuesday, February 10, 2026

Unicode 18.0 Alpha review

Unicode 18.0 Alpha Review Opens for Feedback

By: Peter Constable, Chair of the Unicode Technical Committee

The repertoire for Unicode Version 18.0 is now open for early review and comment. During alpha review, the repertoire is reasonably mature and stable but is not yet completely locked down. Discussion regarding whether certain characters should be removed from the repertoire for publication is welcome. Character names and code point assignments are reasonably firm, but suggestions for improvement may still be considered.

For the alpha review, preliminary data files are also available, with data covering existing and new character repertoire. In addition, a draft for the core specification is available, with new block descriptions for some of the newly-added blocks and scripts.

The primary focus for the alpha review should be on the new character repertoire. This early review is provided so that reviewers may consider the repertoire and data file issues prior to the start of beta review (currently scheduled to start in May 2026). Once beta review begins, the repertoire, code points, and character names will all be locked down, and no longer be subject to changes.

Sample of representative glyphs for Seal script ideographs

Notable changes

The planned repertoire for Unicode 18.0 adds 13,048 new characters, which would bring the total number of characters to 172,849 characters.

The additions include four new scripts:

Small Seal (“Seal”): This comprises the largest portion of the new characters, with 11,328 ideographs. Seal is an important precursor to modern Han ideographs (aka, “CJK”), and has important cultural significance in China and for Chinese speakers throughout the world.
Chisoi: A modern script used in eastern India.
Jurchen: A historic ideographic script that was used in northeastern China during the Jin and Ming dynasties.
Proto-cuneiform—in this version, just numeric signs (other characters have been proposed for a future version).

Other additions include nine new emoji characters, 72 historical mathematical symbols, 323 Cuneiform numeric signs, and three new currency symbols for modern currencies:

Maldivian Rufiyaa
Omani Rial
UAE Dirham

Feedback for the alpha review should be reported under PRI #536 using the Unicode contact form by March 31, 2026.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Tuesday, February 3, 2026

Highlights from UTC Meeting #186

By: Peter Constable, Chair of the Unicode Technical Committee

The Unicode Technical Committee (UTC) met January 21 – 23 in Sunnyvale, CA. Thanks to Unicode member organization, Google, for hosting. Here are some highlights.

Progress on Unicode 18.0

Version 18.0 of the Unicode Standard is being prepared for publication in September of this year. At meetings 184 and 185, UTC had approved 12,995 characters for encoding in version 18.0. At this meeting, some additional characters were approved for this version. One of these new characters is the Omani rial sign, a currency symbol recently created by the Omani Central Bank. Other additions include 51 mathematical symbols and 10 standardized variation sequences proposed by the PHILIUMM Project.

UTC authorized the Unicode 18.0 Alpha preview release, which will be available February 10 for public review.

Future additions

A typical step in the process for encoding new characters is provisional assignment of code points for characters that UTC has deemed eligible for encoding. This allows working groups to begin development of content — property data, code charts, text for the core spec — for a future version. At this meeting, code points were provisionally assigned for several characters including three new scripts: Leke script, used in SE Asia; and Mwangwego and Shaaldaa scripts, used in Eastern Africa.

New UTS on links

UTC approved a new Unicode Technical Standard, UTS #58 Unicode Link Detection and Serialization. This standard includes character data, and this first version includes data for characters in Unicode 17.0. Starting with Unicode 18.0, this will become a synchronized standard, with a new version released together with each new version of the Unicode Standard.

New joint working group for orthographic sequences

At UTC #185, the Government of India proposed that Unicode develop specifications for orthographically valid cluster sequences for Hindi and other language orthographies. (See L2/26-061.) Such work would overlap the scopes of both the Unicode Technical Committee and the CLDR Technical Committee: Specs would deal with character sequences in a manner similar UAX #29, Unicode Text Segmentation, which is maintained by UTC; but each document would be for the orthography of a specific language, which puts this in the scope of CLDR-TC.

After UTC #185, Unicode leaders discussed options and proposed formation of a joint working group (JWG) between CLDR-TC and UTC. (See L2/26-045.) At this UTC meeting, this JWG was approved by UTC. It was similarly approved by CLDR-TC at one of their recent meetings. This new JWG will get organized and begin working during the next quarter.

Metadata embedded in “plain” text

It recently came to light that another organization has developed a specification to embed AI-related metadata into “unstructured” (i.e., “plain”) text. (See L2/26-042.) This has been motivated by the EU AI Act (AIA), which goes into enforcement in August of this year. Article 50 of the AIA obligates vendors to “mark” AI-generated content with machine-readable metadata so that content can be detectable as being artificially generated. This requirement applies to text content as well as other content types. However, Article 50 doesn’t specify what would count as “marking” of text, neither does it distinguish between different text formats: does it apply to generated source code? SMS messages? file names? But C2PA has taken a conservative approach, anticipating that the EU might enforce the requirement on any AI-generated text.

Unfortunately, the scheme added to the C2PA specification embeds sequences of Unicode variation selector characters in a manner that does not conform to the Unicode Standard.

UTC discussed this situation together with a representative from C2PA. On the one hand, it brought to light that the text of the Unicode Standard wasn’t sufficiently clear about conformance requirements in relation to variation sequences. But UTC was clear that this scheme is non-conformant. Other concerns were mentioned, including that it is a contradiction of terms to say that “unstructured” text can contain metadata. An outcome of this discussion was to recommend that Unicode establish a liaison relationship with C2PA, and that the topic be discussed further between the two organizations.

For complete details on outcomes from UTC #186, see the draft minutes.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock