The Unicode Blog: CLDR

Showing posts with label CLDR. Show all posts

Thursday, May 14, 2026

Unicode CLDR 49: Submission open through June 2026

The Unicode® CLDR Survey Tool is open for submission for version 49 through June (see detailed schedule below). CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort order, etc.). All major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Via the online Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.

The new areas in CLDR 49 are focused on:

Unicode 18 additions: new emoji, script names, …
Improvements in date and time and locale display names formatting
New languages available for submission in Survey Tool: Adyghe [ady], Brahui [brh], Hunsrik [hrx], Interslavic [isv], Kabardian [kbd], Kaitag [xdq], Mara [mrh], and Susu [sus]

General Submission for TC locales* opened recently and is slated to finish on June 10, 2026. The Survey Tool then enters a vetting phase, where contributors select the best data for each field. That vetting phase is slated to finish on June 29. The draft data will be available in a public alpha in early August, and the final release is targeted for mid-October.

Other locales, managed by the DDL Working Group, have a longer submission period to allow smaller organizations to submit data on a more flexible timeline. The Survey Tool opened earlier for these locales, and will stay in Extended Submission until the end of June, so that these organizations can contribute data for the current release.

Each new locale starts with a small set of Core data, such as a list of characters used in the language. Submitters of those locales need to bring the coverage up to Basic level (very basic basic dates, times, numbers, and endonyms) during the following submission cycle. Once a language reaches Basic coverage, it has the minimum support for use in language selection, such as on mobile devices. In the next submission cycle, the name of that language is also added for translation for all languages at Modern coverage. Locales that reach a higher level of coverage (Moderate or Modern) are suitable for general-purpose support in applications and operating systems.

If you would like to contribute missing data for your language, see Survey Tool Accounts. For more information on contributing to CLDR, see the CLDR Information Hub.

* TC Locales are ones for which major organizations commit to adding data in concert over a short span of time each year.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

Tuesday, March 31, 2026

Unicode ICU 78.3 and CLDR 48.2 released

Unicode® CLDR is the most widely used provider of locale data. It provides the essential building blocks that allow software to display dates, times, and currencies correctly in every language and region. Unicode® ICU provides widely used C/C++/Java internationalization (i18n) libraries and APIs.

We have just published new maintenance releases of ICU and CLDR, with some small but significant changes. To find out more and to download these releases, go to:

CLDR and ICU have each published a maintenance release in March instead of a major release. The next major releases, CLDR 49 and ICU 79, are planned for October and will include the data from the next CLDR general submission period, planned to start in early Q2 2026, as well as Unicode 18.

The following issues are fixed in the CLDR 48.2 and ICU 78.3 maintenance releases:

Several important locale data bug fixes including:

Group separator for number formatting was updated to ' in fr_CH for consistency with other Swiss locales.
Some fixes to date and time formats including: Hv available formats were updated to match behavior in CLDR 47. The previous change caused web compatibility issues related to current JS capabilities.
Fixes for Emoji annotations issues, such as collisions between emoji short names.
Updated abbreviated and narrow AM/PM for ko and ps for consistency with how the wide forms are localized.
Full list of changes are available in Δ48.2

ICU 78.3 includes the CLDR 48.2 changes
ICU also fixes a C++ code point iterator bug
Updates for timezone data 2026a

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

Friday, December 19, 2025

Opening CLDR Survey Tool early for DDL locales

We are announcing an early submission window for the CLDR Survey Tool, exclusively for Digitally Disadvantaged Languages (DDLs). These include languages across the world that lack full digital support, such as Qʼeqchiʼ with about 1.3M speakers, and many more.

The early submission window will allow more time for individuals and organizations that make DDL contributions, providing crucial data to close the digital support gap. The data will go into the CLDR v50 release, targeted at October 2026. Languages maintained by the CLDR Technical Committee are not available during this special window. They will be available for submission in Q2 2026.

See DDL: Help Center for more information on how to contribute to a DDL language.

If your language is not yet in CLDR, organizations can submit a formal request to add it; see adding a new language.

CLDR Organizations are needed for approval of CLDR data, so that it can be picked up by libraries, applications, programming languages, and operating systems. To register a new CLDR Organization, see adding an organization to CLDR. Individuals can also request languages and submit/approve data; however, the data cannot reach even Basic coverage without at least one CLDR Organization supporting it.

What is CLDR?

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.). All major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Contributors supply data for their languages via the online Survey Tool. This data is widely used to support much of the world’s software and is also a factor in determining which languages are supported on mobile phones and computer operating systems.

The Survey Tool opened on December 18, 2025 for DDL languages. The tool will remain open for data submission and correction until July 2026. A public alpha will make the draft data available in early August 2026. Data contributed at this time will be scheduled for publication and available for use in October 2026.

Each additional CLDR language starts with a small set of Core Data, such as a list of characters used in the language. Submitters of new languages commit to bringing the coverage up to a minimum of Basic coverage (very basic formats for dates, times, numbers, and endonyms).

Once a language reaches Basic coverage, it will have the minimum support for use in language selection, such as on mobile devices. That is the first step; for broader support the Moderate level is typically required.

If you would like to contribute missing data for your language, see Survey Tool Accounts.

----------------------------------------------

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Friday, October 25, 2024

ICU 76 Released

Unicode® ICU 76 has just been released. ICU is the premier library for software internationalization, used by a wide array of companies and organizations to support the world's languages, implementing both the latest version of the Unicode Standard and of the Unicode locale data (CLDR).

ICU 76 updates to Unicode 16 (blog), including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations. It also updates to CLDR 46 (beta blog) locale data with new locales, significant updates to existing locales, and various additions and corrections. For example, the CLDR and Unicode default sort orders are now very nearly the same.

Most of the java.time (Temporal) types can now be formatted directly using the existing ICU4J date/time formatting classes.

There are some new APIs to make ICU easier to use with modern C++ and Java patterns. Most of the C/C++ APIs added for this purpose are implemented as C++ header-only APIs, and usable on top of binary stable C APIs, which is a first for ICU.

The Java and C++ technology preview implementations of the (also in tech preview) CLDR MessageFormat 2.0 specification have been updated to match recent changes.

ICU 76 and CLDR 46 are major releases, including a new version of Unicode and major locale data improvements.

For details, please see
https://unicode-org.github.io/icu/download/76.html.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Unicode CLDR 46 available

Unicode CLDR 46 is now available and has been integrated into version 76 of ICU.

The most significant data changes in this release were:

Updated to Unicode 16.0 (including major changes to collation)
Substantial additions and modifications of Emoji search keyword data
‘Upleveling’ the locale coverage (see below)

The most significant changes in the specification were:

Updates to Message Format in tech preview
Updates to conformance
New tech preview section on semantic skeletons

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.) For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?))

Via the Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.

In version 46, the following levels were reached:

New / Upleveled Locales

±	New Level	Locales
📈	Modern	Nigerian Pidgin, Tigrinya
📈	Moderate	Akan, Baluchi (Latin), Kangri, Tajik, Tatar, Wolof
📈	Basic	Ewe, Ga, Kinyarwanda, Konkani (Latin), Northern Sotho, Oromo, Sichuan Yi, Southern Sotho, Tswana
📉	Basic*	Chuvash, Anii

We are currently planning for CLDR 47 to be a closed release with no data submission period. The focus will be on improving the Survey Tool used for data submission, making necessary infrastructure changes, and some high priority data quality fixes.

For more information

See the CLDR 46 release page , which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Monday, May 20, 2024

Unicode CLDR Version 46 Submission Open

The Unicode CLDR Survey Tool is open for submission for version 46. CLDR provides key building blocks for software to support the world’s languages (dates, times, numbers, sort-order, etc.) All major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Via the online Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.

Version 46 is focusing on:

Unicode 16 additions: new emoji, script names, collation data (Chinese & Japanese), …
Emoji search keywords: Expanding keyword coverage to make it easier for users to find the right emoji
New Languages targeting Basic:
- Ewe (ee),
- Ga (gaa)
- Kinyarwanda (rw)
- Northern Sotho (nso)
- Oromo (om),
- Sesotho (st)
- Setswana (tn),
Up-leveling: Akan (ak)

Submission of new data opened recently, and is slated to finish on June 11. The new data then enters a vetting phase, where contributors work out which of the supplied data for each field is best. That vetting phase is slated to finish on July 1. A public alpha makes the draft data available around August 28, and the final release targets October 16.

Each new locale starts with a small set of Core data, such as a list of characters used in the language. Submitters of those locales need to bring the coverage up to Basic level (very basic basic dates, times, numbers, and endonyms) during the next submission cycle.

Once a language reaches Basic coverage, it has the minimum support for use in language selection, such as on mobile devices. In the next submission cycle, the name for that language is also added for translation for all languages at Modern coverage.

If you would like to contribute missing data for your language, see Survey Tool Accounts. For more information on contributing to CLDR, see the CLDR Information Hub.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Thursday, April 18, 2024

Unicode CLDR v45 released

The Unicode CLDR v45 is now available and has been integrated into version 75 of ICU. The CLDR v45 release page has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.) For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

CLDR 45 did not have a Survey Tool submission phase, and focused on tooling and just a few functional areas:

MessageFormat 2.0 Tech Preview

Software needs to construct messages that incorporate various pieces of information. The complexities of the world's languages make this challenging. The goal for MessageFormat 2.0 is to allow developers and translators to create natural-sounding, grammatically-correct, user interfaces that can appear in any language and support the needs of various cultures.

The new MessageFormat defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages, software libraries, and software localization tooling. It enables the integration of internationalization APIs (such as date or number formats), and grammatical matching (such as plurals or genders). It is extensible, allowing software developers to create formatting or message selection logic that add on to the core capabilities. Its data model provides the means of representing existing syntaxes, thus enabling gradual adoption by users of older formatting systems.
See also:

UTW { } MessageFormat v2 (November 7, 2023)
Message Format Virtual Open House (February 20, 2024)

Keyboard 3.0 stable version

Keyboard support for digitally disadvantaged languages (DDLs) is often lacking or inconsistent between platforms. The updated LDML Keyboard 3.0 format specifies an interchange format for keyboard data. This will allow keyboard authors to create a single mapping file for their language, which implementations can use to provide that language’s keyboard mapping on their own platform. This format allows both physical and virtual (that is, on-screen or touch) keyboard layouts for a language to be defined in a single file.

See also:

CLDR, Beyond Locale Data (June 22, 2023)

Tooling changes

Many tooling changes are difficult to accommodate in a data-submission release, including performance work and UI improvements. The changes in v45 provide faster turn-around for linguists and higher data quality. They are targeted at the v46 submission period, starting in May, 2024.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Wednesday, April 17, 2024

ICU 75 Released

Unicode® ICU 75 has just been released. ICU is the premier library for software internationalization, used by a wide array of companies and organizations to support the world's languages, implementing both the latest version of the Unicode Standard and of the Unicode locale data (CLDR). ICU 75 updates to CLDR 45 (beta blog) locale data with new locales and various additions and corrections. C++ code now requires C++17 (C code now requires C11) and is being made more robust.

The CLDR MessageFormat 2.0 specification is now in technology preview, together with a corresponding update of the ICU4J (Java) tech preview and a new ICU4C (C++) tech preview.

For details, please see https://icu.unicode.org/download/75.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Tuesday, March 5, 2024

Unicode CLDR v45 Alpha available for testing

The Unicode CLDR v45 Alpha is now available for integration testing.

The alpha has already been integrated into the development version of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR data and on Migration issues. Feedback can be filed at CLDR Tickets.

CLDR 45 is a closed release with no submission period, focusing on just a few areas:

MessageFormat 2.0 Tech Preview

The new MessageFormat defines the data model, syntax, processing, and conformance requirements for the next generation of dynamic messages. It is intended for adoption by programming languages, software libraries, and software localization tooling. It enables the integration of internationalization APIs (such as date or number formats), and grammatical matching (such as plurals or genders). It is extensible, allowing software developers to create formatting or message selection logic that add on to the core capabilities. Its data model provides a means of representing existing syntaxes, thus enabling gradual adoption by users of older formatting systems.

Keyboard 3.0 stable version

Keyboard support for digitally disadvantaged languages is often lacking or inconsistent between platforms. The updated LDML Keyboard 3.0 format specifies an interchange format for keyboard data. This will allow keyboard authors to create a single mapping file for their language, which implementations can use to provide that language’s keyboard mapping on their own platform. This format allows both physical and virtual (that is, on-screen or touch) keyboard layouts for a language to be defined in a single file.

Tooling changes

For more information

See the draft CLDR v45 release page, which has information on accessing the data, reviewing charts of the changes, and — importantly — Migration issues.

Adopt a Character and Support Unicode’s Mission

Looking to give that special someone a special something?
Or maybe something to treat yourself?
🕉️💗🏎️🐨🔥🚀爱₿♜🍀

Adopt a character or emoji to give it the attention it deserves, while also supporting Unicode’s mission to ensure everyone can communicate in their own languages across all devices.

Each adoption includes a digital badge and certificate that you can proudly display!

Have fun and support a good cause

You can also donate funds or gift stock

Tuesday, October 31, 2023

ICU 74 Released

Unicode® ICU 74 has just been released. ICU is the premier library for software internationalization, used by a wide array of companies and organizations to support the world's languages, implementing both the latest version of the Unicode Standard and of the Unicode locale data (CLDR). ICU 74 updates to Unicode 15.1, and to CLDR 44 locale data with various additions and corrections.

ICU 74 and CLDR 44 are major releases, including a new version of Unicode and major locale data improvements. They subsume the changes for the ICU 73.2 and CLDR 43.1 maintenance releases.

Unicode 15.1 adds source code security mechanisms, improves line breaking for southeast Asian scripts, and adds important CJK unified ideographs.

CLDR 44 has added or improved data for a number of languages that have been newly added to ICU, and has improved measurement unit handling, conversion, and formatting.

ICU 74 implements these improvements, adds new C APIs for locale handling, adds a plug-in API for word segmentation, and switches the Java build system to Maven.

For details, please see https://icu.unicode.org/download/74.

Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.

Unicode CLDR v44 available

Unicode CLDR version 44 is now available and has been integrated into version 74 of ICU. In CLDR 44, the focus is on:

Formatting Person Names. Added further enhancements (data and structure) for formatting people's names. For more information on why this feature is being added and what it does, see Background.
Emoji 15.1 Support. Added short names, keywords, and sort-order for the new Unicode 15.1 emoji.
Unicode 15.1 additions. Made the regular additions and changes for a new release of Unicode, including names for new scripts, collation data for Han characters, etc.
Digitally disadvantaged language coverage. Work began to improve DDL coverage, with the following DDL locales now having higher coverage levels:
1. Modern: Cherokee, Lower Sorbian, Upper Sorbian
2. Moderate: Anii, Interlingua, Kurdish, Māori, Venetian
3. Basic: Esperanto, Interlingue, Kangri, Kuvi, Kuvi (Devanagari), Kuvi (Odia), Kuvi (Telugu), Ligurian, Lombard, Low German, Luxembourgish, Makhuwa, Maltese, N’Ko, Occitan, Prussian, Silesian, Swampy Cree, Syriac, Toki Pona, Uyghur, Western Frisian, Yakut, Zhuang

CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.). For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Via the online Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.

There are many other changes: to find out more, see the CLDR v44 release page, which has information on accessing the date, reviewing charts of the changes, and — importantly — Migration issues.

In version 44, the following levels were reached:

v44 Level	Langs	Usage
Modern	95	Suitable for full UI internationalization
	čeština, ‎Deutsch, ‎français, Kiswahili‎, Magyar‎, O‘zbek‎, Română‎‎, Tiếng Việt‎, Ελληνικά‎, Беларуская‎, ‎ᏣᎳᎩ‎, Ქართული‎, ‎Հայերեն‎, ‎עברית‎, ‎اردو‎, አማርኛ‎, ‎नेपाली‎, অসমীয়া‎, ‎বাংলা‎, ‎ਪੰਜਾਬੀ‎, ‎ગુજરાતી‎, ‎ଓଡ଼ିଆ‎, தமிழ்‎, ‎తెలుగు‎, ‎ಕನ್ನಡ‎, ‎മലയാളം‎, ‎සිංහල‎, ‎ไทย‎, ‎ລາວ‎, မြန်မာ‎, ‎ខ្មែរ‎, ‎한국어‎, 中文, 日本語‎, … ‎
Moderate	13	Suitable for “document content” internationalization, eg. in spreadsheet
	brezhoneg, ‎føroyskt, IsiXhosa, ‎sardu, чӑваш, …
Basic	50	Suitable for locale selection, eg. choice of language on mobile phone
	asturianu, ‎Rumantsch, Māori, ‎Wolof, тоҷикӣ, ‎‎کٲشُر, ‎ትግርኛ, कॉशुर‎, ‎মৈতৈলোন্, ‎ᱥᱟᱱᱛᱟᱲᱤ, …

We are currently planning for CLDR version 45 to be a closed release with no submission period. The focus will be on improving the Survey Tool used for data submission, making necessary infrastructure changes, and some high priority data quality fixes.

Thursday, May 14, 2026

Adopt a Character and Support Unicode’s Mission

Tuesday, March 31, 2026

Adopt a Character and Support Unicode’s Mission

Friday, December 19, 2025

Adopt a Character and Support Unicode’s Mission

Friday, October 25, 2024

Adopt a Character and Support Unicode’s Mission

New / Upleveled Locales

For more information

Adopt a Character and Support Unicode’s Mission

Monday, May 20, 2024

Adopt a Character and Support Unicode’s Mission

Thursday, April 18, 2024

MessageFormat 2.0 Tech Preview

Keyboard 3.0 stable version

Tooling changes

Adopt a Character and Support Unicode’s Mission

Wednesday, April 17, 2024

Adopt a Character and Support Unicode’s Mission

Tuesday, March 5, 2024

MessageFormat 2.0 Tech Preview

Keyboard 3.0 stable version

Tooling changes

For more information

Adopt a Character and Support Unicode’s Mission

Tuesday, October 31, 2023

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog