Thursday, June 1, 2023

Unlocking the Power of CLDR Person Name Formatting: A Solution for Formatting Names in a Globalized World

By Mike McKenna, Chair of CLDR Person Names Subcommittee


CLDR Person Names has moved from “tech preview” to “draft” status and is available for initial testing by implementors through ICU4J.

How a person’s name is displayed and used can convey respect, familiarity, or even be interpreted as rude if used improperly. That’s why it’s important to format names correctly, especially because naming practices vary across the globe. In many cultures, names can indicate gender, status, birthplace, nationality, ethnicity, religion, and more.

Until now, there have been no good standards for how to format people’s names in various contexts. A number of Unicode members wanted to address this problem and provide a mechanism that anyone could use to format people’s names in a wide variety of applications, such as contact lists, air travel, billing applications, CRMs, social media, and any other application that asks for user information and presents it back to the user or others.

The Unicode® Person Name Formats defines patterns used to take a person’s name and format it correctly in a given language or locale depending on a chosen context. With the Unicode Common Locale Data Repository (CLDR), locale codes and name sequences can be selected to create a specific pattern for formatting a person’s name — including preferences for formal, informal, or abbreviated versions. As a result, designers and developers can correctly display names according to the user’s native locale and culture, especially important when integrating names in different character scripts, such as Japanese, Chinese, or Russian.

The Unicode Consortium added Person Name formatting to CLDR in v42 and has been refined and enhanced for v43, which just released in April. In CLDR v43, with the help of linguists from around the world, we completed data for formatting people’s names for CLDR locales at modern coverage. Its formal name is "Unicode Technical Standard #35 Unicode Locale Data Markup Language (LDML); Part 8: Person Names". ICU has added the PersonNameFormatter class and is available in ICU 73.

To learn more, and get an idea of the implications for user experience and application design, see the following paper, which provides an illustration of the many contexts in which names can be formatted through CLDR Person Names.

LDML (UTS#35) Part 8: Person Names - a story teller’s case study

