Thursday, June 7, 2012

CLDR 21.0.2: New T Extensions for language/locale identifiers

New T Extension fields and subfields [RFC 6497] are now available for use in BCP47 and Unicode Locale/Language Identifiers. These T extensions provide for the identification of transforms that can be used for tagging content or requesting resources. The new T extension fields and subfields are defined in the following files, as part of the CLDR 21.0.2 release:
For example:
  • "zh-t-i0-pinyin", to indicate Chinese text generated with a pinyin input method
  • "en-t-k0-dvorak", to identify a Dvorak keyboard for English
  • "it-t-k0-osx-extended", to request an extended Mac keyboard for Italian
The private use subfields can be used for private agreements, such as:
  • "ru-t-en-x0-mobile", to indicate a translation from English to Russian for use on a mobile device, or
  • "ja-t-de-t0-und-x0-medical", to identify a machine translation from German to Japanese with a specialized dictionary for medical terms.
Related to this, there is draft keyboard layout data currently slated for CLDR 22.0: see Draft Keyboard Charts.