We are pleased to announce that Localization
World is organizing a one-day Unicode workshop on Unicode,
including an
introduction with Richard Ishida and three additional sessions.
This will take place on the preconference day, June 4, 2012, in Paris.
Richard is an experienced presenter at Unicode conferences, and is well
known for his clear and effective presentations.
The Unicode Consortium’s goal is
to enable people around the world to use computers in any
language. The Consortium is involved in core internationalization
specifications at the
heart of all modern software, such as the Unicode Standard for
character encoding. The Consortium’s involvement in localization is a key
extension of this work. The Unicode Consortium maintains and extends the
Common Data Locale Repository (CLDR), and in 2011 established the Unicode
Localization Interoperability Technical Committee to improve the
interoperability of localization data interchange.
For more information, including the
program of the June LocalizationWorld Conference, please see
http://www.localizationworld.com/lwparis2012/program.php .
Helena Chapman, chair, Unicode Localization Interoperability Technical Committee
Ulrich Henes, Donna Parrish and Daniel Goldschmidt, chair, vice-chairs, Localization World Conference Program Committee
Friday, February 17, 2012
Friday, February 10, 2012
Unicode Releases Common Locale Data Repository, Version 21.0
Unicode CLDR 21.0 contains data for 193 languages and 170 territories: 528 locales in all. This release did not include a public data submission phase, and focused on improvements to the LDML structure and tools, and consistency of data.
Main features included the updates for
Unicode 6.1, a major cleanup of timezone names, date
format data, and delimiters (“…” vs „…“ vs „…” vs …); the
new BCP47 -t- extension; addition of ordinal
categories (1st, 2nd,…), collation reordering (eg, Cyrillic
before Latin), multiple numbering systems for a locale,
abbreviated numbers (eg, “1.2 B”); and restructuring of
Chinese calendar data. For more information on other changes
since the 2.0.1 release, see the CLDR
21 Release Note.
Unicode CLDR is by far the largest and most
extensive standard repository of locale data. This data is
used by a wide spectrum of companies for their software
internationalization and localization: adapting software to
the conventions of different languages for such common
software tasks as formatting of dates, times, time zones,
numbers, and currency values; sorting text; choosing
languages or countries by name; transliterating different
alphabets; and many others. Unicode CLDR 21 is part of
the Unicode locale data project, together with the
Unicode Locale Data Markup Language (LDML:
http://unicode.org/reports/tr35/). LDML is an XML format
used for general interchange of locale data, such as in
Microsoft's .NET.
For web pages with different views of CLDR data,
see
http://cldr.unicode.org/index/charts. For more
information about the Unicode CLDR project (including
charts) see
http://cldr.unicode.org/.
Thursday, February 2, 2012
UTS #10, Unicode Collation Algorithm, Version 6.1 Released
Mountain View, CA, USA – February 2,
2010 – The new version of Unicode Technical Standard #10, Unicode Collation Algorithm has been
released, updating to Unicode
Version 6.1.
This new version adds a number of features:
- The collation ordering for the 732 new Unicode characters.
- A major revision to the ordering of "variable" characters into groups, separating punctuation and symbols. This change may present migration issues for some implementations.
- Options added for ignoring spaces and punctuation (but not symbols), and for reordering groupings of characters, such as putting Latin characters before Greek (for Greek users), or digits after letters.
- A new section on asymmetric search (where a query of the base character 'e' matches é, è,…, but a query of the more specific é doesn't match other accented versions or the base character).
- Important restructuring and clarifications of other sections.
Wednesday, February 1, 2012
UTS #46, Unicode IDNA Compatibility Processing, Version 6.1 Released
Mountain View, CA, USA – February 1, 2010 – The new version
of Unicode
Technical Standard #46, Unicode IDNA Compatibility Processing has
been released, updating to Unicode
Version 6.1. It adds support for 528 additional characters
in internationalized domain names (IDN).
The specification provides two main features for use with the internationalized domain names specification released in August 2010 (IDNA2008):
The specification provides two main features for use with the internationalized domain names specification released in August 2010 (IDNA2008):
- A comprehensive mapping to reflect user expectations for casing and other variants of domain names. This mapping is allowed by IDNA2008, and follows the same principles as in the previous version of that specification (IDNA2003). It thus provides users consistency between old and new versions.
- A compatibility mechanism that supports internationalized domain names valid under the IDNA2003 specification and the IDNA2008 specification. This second feature allows browsers, search engines, and other clients to handle both old and new domain names during the transitional period until registries update their rules to follow IDNA2008.