The Unicode Blog: 2009

Thursday, December 10, 2009

[Unicode Announcement] Unicode Releases Common Locale Data Repository, Version 1.7.2

Mountain View, CA, December, 10, 2009 - The Unicode® Consortium
announced today the release of version 1.7.2 of the Unicode Common
Locale Data Repository (CLDR). Unicode CLDR 1.7.2 is an update release,
with no new translations. The main changes are modifications to Unicode
language and locale identifiers to correspond to the recent version of
BCP47.

*About Unicode CLDR*
Unicode CLDR provides key building blocks for software to support the
world's languages. Unicode CLDR is by far the largest and most extensive
standard repository of locale data. This data is used by a wide spectrum
of companies for their software internationalization and localization:
adapting software to the conventions of different languages for such
common software tasks as formatting of dates, times, time zones,
numbers, and currency values; sorting text; choosing languages or
countries by name; transliterating different alphabets; and many others.
For more information about the Unicode CLDR project (including charts)
see http://cldr.unicode.org.

*About the Unicode Consortium*
The Unicode Consortium is a non-profit organization founded to develop,
extend and promote use of the Unicode Standard and related globalization
standards. The membership of the consortium represents a broad spectrum
of corporations and organizations in the computer and information
processing industry: Adobe Systems, Apple, DENIC eG, Google, Government
of India, Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging,
Oracle, SAP, Society for Natural Language Technology Research, Sun
Microsystems, Sybase, The University of California at Berkeley, Yahoo!,
plus well over a hundred Associate, Liaison, and Individual members. For
more information, please contact the Unicode Consortium
(http://www.unicode.org/).

----
All of the Unicode Consortium lists are strictly opt-in lists for members
or interested users of our standards. We make every effort to remove
users who do not wish to receive e-mail from us. To see why you are getting
this mail and how to remove yourself from our lists if you want, please
see http://www.unicode.org/consortium/distlist.html#announcements

Thursday, November 19, 2009

[Unicode Announcement] New draft Unicode specifications for IDNA and Security

There are new drafts of Unicode specifications connected with IDNA and
Security:

* UTS #46: Unicode IDNA Compatibility Processing
* UTS #39: Unicode Security Mechanisms
* UTR #36: Unicode Security Considerations

The major changes include the following:

http://www.unicode.org/reports/tr46/tr46-2.html

* Major rework based on the UTC and editorial committee decisions.
The text and specification are simplified considerably.
* The FAQ section was separated from the document and rewritten as
an independent FAQ page
* Draft data files were restructured
* Added a comparison table of IDNA2003, UTS46, and IDNA2008

/NOTE: IDNA2008 is not final, and the draft UTS46 may undergo changes
depending on the final form of IDNA2008/

http://www.unicode.org/reports/tr39/tr39-3.html

* The confusable data was revised to add data extracted from a
comparison of font data from Windows and Mac
* Additional mappings were also added, such as "rn" ~ "m"
* The characters recommended for identifiers were updated based on
UAX 31

http://www.unicode.org/reports/tr36/tr36-8.html

* Added new sections
o 3.6 Secure Encoding Conversion
o 3.7 Enabling Lossless Conversion to Unicode

The public review period for these specifications ends on January 26, 2010.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please
use the following link to subscribe (if necessary). Please be aware
that discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above
to generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Tuesday, November 17, 2009

[Unicode Announcement] Two New Public Review Issues (UAX #15, UAX #44 Proposed Updates)

The Unicode Technical Committee has posted two new issues for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new items closes on January 26, 2010.

Please see the page for links to discussion and relevant documents.
Briefly, the new issues are:

Issue #151 Proposed Update UAX #44: Unicode Character Database
http://www.unicode.org/reports/tr44/tr44-5.html
This revision indicates the changed status of several properties as
Deprecated, adds tables listing Deprecated and Stabilized properties,
and extends the discussion of the significance of the
Bidi_Mirroring_Glyph property.

Issue #152 Proposed Update UAX #15: Unicode Normalization Forms
http://www.unicode.org/reports/tr15/tr15-32.html
This revision corrects the definitions of classes of characters in the
Composition Exclusion Table.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

http://www.unicode.org/consortium/distlist.html

Wednesday, October 21, 2009

[Unicode Announcement] Unicode Collation Algorithm Version 5.2 Released

Version 5.2 of the Unicode Collation Algorithm has been released.
See http://www.unicode.org/reports/tr10/.
This version resynchronizes the Unicode Collation Algorithm with all
of the updates for the Unicode Standard, Version 5.2. Please note
the following changes and issues for implementations:

* The text of UTS #10 has been updated. Among other changes, the
revised text for UTS #10 makes it clear that the BASE for
implicit generation of weights for Han characters does not
include unassigned code points.
* There are small changes in Gujarati, Telugu, Malayalam
(including weighting for chillus), Tamil, and Sinhala. While
these changes move in the direction of expected behavior, good
results will only come from tailoring for particular languages,
such as with CLDR.
* There have been significant changes to the ordering of many
combining marks. Many combining marks that are not in customary
use in modern languages now have the same secondary weight, and
will only be distinguished on a fourth level, by code point
ordering. This can be seen by looking at the Unicode Collation
Charts (http://unicode.org/charts/collation/). In 5.2, many
characters now have a white background, indicating that they
sort exactly the same as the previous character, unless a 4th
(codepoint) level is used.
* Implementations of UCA should take note that the increased
number of characters may cause overflows if the implementing
code makes certain assumptions or optimizations. This can result
either from the new character additions (which increase the
number of distinct weights in the table) or because of changes
in the way the weights, particularly for secondary weight
values, are assigned in the table. The latter change may result
in unexpected numbers of characters having the same weight.

Tuesday, October 20, 2009

[Unicode Announcement] Public Review Issue #150: Draft UTS #46 Updated

The draft UTS#46 Unicode IDNA Compatible Preprocessing has been updated.
There are a number of new review notes pointing out issues and asking
for feedback. There are also new tables: one comparing behavior of
compatibility and escaped versions of FULL STOP in delimiting labels
between different browsers, and one comparing the allowed and disallowed
repertoires when processing IDNs according to the IDNA2003, IDNA2008,
and UTS #46 specifications. There are also many improvements and
clarifications of the text.

See: http://www.unicode.org/reports/tr46/

Review period closes October 26, 2009.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

http://www.unicode.org/consortium/distlist.html

Thursday, October 1, 2009

[Unicode Announcement] Unicode 5.2.0 Released

Unicode 5.2 has been released! The data files, code charts, and Unicode
Standard Annexes for this version are final and are posted on the
Unicode site.

For Unicode 5.2, the core specification is no longer just a delta
document applied to the book; instead, the entire core specification,
with all textual changes integrated, will be available on the Unicode
site. As of this announcement, the first five chapters are available;
the other chapters will follow soon.

For full details about what is new or changed in this release, see the
version documentation for Unicode 5.2 at:

http://www.unicode.org/versions/Unicode5.2.0/

Tuesday, September 29, 2009

[Unicode Announcement] Unicode Haiku Contest

Unicode Haiku Contest
Here's your chance to show what you think of Unicode - with poetry!
Enter the Unicode Haiku contest, and meet the bar set by the immortal Haiku:

Chaos reigns within.
Reflect, repent, and reboot.
Order shall return.
(aka Blue Screen)

The tricorder broke
Communicator is dead
And my shirt is red

The winners are to be announced at the upcoming Unicode Conference, Oct 14-16 (but you don't have to attend the conference to win). The first prize is a myTouch 3G phone, sponsored by Google. (If your company is interested in sponsoring an additional prize, contact Magda Danish, http://www.unicode.org/reporting.html). All submissions must arrive by October 12, 2009.

Please submit your entry at http://unicode.org/conference/haiku.html.
Each entry should be 3 lines, with 5 syllables on the first, 7 on the second, and 5 on the third. You can enter as many different submissions as you want. Submissions are judged based on their relation to Unicode and/or SW Globalization, and most importantly, cleverness and whimsy.

Thursday, September 24, 2009

[Unicode Announcement] Remote Access Registration Now Offered at 33rd Internationalization & Unicode Conference

[IUC 32 Logo]<http://www.unicodeconference.org/keynote-e>

[Banner]

What's Your Excuse For Not Attending IUC 33?

I can't attend IUC 33 because...

1. chained to my desk
2. don't like flying
3. baby is due
4. standby for jury duty
5. no travel budget

[http://www.omg.org/images/emails/rachel-2.jpg]

Well, stop the excuses! Attend remotely!

You can attend through the new remote access option for the 33rd Internationalization & Unicode conference!

The conference organizer will be broadcasting via secure connection all of the IUC 33 conference for the first time. Every presentation on every track, including the keynote, will be available. Remotely sit in on presentations from different tracks from the comfort of your home or office. Standard registration fee is US$795, with additional discounts for Unicode and LISA Members. The remote access IUC conference is BYOC (Bring Your Own Coffee).

Or if you would still prefer to attend in person visit http://www.unicodeconference.org/vc-e.

About the Internationalization & Unicode Conference
The Internationalization & Unicode Conference is the premier technical conference for both software and Web internationalization. Unicode and internationalization experts, implementers, clients and vendors are invited to attend this unique conference. The program committee has created an exciting program full of new and cutting-edge topics that is relevant and engaging for the internationalization community. The three-day conference will feature a full day of tutorials followed by two days of presentations, panels and discussions. There will also be technology exhibits and demonstrations. The interactive format makes the Internationalization & Unicode Conference a great place to meet and exchange ideas with leading experts, find out about the needs of potential clients, or get information about new and existing Unicode and internationalization-enabled products.

The 33rd Internationalization & Unicode Conference is sponsored by Gold Sponsors Adobe, Inc. and WinSoft; Media Sponsors LISA Globalization Insider and MultiLingual Computing Inc. and Organizational Sponsor Localization Industry Standards Association (LISA).

Gold Sponsors:

Media Sponsors:

Organizational Sponsor:

[http://www.unicodeconference.org/images/logos/ADOBE-logo.jpg]

[WinSoft Banner]<http://www.unicodeconference.org/winsoft-banner/>

[LISA Globalization Insider]<http://www.unicodeconference.org/lisa-gl-banner>

[MultiLingual]

[LISA]<http://www.unicodeconference.org/lisa-banner>

The hotel registration deadline has been extended to September 30, 2009.

Sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at kenberk@omg.org<mailto:kenberk@omg.org>, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org<mailto:event_marketing@omg.org>. For all other questions email info@unicodeconference.org<mailto:info@unicodeconference.org>.

________________________________

About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, DENIC eG, Google, Government of India, Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging, Oracle, The Society for Natural Language Technology Research, Sun Microsystems, Sybase, The University of California at Berkeley, Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium www.unicode.org/contacts.html<http://www.unicode.org/contacts.html>.

About the Event Producer
OMG(tm) is the Event Producer for the Internationalization & Unicode Conferences. OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA(r), UML(r), CORBA(r), MOF(tm), XMI(r) and CWM(tm). OMG's specifications are all available for download by everyone without charge.

For more information about OMG, visit us online at www.omg.org<http://www.omg.org>.

If you would prefer not to receive messages from the OMG, or have address corrections, please reply to this email message, requesting Unsubscribe or describing your address corrections in the body of the text. Please leave subject line intact.

[http://www.omg.org/cgi-bin/imgtracker.cgi?e=1IUC33RA092409(!*EMAIL*!)]

Tuesday, September 15, 2009

[Unicode Announcement] Public Review Issue: Draft UTS #46: Unicode Compatible IDNA Preprocessing

The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new items close on October 26, 2009.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:

http://www.unicode.org/reports/tr46/tr46-2.html

Issue #150 Draft UTS #46: Unicode Compatible IDNS Preprocessing

This document provides a specification for processing that provides for
compatibility between older and newer versions of internationalized
domain names (IDN) in client software (lookup). It allows
applications--browsers, emailers, and so on--to be able to handle both
the original version of internationalized domain names(IDNA2003) and the
newer version (IDNA2008), avoiding possible interoperability and
security problems.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

http://www.unicode.org/consortium/distlist.html

Tuesday, September 8, 2009

[Unicode Announcement] Unicode Collation Algorithm 5.2.0 Beta Data Files Now Available

Version 5.2.0 of The Unicode Collation Algorithm (UCA) is being prepared
for release in parallel with Unicode 5.2. The UCA data files have been
recently updated and are ready for review. Please see the Public Review
Issue:
http://www.unicode.org/review/#pri143
as well as the beta data files and collation test files:
http://www.unicode.org/Public/UCA/5.2.0/

1. The data files contain weights for all new assigned characters.
a. There have been significant changes to the ordering of
many combining marks. Many of those that are not in customary
use in modern languages now have the same secondary weight,
and will only be distinguished on a fourth level, by code
point ordering.
b. The ordering for Tamil and Malayalam has been improved,
but would still need tailoring for the Tamil and Malayalam
languages.
2. The text of UTS#10 has been updated. See the
modifications section for details:
http://www.unicode.org/reports/tr10/tr10-19.html#Modifications

Time is very short for this beta review, which closes on September 23,
2009, so reviewers are urged to download and test the files as soon as
they can.

Feedback should be sent through the usual Error Reporting Form:
http://www.unicode.org/reporting.html

Wednesday, August 26, 2009

[Unicode Announcement] Last Call for Unicode 5.2 Data

The data files in the Unicode Character Database for Unicode 5.2 have
been revised to include all of the authorized changes from the last UTC
meeting. If you use any of the Unicode data in your implementations,
please update a test version of your implementation to use those files
and run your tests. If there are any showstopper bugs, please report
them (using http://www.unicode.org/reporting.html) as soon as possible.

From this point, the only adjustments that will be made to the data
will be on the basis of showstopper bugs, including bugs uncovered in
the process of updating the Unicode Collation data files for UCA 5.2.

[Unicode Announcement] 33rd Internationalization & Unicode Conference - Keynote Speaker Announcement

33rd Internationalization & Unicode Conference
Features Sessions on Security, Open Source,
Social Networking and Cloud Computing

The Unicode® Consortium announces that Nicholas Ostler, Chairman, Foundation for Endangered Languages, will keynote the 33rd Internationalization & Unicode® Conference (IUC). The conference, sponsored by Gold Sponsors Adobe, Inc. and WinSoft, will take place in San Jose, Calif., USA; October 14-16, 2009. For more information and to register please visit www.unicodeconference.org/keynote-e.

Mr. Ostler will present "The Alphabetic Principle and its Enemies."

The alphabetic principle for writing seems brilliantly simple, and its implementation, often subverting other options, has often caused explosive growths in literacy, with important historical consequences for cultural survival. Its great advantages are economy of effort in the learner, and ready application to new languages. However, it has drawbacks as to speed for the initiated user, and also (by being essentially mechanical and phonetic) in representing many of the cultural overtones which people like their written language to have. There is, too, a certain resistance to the role of art in writing. But as alphabetic traditions age, becoming less purely alphabetic, these disadvantages can be reduced. New structures may emerge, meaningful patterns that leave alphabets far behind. Alphabetic scripts have more recently revealed new aspects, defining a convenient order to index anything, inspiring the phonemic principle of structural linguistics, and later mapping more easily!
than other systems onto digital systems, and hence a whole new set of functions for written language. But the alphabet remains a rather arbitrary means of representing meanings, since its icons are parasitic on the particular sounds of particular words in particular languages, a long way from thoughts.

About the Keynote Presenter
---------------------------
Nicholas Ostler holds an MA in classics, philosophy and economics from Oxford, and a PhD in linguistics from MIT. His first job was teaching in Japan, later consulting on machine translation for Fujitsu. Returning to England, he worked in IT research during the 1980s and '90s, especially with the UK government, and the European Union. He has been Chairman of the Foundation for Endangered Languages (www.ogmios.org) since its inception in 1996. He also edited its newsletter Ogmios until 2006. Within descriptive linguistics, his main research field has been the grammar of the (extinct) Chibcha language of Colombia. He has served on the board of the British National Corpus, the LSA's Committee for Endangered Languages, and on the editorial board of the International Journal of American Linguistics. As a writer, his book "Empires of the Word: a language history of the world" (HarperCollins, 2005) traced the histories of the large literate languages, from Sumerian to English, cons!
idering the factors that make for large-scale expansion. Later, "Ad Infinitum: a biography of Latin" (Walker & Co., 2007) considered the attitudes that have accompanied the Latin language throughout its 2,500 year recorded history. He is now at work on a book about the prospects of English as a global lingua franca, in the light of the competition, past and present. This is due for publication in 2010.

About the Internationalization & Unicode Conference
---------------------------------------------------
The Internationalization & Unicode Conference is the premier technical conference for both software and Web internationalization. Unicode and internationalization experts, implementers, clients and vendors are invited to attend this unique conference. The program committee has created an exciting program full of new and cutting-edge topics that is relevant and engaging for the internationalization community. The three-day conference will feature a full day of tutorials followed by two days of presentations, panels and discussions. There will also be technology exhibits and demonstrations. The interactive format makes the Internationalization & Unicode Conference a great place to meet and exchange ideas with leading experts, find out about the needs of potential clients, or get information about new and existing Unicode and internationalization-enabled products.
The 33rd Internationalization & Unicode Conference is sponsored by Gold Sponsors Adobe, Inc. and WinSoft; Media Sponsors LISA Globalization Insider and MultiLingual Computing Inc. and Organizational Sponsor Localization Industry Standards Association (LISA).

The early-bird registration deadline is September 4, 2009; the hotel registration deadline is September 23, 2009. For full conference details and to register, please click here.
Sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at kenberk@omg.org, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org. For all other questions email info@unicodeconference.org.

About The Unicode Consortium
----------------------------
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

For more information, please contact the Unicode Consortium www.unicode.org/contacts.html.

About the Event Producer
------------------------
OMGT is the Event Producer for the Internationalization & Unicode Conferences. OMG is an open membership, not-for-profit consortium that produces and maintains computer industry specifications for interoperable enterprise applications. Our specifications include MDA®, UML®, CORBA®, MOFT, XMI® and CWMT. OMG's specifications are all available for download by everyone without charge.

For more information about OMG, visit us online at www.omg.org.

Tuesday, July 28, 2009

[Unicode Announcement] Unicode 5.2 Beta - Chapters 1-5 Available

The ongoing beta review for Unicode 5.2 has been supplemented
today with the availability of drafts for the first part of the
consolidated text of the Unicode Standard, Version 5.2.

The landing page for Unicode 5.2 summarizes the major
new additions and changes for Version 5.2:

http://www.unicode.org/versions/Unicode5.2.0/

Links to pdf versions of Chapters 1 through 5 of the standard
are available on that page.

We would like to remind folks that the period for beta review
of the Version 5.2 data files and the Unicode Standard Annexes
is rapidly drawing to a close. The meeting of the Unicode
Technical Committee in August will be making the final decisions
on any reported problems in the data or the annexes, so now
is the time to check the posted data files and documents.
See the beta review page for details:

http://www.unicode.org/versions/beta.html

==================================================

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

http://www.unicode.org/consortium/distlist.html

Wednesday, July 15, 2009

[Unicode Announcement] 33rd Internationalization & Unicode Conference - Program Online

33rd Internationalization & Unicode Conference
Features Sessions on Security, Open Source, Social Networking and Cloud Computing

The Unicode(r) Consortium announces the program for the 33rd Internationalization & Unicode(r) Conference (IUC). The conference, sponsored by Gold Sponsor Adobe, Inc., will take place in San Jose, Calif., USA; October 14-16, 2009. The conference program is available online here.

The program committee has created an exciting program full of new and cutting-edge topics that is relevant and engaging for the internationalization community. The three-day conference will feature a full day of tutorials followed by two days of presentations, panels and discussions. There will also be technology exhibits and demonstrations.

Highlights of the Conference:
============================
Tutorials in Three Tracks:
-------------------------
An Introduction to Writing Systems & Unicode
Internationalization: An Introduction
Building a Custom Keyboard Layout for the Mac with Ukulele and XML
Arabic Script: Structure, Geographic and Regional Classification
Unicode - a Grand Tour
Web Internationalization - Standards and Best Practices
Building Multilingual Websites in Joomla [Drupal]
Creating XHTML/HTML Pages with Right-to-Left Scripts
Free Software stack for Unicode Text Rendering
Presenters come from such organizations as DecoType, Amazon, Penn State, Red Hat/GNOME, W3C, XenCraft, and Yahoo! Inc.

Sessions in Three Tracks:
------------------------
* Session tracks are categorized by Programming Languages, Fonts and Typography, Unicode News, and I18n Standards News on Thursday morning; with Open Source Libraries, Assuring Quality, and Scripts tracks in the afternoon. On Friday, track topics include Development Platforms, Mobile Programming, Internationalization in Practice, and Leveraging CLDR in the morning; and Translation Services API, Bidirectional Text, and Case Studies in the afternoon.

* The following is just a small sample of some of the cutting-edge presentations that will be given at IUC 33. For the full program, visit the IUC 33 Web site.

- Internationalization for JavaScript applications
- Emoji in Unicode: Cell Phones Meet the Internet
- Banking in the Cloud: Challenges of Internationalizing Banking Software
- Twanguages of the World: a Language Census of Twitter
- HarfBuzz, the Free and Open OpenType Shaping Engine

* Session Presenters come from such organizations as Adobe, Inc.; Amazon; Apple, Inc.; Casaba Security; DataDirect Technologies; DecoType; UC Berkeley; Google Inc.; HighTech Passport; IBM; Intel Corporation; Microsoft; Monotype Imaging; University of Michigan; XenCraft; Yahoo! Inc.; and Yale University
The Internationalization & Unicode Conference is the premier technical conference for both software and Web internationalization. Unicode and internationalization experts, implementers, clients and vendors are invited to attend this unique conference. The interactive format makes the Internationalization & Unicode Conference a great place to meet and exchange ideas with leading experts, find out about the needs of potential clients, or get information about new and existing Unicode and internationalization-enabled products.

Gold sponsor: ADOBE - Media Sponsor: Multilingual

The early-bird registration deadline is September 4, 2009; the hotel registration deadline is September 23, 2009. For full conference details and to register, please click here. Sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at kenberk@omg.org, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org. For all other questions email info@unicodeconference.org.

--------------------------------------------------------------------------------

About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.
The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, DENIC eG, Google, Government of India, Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging, Oracle, SAP, The Society for Natural Language Technology Research, Sun Microsystems, Sybase, The University of California at Berkeley, Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.
For more information, please contact the Unicode Consortium www.unicode.org/contacts.html.

Wednesday, July 8, 2009

[Unicode Announcement] New Public Review #149: UTS #22; and other PRI updates

The Unicode Technical Committee has posted a new issue for public review
and comment. Details are on the following web page:

http://www.unicode.org/review/

Review period for the new item closes on August 3, 2009.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:

PRI #149
Proposed Update UTS #22: Unicode Character Mapping Markup Language

This proposed update includes editorial fixes and clarifications based
on community feedback. There is a small change in the DTD from version
three to this proposed version five (a new default attribute value). See
the Modification History and the highlighted changes for details.

There are also two updates to open Public Review Issues:

PRI #136
Proposed Update UAX #14: Unicode Line Breaking Algorithm

The text of UAX #14 has been revised throughout, with both substantive
and editorial changes. A new Line_Break class CP has been added, and the
rule LB30 has been reintroduced, to address an edge case involving
breaks around parenthesized letters. More new Southeast Asian scripts
and characters have been added to the Line_Break class SA. The lists of
characters representing each Line_Break class are now exemplary, rather
than exhaustive in the text. Please review the new text carefully.

PRI #134
Proposed Update UAX #9: Unicode Bidirectional Algorithm

The latest revision also includes a new conformance test file, which
implementers should carefully review. See BidiTest.txt in the data files
directory: http://www.unicode.org/Public/5.2.0/ucd/

The closing dates remain the same.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use
the following link to subscribe (if necessary). Please be aware that
discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above to
generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Wednesday, July 1, 2009

[Unicode Announcement] Draft code charts for Unicode 5.2 beta review

Draft code charts are now available for the Unicode 5.2 beta review.
Please check the code charts carefully to verify correctness of the new
characters added to Unicode 5.2 and to ensure that there are no
regressions for previously encoded characters. The draft code charts
are located in:

http://www.unicode.org/Public/5.2.0/charts/ or
ftp://www.unicode.org/Public/5.2.0/charts/

The Unicode Consortium appreciates the help provided by its many volunteers
who help in ensuring the best possible quality for the published code charts
for the Unicode Standard.

For further information about the beta, please see the beta page
http://www.unicode.org/versions/beta.html and the associated
Public Review Issues page: http://www.unicode.org/review/#148

Monday, June 29, 2009

[Unicode Announcement] Unicode Releases Common Locale Data Repository, Version 1.7.1

Mountain View, CA, June 29, 2009 - The Unicode® Consortium announced
today the release of a new version of the Unicode Common Locale Data
Repository (Unicode CLDR 1.7.1), providing key building blocks for
software to support the world's languages. Unicode CLDR is by far the
largest and most extensive standard repository of locale data. This data
is used by a wide spectrum of companies for their software
internationalization and localization: adapting software to the
conventions of different languages for such common software tasks as
formatting of dates, times, time zones, numbers, and currency values;
sorting text; choosing languages or countries by name; transliterating
different alphabets; and many others.

------------------------------------------------------------------------

CLDR 1.7.1 is an update release, with no new translations. The main
changes are fixes for numbering systems and currencies, but a number of
other bugs were fixed. See the CLDR 1.7.1 Release Note
<http://sites.google.com/site/cldr/index/downloads/cldr-1-7-1> for a
full list of changes. There were no changes in the LDML specification.

------------------------------------------------------------------------

Unicode CLDR 1.7 is part of the Unicode locale data project, together
with the Unicode Locale Data Markup Language (LDML:
http://unicode.org/reports/tr35/). LDML is an XML format used for
general interchange of locale data, such as in Microsoft's .NET. For web
pages with different views of CLDR data, see
http://unicode.org/cldr/charts.html. For more information about the
Unicode CLDR project (including charts) see http://cldr.unicode.org. The
latest features of CLDR will also be showcased at the 33rd
Internationalization and Unicode Conference (IUC) on October 14-16, 2009
in San Jose, CA — see http://unicodeconference.org/
<http://www.unicodeconference.org/>.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop,
extend and promote use of the Unicode Standard and related globalization
standards. The membership of the consortium represents a broad spectrum
of corporations and organizations in the computer and information
processing industry. Members are: Adobe Systems, Apple, DENIC eG,
Google, Government of India, Government of Tamil Nadu, IBM, Microsoft,
Monotype Imaging, Oracle, SAP, The Society for Natural Language
Technology Research, Sun Microsystems, Sybase, The University of
California at Berkeley, Yahoo!, plus well over a hundred Associate,
Liaison, and Individual members.

For more information, please contact the Unicode Consortium
(http://unicode.org/ <http://www.unicode.org/>).

Tuesday, June 23, 2009

[Unicode Announcement] Unicode 5.2 beta, updates of UAX #9 and UAX #31

As part of the Unicode 5.2 beta, the following proposed updates for
Unicode Standard Annexes have significant new revisions:

===

UAX #31 Unicode Identifier and Pattern Syntax
http://www.unicode.org/reports/tr31/tr31-10.html

The main changes are the addition of characters and new scripts to
tables for:

* Candidates for Inclusion in Identifiers
* Candidate Characters for Exclusions from Identifiers
* Recommended Scripts

===

UAX#9 Unicode Bidirectional Algorithm
http://www.unicode.org/reports/tr9/tr9-20.html

The main changes are the addition of:

* a section on Bidi Conformance Testing
* the Bidi_Class BN to Rule X6 (removing certain characters from
Bidi processing)
* a clause in HL6 providing for mirroring of R and AL characters in
certain circumstances

===

More details are in the Modifications section of each document, and
there remain some editorial notes asking for feedback on particular
issues. Feedback for both of these documents is solicited by August 3, 2009.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

If you wish to discuss issues on the Unicode mail list, then please use
the following link to subscribe (if necessary). Please be aware that
discussion comments on the Unicode mail list are not automatically
recorded as input to the UTC. You must use the reporting link above to
generate comments for UTC consideration.

http://www.unicode.org/consortium/distlist.html

Wednesday, June 10, 2009

[Unicode Announcement] Unicode 5.2.0 Beta Review Starts

The next version of the Unicode Standard will be Version 5.2.0. The beta
information page for Unicode 5.2.0 is located at:

http://www.unicode.org/versions/beta.html

This version is planned for release in October 2009. A beta version of
the 5.2.0 Unicode Character Database files is also available for public
comment. We strongly encourage implementers to download these files and
test them with their programs, well before the end of the beta period,
August 3, 2009. These files are located in:

http://www.unicode.org/Public/5.2.0/
ftp://www.unicode.org/Public/5.2.0/

For detailed information and guidance on how to focus your review, see
the section Notable Issues for Beta Testers on the beta page.

The beta information page tells how to report comments and initiate
discussions.

Thursday, May 28, 2009

[Unicode Announcement] New Public Review Issue #147: Proposed Deprecation of U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW

The Unicode Technical Committee has posted a new issue for public
review and comment. Details are on the following web page:

http://www.unicode.org/review/

Review periods for the new item closes on August 3, 2009.

Please see the page for links to discussion and relevant documents.
Briefly, the new issue is:

PRI #147: Proposed Deprecation of U+0673 ARABIC LETTER ALEF WITH WAVY
HAMZA BELOW

The UTC has recently approved a proposal to encode an ARABIC WAVY HAMZA
BELOW for a future version of the Unicode Standard. That character is
used productively in Kashmiri and other languages, and is applied to
letters other than ALEF. The intent is to deprecate the existing
character U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW, in favor of
the sequence of an ALEF plus the new ARABIC WAVY HAMZA BELOW. (Because
of normalization stability constraints, a canonical equivalence relation
cannot be established.)

The UTC is seeking feedback on whether U+0673 should be deprecated when
ARABIC WAVY HAMZA BELOW is encoded. Pertinent information would include
data on how widespread usage of this character is. Note that deprecation
of a character does not mean removal of that character from the
standard; it merely constitutes a strong recommendation not to use the
character.

If you have comments for official UTC consideration, please post them by
submitting your comments through our feedback & reporting page:

http://www.unicode.org/reporting.html

http://www.unicode.org/consortium/distlist.html

Thursday, May 21, 2009

[Unicode Announcement] 33rd IUC - Call for Participation Deadline EXTENDED to Thursday, May 28th

33rd Internationalization & Unicode Conference
REMINDER: Call for Participation Deadline EXTENDED to Thursday, May 28th

The Unicode(r) Consortium announces a call for participation in The Thirty-third Internationalization & Unicode(r) Conference (IUC 33), taking place in San Jose, Calif., USA; October 14-16, 2009. The call for participation runs until Thursday, May 28. The annual conference is produced by OMG(tm). Details about the conference and the call for participation are available at http://www.unicodeconference.org/iuc33call.

The Internationalization & Unicode Conference is the premier annual technical conference focusing on multilingual, global software and Web internationalization. Each IUC conference features a variety of tutorials and conference sessions that cover current topics related to Web and software internationalization, globalization, and Unicode.

IUC 33 will include sessions with a special focus on emerging technologies, social networks, cloud computing, evolving standards, and best practices in internationalization for management, development and testing. Organizations are specifically invited to submit proposals on their real-world experiences with globalization efforts. The conference Program Committee is also seeking technical and business-focused presentations on case studies, experience reports, evaluations or research papers on topics relevant to (but not limited to):

New and upcoming globalization, internationalization and Unicode technologies
Internationalized Domain Names
Implementation of Unicode, including new scripts and characters in Unicode 5.1
Current state of font development with regard to Unicode
Unicode in "the cloud"
Social networking and its impact on globalization and internationalization issues
Unicode conformance and international standards compliance issues
Common Locale Data Repository (CLDR)
Internationalization or enabling of applications or Web sites
Working with multilingual text and data
Global development best practices
Security and data-exchange issues
Business cases and technical issues for globalized software
Publishing and broadcasting for a global audience
Encoding and Internationalization challenges for governments
Unicode in the library and in university curricula
Internationalization of Web services and XML-based data formats
Support of South Asian and African languages

Tutorial Sessions are an important part of the conference. The Program Committee is seeking proposals on topics of interest to general software users, to project and program managers, and to technical attendees who need to build basic knowledge of Unicode and software internationalization. Tutorial topics can also include (but aren't limited to):

Best practices in localization process and technology
Users: making the most of international features in common applications
Unicode and internationalization in programming languages
Solutions for handling complex scripts
Platform technologies for internationalization
Program and project management of internationalization
Strategies and best practices for managing multilingual sites & content

Tutorial presenters receive complimentary registration and two nights lodging. Session presenters receive a fifty percent conference discount and two nights lodging. See the web site for full details and restrictions.

Proposals for panel sessions in any of the above areas are also welcomed. Interested individuals or organizations are invited to submit a brief (up to 600 word) abstract of their proposed conference presentation by Thursday, May 28 using this web form: www.unicodeconference.org/abstracts<http://www.unicodeconference.org/abstracts>.

The Program Committee will select presentations for inclusion in the program and notify authors by Tuesday, June 9. Final presentation materials will be required from all selected presenters by Monday August 28. The conference agenda will be available by Friday, June 12, and posted at www.unicodeconference.org.

Internationalization and Unicode experts, implementers, clients, teachers and vendors are invited to attend this unique conference. The interactive format makes the Internationalization & Unicode Conference a great place to meet and exchange ideas with leading experts, find out about the needs of potential clients, or get information about new and existing Unicode-enabled products.

--------------------------------------------------------------------------------

About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, DENIC eG, Google, Government of India, Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging, NetApp, Oracle, SAP, Sun Microsystems, Society for Natural Language Technology Research, Sybase, The University of California at Berkeley, Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium http://www.unicode.org.

For more information about OMG, visit us online at http://www.omg.org.

Tuesday, May 12, 2009

[Unicode Announcement] 33rd Internationalization & Unicode Conference - Call for Participation

33rd Internationalization & Unicode Conference
REMINDER: Call for Participation Deadline - Friday, May 22nd

The Unicode(r) Consortium announces a call for participation in The Thirty-third Internationalization & Unicode(r) Conference (IUC 33), taking place in San Jose, Calif., USA; October 14-16, 2009. The call for participation runs until Friday, May 22. The annual conference is produced by OMG(tm). Details about the conference and the call for participation are available at http://www.unicodeconference.org/iuc33call.

* New and upcoming globalization, internationalization and Unicode technologies

* Internationalized Domain Names

* Implementation of Unicode, including new scripts and characters in Unicode 5.1

* Current state of font development with regard to Unicode

* Unicode in "the cloud"

* Social networking and its impact on globalization and internationalization issues

* Unicode conformance and international standards compliance issues

* Common Locale Data Repository (CLDR)

* Internationalization or enabling of applications or Web sites

* Working with multilingual text and data

* Global development best practices

* Security and data-exchange issues

* Business cases and technical issues for globalized software

* Publishing and broadcasting for a global audience

* Encoding and Internationalization challenges for governments

* Unicode in the library and in university curricula

* Internationalization of Web services and XML-based data formats

* Support of South Asian and African languages

* Best practices in localization process and technology

* Users: making the most of international features in common applications

* Unicode and internationalization in programming languages

* Solutions for handling complex scripts

* Platform technologies for internationalization

* Program and project management of internationalization

* Strategies and best practices for managing multilingual sites & content

Proposals for panel sessions in any of the above areas are also welcomed. Interested individuals or organizations are invited to submit a brief (up to 600 word) abstract of their proposed conference presentation by Friday, May 22 using this web form: www.unicodeconference.org/abstracts.

Sponsorships and exhibit space are available; for more information on sponsoring contact Ken Berk at kenberk@omg.org, +1-781-444 0404. For exhibiting questions email event_marketing@omg.org. For all other questions email info@unicodeconference.org.

--------------------------------------------------------------------------------

About The Unicode Consortium
The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards.

The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Members are: Adobe Systems, Apple, DENIC eG, Google, Government of India, Government of Tamil Nadu, IBM, Microsoft, Monotype Imaging, NetApp, Oracle, SAP, Sun Microsystems, Society for Natural Language Technology Research, Sybase, The University of California at Berkeley, Yahoo!, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium http://www.unicode.org.

For more information about OMG, visit us online at http://www.omg.org.

Friday, May 8, 2009

Unicode Releases Common Locale Data Repository, Version 1.7

Mountain View, CA, May 8, 2009 - The Unicode® Consortium announced today the release of the new version of the Unicode Common Locale Data Repository (Unicode CLDR 1.7), providing key building blocks for software to support the world's languages. Unicode CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; transliterating different alphabets; and many others.

CLDR 1.7 contains data for 146 languages and 159 territories: 468 locales in all. Version 1.7 of the repository contains over 21% more locale data than the previous release, with over 40,000 new or modified data items from over 140 different contributors. Major contributors to CLDR 1.7 include Adobe, Apple, Google, IBM, and Sun, plus official representatives from a number of countries. Many other organizations and volunteers around the globe, including Gnome, Kotoistus, LISA, OpenOffice, and Utilika, have also made important contributions. The data for CLDR is gathered through the CLDR Survey Tool, which allows organizations and volunteers to contribute, compare, and vet locale data. In the development of this release, the process of gathering data was sped up, and the voting process was simplified.

The new features of Unicode CLDR 1.7 include:

New and improved data, including Indic data.
Enhanced number system support, including many non-decimal formats as well as spelled-out forms ("twenty-three")
Postal code format validity
New IETF BCP 47 (RFC 4646) support
Calendar preference data
Improved language population data, and language-script mapping data
Local DTD access
Improved currency symbols
Clarified specification of timezone parsing

For more information about the Unicode CLDR project (including charts) see http://cldr.unicode.org. The latest features of CLDR will also be showcased at the 33rd Internationalization and Unicode Conference (IUC) on October 14-16, 2009 in San Jose, CA — see http://unicodeconference.org/.

Thursday, May 7, 2009

Call for Participation: 33rd Internationalization & Unicode Conference

Contact: Stephanie Covert OMG +1-843-737 0637 info@unicodeconference.org

Call for Participation: 33rd Internationalization & Unicode Conference San Jose, Calif., USA; October 14-16, 2009

Mountain View, CA, USA – April 2, 2009 – The Unicode® Consortium today announced a call for participation in The Thirty-third Internationalization & Unicode® Conference (IUC 33), taking place in San Jose, Calif., USA; October 14-16, 2009. The call for participation runs until Friday, May 22. The annual conference is produced by OMG™. Details about the conference and the call for participation are available at http://www.unicodeconference.org/iuc33call.

New and upcoming globalization, internationalization and Unicode technologies
Internationalized Domain Names
Implementation of Unicode, including new scripts and characters in Unicode 5.1
Current state of font development with regard to Unicode
Unicode in “the cloud”
Social networking and its impact on globalization and internationalization issues
Unicode conformance and international standards compliance issues
Common Locale Data Repository (CLDR)
Internationalization or enabling of applications or Web sites
Working with multilingual text and data
Global development best practices
Security and data-exchange issues
Business cases and technical issues for globalized software
Publishing and broadcasting for a global audience
Encoding and Internationalization challenges for governments
Unicode in the library and in university curricula
Internationalization of Web services and XML-based data formats
Support of South Asian and African languages

Best practices in localization process and technology
Users: making the most of international features in common applications
Unicode and internationalization in programming languages
Solutions for handling complex scripts
Platform technologies for internationalization
Program and project management of internationalization
Strategies and best practices for managing multilingual sites & content

Proposals for panel sessions in any of the above areas are also welcomed. Interested individuals or organizations are invited to submit a brief (up to 600 word) abstract of their proposed conference presentation by Friday, May 22 using this web form: http://www.unicodeconference.org/abstracts.

Thursday, December 10, 2009

Thursday, November 19, 2009

Tuesday, November 17, 2009

Wednesday, October 21, 2009

Tuesday, October 20, 2009

Thursday, October 1, 2009

Tuesday, September 29, 2009

Thursday, September 24, 2009

Tuesday, September 15, 2009

Tuesday, September 8, 2009

Wednesday, August 26, 2009

Tuesday, July 28, 2009

Wednesday, July 15, 2009

Wednesday, July 8, 2009

Wednesday, July 1, 2009

Monday, June 29, 2009

Tuesday, June 23, 2009

Wednesday, June 10, 2009

Thursday, May 28, 2009

Thursday, May 21, 2009

Tuesday, May 12, 2009

Friday, May 8, 2009

Thursday, May 7, 2009

Links of Interest

Blog Archive

Labels

Followers

Subscribe to this blog