Regular expressions are used throughout much of the world's software for matching and manipulating text. UTS #18: Unicode Regular Expressions provides the foundation for the
handling of Unicode text in those expressions.
Version 17 of this standard adds the Bidi_Paired_Bracket and Bidi_Paired_Bracket_Type properties, both new in Unicode 6.3, and it expands the guidelines and requirements for support of the Script_Extensions property.
Tuesday, November 19, 2013
Friday, November 15, 2013
Unicode Security Standard version 6.3 Released
Version 6.3 of UTS #39: Unicode Security
Mechanisms has been released. Because the Unicode Standard contains
such a large number of characters for the writing systems of the world,
caution is necessary to avoid exposing programs and systems to possible
security attacks. This document provides mechanisms for reducing the risk of
problems, while the associated
UTR #36: Unicode Security
Considerations describes a variety of security considerations for
Unicode and guidelines for dealing with them.
UTS #39 includes a new Restriction Level
(Single Script), and a number of clarifications for confusable
detection, restriction revels, and optional detection. It also contains a
new section describing how the identifier data is generated. That identifier
data has been expanded to include certain characters from
UAX #31: Unicode Identifier
and Pattern Syntax, a few extra characters allowed in IDNA2008
(Internationalized Domain Name Architecture,
http://tools.ietf.org/html/rfc5890), and certain characters based on
user feedback. The version numbering has also been changed to align with
versions of the Unicode Standard.
The associated UTR #36 has some smaller
changes. There are a few important corrections, and the addition of new
sections discussing security issues with transitivity and idempotence. There
are also a few related new FAQ entries on
http://www.unicode.org/faq/security.html.
Wednesday, November 13, 2013
Version 6.3 of UTS #46, Unicode IDNA Compatibility Processing
Unicode Technical Standard #46 version 6.3 has been released, synchronized with Unicode
6.3. The data tables are identical with the previous version,
with the exception of the 5 new Bidi_Control characters. The
table derivation has been modified to forbid Bidi_Control
characters, now and in the future: this is consistent with the
intent of IDNA2003, and with the treatment of these characters
in IDNA2008.