archives

« Bugzilla Issues Index

#522 — Use Unicode character names consistently


The specification in many places references Unicode characters, sometimes using the names provided by the Unicode standard, but often using other names of unknown provenance. Sometimes, as in the Quote algorithm in 15.12.3, two different names are used for the same character (reverse solidus vs. backslash).

I'd suggest consistently using the Unicode character names, along with their code point value or UTF-16 code unit value, throughout the document.


I think I've have them all switch to the official Unicode names.

Fixed in rev26 editor's draft

(but we'll see if any others turn up).


fixed in rev26


Checked in rev 26 draft: Searching the document still finds one or more occurrences of:
- slash (which the Unicode Standard calls SOLIDUS)
- backslash (REVERSE SOLIDUS)
- quote (QUOTATION MARK or APOSTROPHE)
- open left bracket or opening left bracket (LEFT SQUARE BRACKET)
- closing right bracket or right bracket (RIGHT SQUARE BRACKET)
- underscore (LOW LINE)
- brace or curly brace (LEFT CURLY BRACKET or RIGHT CURLY BRACKET)
- BYTE ORDER MARK (ZERO WIDTH NO-BREAK SPACE)
- FORM FEED (FORM FEED (FF))
- LINE FEED (LINE FEED (LF))
- CARRIAGE RETURN (CARRIAGE RETURN (CR))


fixed again in rev29 editor's draft


fixed in rev29


In rev29, the code point value is still not consistently being used alongside the canonical symbol name.

Section 11, for example: s/SOLIDUS/U+002F SOLIDUS/


11.8.4: REVERSE SOLIDUS (\), CARRIAGE RETURN (CR), LINE SEPARATOR, PARAGRAPH SEPARATOR, and LINE FEED (LF).


13.4 LEFT CURLY BRACKET

21.2.3.1 REVERSE SOLIDUS


24.3.2

QUATION MARK (sic)
LEFT CURLY BRACKET
COMMA
RIGHT CURLY BRACKET
COLON
LEFT SQUARE BRACKET
RIGHT SQUARE BRACKET


fixed QUATIOB spelling in 24.3.2


fixed in rev36 editor's draft

or at least the ones listed in Comment 6 - Comment 9


in rev36