« Bugzilla Issues Index

#513 — Clarify Unicode processing in eval

The specification of eval() needs to clarify how x is converted to the Unicode code points expected in parsing. In particular, it needs to clarify that unpaired surrogates are converted to their corresponding surrogate code points and are not treated as errors or converted to a fallback character. An inverse function to the UTF-16 Encoding function in clause 6 would be the best solution.

Actually, I provided just such an inverse function, and it found a home in section 8.4 :-)

corrected in editor's draft

fixed in rev10, Sept. 27 2012 draft

Checked in rev 26 draft: There's now a cross-reference to section 10.1.1, but that's not the right section to reference - it describes the mapping from a code point to UTF-16, while we need the mapping from a sequence of UTF-16 code units to a sequence of code points. The algorithm that needs to be referenced here is the one at the end of section 6.1.1.

(This bug was actually correctly fixed in rev 10, but somewhere in between the reference got changed to the wrong section).

fixed in rev27 editor's draft

fixed in rev27 draft

Verified in rev 28 draft.