archives

« Bugzilla Issues Index

#277 — Keywords vs unicode escapes in IdentifierName


Section 7.6 isn't particularly clear on whether identifier names with unicode escapes can be treated as keywords. The only respective text is the sentence "All interpretations of identifiers within this specification are based upon their actual characters regardless of whether or not an escape sequence was used to contribute any particular characters." It is probably intended, but not clear, that this overrides/amends the formal grammar definition.

Implementations differ in their interpretation. For example,

v\u0061r x = 0
eval("v\\u0061r y = 1")

Is accepted by FF 10, rejected by V8 3.8, and JSC seems to reject the former but accepts the latter. Conversely,

var v\u0061r = 1
eval("var v\\u0061r = 2")

is rejected by FF but accepted by V8 (actually introducing a variable named "var").


From http://mathiasbynens.be/notes/javascript-identifiers:

For compatibility reasons, browsers must support identifiers that unescape to a reserved word, as long as at least one character is escaped using a Unicode escape sequence. For example, `var var;` wouldn’t work, but e.g. `var v\u0061r;` would — even though strictly speaking, the ECMAScript spec disallows it. Subsequent use of such identifiers must also have at least one character escaped (otherwise the reserved word will be used instead), but it doesn’t have to be the same character(s) that were originally used to create the identifier. For example, `var v\u0061r = 42; alert(va\u0072);` would alert `42`.

This is documented here: http://wiki.whatwg.org/wiki/Web_ECMAScript#Identifiers


http://wiki.whatwg.org/wiki/Web_ECMAScript#Identifiers shouldn't be used justify any particular interpretation of the ES specification. There is nothing normative about that document and the section on Identifiers is actually labelled as "this is very rough". Nobody should be making implementation decisions on the basis of that document.

There may well be web compatibility requirements that are not covered by the current ES5.1 spec. We should try to understand those requirements before implementations start trying to match each others deviations from the spec. Difference between implementations suggest areas where there currently isn't complete interoperability in this area so we shouldn't be creating interoperability among spec. deviations (and future compatibility requirements) unless we actually decide we want those deviations to be part of the standard language.

On es-discuss Andreas reported that Waldemar told him that keywords are supposed to be recognized before canonicalization. Waldemar's observation may well be correct for ES<=3. He would know, and if that was the intent then I can see how the spec. could be read in that manner. But for ES5 we rewrote that portion of the specification and introduced the concept of IdentifierName as a lexical category that includes both ReservedWord and Identifier and all the escape and canonicalization language was applied to IdentifierName rather than Identifier. This was all intentional. We certainly didn't expect true and tru\u0065 to be recognized as different identifier names in newly allowed contexts such as:

obj.true == obj.tru\u0065

I'm pretty sure that no reviewers brought up issues related to a ES3 interpretation of unicode escapes as keyword escapes.

At this point I think we need to do two things:

1) Understand the actual browser interop situation. For example, do all major browsers accept:
var tru\u0065;
2) Within the constraints of 1) decide what we actually want to specify. Do we want
console.log(fals\u0065)
to print "false" or "undefined"?
3) For ES6 we have to decide how \u{0065} fits in.


To echo the discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=744784:

(In reply to comment #2)
> At this point I think we need to do two things:
>
> 1) Understand the actual browser interop situation. For example, do all major
> browsers accept:
> var tru\u0065;

The latest versions of Firefox and IE throw an error in this case. Other browsers accept it. Firefox used to support this syntax, but removed this non-standard extension little over half a year ago, and hasn’t received any reports of compatibility problems.

Let’s try to get other browsers/engines to remove this non-standard extension as well. I’ve filed the following bugs:

* Opera/Carakan bug: https://bugs.opera.com/browse/DSK-369398
* Chrome/V8: http://code.google.com/p/v8/issues/detail?id=2222
* Safari/JavaScriptCore: https://bugs.webkit.org/show_bug.cgi?id=90678


(In reply to comment #2)
> At this point I think we need to do two things:
>
> 1) Understand the actual browser interop situation. For example, do all major
> browsers accept:
> var tru\u0065;
> 2) Within the constraints of 1) decide what we actually want to specify. Do we
> want
> console.log(fals\u0065)
> to print "false" or "undefined"?
> 3) For ES6 we have to decide how \u{0065} fits in.

A closely related issue: is /foo/\u0069 a legal regexp? The grammar parses the regexp flags as IdentifierParts, which can contain unicode escapes, but is then rather vague about the semantics via conversion to a string (end of Section 7.8).


Carakan (Opera) has a ready patch waiting to be integrated that changes to throwing here.


(In reply to comment #4)
>
> A closely related issue: is /foo/\u0069 a legal regexp? The grammar parses the
> regexp flags as IdentifierParts, which can contain unicode escapes, but is then
> rather vague about the semantics via conversion to a string (end of Section
> 7.8).

I agree, the spec. is vague. On O3/21/13 I tested this on recent versions of Firefox, Chrome, Safari, and IE9. Only IE9 accepted the escaped flag as valid.

Based on this, for ES6 I will make it clear that escaped flags are not allowed.


mark as in_progress to flag that it needs to go into a future ES5.1 errata


FWIW, Safari/JavaScriptCore just dropped the “escaped reserved words as identifiers” compatibility measure. https://trac.webkit.org/changeset/185414


Bulk resolving ES5.1 errata issues as a sampling suggests these are all fixed. If this is in error, please open a new issue on GitHub.