#1108 — String.prototype.normalize

bug_id: 1108
creation_ts: 2012-12-02 11:04:00 -0800
short_desc: String.prototype.normalize
delta_ts: 2013-03-08 14:44:19 -0800
product: Draft for 6th Edition
component: new feature
version: Rev 12: November 22, 2012 Draft
rep_platform: All
op_sys: All
bug_status: RESOLVED
resolution: FIXED
priority: Normal
bug_severity: enhancement
everconfirmed: true
reporter: Rick Waldron
assigned_to: Allen Wirfs-Brock
cc: ecmascriptbugs

commentid: 2974
comment_count: 0
who: Rick Waldron
bug_when: 2012-12-02 11:04:33 -0800

Add String.prototype.normalize

http://wiki.ecmascript.org/doku.php?id=strawman:unicode_normalization#add_normalize_method

Resolution:

- total
- deterministic
- idempotent normalization (normalizing the result of normalization again will return the first result)

WH: Note that Unicode got this wrong a while back (their normalization algorithm wasn't idempotent, and it didn't even form proper equivalence relations). They fixed it since then and now explicitly state that it's idempotent.

commentid: 2990
comment_count: 1
who: Allen Wirfs-Brock
bug_when: 2012-12-04 20:09:55 -0800

So this presumably means that if we have:

var ̍ϓ = 5; //u+03D3
var Ύ = 6; //u+038E

we will have two distinct variable even though, the unicode code points logically represent the same character. (and assuming both codepoints are both valid identifier characters).

Right?

If so, it seems we should explicitly state this, probably somewhere in section 7.

Norbert, do you have any language you would like to suggest that would clarify that after your proposed deletions are made?

commentid: 3275
comment_count: 2
who: Allen Wirfs-Brock
bug_when: 2013-03-05 16:52:23 -0800

fixed in rev 14 editor's draft

commentid: 3346
comment_count: 3
who: Allen Wirfs-Brock
bug_when: 2013-03-08 14:44:19 -0800

in Rev 14 draft

archives

#1108 — String.prototype.normalize