#2368 — 21.2.3.3.2: Pattern interpretation doesn't account for code unit patterns

bug_id: 2368
creation_ts: 2013-12-08 20:36:00 -0800
short_desc: 21.2.3.3.2: Pattern interpretation doesn't account for code unit patterns
delta_ts: 2015-02-11 22:11:51 -0800
product: Draft for 6th Edition
component: technical issue
version: Rev 21: November 8, 2013 Draft
rep_platform: All
op_sys: All
bug_status: VERIFIED
resolution: FIXED
priority: Normal
bug_severity: normal
everconfirmed: true
reporter: Norbert
assigned_to: Allen Wirfs-Brock

commentid: 6905
comment_count: 0
who: Norbert
bug_when: 2013-12-08 20:36:39 -0800

Step 8 if RegExpInitialise starts with "Parse P interpreted as UTF-16 encoded Unicode code points using the grammars in 21.2.1." This fails to mention that, as explained in 21.2.2, patterns can be interpreted as either code unit or code point ("BMP" or "Unicode" - see bug 2367) patterns. It should mention this distinction as well as the necessary conversion to a List of SourceCharacter values, different for code unit and code point patterns.

commentid: 7338
comment_count: 1
who: Allen Wirfs-Brock
bug_when: 2014-02-17 09:55:52 -0800

fixed in rev23 editor's draft.

Made things a bit more explicit WRT to these points.

commentid: 7508
comment_count: 2
who: Allen Wirfs-Brock
bug_when: 2014-04-06 11:29:18 -0700

fixed in rev23 draft

commentid: 8719
comment_count: 3
who: Norbert
bug_when: 2014-05-31 00:49:27 -0700

Looking at the rev 25 draft, I like the clean separation of code paths for BMP and Unicode. However, some of the details still need improvements:

- In step 9.a, P is not interpreted as a list of UTF-16 encoded code points, but as a list of UTF-16 code units individually interpreted as source characters.

- In step 10.b, the description of the list is hard to parse. How about "... List whose elements are the code points resulting from interpreting P as a sequence of UTF-16 encoded Unicode code points."?

commentid: 9748
comment_count: 4
who: Allen Wirfs-Brock
bug_when: 2014-08-09 13:01:58 -0700

fixed in rev27 editor's draft

commentid: 9866
comment_count: 5
who: Allen Wirfs-Brock
bug_when: 2014-08-25 08:29:18 -0700

fixed in rev27 draft

commentid: 10699
comment_count: 6
who: Norbert
bug_when: 2014-12-01 20:58:21 -0800

Small grammatical error resulting from the edits in step 10.b: "code points of resulting". Remove "of".

commentid: 10810
comment_count: 7
who: Allen Wirfs-Brock
bug_when: 2014-12-06 15:32:23 -0800

fixed in rev29 editor's draft

commentid: 10821
comment_count: 8
who: Allen Wirfs-Brock
bug_when: 2014-12-07 14:34:57 -0800

fixed in rev29

commentid: 12356
comment_count: 9
who: Norbert
bug_when: 2015-02-11 22:11:51 -0800

Verified in rev 32 draft.

archives

#2368 — 21.2.3.3.2: Pattern interpretation doesn't account for code unit patterns