archives

« Bugzilla Issues Index

#2662 — 21.2.5.10 Significant issues with RegExp.prototype.split


The algorithm as currently written does not correctly deal with full unicode regular expressions. In particular it needs to translate endIndex values from from code point offsets to code units offsets as is done by RegExpBuiltinExec (RegExpExec in rev23).

More generally, the algorithm is not usable by subclass that over-ride exec to change the matching semantics. This algorithm is the last place that the RegExp matcher is directly used without going through exec.


> More generally, the algorithm is not usable by subclass that over-ride exec
> to change the matching semantics. This algorithm is the last place that the
> RegExp matcher is directly used without going through exec.

An issue I see here is, if one want to use `exec`, one should run it as if the global flag is set to true, and the sticky flag is set to false.

In order to work around the state of the global flag, one can run the `exec` method against a substring of the original string. That is, whenever it is said:

Let z be the result of calling the matcher with arguments S and q.

one can say, approximatively:

Let T = S.substring(q).
Let rx.lastIndex = 0.
Let result = rx.exec(T).

Moreover, if the sticky flag is on, one can add additional logic in order to try to match at positions q+1, q+2, etc. in case of failure.


(In reply to Claude Pache from comment #1)
I've just realised that RegExpBuiltinExec (Section 21.2.5.2.2) uses Get(..) in order to obtain the value of the global and sticky flags, instead of taking the original values stored in [[OriginalFlags]]. Instead of the workaround of Comment #1, one could try to play with the values of these properties. However there is currently no natural manner to do it.


fixed in rev29 editor's draft.

The algorithm now uses 'exec' instead of directly calling the matcher procedure. However, to preserve the legacy observable side-effects (of lack of) on the this RegExp a 'exec' is applied to a clone of this value.


fixed in rev29