archives

« Bugzilla Issues Index

#2070 — \b should work for Unicode strings


In section 15.10.2.6, the assertion \b is currently defined for ASCII only, since the standard defines "The abstract operation IsWordChar" returning true for only 63 ASCII characters.

I believe it should instead be phrased as "The abstract operation IsWordChar returns true if and only if CharacterClassEscape w returns true".

The reason I'm requesting this change is that I'm a C++ developer, and the C++11 standard points to ECMA-262 for Regular Expression syntax. For Unicode strings, \w matches all Unicode alphanumeric characters, whereas \b only matches ASCII characters. I am using the workaround of changing \b into (?!\w), and it works, but it's very confusing to have \w to match one set of characters and \b work with a different set.


Substantive changes should go through the proposal process defined here: https://github.com/tc39/ecma262/blob/master/CONTRIBUTING.md.