archives

« Bugzilla Issues Index

#2376 — Term "character" used without definition


The spec uses the term "character" in several places without any definition.

Since the spec is written primarily in terms of code point, you might define it along the lines of "a character is a code point corresponding to a Unicode encoded character".


Unicode has "Unicode scalar value", which is what I think should be used.

Remember that Unicode contains many code points that do not (yet) correspond to a character, but it is generally good form to allow these code points (while not allowing surrogate code points).

As a JSON text is a sequence of Unicode code points, it is not concerned with any Unicode encoding (except in string escapes for characters not in the Basic Multilingual Plane, which have their own problems).