Some algorithms and semantics in this proposal are presented as modifications to existing ES262 algorithms and semantics. The base revision of ES262 for the modifications (here denoted "ES7") is currently f35248729043089b47ccf29e4c47559386f5f6fd (2016-06-29). At the moment, ES7 (ES2016) section numbering differs from ES6 (ES2015) only for TypedArrays (section 22.2).
This proposal adds a simple form of shared memory to ECMAScript. Shared memory is being exposed in the form of a new SharedArrayBuffer type; the existing TypedArray and DataView types are adapted in such a way that they can be used to create views on shared memory. The new global Atomics object provides atomic operations on shared memory locations, including operations that can be used to create blocking synchronization primitives. Though not a part of this specification, we intend for the role of "threads" to be played by Web Workers in web browser environments.
At this time the proposal adds only low-level primitives to ECMAScript; we expect the developer community to put together abstractions that are useful in specific domains.
The work has been driven by the following use cases:
The proposal makes only very basic assumptions about the required hardware and should be broadly implementable with good performance. Prototype implementations of the API are available in Firefox and in Google Chrome.
This specification constitutes a restatement and formalization of an earlier work, the spec document for which also contains additional rationale, background information, discussions, implementation notes, and comments.
Changelog:
In the third paragraph, include SharedArrayBuffer after ArrayBuffer.
In the NOTE, include SharedArrayBuffer along with ArrayBuffer.
In table 7, include rows for
Insert the following paragraphs after the third:
A data block that resides in memory that can be referenced from multiple agents concurrently is designated a Shared Data Block. A Shared Data Block has an identity (for the purposes of equality testing Shared Data Block values) that is address-free: it is tied not to the virtual addresses the block is mapped to in any process, but to the set of locations in memory that the block represents. Shared Data Blocks can also be distinguished from Data Blocks.
The semantics of Shared Data Blocks is defined using Shared Data Block events by the Atomics
Shared Data Block events are modeled by Records, defined in the Atomics
One may think of the evaluation semantics of ECMAScript as a generating function for candidate executions. The Math.random()
) always has exactly one
This algorithm is modified as follows:
When the abstract operation CopyDataBlockBytes is called, the following steps are taken:
"Unordered"
, [[NoTear]]: "Unordered"
, [[NoTear]]: When the abstract operation CreateSharedByteDataBlock is called with integer argument size, the following steps are taken:
"Init"
, [[NoTear]]: The forward progress guarantee is provided by PR 522 on ES262.
The forward progress guarantee is further amended by the following clause.
Implementations must ensure that an agent does not cause another agent to become blocked except via explicit APIs that provide blocking.
This, along with the liveness guarantee in the "SeqCst"
writes eventually become observable to all agents.
(Spec draft note) This should be folded into PR 522 above.
Add the following properties to the Agent Record (which is provided by PR 522 on ES262):
Field name | Value | Meaning |
---|---|---|
[[Signifier]] | A value that admits equality testing | Uniquely identifies the agent within its agent cluster. |
[[LittleEndian]] | A Boolean | |
[[IsLockFree1]] | A Boolean | |
[[IsLockFree2]] | Boolean | |
[[CandidateExecution]] | A |
See the Atomics |
Once the values of [[Signifier]], [[IsLockFree1]], and [[IsLockFree2]] have been observed by any agent in the agent cluster they cannot change.
The values of [[IsLockFree1]] and [[IsLockFree2]] are not necessarily determined by the hardware, but may also reflect implementation choices that can vary over time and between ECMAScript implementations.
There is no [[IsLockFree4]] property: 4-byte atomic operations are always lock-free.
Formally, atomic operations are lock-free if, infinitely often, some atomic operation finishes in a finite number of program steps. In practice, if an atomic operation is implemented with any type of lock the operation is not lock-free. Lock-free does not imply wait-free: there is no upper bound on how many machine steps may be required to complete a lock-free atomic operation.
That an atomic access of size n is lock-free does not imply anything about the (perceived) atomicity of non-atomic accesses of size n, specifically, non-atomic accesses may still be performed as a sequence of several separate memory accesses. See ReadSharedMemory and WriteSharedMemory for details.
An agent cluster is a maximal set of agents that can communicate by operating on shared memory.
Programs within different agents may share memory by unspecified means. At a minimum, the backing memory for SharedArrayBuffer objects can be shared among the agents in the cluster.
There may be agents that can communicate by message passing that cannot share memory; they are never in the same cluster.
Every agent belongs to exactly one agent cluster.
The agents in a cluster need not all be alive at some particular point in time. If agent A creates another agent B, after which A terminates and B creates agent C, the three agents are in the same cluster if A could share some memory with B and B could share some memory with C.
All agents within a cluster must have the same value for the [[LittleEndian]] property in their respective Agent Records.
If different agents within an agent cluster have different values of [[LittleEndian]] it becomes hard to use shared memory for multi-byte data.
All agents within a cluster must have the same values for the [[IsLockFree1]] property in their respective Agent Records; similarly for the [[IsLockFree2]] property.
All agents within a cluster must have the same value for the [[CandidateExecution]] in their respective Agent Records.
All agents within a cluster must have different values for the [[Signifier]] property in their respective Agent Records.
An agent cluster is a specification mechanism and need not correspond to any particular artefact of an ECMAScript implementation.
An embedding may deactivate (stop forward progress) or activate (resume forward progress) an agent without the agent's knowledge or cooperation. If the embedding does so, it must not leave some agents in the cluster active while other agents in the cluster are deactivated indefinitely.
The purpose of the preceding restriction is to avoid a situation where an agent deadlocks or starves because another agent has been suspended. For example, if a DOM SharedWorker shares memory with a regular worker, and the regular worker is suspended while it holds a lock (because the web page the regular worker is in is pushed into the window history), and the SharedWorker tries to acquire the lock, then the SharedWorker will be blocked until the regular worker wakes up again, if ever. Meanwhile other workers trying to access the SharedWorker from other web pages will starve.
The implication of the restriction is that it will not be possible to share memory between agents that don't belong to the same suspend/wake collective within the embedding.
An embedding may terminate an agent without any of the agent's cluster's other agents' prior knowledge or cooperation. If an agent is terminated not by programmatic action of its own or of another agent in the cluster but by forces external to the cluster, then the embedding must choose one of two strategies: Either terminate all the agents in the cluster, or provide reliable APIs that allow the agents in the cluster to coordinate so that at least one remaining member of the cluster will be able to detect the termination, with the termination data containing enough information to identify the agent that was terminated.
Examples of that type of termination are: operating systems or users terminating agents that are running in separate processes; the embedding itself terminating an agent that is running in-process with the other agents when per-agent resource accounting indicates that the agent is runaway.
This proposal additionally suggests (see later text) that if
termination is signaled then the signal creates a
All agents in an agent cluster reference the same
Before any evaluation is performed, perform the following steps.
Add a new subsection for SharedArrayBuffer, pointing to the appropriate new section (below).
Add a new subsection for Atomics, pointing to the appropriate new section (below).
This algorithm is modified as follows:
In the call to
This algorithm is modified as follows:
Sets multiple values in this TypedArray, reading the values from the typedArray argument object. The optional offset value indicates the first element index in this TypedArray where values are written. If omitted, it is assumed to be 0.
"Uint8"
, "Uint8"
, This algorithm is modified as follows:
In the calls to
This algorithm is modified as follows:
In the calls to
Modify steps 17 and 18 to take shared memory into account, as follows:
This algorithm is modified as follows:
The abstract operation DetachArrayBuffer with argument arrayBuffer performs the following steps:
The abstract operation RawBytesToNumber takes three parameters, a String type, a
"Float32"
, then"Float64"
, then"U"
, thenThis algorithm is modified as follows:
The abstract operation GetValueFromBuffer takes foursix parameters, an ArrayBuffer or SharedArrayBuffer arrayBuffer, an integer byteIndex, a String type, a Boolean isTypedArray, a String order, and optionally a Boolean isLittleEndian. This operation performs the following steps:
"Int8"
, "Uint8"
, "Int16"
, "Uint16"
, "Int32"
, or "Uint32"
, otherwise "Float32"
, then"Float64"
, then"U"
, thenAll existing uses of the form GetValueFromBuffer(arrayBuffer, byteIndex, type, isLittleEndian) in TypedArray methods and abstract operations are changed to GetValueFromBuffer(arrayBuffer, byteIndex, type, "Unordered"
, isLittleEndian).
All existing uses of the form GetValueFromBuffer(arrayBuffer, byteIndex, type, isLittleEndian) in non-TypedArray methods and abstract operations are changed to GetValueFromBuffer(arrayBuffer, byteIndex, type, "Unordered"
, isLittleEndian).
All existing uses of the form GetValueFromBuffer(arrayBuffer, byteIndex, type) in TypedArray methods and abstract operations are changed to GetValueFromBuffer(arrayBuffer, byteIndex, type, "Unordered"
).
All existing uses of the form GetValueFromBuffer(arrayBuffer, byteIndex, type) in non-TypedArray methods and abstract operations are changed to GetValueFromBuffer(arrayBuffer, byteIndex, type, "Unordered"
).
The abstract operation NumberToRawBytes takes three parameters, a String type, a Number value, and a Boolean isLittleEndian.
"Float32"
, then"Float64"
, thenThis algorithm is modified as follows:
The abstract operation SetValueInBuffer takes fiveseven parameters, an ArrayBuffer or SharedArrayBuffer arrayBuffer, an integer byteIndex, a String type, a Number value, a Boolean isTypedArray, a String order, and optionally a Boolean isLittleEndian. This operation performs the following steps:
"Float32"
, then"Float64"
, then"Int8"
, "Uint8"
, "Int16"
, "Uint16"
, "Int32"
, or "Uint32"
, otherwise All existing uses of the form SetValueInBuffer(arrayBuffer, byteIndex, type, value, isLittleEndian) in TypedArray methods and abstract operations are changed to SetValueInBuffer(arrayBuffer, byteIndex, type, value, "Unordered"
, isLittleEndian).
All existing uses of the form SetValueInBuffer(arrayBuffer, byteIndex, type, value, isLittleEndian) in non-TypedArray methods and abstract operations are changed to SetValueInBuffer(arrayBuffer, byteIndex, type, value, "Unordered"
, isLittleEndian).
All existing uses of the form SetValueInBuffer(arrayBuffer, byteIndex, type, value) in TypedArray methods and abstract operations are changed to SetValueInBuffer(arrayBuffer, byteIndex, type, value, "Unordered"
).
All existing uses of the form SetValueInBuffer(arrayBuffer, byteIndex, type, value) in non-TypedArray methods and abstract operations are changed to SetValueInBuffer(arrayBuffer, byteIndex, type, value, "Unordered"
).
The abstract operation GetModifySetValueInBuffer takes six parameters, An ArrayBuffer or SharedArrayBuffer arrayBuffer, a nonnegative integer byteIndex, a String type, a semantic function op, a Number value, and optionally a Boolean isLittleEndian. This operation performs the following steps given a valid execution execution (see the Atomics
"SeqCst"
, [[NoTear]]: This algorithm is modified as follows:
ArrayBuffer.prototype.byteLength
is an accessor property whose set accessor function is
This algorithm is modified as follows:
The following steps are taken:
The abstract operation AllocateSharedArrayBuffer with arguments constructor and byteLength is used to create a SharedArrayBuffer object. It performs the following steps:
IsSharedArrayBuffer tests whether an object that is an ArrayBuffer, a SharedArrayBuffer, or a subtype of either is a SharedArrayBuffer or a subtype of it. It performs the following steps:
The SharedArrayBuffer constructor is the %SharedArrayBuffer% intrinsic object and the initial value of the SharedArrayBuffer
property of the
The SharedArrayBuffer constructor is designed to be subclassable. It may be used as the value of an extends
clause of a class definition. Subclass constructors that intend to inherit the specified SharedArrayBuffer behaviour must include a super
call to the SharedArrayBuffer constructor to create and initialize subclass instances with the internal state necessary to support the SharedArrayBuffer.prototype
built-in methods.
Unlike an ArrayBuffer, a SharedArrayBuffer cannot become detached, and its internal [[ArrayBufferData]] slot is never
SharedArrayBuffer called with argument length performs the following steps:
The value of the [[Prototype]] internal slot of the SharedArrayBuffer constructor is the intrinsic object
Besides its length
property (whose value is 1), the SharedArrayBuffer constructor has the following properties:
The initial value of SharedArrayBuffer.prototype is the intrinsic object %SharedArrayBufferPrototype% (q.v.).
This property has the attributes { [[Writable]]:
SharedArrayBuffer[@@species]
is an accessor property whose set accessor function is
The value of the name property of this function is "get [Symbol.species]
".
The SharedArrayBuffer prototype object is the intrinsic object %SharedArrayBufferPrototype%. The value of the [[Prototype]] internal slot of the SharedArrayBuffer prototype object is the intrinsic object
SharedArrayBuffer.prototype.byteLength is an accessor property whose set accessor function is
The initial value of SharedArrayBuffer.prototype.constructor
is the intrinsic object %SharedArrayBuffer%.
The following steps are taken:
The initial value of the @@toStringTag property is the String value "SharedArrayBuffer"
.
This property has the attributes { [[Writable]]:
SharedArrayBuffer instances inherit properties from the SharedArrayBuffer prototype object. SharedArrayBuffer instances each have an [[ArrayBufferData]] internal slot and a [[ArrayBufferByteLength]] internal slot.
SharedArrayBuffer instances, unlike ArrayBuffer instances, are never detached.
The Atomics object is the %Atomics% intrinsic object and the initial value of the Atomics
property of the
The Atomics object provides functions that operate indivisibly (atomically) on shared memory array cells as well as functions that let agents wait for and dispatch primitive events. When used with discipline, the Atomics functions allow multi-agent programs that communicate through shared memory to execute in a well-understood order even on parallel CPUs. The rules that govern shared-memory communication are provided by the memory model, defined below.
The value of the [[Prototype]] internal slot of the Atomics object is the intrinsic object
The Atomics object is not a function object. It does not have a [[Construct]] internal method; it is not possible to use the Atomics object as a constructor with the new
operator. The Atomics object also does not have a [[Call]] internal method; it is not possible to invoke the Atomics object as a function.
(Spec draft notes)
Atomicity means two things. One, it means that an action is indivisible. For example, writing a value to memory should write the whole value all at once. This is called access atomicity. Two, it means the timing with which different threads can observe a memory access, i.e., the ordering of memory events. For example, on x86, a memory write becomes visible to all cores other than the writing core at exactly the same time. On ARM, a memory write may become visible to other cores at different times. This is called copy atomicity.
Access atomicity is the easier problem. Hardware provides access atomic instructions that provide the guarantee. Copy atomicity is more difficult. Memory models concern themselves with prescribing some flavor of copy atomicity by ordering memory events. Access atomicity may be thought of as an orthogonal property that is applied on top of the
The
Thought of another way, the operational model of ECMA262 constructs an execution that is correct by construction. The
The model is best understood in two parts: the axiomatic part and the interfacing with the rest of ECMA262.
The axiomatic part is a weak consistency model. It needs to be flexible enough to allow a set of astonishing observable behaviors that are exhibited on weak hardware like ARM and Power and result from compiler transformations. For example, in certain cases, causality is allowed to be violated. At the same time, the
To understand the axiomatic model, we start with the input to the model, candidate executions. Assume that there is a set of shared memory events (reads and writes) that correspond to some evaluation of a program. The evaluation gives an initial partial ordering of these events called
The partial order
The SharedArrayBuffer API raises the difficulty that it allows aliased access on the same memory location via different sizes. That is, a read event can
In addition to
These relations and orders mutually constrain each other so that they are coherent. The coherence constraints roughly correspond to the intuition that a read event cannot read from a write event that's "too old", in the sense that there's another write event on the memory location that happens-after the write event that was read from.
Putting all of these together nets us a decision procedure that can decide if a
The decision procedure alone is not yet enough. How does a set of memory events arise in the first place? How do we connect this decision procedure, which acts on all possible executions, with the evaluation semantics of ECMA262?
To connect the axiomatic semantics to the operational semantics, the operational semantics are made nondeterministic. Instead of constructing a single, valid execution, the nondeterministic operational semantics constructs all candidate executions. This is captured by
With this nondeterminism the evaluation semantics can accumulate events as a side effect of performing operations on SharedArrayBuffers. Events and nondeterministically chosen values are accumulated in the [[EventLists]] and [[ChosenValues]] fields in a
So what we talk about when we talk about the meaning of a program that uses SharedArrayBuffer is the set of valid executions. To wit, the evaluation semantics first generates the set of all possible candidate executions, then the
The memory consistency model, or
The
This section provides an axiomatic model on events introduced by the abstract operations on SharedArrayBuffers. It bears stressing that the model is not expressible algorithmically, unlike the rest of this specification. The nondeterministic introduction of events by abstract operations is the interface between the operational semantics of ECMAScript evaluation and the axiomatic semantics of the
The Set type is used to explain a collection of unordered elements. Values of the Set type are simple collections of elements, where no element appears more than once. Elements may be added to and removed from Sets. Sets may be unioned, intersected, or subtracted from each other.
The Relation type is used to explain constraints on Sets. Values of the Relation type are Sets of ordered pairs of values from its value domain. For example, a Relation on events is a set of ordered pairs of events. For a Relation R and two values a and b in the value domain of R, a R b is shorthand for saying the ordered pair (a, b) is a member of R. A Relation is least with respect to some conditions when it is the smallest Relation that satisfies those conditions.
A strict partial order is a Relation value R that satisfies the following conditions.
The two properties above are called, in order, irreflexivity and transitivity.
A strict total order is a Relation value R that satisfies the following conditions.
The three properties above are called, in order, totality, irreflexivity, and transitivity.
Shared memory accesses (reads and writes) are divided into two groups, atomic accesses and data accesses, defined below. Atomic accesses are sequentially consistent, i.e., there is a strict total ordering of events agreed upon by all agents in an agent cluster. Non-atomic accesses do not have a strict total ordering agreed upon by all agents, i.e., unordered.
No orderings weaker than sequentially consistent and stronger than unordered, such as release-acquire, are supported.
A Shared Data Block event is either a ReadSharedMemory, WriteSharedMemory, or ReadModifyWriteSharedMemory
Field | Value | Meaning |
---|---|---|
[[Order]] | "SeqCst" or "Unordered" |
The weakest ordering guaranteed by the |
[[NoTear]] | A Boolean | Whether this event is allowed to read from multiple write events on equal |
[[Block]] | A |
The block the event operates on. |
[[ByteIndex]] | A nonnegative integer | The byte address of the read in [[Block]]. |
[[ElementSize]] | A nonnegative integer | The size of the read. |
Field | Value | Meaning |
---|---|---|
[[Order]] | "SeqCst" , "Unordered" , or "Init" |
The weakest ordering guaranteed by the |
[[NoTear]] | A Boolean | Whether this event is allowed to be read from multiple read events with equal |
[[Block]] | A |
The block the event operates on. |
[[ByteIndex]] | A nonnegative integer | The byte address of the write in [[Block]]. |
[[ElementSize]] | A nonnegative integer | The size of the write. |
[[Payload]] | A |
The |
Field | Value | Meaning |
---|---|---|
[[Order]] | "SeqCst" |
Read-modify-write events are always sequentially consistent. |
[[NoTear]] | Read-modify-write events cannot tear. | |
[[Block]] | A |
The block the event operates on. |
[[ByteIndex]] | A nonnegative integer | The byte address of the read-modify-write in [[Block]]. |
[[ElementSize]] | A nonnegative integer | The size of the read-modify-write. |
[[Payload]] | A |
The |
[[ModifyOp]] | A semantic function | A pure semantic function that returns a modified |
These events are introduced by abstract operations or by methods on the Atomics object.
In addition to
Let the range of a ReadSharedMemory, WriteSharedMemory, or ReadModifyWriteSharedMemory event be the Set of contiguous integers from its byteIndex to byteIndex+elementSize-1. Two events' ranges are equal when the events have the same block, and the ranges are element-wise equal. Two events' ranges are overlapping when the events have the same block, the ranges are not equal and their intersection is non-empty. Two events' ranges are disjoint when the events do not have the same block or their ranges are neither equal nor overlapping.
Examples of host-specific synchronizing events that should be accounted for are: sending a SharedArrayBuffer from one agent to another (e.g., by postMessage
in a browser), starting and stopping agents, and communicating within the agent cluster via channels other than shared memory. It is assumed those events are appended to
An Agent Events Record is a
Field | Value | Meaning |
---|---|---|
[[AgentSignifier]] | A value that admits equality testing | The agent whose evaluation resulted in this ordering. |
[[EventList]] | A |
Events are appended to the list during evaluation. |
A Chosen Value Record is a
Field | Value | Meaning |
---|---|---|
[[Event]] | A |
The ReadSharedMemory or ReadModifyWriteSharedMemory event that was introduced for this chosen value. |
[[ChosenValue]] | A |
The bytes that were nondeterministically chosen during evaluation. |
A candidate execution of the evaluation of an agent cluster is a
Field | Value | Meaning |
---|---|---|
[[EventLists]] | A |
Maps an agent to Lists of events appended during the evaluation. |
[[ChosenValues]] | A |
Maps ReadSharedMemory or ReadModifyWriteSharedMemory events to the |
[[AgentOrder]] | An |
Defined below. |
[[ReadsBytesFrom]] | A |
Defined below. |
[[ReadsFrom]] | A |
Defined below. |
[[HostSynchronizesWith]] | A |
Defined below. |
[[SynchronizesWith]] | A |
Defined below. |
[[HappensBefore]] | A |
Defined below. |
An empty candidate execution is a candidate execution
The abstract operation EventSet takes one argument, a
The abstract operation SharedDataBlockEventSet takes one argument, a
The abstract operation HostEventSet takes one argument, a
For a
Each agent introduces events in a per-agent
For a
For a
For a
For two host-specific events E and D, E host-synchronizes-with D implies E
The host-synchronizes-with relation allows the host to provide additional synchronization mechanisms, such as postMessage
between HTML workers.
For a
"SeqCst"
and R "SeqCst"
and R and W have equal ranges then W "Init"
then"Init"
then set allInitReads to Owing to convention, write events synchronizes-with read events, instead of read events synchronizes-with write events.
Not all "SeqCst"
events related by
For an event R and a event W such W synchronizes-with R, R may
For a
"Init"
, and E and D have overlapping ranges then:"Init"
.Because happens-before is a superset of
The abstract operation ComposeWriteEventBytes takes four arguments, a
The semantic function [[ModifyOp]] is given by the function properties on the Atomics object that introduce ReadModifyWriteSharedMemory events.
The abstract operation ValueOfReadEvent takes two arguments, a
A
A
A
An event's [[NoTear]] field is
Intuitively, this requirement says when a memory
For a
"SeqCst"
."SeqCst"
events on equal ranges."SeqCst"
then it is not the case there is an infinite number of ReadSharedMemory or ReadModifyWriteSharedMemory events in "SeqCst"
writes become visible to "SeqCst"
reads with equal A
While memory-order includes all events in
A
All programs have at least one valid execution.
(Spec draft notes)
The intuition that the set of valid executions for any ECMAScript program is nonempty owes to that a sequentially consistent (i.e., interleaving) execution may always be demonstrated. Clearly, this valid execution is too strong and insufficient to account for all observable behaviors in implementations. Instead, it serves as an existence proof that the
The idea is to build an interleaving of events, a
The inputs to the "Init"
events introduced by "Init"
events in cluster-order. For example, if Agent 1 has events [A, B] and Agent 2 has events [C, D], cluster-order can be [A, C, B, D], or [C, A, D, B], or [A, B, C, D], and so forth. It, however, cannot be [D, C, A, B], as that would be inconsistent with Agent 2's
Note that there are caveats to the construction of cluster-order we gloss over in this sketch. It is not the case that all possible strict total orders for each agent's own strict total orders in [[EventLists]] give rise to valid executions. For example, if an agent's local order of events assumes certain control dependences on shared memory nondeterminism (e.g., assuming a call to Atomics.wait completed), then cluster-order must order events in such a way that the value depended upon was read. If such a cluster-order does not exist, the construction procedure can start over with different orders in [[EventLists]]. The control dependence nondeterminism introduced by shared memory reads may be explored until a valid execution is found. Note that an execution may be undesirable in reality, such as a deadlock, but still be valid with respect to the
With this cluster-order, a valid "Init"
, such a W always exists. This
With this "SeqCst"
reads that happen to read from "SeqCst"
writes on equal ranges are
Next,
Thus far, we have shown how to construct the needed relations. They must now be checked to satisfy the constraints the model sets out for them: valid chosen reads, coherent reads, tear free reads, and sequentially consistent atomics.
Valid chosen reads may be assumed to be satisfied, as it is a formal mechanism to interface the axiomatic
Coherent reads is satisfied by the construction of
Tear free reads is satisfied by construction of
This execution has sequentially consistent atomics because cluster-order is a "SeqCst"
events must also be sequentially consistent. In other words, cluster-order is the
This concludes the sketch: the constructed execution is valid.
For an execution execution, two events E and D in
For an execution execution, two events E and D in
"SeqCst"
or E and D have overlapping ranges.
An execution execution is data race free if there are no two events in
A program is data race free if all its executions are data race free.
In an execution, a ReadSharedMemory or ReadModifyWriteSharedMemory event is access atomic when it
The Tear Free Reads requirement does not guarantee access atomicity for non-atomic events. Consider the following program where U8
and U16
are aliased 1-byte and 2-byte views on the same a
and b
represent the same memory address (scaled appropriately for the data type).
W1: U16[a] = 1; W2: U8[b] = 2; || W3: U16[a] = 3; || R: observe(U16[a]);
A
R
W1
R
W2
R
reads a value composed of bytes from W1
and W2
, and is thus not access atomic. At the same time, it does not read-from both W2
and W3
, and thus is tear free.
Similarly, neither the U8
and U16
are aliased 1-byte and 2-byte views on the same
W1: Atomics.store(U16, a, 1); W2: Atomics.store(U8, b, 2); || W3: Atomics.store(U16, a, 3); || R: observe(Atomics.load(U16, a));
A
R
W1
R
W1
R
W2
W1
is W2
and W2
is R
and R
is W3
W2
is allowed to come between W1
and R
in R
do not have equal ranges. Thus, R
may read a value composed of bytes written by both W1
and W2
.
Finally,
W1: Atomics.store(U16, a, 1); W2: Atomics.store(U8, b, 2); R: observe(Atomics.load(U16, a));
The only valid execution is as follows.
R
W1
R
W1
R
W2
W1
is W2
and W2
is R
All valid executions of the program are R
is not access atomic since it
The central problem the
A program wishing to avoid reasoning about access atomicity or wishing to guarantee all its aligned and atomic accesses are access atomic should avoid accesses on overlapping ranges. When a program does not use address ranges in an overlapping fashion, aligned and atomic accesses are access atomic.
The
(Spec draft note): A formal proof of the property also requires the extending the model with a host-provided data dependence relation that prohibits out of thin air reads.
"Int32Array"
then throw a "Int8Array"
, "Uint8Array"
, "Int16Array"
, "Uint16Array"
, "Int32Array"
, or "Uint32Array"
then throw a Perform the following steps:
When AgentSignifier is called without arguments then the following steps are taken.
When AgentCanSuspend is called without arguments then the following steps are taken.
In some environments it may not be reasonable for a given agent to suspend. For example, in a web browser environment, it may be reasonable to disallow suspending a document's main event handling thread, while still allowing workers' event handling threads to suspend.
A WaiterList is a semantic object that contains an ordered list of those agents that are waiting on a location (block, i) in shared memory; block is a
The agent cluster has a store of WaiterList objects; the store is indexed by (block, i). WaiterLists are agent-independent: a lookup in the store of WaiterLists by (block, i) will result in the same WaiterList object in any agent in the agent cluster.
Operations on a WaiterList -- adding and removing waiting agents, traversing the list of agents, suspending and waking agents on the list -- may only be performed by agents that have entered the WaiterList's critical section.
When GetWaiterList is called with
When EnterCriticalSection is called with
When LeaveCriticalSection is called with
When AddWaiter is called with
When RemoveWaiter is called with
When RemoveWaiters is called with
When Suspend is called with
When WakeWaiter is called with
The embedding may delay waking W, eg for resource management reasons, but W must eventually be woken in order to guarantee forward progress.
AtomicReadModifyWrite is a semantic function that atomically loads a value, combines it with another value, and stores the result of the combination. It returns the loaded value. It is parameterized by the (pure) combining operation op that takes two
Let add
denote a semantic function of two
The following steps are taken:
add
).
Let and
denote a semantic function of two
The following steps are taken:
and
).
The following steps are taken:
compareExchange
denote a semantic function of two compareExchange
).
Let second
denote a semantic function of two
The following steps are taken:
second
).
Atomics.isLockFree() is an optimization primitive. The
intuition is that if the atomic step of an atomic primitive
(compareExchange
, load
, store
, add
, sub
, and
,
or
, xor
, or exchange
) on a datum of size n bytes
will be performed without the calling agent acquiring a lock
outside the n bytes comprising the datum, then
Atomics.isLockFree(n) will return
Atomics.isLockFree(4) always returns
The following steps are taken:
"SeqCst"
).
Let or
denote a semantic function of two
The following steps are taken:
or
).
The following steps are taken:
"SeqCst"
).Let subtract
denote a semantic function of two
The following steps are taken:
subtract
).
Atomics.wait puts the calling agent in a wait queue and puts it to sleep until it is awoken or the sleep times out. The following steps are taken:
"not-equal"
."ok"
."timed-out"
.
Atomics.wake wakes up some agents that are sleeping in the wait queue. The following steps are taken:
Let xor
denote a semantic function of two
The following steps are taken:
xor
).
Keep programs
More generally, even if a program is not
(Spec draft notes)
The
The question arises as to which transformations that are allowed in a multi-agent setting, where transformations that affect shared memory writes are observable by other agents. An agent that writes 0 to x
and then writes 1 to y
can be transformed in a single-agent setting to one that writes 1 to y
and then writes 0 to x
. But is that a legal transformation if those variables are in shared memory and another agent might observe the order of writes? The transformation would affect the x
and y
are written with "SeqCst"
operations or not?
It is desirable to allow most program transformations that are valid in a single-agent setting in a multi-agent setting, to ensure that the performance of each agent in a multi-agent program is as good as it would be in a single-agent setting. (There seems to be broad agreement in the programming language design community that that is the right trade-off.) But "most" does not mean "all". In the example above, if x
or y
(or both) were intended to be "SeqCst"
writes, and the writes were reordered, not only might x
and y
were both "Unordered"
then in a data-
Frequently these transformations are hard to judge. Can an "Unordered"
load be moved out of a loop? Even if the loop's termination condition depends on the loaded value? The load is necessarily racy, and the
With that background, we outline some rules about program transformations that are intended to be taken as normative (in that they fall out of the
Let an agent-order slice be the subset of the
Let possible read values of a read event be the set of all values of
Any transformation of an agent-order slice that is valid in the absence of shared memory is valid in the presence of shared memory, with the following exceptions.
"SeqCst"
events in an agent-order slice to be reordered with its "Unordered"
operations, nor its "SeqCst"
operations to be reordered with each other, nor may a program transformation remove a "SeqCst"
operation from the (In practice, the prohibition on reorderings forces a compiler to assume that every "SeqCst"
operation is a synchronization and included in the final "SeqCst"
operations.)
(For example, if what is semantically a single read in the program is executed multiple times then the program is subsequently allowed to observe only one of the values read. A transformation known as rematerialization can violate this rule.)
(For example, a transformation may not introduce certain observable writes, such as by using read-modify-write operations on a larger location to write a smaller datum, writing a value to memory that the program could not have written, or writing a just-read value back to the location it was read from, if that location could have been overwritten by another agent after the read.)
(Counterintuitively, this rule in effect restricts transformations on writes, because writes have force in "SeqCst"
operations, but the transformation may not remove every write that updates a location; some write must be preserved.)
(Spec draft note) These rules are technically stronger than what the "SeqCst"
events that "SeqCst"
events that
The rules apply only to shared memory accesses, so an implementation that knows the program is not accessing shared memory will not be affected by them. In practice, the optimizations that are affected by the rules are of limited value, and a practical implementation could avoid those optimizations even for non-shared memory without significant impact.
Examples of transformations that remain valid are: merging multiple non-atomic reads from the same location, reordering non-atomic reads, introducing speculative non-atomic reads, merging multiple non-atomic writes to the same location, reordering non-atomic writes to different locations, and hoisting non-atomic reads out of loops even if that affects termination. Note in general that aliased TypedArrays make it hard to prove that locations are different.
For architectures with memory models no weaker than those of ARM or Power, non-atomic stores and loads may be compiled to bare stores and loads on the target architecture. Atomic stores and loads may be compiled down to instructions that guarantee sequential consistency. If no such instructions exist, memory barriers are to be employed, such as placing barriers on both sides of a bare store or load. Read-modify-write operations may be compiled to read-modify-write instructions on the target architectrue, such as LOCK
-prefixed instructions on x86, load-exclusive/store-exclusive instructions on ARM, and load-link/store-conditional instructions on Power.
(Spec draft notes)
While the
Specifically, the
With that in mind, naive code generation uses these patterns:
That mapping is correct so long as an atomic operation on an address
A number of local improvements to those basic patterns are also intended to be legal:
One informative reference about local improvements is the writeup from the Cambridge Relaxed Memory Group.
On the other hand, the restrictions the
This section outlines how the Shared Memory and Atomics specification fits into the current web ecosystem with the minimum amount of change to that ecosystem. This section is not part of the proposal, it is informative only.
A separate proposal, the DOM companion spec, contains information about changes to the DOM APIs and specifications.
In a web browser an agent is an HTML event loop [here]. The event loop is realized as either a main thread (which may be shared among tabs, as it is in Firefox) or some type of worker thread. The event loops are running jobs in the sense of ES262, and the forward-progress requirement of this specification (section 3.1.1) is generally met as long as each agent has its own dedicated operating system thread or shared-thread agents can't block.
Browsers will typically let agents that run on the browser's
main thread have [[CanBlock]] equal to
Normal ("dedicated") Worker
s must be supported as agents for
this proposal to make much sense.
Memory will be shared among agents by using postMessage
to
transmit SharedArrayBuffer objects or TypedArray objects that
have SharedArrayBuffer buffers; this requires an extension to
the Structured Clone mechanism. There is agreement on the general
syntax and semantics for that extension.
Agent-to-agent communication extends the
Worker
constructor in the parent
postMessage
to another agent So long as the browser does not allow a ServiceWorker
or
SharedWorker
("non-page worker") to share memory with a Worker
it will not violate the suspend/wake cohort rule of this
specification (section 3.3). The restriction on sharing
memory can be implemented in the extension to the Structured Clone
mechanism. For example, a non-page worker may simply not be
allowed to receive shared memory (leading to a null value or an
error signal). There is not yet any agreement on this point, but
it is clear that the restriction must be on the receiving side of
the communication, as the sender may be sending on a MessagePort
that is not yet connected, but may in the future be connected to
either a valid or invalid recipient.
At the moment, I believe dedicated workers are in the same process as their owning tab in all browsers, so if the process crashes then the workers will crash too. I also don't know any reason a browser should forcibly terminate a worker except when a tab is closed. In sum, the termination signaling requirement of this specification (section 3.3) is probably met by existing browsers.
The web platform should evolve to serve the shared memory use case better: by incorporating inspectable worker state and worker lifecycle events, and by tightening the wording in the HTML spec (currently the browser is allowed to kill a worker at any time for any reason, which is not quite what we want). WebIDL should evolve to allow Web APIs to be described as to when they can and cannot receive shared memory parameters. However, only an extension to the Structured Clone algorithm is needed to support the Shared Memory and Atomics specification in practice on current browsers.
© 2017 Mozilla, Inc.
All Software contained in this document ("Software") is protected by copyright and is being made available under the "BSD License", included below. This Software may be subject to third party rights (rights from parties other than Ecma International), including patent rights, and no licenses under such third party rights are granted under this license even if the third party concerned is a member of Ecma International. SEE THE ECMA CODE OF CONDUCT IN PATENT MATTERS AVAILABLE AT http://www.ecma-international.org/memento/codeofconduct.htm FOR INFORMATION REGARDING THE LICENSING OF PATENT CLAIMS THAT ARE REQUIRED TO IMPLEMENT ECMA INTERNATIONAL STANDARDS.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE ECMA INTERNATIONAL "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL ECMA INTERNATIONAL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.