In concurrent programming, an operation is atomic, or linearizable, if it appears to take effect instantaneously. Implementation details may be ignored by the user, except insofar as they affect performance. On the other hand, if an operation is not atomic, the user will also need to understand and cope with sporadic extraneous behaviour caused by concurrent operations, which by its nature is likely to be hard to reproduce and debug.

Those familiar with the ACID properties of databases should note that, in concurrent programming, atomicity is an isolation level, strictly stronger than serializable. Databases have a different definition of the term atomicity.

Contents

History

Linearizability was first introduced as a consistency model by Herlihy and Wing in 1987. It encompassed more restrictive definitions of atomic, such as "an atomic operation is one which cannot be (or is not) interrupted by concurrent operations", which are usually vague about when an operation is considered to begin and end.

An atomic object can be understood immediately and completely from its sequential definition, as a set of operations run in parallel will always appear to occur one after the other; no inconsistencies may emerge. Specifically, linearizability guarantees that the invariants of a system are observed and preserved by all operations: if all operations individually preserve an invariant, the system as a whole will.

Primitive atomic instructions

All modern computers provide basic atomic primitives, which are then used to build more complex atomic objects, e.g. queues or stacks. In addition to atomic read and write operations, most platforms provide an atomic read-and-update operation like Test-and-set or CAS, or a pair of operations like load-link/store-conditional that only have an effect if they occur atomically (that is, with no intervening, conflicting update). These can be used to implement locks, a vital mechanism for multithreaded programming, allowing invariants and atomicity to be enforced across groups of operations.

Many processors, especially 32-bit ones with 64-bit floating point support, provide some read and write operations that are not atomic: one thread reading a 64-bit register while another thread is writing to it may see a combination of both "before" and "after" values, a combination that may never actually have been written to the register. Further, only single operations are guaranteed to be atomic; threads arbitrarily performing groups of reads and writes will also observe a mixture of "before" and "after" values. Clearly, invariants cannot be relied on when such effects are possible.

High-level atomic operations

The easiest way to achieve linearizability is running groups of primitive operations in a critical section. Strictly independent operations can then be carefully permitted to overlap their critical sections, provided this does not violate linearizability. Such an approach must balance the cost of large numbers of locks against the benefits of increased parallelism.

Another approach, favoured by researchers (but not yet widely used in the software industry), is to design a linearizable object using the native atomic primitives provided by the hardware. This has the potential to maximise available parallelism and minimise synchronisation costs, but requires mathematical proofs which show that the objects behave correctly.

A promising hybrid of these two is to provide a transactional memory abstraction. As with critical sections, the user marks sequential code that must be run in isolation from other threads. The implementation then ensures the code executes atomically. This style of abstraction is common when interacting with databases; for instance, when using the Spring Framework, annotating a method with @Transactional will ensure all enclosed database interactions occur in a single database transaction. Transactional memory goes a step further, ensuring that all memory interactions occur atomically. As with database transactions, issues arise regarding composition of transactions, especially database and in-memory transactions.

A common theme when designing linearizable objects is to provide an all-or-nothing interface: either an operation succeeds completely, or it fails and does nothing. (ACID databases refer to this principle as atomicity.) If the operation fails (usually due to concurrent operations), the user must retry, usually performing a different operation. For example:

  • Compare-and-swap writes a new value into a location only if it matches a supplied old value. This is commonly used in a read-modify-CAS sequence: the user reads the location, computes a new value to write, and writes it with a CAS; if the value changes concurrently, the CAS will fail and the user tries again.
  • Load-Link/Store-Conditional encodes this pattern more directly: the user reads the location with load-link, computes a new value to write, and writes it with store-conditional; if the value has changed concurrently, the SC will fail and the user tries again.
  • In a database transaction, if the transaction cannot be completed due to a concurrent operation (e.g. in a deadlock), the transaction will be aborted and the user must try again.

Rigorous definition

A history is a sequence of invocations and responses made of an object by a set of threads. Each invocation of a function will have a subsequent response. This can be used to model any use of an object. Suppose, for example, that two threads, A and B, both attempt to grab a lock, backing off if it's already taken. This would be modeled as both threads invoking the lock operation, then both threads receiving a response, one successful, one not.

A invokes lock B invokes lock A gets "failed" response B gets "successful" response

A sequential history is one in which all invocations have immediate responses. A sequential history should be trivial to reason about, as it has no real concurrency; the previous example was not sequential, and thus is hard to reason about. This is where linearizability comes in.

A history is linearizable if:

  • its invocations and responses can be reordered to yield a sequential history
  • that sequential history is correct according to the sequential definition of the object
  • if a response preceded an invocation in the original history, it must still precede it in the sequential reordering

(Note that the first two bullet points here match serializability: the operations appear to happen in some order. It is the last point which is unique to linearizability, and is thus the major contribution of Herlihy and Wing.)

Let us look at two ways of reordering the locking example above.

A invokes lock A gets "failed" response B invokes lock B gets "successful" response

Reordering B's invocation below A's response yields a sequential history. This is easy to reason about, as all operations now happen in an obvious order. Unfortunately, it doesn't match the sequential definition of the object: A should have successfully obtained the lock, and B should have subsequently aborted.

B invokes lock B gets "successful" response A invokes lock A gets "failed" response

This is another correct sequential history. It is also a linearization! Note that the definition of linearizability only precludes responses that precede invocations from being reordered; since the original history had no responses before invocations, we can reorder it as we wish. Hence the original history is indeed linearizable.

An object (as opposed to a history) is linearizable if all valid histories of its use can be linearized. Note that this is a much harder assertion to prove.

Linearizability versus serializability

Consider the following history, again of two objects interacting with a lock:

A invokes lock A successfully locks B invokes unlock B successfully unlocks A invokes unlock A successfully unlocks

This history is visibly not linearizable, as it cannot be reordered to another sequential history without violating the ordering rule. However, under serializability, we may reorder B's unlock operation to before A's original lock, which is a valid history assuming the object begins the history in a locked state:

B invokes unlock B successfully unlocks A invokes lock A successfully locks A invokes unlock A successfully unlocks

While weird, this reordering is sensible provided there is no alternative means of communicating between A and B. Linearizability is better when considering individual objects separately, as the reordering restrictions ensure that multiple linearizable objects are, considered as a whole, still linearizable.

Linearization points

This definition of linearizability is equivalent to the following:

  • All function calls have a linearization point at some instant between their invocation and their response
  • All functions appear to occur instantly at their linearization point, behaving as specified by the sequential definition

This alternative is usually much easier to prove. It is also much easier to reason about as a user, largely due to its intuitiveness. This property of occurring instantaneously, or indivisibly, leads to the use of the term atomic as an alternative to the longer "linearizable".

Strict consistency

Strict consistency in computer science is the most stringent consistency model.

It says that a read operation has to return the result of the latest write operation which occurred on that data item. This is only possible when a global clock exists. Since its impossible to implement a global clock across nodes of a distributed system, this model has traditionally only been possible on a uniprocessor.

See also

References

  • M. Herlihy and J. Wing, "Axioms for Concurrent Objects", Proceedings of the 14th ACM SIGACT-SIGPLAN symposium on Principles of programming languages (January 1987), pp. 13-26 [1].
  • M. Herlihy, "A methodology for implementing highly concurrent data structures", ACM SIGPLAN symposium on Principles & practice of parallel programming, 1990, pp. 197-206 [2].

No comments have been added.



Your name:

City:

Country:

Your comments:

Security check *
(Please enter the number into adjoining box)