fbpx
Wikipedia

Compare-and-swap

In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of that memory location to a new given value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple boolean response (this variant is often called compare-and-set), or by returning the value read from the memory location (not the value written to it).

Overview edit

A compare-and-swap operation is an atomic version of the following pseudocode, where * denotes access through a pointer:[1]

function cas(p: pointer to int, old: int, new: int) is if *p ≠ old return false *p ← new return true 

This operation is used to implement synchronization primitives like semaphores and mutexes,[1] as well as more sophisticated lock-free and wait-free algorithms. Maurice Herlihy (1991) proved that CAS can implement more of these algorithms than atomic read, write, or fetch-and-add, and assuming a fairly large[clarification needed] amount of memory, that it can implement all of them.[2] CAS is equivalent to load-link/store-conditional, in the sense that a constant number of invocations of either primitive can be used to implement the other one in a wait-free manner.[3]

Algorithms built around CAS typically read some key memory location and remember the old value. Based on that old value, they compute some new value. Then they try to swap in the new value using CAS, where the comparison checks for the location still being equal to the old value. If CAS indicates that the attempt has failed, it has to be repeated from the beginning: the location is re-read, a new value is re-computed and the CAS is tried again. Instead of immediately retrying after a CAS operation fails, researchers have found that total system performance can be improved in multiprocessor systems—where many threads constantly update some particular shared variable—if threads that see their CAS fail use exponential backoff—in other words, wait a little before retrying the CAS.[4]

Example application: atomic adder edit

As an example use case of compare-and-swap, here is an algorithm for atomically incrementing or decrementing an integer. This is useful in a variety of applications that use counters. The function add performs the action *p ← *p + a, atomically (again denoting pointer indirection by *, as in C) and returns the final value stored in the counter. Unlike in the cas pseudocode above, there is no requirement that any sequence of operations is atomic except for cas.

function add(p: pointer to int, a: int) returns int done ← false while not done value ← *p // Even this operation doesn't need to be atomic. done ← cas(p, value, value + a) return value + a 

In this algorithm, if the value of *p changes after (or while!) it is fetched and before the CAS does the store, CAS will notice and report this fact, causing the algorithm to retry.[5]

ABA problem edit

Some CAS-based algorithms are affected by and must handle the problem of a false positive match, or the ABA problem. It is possible that between the time the old value is read and the time CAS is attempted, some other processors or threads change the memory location two or more times such that it acquires a bit pattern which matches the old value. The problem arises if this new bit pattern, which looks exactly like the old value, has a different meaning: for instance, it could be a recycled address, or a wrapped version counter.

A general solution to this is to use a double-length CAS (DCAS). E.g., on a 32-bit system, a 64-bit CAS can be used. The second half is used to hold a counter. The compare part of the operation compares the previously read value of the pointer and the counter, with the current pointer and counter. If they match, the swap occurs - the new value is written - but the new value has an incremented counter. This means that if ABA has occurred, although the pointer value will be the same, the counter is exceedingly unlikely to be the same (for a 32-bit value, a multiple of 232 operations would have to have occurred, causing the counter to wrap and at that moment, the pointer value would have to also by chance be the same).

An alternative form of this (useful on CPUs which lack DCAS) is to use an index into a freelist, rather than a full pointer, e.g. with a 32-bit CAS, use a 16-bit index and a 16-bit counter. However, the reduced counter lengths begin to make ABA possible at modern CPU speeds.

One simple technique which helps alleviate this problem is to store an ABA counter in each data structure element, rather than using a single ABA counter for the whole data structure.

A more complicated but more effective solution is to implement safe memory reclamation (SMR). This is in effect lock-free garbage collection. The advantage of using SMR is the assurance a given pointer will exist only once at any one time in the data structure, thus the ABA problem is completely solved. (Without SMR, something like a freelist will be in use, to ensure that all data elements can be accessed safely (no memory access violations) even when they are no longer present in the data structure. With SMR, only elements actually currently in the data structure will be accessed).

Costs and benefits edit

CAS, and other atomic instructions, are sometimes thought to be unnecessary in uniprocessor systems, because the atomicity of any sequence of instructions can be achieved by disabling interrupts while executing it. However, disabling interrupts has numerous downsides. For example, code that is allowed to do so must be trusted not to be malicious and monopolize the CPU, as well as to be correct and not accidentally hang the machine in an infinite loop or page fault. Further, disabling interrupts is often deemed too expensive to be practical. Thus, even programs only intended to run on uniprocessor machines will benefit from atomic instructions, as in the case of Linux's futexes.

In multiprocessor systems, it is usually impossible to disable interrupts on all processors at the same time. Even if it were possible, two or more processors could be attempting to access the same semaphore's memory at the same time, and thus atomicity would not be achieved. The compare-and-swap instruction allows any processor to atomically test and modify a memory location, preventing such multiple-processor collisions.

On server-grade multi-processor architectures of the 2010s, compare-and-swap is cheap relative to a simple load that is not served from cache. A 2013 paper points out that a CAS is only 1.15 times more expensive than a non-cached load on Intel Xeon (Westmere-EX) and 1.35 times on AMD Opteron (Magny-Cours).[6]

Implementations edit

Compare-and-swap (and compare-and-swap-double) has been an integral part of the IBM 370 (and all successor) architectures since 1970. The operating systems that run on these architectures make extensive use of this instruction to facilitate process (i.e., system and user tasks) and processor (i.e., central processors) parallelism while eliminating, to the greatest degree possible, the "disabled spinlocks" which had been employed in earlier IBM operating systems. Similarly, the use of test-and-set was also eliminated. In these operating systems, new units of work may be instantiated "globally", into the global service priority list, or "locally", into the local service priority list, by the execution of a single compare-and-swap instruction. This substantially improved the responsiveness of these operating systems.

In the x86 (since 80486) and Itanium architectures this is implemented as the compare and exchange (CMPXCHG) instruction (on a multiprocessor the LOCK prefix must be used).

As of 2013, most multiprocessor architectures support CAS in hardware, and the compare-and-swap operation is the most popular synchronization primitive for implementing both lock-based and non-blocking concurrent data structures.[4]

The atomic counter and atomic bitmask operations in the Linux kernel typically use a compare-and-swap instruction in their implementation. The SPARC-V8 and PA-RISC architectures are two of the very few recent architectures that do not support CAS in hardware; the Linux port to these architectures uses a spinlock.[7]

Implementation in C edit

Many C compilers support using compare-and-swap either with the C11 <stdatomic.h> functions,[8] or some non-standard C extension of that particular C compiler,[9] or by calling a function written directly in assembly language using the compare-and-swap instruction.

The following C function shows the basic behavior of a compare-and-swap variant that returns the old value of the specified memory location; however, this version does not provide the crucial guarantees of atomicity that a real compare-and-swap operation would:

int compare_and_swap(int* reg, int oldval, int newval) {  ATOMIC();  int old_reg_val = *reg;  if (old_reg_val == oldval)  *reg = newval;  END_ATOMIC();  return old_reg_val; } 

old_reg_val is always returned, but it can be tested following the compare_and_swap operation to see if it matches oldval, as it may be different, meaning that another process has managed to succeed in a competing compare_and_swap to change the reg value from oldval.

For example, an election protocol can be implemented such that every process checks the result of compare_and_swap against its own PID (= newval). The winning process finds the compare_and_swap returning the initial non-PID value (e.g., zero). For the losers it will return the winning PID.

This is the logic in the Intel Software Manual Vol 2A:

bool compare_and_swap(int *accum, int *dest, int newval) {  if (*accum == *dest) {  *dest = newval;  return true;  } else {  *accum = *dest;  return false;  } } 

Extensions edit

Since CAS operates on a single pointer-sized memory location, while most lock-free and wait-free algorithms need to modify multiple locations, several extensions have been implemented.

Double compare-and-swap (DCAS)
Compares two unrelated memory locations with two expected values, and if they're equal, sets both locations to new values. The generalization of DCAS to multiple (non-adjacent) words is called MCAS or CASN. DCAS and MCAS are of practical interest in the convenient (concurrent) implementation of some data structures like deques or binary search trees.[10][11] DCAS and MCAS may be implemented however using the more expressive hardware transactional memory[12] present in some recent processors such as IBM POWER8 or in Intel processors supporting Transactional Synchronization Extensions (TSX).
Double-wide compare-and-swap
Operates on two adjacent pointer-sized locations (or, equivalently, one location twice as big as a pointer). On later x86 processors, the CMPXCHG8B and CMPXCHG16B instructions[13] serve this role, although early 64-bit AMD CPUs did not support CMPXCHG16B (modern AMD CPUs do). Some Intel motherboards from the Core 2 era also hamper its use, even though the processors support it. These issues came into the spotlight at the launch of Windows 8.1 because it required hardware support for CMPXCHG16B.[14]
Single compare, double swap
Compares one pointer but writes two. The Itanium's cmp8xchg16 instruction implements this,[15] where the two written pointers are adjacent.
Multi-word compare-and-swap
Is a generalisation of normal compare-and-swap. It can be used to atomically swap an arbitrary number of arbitrarily located memory locations. Usually, multi-word compare-and-swap is implemented in software using normal double-wide compare-and-swap operations.[16] The drawback of this approach is a lack of scalability.
Persistent compare-and-swap
Is a combination of persist operation and the normal compare-and-swap. It can be used to atomically compare-and-swap a value and then persist the value, so there is no gap between concurrent visibility and crash visibility. The extension solves the read-of-non-persistent-write problem.[17]

See also edit

References edit

  1. ^ a b Mullender, Sape; Cox, Russ (2008). Semaphores in Plan 9 (PDF). 3rd International Workshop on Plan 9.
  2. ^ Herlihy, Maurice (January 1991). "Wait-free synchronization" (PDF). ACM Trans. Program. Lang. Syst. 13 (1): 124–149. CiteSeerX 10.1.1.56.5659. doi:10.1145/114005.102808. S2CID 2181446. Retrieved 2007-05-20.
  3. ^ J. H. Anderson and M. Moir. "Universal constructions for multi-object operations". In Proc. 14th Annual ACM Symposium on Principles of Distributed Computing, pages 184–193, 1995. See their Table 1, Figures 1 & 2 and Section 2 in particular.
  4. ^ a b Dice, Dave; Hendler, Danny; Mirsky, Ilya (2013). "Lightweight Contention Management for Efficient Compare-and-Swap Operations". arXiv:1305.5800 [cs.DC].
  5. ^ Goetz, Brian (23 November 2004). "Java theory and practice: Going atomic". IBM developerWorks.
  6. ^ Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. "Everything you always wanted to know about synchronization but were afraid to ask." Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, 2013, pp. 33-48. Detail on p. 34
  7. ^ David S. Miller. "Semantics and Behavior of Atomic and Bitmask Operations, for Linux port maintainers" 2012-03-20 at the Wayback Machine.
  8. ^ https://en.cppreference.com/w/c/atomic/atomic_compare_exchange
  9. ^ "GNU C Extensions to the C Language Family: Built-in functions for atomic memory access"
  10. ^ Simon Doherty et al., "DCAS is not a silver bullet for nonblocking algorithm design". 16th annual ACM symposium on Parallelism in algorithms and architectures, 2004, pp. 216–224. doi:10.1145/1007912.1007945
  11. ^ Keir Fraser (2004), "Practical lock-freedom" UCAM-CL-TR-579.pdf
  12. ^ Dave Dice, Yossi Lev, Mark Moir, Dan Nussbaum, and Marek Olszewski. (2009) "Early experience with a commercial hardware transactional memory implementation." Sun Microsystems technical report (60 pp.) SMLI TR-2009-180. A short version appeared at ASPLOS’09 doi:10.1145/1508244.1508263. The full-length report discusses how to implement DCAS using HTM in section 5.
  13. ^ "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2A: Instruction Set Reference, A-M" (PDF). Retrieved 2007-12-15.
  14. ^ "New Windows 8.1 requirements strand some users on Windows 8".
  15. ^ "Intel Itanium Architecture Software Developer's Manual Volume 3: Instruction Set Reference" (PDF). Retrieved 2007-12-15.
  16. ^ "A Practical Multi-Word Compare-and-Swap Operation" (PDF). Retrieved 2009-08-08.
  17. ^ Wang, William; Diestelhorst, Stephan (June 17, 2019). "Persistent Atomics for Implementing Durable Lock-Free Data Structures for Non-Volatile Memory (Brief Announcement)". The 31st ACM Symposium on Parallelism in Algorithms and Architectures. Association for Computing Machinery. pp. 309–311. doi:10.1145/3323165.3323166. ISBN 9781450361842. S2CID 195064876 – via ACM Digital Library.

External links edit

Basic algorithms implemented using CAS edit

  • Sundell, Håkan; Tsigas, Philippas. "Lock-Free and Practical Deques using Single-Word Compare-And-Swap" (PDF).
  • Valois, John D. Lock-Free Linked Lists Using Compare-and-Swap. Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing. CiteSeerX 10.1.1.41.9506.
  • Prakash, S.; Lee, Yann Hang; Johnson, T. "A Nonblocking Algorithm for Shared Queues Using Compare-and-Swap". IEEE Transactions on Computers.
  • 2003 discussion "Lock-Free using cmpxchg8b..." on Intel x86, with pointers to various papers and source code

Implementations of CAS edit

  • AIX compare_and_swap Kernel Service
  • Java package java.util.concurrent.atomic implements 'compareAndSet' in various classes
  • .NET Class methods Interlocked::CompareExchange
  • Windows API InterlockedCompareExchange

compare, swap, this, article, uses, bare, urls, which, uninformative, vulnerable, link, please, consider, converting, them, full, citations, ensure, article, remains, verifiable, maintains, consistent, citation, style, several, templates, tools, available, ass. This article uses bare URLs which are uninformative and vulnerable to link rot Please consider converting them to full citations to ensure the article remains verifiable and maintains a consistent citation style Several templates and tools are available to assist in formatting such as reFill documentation and Citation bot documentation September 2022 Learn how and when to remove this template message In computer science compare and swap CAS is an atomic instruction used in multithreading to achieve synchronization It compares the contents of a memory location with a given value and only if they are the same modifies the contents of that memory location to a new given value This is done as a single atomic operation The atomicity guarantees that the new value is calculated based on up to date information if the value had been updated by another thread in the meantime the write would fail The result of the operation must indicate whether it performed the substitution this can be done either with a simple boolean response this variant is often called compare and set or by returning the value read from the memory location not the value written to it Contents 1 Overview 1 1 Example application atomic adder 1 2 ABA problem 1 3 Costs and benefits 2 Implementations 2 1 Implementation in C 3 Extensions 4 See also 5 References 6 External links 6 1 Basic algorithms implemented using CAS 6 2 Implementations of CASOverview editA compare and swap operation is an atomic version of the following pseudocode where denotes access through a pointer 1 function cas p pointer to int old int new int is if p old return false p new return true This operation is used to implement synchronization primitives like semaphores and mutexes 1 as well as more sophisticated lock free and wait free algorithms Maurice Herlihy 1991 proved that CAS can implement more of these algorithms than atomic read write or fetch and add and assuming a fairly large clarification needed amount of memory that it can implement all of them 2 CAS is equivalent to load link store conditional in the sense that a constant number of invocations of either primitive can be used to implement the other one in a wait free manner 3 Algorithms built around CAS typically read some key memory location and remember the old value Based on that old value they compute some new value Then they try to swap in the new value using CAS where the comparison checks for the location still being equal to the old value If CAS indicates that the attempt has failed it has to be repeated from the beginning the location is re read a new value is re computed and the CAS is tried again Instead of immediately retrying after a CAS operation fails researchers have found that total system performance can be improved in multiprocessor systems where many threads constantly update some particular shared variable if threads that see their CAS fail use exponential backoff in other words wait a little before retrying the CAS 4 Example application atomic adder edit As an example use case of compare and swap here is an algorithm for atomically incrementing or decrementing an integer This is useful in a variety of applications that use counters The function add performs the action p p a atomically again denoting pointer indirection by as in C and returns the final value stored in the counter Unlike in the cas pseudocode above there is no requirement that any sequence of operations is atomic except for cas function add p pointer to int a int returns int done false while not done value p Even this operation doesn t need to be atomic done cas p value value a return value a In this algorithm if the value of p changes after or while it is fetched and before the CAS does the store CAS will notice and report this fact causing the algorithm to retry 5 ABA problem edit Main article ABA problem Some CAS based algorithms are affected by and must handle the problem of a false positive match or the ABA problem It is possible that between the time the old value is read and the time CAS is attempted some other processors or threads change the memory location two or more times such that it acquires a bit pattern which matches the old value The problem arises if this new bit pattern which looks exactly like the old value has a different meaning for instance it could be a recycled address or a wrapped version counter A general solution to this is to use a double length CAS DCAS E g on a 32 bit system a 64 bit CAS can be used The second half is used to hold a counter The compare part of the operation compares the previously read value of the pointer and the counter with the current pointer and counter If they match the swap occurs the new value is written but the new value has an incremented counter This means that if ABA has occurred although the pointer value will be the same the counter is exceedingly unlikely to be the same for a 32 bit value a multiple of 232 operations would have to have occurred causing the counter to wrap and at that moment the pointer value would have to also by chance be the same An alternative form of this useful on CPUs which lack DCAS is to use an index into a freelist rather than a full pointer e g with a 32 bit CAS use a 16 bit index and a 16 bit counter However the reduced counter lengths begin to make ABA possible at modern CPU speeds One simple technique which helps alleviate this problem is to store an ABA counter in each data structure element rather than using a single ABA counter for the whole data structure A more complicated but more effective solution is to implement safe memory reclamation SMR This is in effect lock free garbage collection The advantage of using SMR is the assurance a given pointer will exist only once at any one time in the data structure thus the ABA problem is completely solved Without SMR something like a freelist will be in use to ensure that all data elements can be accessed safely no memory access violations even when they are no longer present in the data structure With SMR only elements actually currently in the data structure will be accessed Costs and benefits edit CAS and other atomic instructions are sometimes thought to be unnecessary in uniprocessor systems because the atomicity of any sequence of instructions can be achieved by disabling interrupts while executing it However disabling interrupts has numerous downsides For example code that is allowed to do so must be trusted not to be malicious and monopolize the CPU as well as to be correct and not accidentally hang the machine in an infinite loop or page fault Further disabling interrupts is often deemed too expensive to be practical Thus even programs only intended to run on uniprocessor machines will benefit from atomic instructions as in the case of Linux s futexes In multiprocessor systems it is usually impossible to disable interrupts on all processors at the same time Even if it were possible two or more processors could be attempting to access the same semaphore s memory at the same time and thus atomicity would not be achieved The compare and swap instruction allows any processor to atomically test and modify a memory location preventing such multiple processor collisions On server grade multi processor architectures of the 2010s compare and swap is cheap relative to a simple load that is not served from cache A 2013 paper points out that a CAS is only 1 15 times more expensive than a non cached load on Intel Xeon Westmere EX and 1 35 times on AMD Opteron Magny Cours 6 Implementations editCompare and swap and compare and swap double has been an integral part of the IBM 370 and all successor architectures since 1970 The operating systems that run on these architectures make extensive use of this instruction to facilitate process i e system and user tasks and processor i e central processors parallelism while eliminating to the greatest degree possible the disabled spinlocks which had been employed in earlier IBM operating systems Similarly the use of test and set was also eliminated In these operating systems new units of work may be instantiated globally into the global service priority list or locally into the local service priority list by the execution of a single compare and swap instruction This substantially improved the responsiveness of these operating systems In the x86 since 80486 and Itanium architectures this is implemented as the compare and exchange CMPXCHG instruction on a multiprocessor the LOCK prefix must be used As of 2013 most multiprocessor architectures support CAS in hardware and the compare and swap operation is the most popular synchronization primitive for implementing both lock based and non blocking concurrent data structures 4 The atomic counter and atomic bitmask operations in the Linux kernel typically use a compare and swap instruction in their implementation The SPARC V8 and PA RISC architectures are two of the very few recent architectures that do not support CAS in hardware the Linux port to these architectures uses a spinlock 7 Implementation in C edit Many C compilers support using compare and swap either with the C11 lt stdatomic h gt functions 8 or some non standard C extension of that particular C compiler 9 or by calling a function written directly in assembly language using the compare and swap instruction The following C function shows the basic behavior of a compare and swap variant that returns the old value of the specified memory location however this version does not provide the crucial guarantees of atomicity that a real compare and swap operation would int compare and swap int reg int oldval int newval ATOMIC int old reg val reg if old reg val oldval reg newval END ATOMIC return old reg val old reg val is always returned but it can be tested following the compare and swap operation to see if it matches oldval as it may be different meaning that another process has managed to succeed in a competing compare and swap to change the reg value from oldval For example an election protocol can be implemented such that every process checks the result of compare and swap against its own PID newval The winning process finds the compare and swap returning the initial non PID value e g zero For the losers it will return the winning PID This is the logic in the Intel Software Manual Vol 2A bool compare and swap int accum int dest int newval if accum dest dest newval return true else accum dest return false Extensions editSince CAS operates on a single pointer sized memory location while most lock free and wait free algorithms need to modify multiple locations several extensions have been implemented Double compare and swap DCAS Compares two unrelated memory locations with two expected values and if they re equal sets both locations to new values The generalization of DCAS to multiple non adjacent words is called MCAS or CASN DCAS and MCAS are of practical interest in the convenient concurrent implementation of some data structures like deques or binary search trees 10 11 DCAS and MCAS may be implemented however using the more expressive hardware transactional memory 12 present in some recent processors such as IBM POWER8 or in Intel processors supporting Transactional Synchronization Extensions TSX Double wide compare and swap Operates on two adjacent pointer sized locations or equivalently one location twice as big as a pointer On later x86 processors the CMPXCHG8B and CMPXCHG16B instructions 13 serve this role although early 64 bit AMD CPUs did not support CMPXCHG16B modern AMD CPUs do Some Intel motherboards from the Core 2 era also hamper its use even though the processors support it These issues came into the spotlight at the launch of Windows 8 1 because it required hardware support for CMPXCHG16B 14 Single compare double swap Compares one pointer but writes two The Itanium s cmp8xchg16 instruction implements this 15 where the two written pointers are adjacent Multi word compare and swap Is a generalisation of normal compare and swap It can be used to atomically swap an arbitrary number of arbitrarily located memory locations Usually multi word compare and swap is implemented in software using normal double wide compare and swap operations 16 The drawback of this approach is a lack of scalability Persistent compare and swap Is a combination of persist operation and the normal compare and swap It can be used to atomically compare and swap a value and then persist the value so there is no gap between concurrent visibility and crash visibility The extension solves the read of non persistent write problem 17 See also editConditional Put and Delete Fetch and add Load link store conditional Non blocking synchronization Read modify write Test and set Transactional memory Treiber stackReferences edit a b Mullender Sape Cox Russ 2008 Semaphores in Plan 9 PDF 3rd International Workshop on Plan 9 Herlihy Maurice January 1991 Wait free synchronization PDF ACM Trans Program Lang Syst 13 1 124 149 CiteSeerX 10 1 1 56 5659 doi 10 1145 114005 102808 S2CID 2181446 Retrieved 2007 05 20 J H Anderson and M Moir Universal constructions for multi object operations In Proc 14th Annual ACM Symposium on Principles of Distributed Computing pages 184 193 1995 See their Table 1 Figures 1 amp 2 and Section 2 in particular a b Dice Dave Hendler Danny Mirsky Ilya 2013 Lightweight Contention Management for Efficient Compare and Swap Operations arXiv 1305 5800 cs DC Goetz Brian 23 November 2004 Java theory and practice Going atomic IBM developerWorks Tudor David Rachid Guerraoui and Vasileios Trigonakis Everything you always wanted to know about synchronization but were afraid to ask Proceedings of the Twenty Fourth ACM Symposium on Operating Systems Principles ACM 2013 pp 33 48 Detail on p 34 David S Miller Semantics and Behavior of Atomic and Bitmask Operations for Linux port maintainers Archived 2012 03 20 at the Wayback Machine https en cppreference com w c atomic atomic compare exchange GNU C Extensions to the C Language Family Built in functions for atomic memory access Simon Doherty et al DCAS is not a silver bullet for nonblocking algorithm design 16th annual ACM symposium on Parallelism in algorithms and architectures 2004 pp 216 224 doi 10 1145 1007912 1007945 Keir Fraser 2004 Practical lock freedom UCAM CL TR 579 pdf Dave Dice Yossi Lev Mark Moir Dan Nussbaum and Marek Olszewski 2009 Early experience with a commercial hardware transactional memory implementation Sun Microsystems technical report 60 pp SMLI TR 2009 180 A short version appeared at ASPLOS 09 doi 10 1145 1508244 1508263 The full length report discusses how to implement DCAS using HTM in section 5 Intel 64 and IA 32 Architectures Software Developer s Manual Volume 2A Instruction Set Reference A M PDF Retrieved 2007 12 15 New Windows 8 1 requirements strand some users on Windows 8 Intel Itanium Architecture Software Developer s Manual Volume 3 Instruction Set Reference PDF Retrieved 2007 12 15 A Practical Multi Word Compare and Swap Operation PDF Retrieved 2009 08 08 Wang William Diestelhorst Stephan June 17 2019 Persistent Atomics for Implementing Durable Lock Free Data Structures for Non Volatile Memory Brief Announcement The 31st ACM Symposium on Parallelism in Algorithms and Architectures Association for Computing Machinery pp 309 311 doi 10 1145 3323165 3323166 ISBN 9781450361842 S2CID 195064876 via ACM Digital Library External links editThis article s use of external links may not follow Wikipedia s policies or guidelines Please improve this article by removing excessive or inappropriate external links and converting useful links where appropriate into footnote references February 2015 Learn how and when to remove this template message Basic algorithms implemented using CAS edit Sundell Hakan Tsigas Philippas Lock Free and Practical Deques using Single Word Compare And Swap PDF Valois John D Lock Free Linked Lists Using Compare and Swap Proceedings of the Fourteenth Annual ACM Symposium on Principles of Distributed Computing CiteSeerX 10 1 1 41 9506 Prakash S Lee Yann Hang Johnson T A Nonblocking Algorithm for Shared Queues Using Compare and Swap IEEE Transactions on Computers 2003 discussion Lock Free using cmpxchg8b on Intel x86 with pointers to various papers and source codeImplementations of CAS edit AIX compare and swap Kernel Service Java package java util concurrent atomic implements compareAndSet in various classes NET Class methods Interlocked CompareExchange Windows API InterlockedCompareExchange Retrieved from https en wikipedia org w index php title Compare and swap amp oldid 1189977377, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.