C++ atomic operations
Table of contents
C++ atomic operations are essential for writing safe and efficient concurrent code by ensuring that operations on shared variables are performed atomically and with well-defined memory ordering semantics. They are a powerful tool for writing high-performance multithreaded applications while avoiding data races and synchronization issues.
Atomic operations provide a lock-free mechanism for synchronization, avoiding the overhead and potential issues of locks. They are more lightweight and efficient when used correctly for simple operations.
However, atomic operations have limitations and are suitable for specific scenarios. For more complex synchronization, such as protecting critical sections or implementing more intricate synchronization patterns, you may still need to use mutexes or other synchronization primitives.
Atomic Operations use hardware
Atomic operations use hardware-supported atomic instructions (e.g., compare-and-swap, load-link/store-conditional) to ensure that certain operations on shared variables are performed indivisibly.
These operations are typically implemented using CPU instructions that guarantee atomicity without requiring explicit locking mechanisms.
Atomic operations are designed to be lock-free or wait-free depending on the specific implementation and underlying hardware support.
Lock use software
Locks (e.g., mutexes, spinlocks) use software-based synchronization techniques to control access to shared resources.
Locks require acquiring and releasing a lock object explicitly using locking mechanisms such as mutex locks (std::mutex) or spinlocks (std::atomic_flag).
When a thread acquires a lock, it gains exclusive access to the critical section of code, preventing other threads from entering that section until the lock is released.
Memory order
memory_order_seq_cst
Sequencial Consistent (default if not specified), the compiler may have to:- to perform aditional store flushes
- to perform aditional cache refreshes
- to restrict instruction reordering
- to restrict the optimization
memory_order_relaxed
- get maximum performace avoiding the restrictions above
- you MUST to solve race conditions explicitly
memory_order_acquire
memory_order_release
memory_order_acq_rel
memory_order_consume
Categorization of atomic operations
- Load / acquire operations
memory_order_relaxed
memory_order_consume
memory_order_acquire
memory_order_seq_cst
- Store / release operations
memory_order_relaxed
memory_order_release
memory_order_seq_cst
- Read modify write / acquire release operations
memory_order_relaxed
memory_order_consume
memory_order_acquire
memory_order_release
memory_order_rel_acq
memory_order_seq_cst
Atomic store operation tagged with memory_order_release
will have
syncronized with relationship with atomic read operation tagged with
memory_order_acquire
.
std::atomic<int> x{ 2 };
fmt::print( "Initial value: {}\n", x.load() );
x.store( 5 );
fmt::print( "New value: {}\n", x.load() );
fmt::print( "fetch_add: was {} now {}\n", x.fetch_add( 3 ), x.load() );
// Perform a release-store (ensures all previous writes are visible before this store)
x.store( 10, std::memory_order_release );
// Perform an acquire-load (ensures this read sees the value written in the release store)
int value = x.load( std::memory_order_acquire );
value = 11;
bool ret_val = x.compare_exchange_weak(
value, //
13, //
std::memory_order_release, // for success scenario
std::memory_order_relaxed ); // for failure scenario
Possible output