Is this a correct implementation of atomic reference counting in C?

Question 1

Giving the following constraints, is the optimization in this code correct?

before an object is transfered to another thread, its reference count is incremented and is_shared is set to true.
everytime an object reaches a reference count of 1, is_shared is set to false.
Header_is_unique is 100% thread safe.

The idea behind the optimization is to try reduce the amount of atomic operations.

Is this a fool's errand? Is this kind of optimization impossible to do in a thread safe manner without using more atomic values and/or mutexes, thus invalidating the idea of the optimization itself?

I only want to target the GCC and Clang compiler.

#include <stdint.h>
typedef struct Header {
 uint8_t is_shared;
 uint64_t reference_count;
} Header;
void
Header_acquire(
 Header * const self
) {
 if (!self->is_shared) {
 ++self->reference_count;
 } else {
 __atomic_fetch_add(&self->reference_count, 1, __ATOMIC_RELAXED);
 }
}
uint8_t
Header_release(
 Header * const self
) {
 if (!self->is_shared) {
 return !--self->reference_count;
 }
 uint64_t const reference_count =
 __atomic_fetch_sub(&self->reference_count, 1, __ATOMIC_RELEASE);
 if (reference_count == 2) {
 __atomic_thread_fence(__ATOMIC_ACQUIRE);
 self->is_shared = 0;
 }
 return reference_count == 1;
}
uint8_t
Header_is_unique(
 Header const * const self
) {
 return !self->is_shared && self->reference_count == 1;
}

Question 2

Perhaps I am blind, but I don't see how is_shared ever becomes non-zero.

Question 3

@vnp: From the description it seems that a separate function to set it to non-zero would exist, somewhere, and it would be the responsibility of the caller to call it appropriately.

Question 4

@MatthieuM. Most likely so, but it is no point to review the code without that separate function in view.

Question 5

this is wrong implementation. if (reference_count == 2) mean that now only single thread have reference to self, but this is *ANOTHER thread, not current. look more at stackoverflow.com/a/79320186/6401656

Question 6

Use C11 atomic types and functions

Since C11, support for atomic types and operations has been added to the standard. Prefer to use those over non-atomic types and non-standard functions.

Be careful combining multiple atomic operations

It's a mistake to think that since you use atomic operations on the members of Header, that higher-level operations on Header, liking acquiring and releasing, are also atomic. Consider:

if (!self->is_shared) {
 return !--self->reference_count;
}

Between the load !self->is_shared and the decrement --self->reference_count, is_shared could potentially have been modified. The same goes for checking the reference count and setting is_shared to zero, later on in that function, and in Header_is_unique().

Avoid premature optimization

The idea behind the optimization is to try reduce the amount of atomic operations.

You added a conditional check to Header_release() and Header_acquire() that you pay for every time you call those functions. And if it's not shared, consider that no other threads are simulatenously trying to access that Header object, so the atomic operations are likely to be cheap.

Measure the performance in a real scenario, with and without is_shared, and then decide whether the added complication is worth it.

G. Sliepen G. Sliepen 68.7k3 gold badges74 silver badges179 bronze badges · Answer 1 · 2024-12-30 00:13:08Z

Use C11 atomic types and functions

Since C11, support for atomic types and operations has been added to the standard. Prefer to use those over non-atomic types and non-standard functions.

Be careful combining multiple atomic operations

It's a mistake to think that since you use atomic operations on the members of Header, that higher-level operations on Header, liking acquiring and releasing, are also atomic. Consider:

if (!self->is_shared) {
 return !--self->reference_count;
}

Between the load !self->is_shared and the decrement --self->reference_count, is_shared could potentially have been modified. The same goes for checking the reference count and setting is_shared to zero, later on in that function, and in Header_is_unique().

Avoid premature optimization

The idea behind the optimization is to try reduce the amount of atomic operations.

You added a conditional check to Header_release() and Header_acquire() that you pay for every time you call those functions. And if it's not shared, consider that no other threads are simulatenously trying to access that Header object, so the atomic operations are likely to be cheap.

Measure the performance in a real scenario, with and without is_shared, and then decide whether the added complication is worth it.

Stack Exchange Network

Is this a correct implementation of atomic reference counting in C?

1 Answer 1

Use C11 atomic types and functions

Be careful combining multiple atomic operations

Avoid premature optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Is this a correct implementation of atomic reference counting in C?

1 Answer 1

Use C11 atomic types and functions

Be careful combining multiple atomic operations

Avoid premature optimization

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions