Generic dynamic array implementation in C

Question 1

I'm working on my own generic-type dynamic array in C99 and I would like to hear your input, suggestions, ideas, anything!

What I most care about is: 1) correctness and 2) performance.

So far, I've tested both with standard malloc and mimalloc, although with the standard implementation it seems to be running 50% faster.

I've also ran benchmarks against libsrt, vec and Nim's implementation of dynamic sequences and my code seems to be beating them all.

Also, of major importance is to note that this code is not meant to be used as a library, but as part of a bytecode VM/intepreter that I've been working on - which means I am the one who is going to use it.

Here's the source:

#ifndef __VARRAY_H__
#define __VARRAY_H__
/**************************************
 Type definitions
 **************************************/
#define Array(TYPE) \
 struct { \
 size_t size; \
 size_t cap; \
 size_t typeSize; \
 TYPE *data; \
 }
/**************************************
 Macros
 **************************************/
#define aInit(DEST,TYPE,CAP) { \
 int k=0; \
 while (powerOfTwo[++k]<CAP); \
 \
 DEST = malloc(sizeof(Array(TYPE))); \
 DEST->size = 0; \
 DEST->typeSize = sizeof(TYPE); \
 DEST->cap = powerOfTwo[k]; \
 DEST->data = calloc(DEST->cap, DEST->typeSize); \
}
#define aNew(NAME,TYPE,CAP) \
 Array(TYPE)* NAME; \
 aInit(NAME,TYPE,CAP);
#define aResize(DEST) { \
 DEST->cap *= 2; \
 DEST->data = realloc(DEST->data, DEST->cap * DEST->typeSize); \
}
#define aAdd(DEST,X) \
 if (++(DEST->size) >= DEST->cap) aResize(DEST); \
 DEST->data[DEST->size-1] = X
#define aAppend(DEST,X) \
 DEST->cap += X->cap; \
 DEST->data = realloc(DEST->data, DEST->cap * DEST->typeSize); \
 memcpy(DEST->data + DEST->size, X->data, X->size * X->typeSize); \
 DEST->size += X->size
#define aEach(DEST,INDEX) \
 for (int INDEX=0; INDEX<DEST->size; INDEX++)
#define aFree(DEST) \
 free(DEST->data); \
 free(DEST)
#endif

The powerOfTwo lookup constant:

/**************************************
 Constants
 **************************************/
static size_t powerOfTwo[] = {
 0,
 1, 2, 4, 8, 16, 32, 64, 128,
 256, 512, 1024, 2048, 4096, 8192, 16384, 32768,
 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608,
 16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824, 2147483648,
 4294967296, 8589934592, 17179869184, 34359738368, 68719476736, 137438953472, 274877906944, 549755813888
};

And a little usage example:

Array(int)* arr;
aInit(arr,int,0);
aAdd(arr,0);
aAdd(arr,1);
aEach(fastArr,i) {
 printf("item @ %d = %d\n",i,arr->data[i]);
}
aFree(arr);

Question 2

See this Stackoverflow answer for a discussion on the optimal growth factor for dynamic arrays. The gist of it is that it depends, but growth factors around the golden ratio are easier for the memory allocator to handle. I'd recommend using 1.5 because it is easy to implement:

DEST->cap += DEST->cap / 2

It likely fares worse on artifical benchmarks but better on real workloads. This requires that the minimum capacity of the dynamic array is 2 which is reasonable. But a more realistic minimum size likely should be 8 or 16 since the minimum malloc size is 32 bytes.

In aAppend you are always realloc:ing which will cause performance to suffer if many small dynamic arrays are collected into one big one in a loop. Instead, you want something like:

size_t req = DEST->size + X->size;
if (req > DEST->cap) {
 aResize(DEST);
}
memcpy(DEST->data + DEST->size, X->data, X->size * X->typeSize);
DEST->size = req;

The last thing I'd change is the alignment of the capacity to a power of two in aInit. On realistic workloads, a good chunk of all variable arrays never grow so the alignment just wastes space. But if you insist on alignment, this function is better than the loop:

static inline int
next_pow2(int v) {
 v--;
 v |= v >> 1;
 v |= v >> 2;
 v |= v >> 4;
 v |= v >> 8;
 v |= v >> 16;
 v++;
 return v;
}

I also wonder why you are using macros when C99 has inline functions? They have many advantages in comparison to code macros.

Question 3

DEST->cap += DEST->cap / 2 won't change anything of the current value of DEST->cap is 1 (or 0, but 0 would be a problem in the original code as well).

Question 4

The minimum malloc size is platform dependent. It might be 32 in your environment, but is likely different for others. The point is good though: allocating very small sizes will hurt performance; but choosing a good minimum size isn't necessarily trivial, especially given that we're measuring in units of TYPE but multiplying up to talk to malloc() in units of char.

Question 5

Yep! 32 bytes is just an "educated guess" that has been true for some 64-bit systems but might very well change in the future. It all depends on the VM OP is building.

Question 6

The expansion by 1.5 was used by standard libraries for a while. But in modern ones they gave up on 1.5 and started using 2. They did not see the theoretical gains in real life enough to make it worth it. But I tried a maths prrof on why 1.5 is better :-) lokiastari.com/blog/2016/03/25/resizemaths/index.html

Question 7

Cool! Can you link to their discussion?

Question 8

We're missing an include of <stddef.h> to define size_t (or any of the other headers which define it).

We're missing an include of <stdlib.h>, needed for malloc() and friends (and this would define size_t for us, too).

With those fixed, and sufficient minor changes to the test program, I managed to compile with only a few warnings.

DEST = malloc(sizeof(Array(TYPE))); \
DEST->size = 0; \

Oops! If malloc() returns a null pointer, we have undefined behaviour. Replace with

DEST = malloc(sizeof(Array(TYPE))); \
if (DEST) { \
 DEST->size = 0; \

Similarly, when we delete, let's accept a null pointer:

#define aFree(DEST) \
 if (DEST) { \
 free(DEST->data); \
 } \
 free(DEST)

We have a dangerous realloc():

#define aResize(DEST) { \
 DEST->cap *= 2; \
 DEST->data = realloc(DEST->data, DEST->cap * DEST->typeSize); \
}

If the realloc() fails, then we have a null pointer in data and no way to access the old contents. The standard idiom is to check the return value before assigning it to data. We're going to need some way to report failure, too.

In aAppend(), we should be using the aResize() we've defined instead of resizing to fit - as it is, we're going to reallocate on every single call.

Overall, I think that writing everything as macros is a poor choice. It's probably better to use macro expansion (or equivalent technique, such is multiple includes with varying definitions) to create a set of (small, inlinable) functions for a given type.

The macros we have here look like functions, but can't be used like functions (in particular, aNew expands to a declaration and a statement, and aAdd and aFree both expand to multiple statements, making them dangerous and confusing near conditionals.

It's frustrating that the array control block must be allocated from dynamic memory - C++ programmers expect to be able to create std::vector objects on the stack, with only the object storage itself on the heap.

Question 9

See Malloc Never Fails. Imo, it's not practical to check mallocs return value. Even if @Dr.Kameleon adds NULL checks to his mallocs, clients of his library also need to check every invocation of the aAdd macro to ensure that it didn't cause memory to run out.

Question 10

Ouch, that's a scary attitude. Personally, I find that allowing overcommit does more harm than good, so it's turned off on my non-embedded systems (and certain problem processes get a specific ulimit value). And yes, I did say that we need functions that report errors properly; that's another reason to eschew the macros.

Question 11

Oh, having read further, I see, "To clarify, the surprising behaviour malloc has does not mean we should ignore its return value". I feel better now.

Question 12

It's perfectly reasonable. There's lots of discussion on whether to check the return value of malloc. I usually don't because it's impractical or impossible to test all oom errors, even if you check them, you can't handle them gracefully and false sense of security. For really robust software, I'd eschew dynamic memory entirely.

Question 13

@BjörnLindqvist It's reasonable under some conditions, IF you're aware of what those conditions are, and IF you've explicitly acknowledged the tradeoffs. In your specific applications, you believe it will take too long to make your software fail gracefully, for a situation that rarely happens, and it's fine that you make that call. I don't think it's good practise to generalise that to a blanket statement of "it's not practical" though. I see far too many cases of newbies ignoring return values; their code works fine on a sunny day, but it dies horribly when anything goes wrong.

score 10 · Accepted Answer · 2020-01-08 17:08:51Z

See this Stackoverflow answer for a discussion on the optimal growth factor for dynamic arrays. The gist of it is that it depends, but growth factors around the golden ratio are easier for the memory allocator to handle. I'd recommend using 1.5 because it is easy to implement:

DEST->cap += DEST->cap / 2

It likely fares worse on artifical benchmarks but better on real workloads. This requires that the minimum capacity of the dynamic array is 2 which is reasonable. But a more realistic minimum size likely should be 8 or 16 since the minimum malloc size is 32 bytes.

In aAppend you are always realloc:ing which will cause performance to suffer if many small dynamic arrays are collected into one big one in a loop. Instead, you want something like:

size_t req = DEST->size + X->size;
if (req > DEST->cap) {
 aResize(DEST);
}
memcpy(DEST->data + DEST->size, X->data, X->size * X->typeSize);
DEST->size = req;

The last thing I'd change is the alignment of the capacity to a power of two in aInit. On realistic workloads, a good chunk of all variable arrays never grow so the alignment just wastes space. But if you insist on alignment, this function is better than the loop:

static inline int
next_pow2(int v) {
 v--;
 v |= v >> 1;
 v |= v >> 2;
 v |= v >> 4;
 v |= v >> 8;
 v |= v >> 16;
 v++;
 return v;
}

I also wonder why you are using macros when C99 has inline functions? They have many advantages in comparison to code macros.

DEST->cap += DEST->cap / 2 won't change anything of the current value of DEST->cap is 1 (or 0, but 0 would be a problem in the original code as well).
The minimum malloc size is platform dependent. It might be 32 in your environment, but is likely different for others. The point is good though: allocating very small sizes will hurt performance; but choosing a good minimum size isn't necessarily trivial, especially given that we're measuring in units of TYPE but multiplying up to talk to malloc() in units of char.
Yep! 32 bytes is just an "educated guess" that has been true for some 64-bit systems but might very well change in the future. It all depends on the VM OP is building.
The expansion by 1.5 was used by standard libraries for a while. But in modern ones they gave up on 1.5 and started using 2. They did not see the theoretical gains in real life enough to make it worth it. But I tried a maths prrof on why 1.5 is better :-) lokiastari.com/blog/2016/03/25/resizemaths/index.html

Stack Exchange Network

Generic dynamic array implementation in C

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Generic dynamic array implementation in C

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions