What?
I have a reference counted dynamic byte array written in C. I'm currently using this implementation in a FIFO fashion. Particularly reading data from files into the arrays, then parsing the data in the array byte by byte.
I'm looking for feedback on anything! Even the decision to use a dynamic byte array versus using a linked list or any other data structure.
Edit: I'm trying to follow the Linux Kernel coding style, so I'm looking for feedback on how well I follow that convention.
Files
Header
#ifndef DATA_BYTE_ARRAY
#define DATA_BYTE_ARRAY
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include <string.h>
#include "../memory.h"
/**
* Dynamic array that grows when it's full (given that
* the util functions in this module are used).
*
* For each grow it uses the current length doubled. See
* byte_array_push for more info.
*/
struct byte_array {
uint8_t refs;
uint32_t length;
uint32_t capability;
uint8_t *bytes;
};
/**
* Creates a byte array with the given length.
*
* Values of the returned struct:
* - 1 reference
* - All bytes will be set to 0
* - Current position will be set to 0
* - Capability will reflect how many bytes are allocated. Which is
* the same as given capability.
*/
struct byte_array *byte_array_create(uint32_t capability);
/**
* Adds the given input at the current position of the array.
*
* Note that if the pointer to the byte array is NULL an assertion will crash
* the program.
*
* If the byte array is full more memory will be allocated. The amount
* of new memory that will be allocated is the current amount doubled.
*/
void byte_array_push(struct byte_array *a, uint8_t input);
void byte_array_incref(struct byte_array *a);
void byte_array_decref(struct byte_array *a);
#endif
Source
#include "byte_array.h"
#define NEW_CAPABILITY a->capability * 2
static void zero_init_bytes(struct byte_array *a,
uint32_t offset,
uint32_t nmemb)
{
uint8_t *bytes = a->bytes;
bytes += offset;
memset(bytes, 0, nmemb);
}
struct byte_array *byte_array_create(uint32_t capability)
{
struct byte_array *result = assert_malloc(sizeof(*result));
result->refs = 1;
result->length = 0;
result->capability = capability;
result->bytes = assert_calloc(capability, sizeof(uint8_t));
return result;
}
void byte_array_push(struct byte_array *a, uint8_t input)
{
assert(a != NULL);
if (a->length == a->capability) {
void *tmp = assert_realloc(a->bytes, NEW_CAPABILITY);
a->bytes = tmp;
zero_init_bytes(
a,
a->capability,
NEW_CAPABILITY - a->capability
);
a->capability = NEW_CAPABILITY;
}
a->bytes[a->length++] = input;
}
void byte_array_incref(struct byte_array *a)
{
assert(a != NULL);
a->refs++;
}
void byte_array_decref(struct byte_array *a)
{
assert(a != NULL);
if (--a->refs != 0) return;
free(a->bytes);
free(a);
}
2 Answers 2
This looks mostly good to me. Your code looks like fairly clean C. I have some comments:
I think
capacity
is a better name thancapability
.I would avoid calling your header
memory.h
, sincememory
is a header in C++. Maybechecked_alloc.h
instead? Also, include it as"memory.h"
and let your build system figure out its path.How is your error handling in the
assert_...alloc
functions? Do they just terminate the program?Two of your public interface functions are without documentation. Are they not supposed to be part of the public interface? If not, put them in the implementation file (only) and make them
static
. If they are, document them as well.Note that the
assert()
macro is only evaluated if your library is compiled withoutNDEBUG
. This means that the code using it will be unchecked for production builds. It might be fine, but maybe you want checking in production as well; I'm just pointing it out.As far as I can see, you adhere to the Linux coding style.
Consider
typedef uint8_t byte
or something like that to emphasize what the role of theuint8_t
s are. Remember that you have no guarantee thatCHAR_BIT
is 8. (Maybe you want to usechar
instead, which is guaranteed to always be one byte.)I would change
if (a->length == a->capability)
toif (a->length >= a->capability)
. The latter is a bit clearer, and if you at one point allow adding larger objects the code will still work.Your macro looks nasty.
I'd do:
#define CAPACITY_GROWTH_FACTOR 2 /* ... */ a->capability *= CAPACITY_GROWTH_FACTOR;
Or even better, change the
#define
into aconst size_t
.The
nmemb
variable isn't very readable. Change it tonum_members
or something like that.Consider sorting your
#include
s alphabetically.While I don't see anything wrong with your reference counting, you should thoroughly test it to verify that it works.
Regarding reference counting and your interface, ask yourself these questions:
- What happens when you copy your array? What do you want to happen? How should a copy be made?
- How should copies behave when you modify an array that has multiple references to it?
- Will this be used in a multi-threaded environment?
The answers affect how the interface should be designed. Personally, it seems to me that the reference counting is on a very low level (the caller has to take care of it himself), meaning it would be easy to make mistakes. Make your interface easy to use correctly, and hard to use incorrectly. Reference counting might not even be necessary here, depending on what the array is meant to be used for.
-
\$\begingroup\$ Thank you very much! Awesome list of comments, you have given me a lot to think about. - All
assert_*alloc
functions usesassert
to check that the return was notNULL
from the*alloc
call. I didn't even realize what happens with NDEBUG until you mentioned it now. - The undocumented functions are supposed to be public, I just didn't think it was worth commenting them since they were so simple. \$\endgroup\$rzetterberg– rzetterberg2013年07月19日 12:42:51 +00:00Commented Jul 19, 2013 at 12:42
On the header, you are exporting extra headers that have no part of the interface. Anything you include in the public header is part of the interface of your module. That means anything that #includes your header becomes dependent upon those headers, which is undesirable.
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#include <string.h>
#include "../memory.h"
All you need to include in the header is stdint.h
to resolve the uint8_t
etc. As dependencies are generally best avoided, you might consider whether
you really need to use stdint types at all. What are you gaining over char
and
int
?
In the struct byte_array
the capability
field is misnamed. capacity
seems to be what you mean.
In the C file, you should delete the NEW_CAPABILITY
macro. Macros are
generally unsafe and should be avoided unless there is a good justification for
them (which there isn't here - you are just hiding some details). Here is how I would do it
if (a->length == a->capacity) {
size_t new_size = a->capacity * 2;
a->bytes = realloc(a->bytes, new_size);
memset(a + capacity, 0, new_size - a->capacity);
a->capacity = new_size;
}
Your zero_init_bytes
function is just a wrapper for memset
and is
redundant (ie. it adds nothing).
On style, it looks nice. I am unfamiliar with Linux rules, so cannot comment
there. There is some excess whitespace, for example byte_array_push
has
four unnecessary blank lines (an a redundant variable tmp
).
Note that sizeof(uint8_t)
is 1, so just use 1
assert_malloc
andassert_calloc
, they are wrapper functions that raises an assertion ifmalloc
orcalloc
fails. \$\endgroup\$assert
macro. \$\endgroup\$