API for data structure with indices, size_t vs int?

Question 1

For a data structure with indices (e.g. an array list, a dynamic array, etc...), should the indices be of type size_t or int? Is there a clear reason to use one over the other?

fooGetByIndex(struct foo* foo, size_t index);

or

fooGetByIndex(struct foo* foo, int index);

Until I had it suggested to me to use size_t I'd always defaulted to int without thinking much of it. Having experimented with both I'm not quite sure which makes for a better API.

There exist many discussions on size_t vs int on a more general level, and that's not what I'm asking. I'm interested in the more specific case of designing an API for a data structure that uses indices (i.e. is array-like) but abstracts away direct array access through an API.

Semantically size_t is appropriate for indices of C arrays, which is the primary argument for it in this case. However if the C array is hidden behind an API (which might not even use one internally) that argument diminishes. Additionally being able to return -1 as an error value is much easier when using int, whereas (size_t)-1 is arguably more error-prone and confusing for the user of the API, despite being well-defined and even used by the C standard library in its mbstowcs function.

If relevant, the two APIs I'm currently working on can be found on CodeReview here and here, though I'm looking for an answer that applies to API design of index-based data structures in general, not just those two examples.

Is using either size_t or int better API design in this case, or are both equally valid (i.e. the choice is subjective)?

Question 2

This had been dicussed on SO's main site under the C tag.

Question 3

Is there a logical upper limit on the maximum size of your data structure?

Question 4

Here you go: stackoverflow.com/q/6004415/694576 (there probably are more)

Question 5

@IllusiveBrian Not in theory, however on any platform where int is 32bit INT_MAX is going to be more than enough capacity in practice. I suppose the question only makes sense when int is 32bit because when it's smaller you have to use size_t. This hidden assumption may well hold the answer, as designing around int being a particular size isn't good C.

Question 6

"I suppose the question only makes sense when int is 32bit because when it's smaller you have to use size_t." Then again there's enough of a correlation between int width and resource availability that on any platform with a 16bit int 32767 of something is going to be plenty. Data structures that need the extra range provided by size_t could arguably be considered a special case.

Question 7

I clearly would prefer size_t, as it is an unsigned integer, and indices are>= 0. You immediately know how to use this parameter.

It is no good style returning special values as -1 for error conditions. This will require extra code for checking. If you forget those checks at some places, this can cause hard to find bugs.

You should use an alternative way for error handling, e.g.:

Throw an exception:
If you e.g. request an index, you could return the error condition by return value and the index by parameter position:
```
bool GetMyIndex (size_t &result);
```
Usage:
```
size_t returned_index = 0;
if (!GetMyIndex(returned_index))
{
 // handle the error
}
```

Question 8

Yeah, the good old C exceptions.

Question 9

There are various advantages for each of the possible approaches.

Using only one type for all indices is nice, allowing you to pass a pointer to an index, or in C++ a reference to an index, around without having to worry what exactly you are indexing.

Having an unsigned index is nice if the index values cannot ever be negative.

Having a signed index is nice because it means you don't have to be paranoid with loops like for (i = count - 1; i>= 0; --i). And you can use -1 to imply an invalid index.

Not having artificial restrictions because of the index type is nice. It's rubbish to use an int index on a 64 GByte machine that could easily handle much bigger indexes.

Not wasting space is nice. It's rubbish to have a 64 bit index that can access one of two items only.

bernie bernie 1095 bronze badges · Answer 1 · 2018-04-04 11:28:24Z

I clearly would prefer size_t, as it is an unsigned integer, and indices are>= 0. You immediately know how to use this parameter.

It is no good style returning special values as -1 for error conditions. This will require extra code for checking. If you forget those checks at some places, this can cause hard to find bugs.

You should use an alternative way for error handling, e.g.:

Throw an exception:
If you e.g. request an index, you could return the error condition by return value and the index by parameter position:
```
bool GetMyIndex (size_t &result);
```
Usage:
```
size_t returned_index = 0;
if (!GetMyIndex(returned_index))
{
 // handle the error
}
```

Yeah, the good old C exceptions.

bool3max
– bool3max

2019年05月29日 23:01:46 +00:00
Commented May 29, 2019 at 23:01

gnasher729 gnasher729 49.2k4 gold badges71 silver badges137 bronze badges · Answer 2 · 2018-04-07 09:45:37Z

There are various advantages for each of the possible approaches.

Using only one type for all indices is nice, allowing you to pass a pointer to an index, or in C++ a reference to an index, around without having to worry what exactly you are indexing.

Having an unsigned index is nice if the index values cannot ever be negative.

Having a signed index is nice because it means you don't have to be paranoid with loops like for (i = count - 1; i>= 0; --i). And you can use -1 to imply an invalid index.

Not having artificial restrictions because of the index type is nice. It's rubbish to use an int index on a 64 GByte machine that could easily handle much bigger indexes.

Not wasting space is nice. It's rubbish to have a 64 bit index that can access one of two items only.

Stack Exchange Network

API for data structure with indices, size_t vs int?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

API for data structure with indices, size_t vs int?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions