I implemented the following excercise:
Implement a lookup table with operations such as
find(struct table*, const char*)
,insert(struct table*, const char*,int)
, andremove(struct table*, const char*)
. The representation of the table could be an array of astruct
pair or a pair of arrays (const char*[]
andint*
); you choose, also choose return types for your functions.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#define ARR_SIZE 10
struct Pair {
const char* word;
int val;
};
struct Table {
struct Pair* pairs[ARR_SIZE];
size_t sz;
};
struct Pair* make_pair(const char* word, int val)
{
assert(word);
struct Pair* pair = (struct Pair*)malloc(sizeof(struct Pair));
pair->val = val;
pair->word = word;
return pair;
}
void table_empty(struct Table* tbl)
{
assert(tbl);
size_t i = 0;
for (i = 0; i < tbl->sz; ++i) {
free(tbl->pairs[i]);
tbl->pairs[i] = NULL;
}
tbl->sz = 0;
}
int search_word(struct Table* tbl, const char* word)
{
assert(tbl);
assert(word);
size_t i = 0;
for (i = 0; i < tbl->sz; ++i) {
//printf("%s %i\n", tbl->pairs[i]->word,i);
if (strcmp(tbl->pairs[i]->word, word) == 0) {
return i;
}
}
return -1; // error
}
void table_insert(struct Table* tbl, const char* word, int val)
{
assert(tbl);
assert(word);
int i = search_word(tbl, word);
if (i != -1) { // replace val
tbl->pairs[i]->val = val;
}
else { // add new pair
struct Pair* pair = make_pair(word, val);
tbl->pairs[tbl->sz] = pair; // add pair at the last position
++tbl->sz;
}
}
int table_find(struct Table* tbl, const char* word, int* return_val)
{
assert(tbl);
assert(word);
assert(return_val);
int i = search_word(tbl, word);
if (i != -1) {
*return_val = tbl->pairs[i]->val;
return 0;
}
return -1; // error not found
}
int table_remove(struct Table* tbl, const char* word)
{
assert(tbl);
assert(word);
int i = search_word(tbl, word);
if (i == -1) return -1;
free(tbl->pairs[i]); // free value at current pos
tbl->pairs[i] = tbl->pairs[tbl->sz - 1]; // put address of last word at the pos of the current
--tbl->sz; // "erase" last word
return 0;
}
void table_print(struct Table* tbl)
{
assert(tbl);
printf("\n");
printf("table size = %i\n", tbl->sz);
for (int i = 0; i < tbl->sz; ++i)
printf("%s %i\n", tbl->pairs[i]->word, tbl->pairs[i]->val);
fflush(stdout);
}
void print_search_result(struct Table* tbl, const char* word)
{
assert(tbl);
assert(word);
int val = 0;
if (table_find(tbl, word, &val) == 0)
printf("%s %i\n",word, val);
else
printf("%s not found in table\n", word);
printf("\n");
fflush(stdout);
}
void print_remove_result(struct Table* tbl, const char* word)
{
assert(tbl);
assert(word);
if (table_remove(tbl, word) == -1)
printf("%s not deleted\n", word);
else
printf("%s deleted\n", word);
printf("\n");
fflush(stdout);
}
int main()
{
struct Table table = { 0 };
int val = 0;
table_insert(&table, "Hello", 10);
table_insert(&table, "Test", 15);
table_insert(&table, "Hello", 18); // testing overrite val
table_insert(&table, "What", 5);
table_insert(&table, "is", 3);
table_insert(&table, "going", 4);
table_insert(&table, "on", 77);
table_print(&table);
print_search_result(&table, "Hello");
print_search_result(&table, "Test");
print_search_result(&table, "keyword");
print_remove_result(&table, "Hello");
print_remove_result(&table, "Hello"); // double delete == false
print_remove_result(&table, "What");
print_remove_result(&table, "going");
print_remove_result(&table, "is");
print_remove_result(&table, "on");
print_remove_result(&table, "Test");
table_print(&table);
table_insert(&table, "Hello", 10);
table_insert(&table, "Test", 15);
table_insert(&table, "Hello", 18); // testing overrite val
table_insert(&table, "What", 5);
table_insert(&table, "is", 3);
table_insert(&table, "going", 4);
table_insert(&table, "on", 77);
table_print(&table);
table_empty(&table);
table_print(&table);
getchar();
}
Feel free to comment on any improvements. Is there a better way to do this?
I have one specific question also: are my uses of assert
appropriate?
3 Answers 3
Using struct
Personally, I think you made the right choice by using a struct
with a char*
and int
in it rather than 2 arrays of char*
and int
, respectively. The Pair
is a conceptual unit of data in your app, so it makes sense to put them together. If you had 2 separate arrays, it would be easy for them to get out of synch and hard to debug why that happened. Nice work!
Prefer const
and functions to macros
You've defined the array size using a macro. (削除) This puts you at a disadvantage while programming. You've removed the type information for the value. By making it:
const int ARR_SIZE = 10;
you get type safety. (削除ここまで) Edit: That's a C++-ism that doesn't work in C. My bad! But the rest of the advice in the next paragraph is correct as far as I know.
With macros that take parameters you run the risk of them being used in unexpected ways and causing hard to debug issues. Luckily, you haven't done that here. But in general, if you find yourself reaching for a macro, ask yourself if you would be better off with a constant or a function. You almost always will be (assuming you can use the constant in the desired way).
Errors
There are some errors in your code. In make_pair()
, you don't check to see if malloc()
succeeded. If it fails, no memory is allocated, and pair
points to NULL
. When you try to assign pair->val
or pair->word
you will crash.
If you did fix that, table_insert()
uses the result of make_pair()
without checking to see if it's NULL
first. This won't crash immediately, because you're just assigning the array pointer in tbl->pairs[tbl->sz]
to have the value of pair
. What will happen is later when you try to search or print or insert another item, you'll get a crash when you iterate over that entry in the table and try to do anything with it.
You could make these errors not possible by not dynamically allocating the array entries. Simply make the array an array of Pair
structs rather than a pointer to them.
Naming
A lot of your names are really good. Pair
and Table
are decent, readable names for this task. make_pair()
, table_insert()
, etc. are informative. Some could be improved, though. tbl
does not save you much typing over table
. Just use the whole word. Same with sz
vs. size
. i
is acceptable as a loop variable name, but it would be even better if it was more descriptive. For example, entry_index
or pair_index
. It should describe what you're iterating over.
-
\$\begingroup\$ when i change the define to
const int ARR_SIZE = 10;
it gives me an error instruct Table { struct Pair* pairs[ARR_SIZE]; size_t sz; };
that ARR_SIZE is not const, So how to use it in this case then? \$\endgroup\$Sandro4912– Sandro49122018年06月25日 15:40:52 +00:00Commented Jun 25, 2018 at 15:40 -
\$\begingroup\$ Sorry - that's a C++-ism. I thought this was a C++ question. My bad. I'll update my answer. \$\endgroup\$user1118321– user11183212018年06月26日 02:25:40 +00:00Commented Jun 26, 2018 at 2:25
-
1\$\begingroup\$ I'm a staunch supporter of descriptive variable names, but with the simplicity of the loops in this program I think the variable name
i
works just fine. \$\endgroup\$Jakob– Jakob2018年06月26日 02:38:03 +00:00Commented Jun 26, 2018 at 2:38 -
\$\begingroup\$ If
malloc()
fails, assigning topair->val
orpair->word
might crash, if you are lucky. It might just continue and give wrong results, or it might do something totally unexpected. That's the joy of Undefined Behaviour! \$\endgroup\$Toby Speight– Toby Speight2018年06月26日 14:01:47 +00:00Commented Jun 26, 2018 at 14:01
Do not cast the result of
malloc
.malloc
returnsvoid *
, and in C it is valid to convertvoid *
to any pointer. If you cast to suppress a warning about integer to pointer conversion, it means that a compiler doesn't have amalloc
prototype, and defaults its to return anint
(it is an ancient C convention). Now ifint
and pointer are of different size, the malloc return value would be truncated, with all nasty consequences.It is not recommended to
sizeof(type)
. Prefersizeof(expression)
. In your case, considerstruct Pair * pair = malloc(sizeof(*pair));
table_insert
blindly inserts into a full table. It should test fortbl->sz < ARR_SIZE
and return an error indication if it is not.The actual insertion
struct Pair* pair = make_pair(word, val); tbl->pairs[tbl->sz] = pair; // add pair at the last position
should be really a single line:
tbl->pairs[tbl->sz] = make_pair(word, val);
-
\$\begingroup\$ why is it not a good idea to cast malloc? In c++ it even would get a compiler error? \$\endgroup\$Sandro4912– Sandro49122018年06月21日 19:06:07 +00:00Commented Jun 21, 2018 at 19:06
-
2\$\begingroup\$ @Sandro4912 The question is tagged C. C++ is a different language, particularly in this respect. Now if the C compiler complains, it means that the
malloc
prototype is missing, which may lead to nasty problems. \$\endgroup\$vnp– vnp2018年06月21日 19:08:22 +00:00Commented Jun 21, 2018 at 19:08 -
\$\begingroup\$ yes i know is c. i was just wondered why here its not a good practice to indicate the type changes with a cast \$\endgroup\$Sandro4912– Sandro49122018年06月21日 19:15:17 +00:00Commented Jun 21, 2018 at 19:15
-
\$\begingroup\$ @Sandro4912 See edit. \$\endgroup\$vnp– vnp2018年06月21日 19:21:37 +00:00Commented Jun 21, 2018 at 19:21
-
2\$\begingroup\$ @Sandro4912
pair = malloc(sizeof *pair)
is easier to code correctly the first time, easier to review and maintain. \$\endgroup\$chux– chux2018年06月26日 02:38:22 +00:00Commented Jun 26, 2018 at 2:38
Enable all compiler warnings
With a well enabled compiler, for (int i = 0; i < tbl->sz; ++i)
should have warned about mixing sign-ness of i
and tbl->sz
as well as range. Save time and enable all warnings and use for (size_t i = 0; i < tbl->sz; ++i)
.
In general, code is squishy on using int,size_t
nearly interchangeably. I'd re-architect and use size_t
only.
Mixed use of shallow and deep copy
make_pair(const char* word, int val)
allocates and forms a whole new struct Pair
(deep copy), yet does not copy what word
points to.
Perhaps
// pair->word = word;
pair->word = strdup(word);
Use const
search_word()
does not modify *tbl
, so use const to convey that. Same for
table_find(),
table_print(),
print_search_result()`.
// int search_word(struct Table* tbl, const char* word)
int search_word(const struct Table* tbl, const char* word)
Naming
Code uses const char* word;
yet it is not a "word", but a string as used in strcmp()
.
----- Additions
Contract violation?
Nothing in the requirements "Implement a lookup table with operations such as ..." indicate that const char*
is a pointer to a string. So calling strcmp()
is questionable unless unstated requirements exist. As a char *
, code could use a simple compare
// if (strcmp(tbl->pairs[i]->word, word) == 0)
if (tbl->pairs[i]->word == word)
Use of assert()
are my uses of assert appropriate?
If the char *
pointer to add/search is not specified to be a string, assert(word);
is not proper as word == NULL
is not known to be invalid.
The assert(return_val)
in table_find(struct Table* tbl, const char* word, int* return_val)
is OK, yet I would re-design to allow return_val == NULL
if (i != -1) {
// add
if (return_val) {
*return_val = tbl->pairs[i]->val;
}
return 0;
}
*
in a pointer type be adjacent to the variable name (e.g.const char *word
). The C type syntax was designed to mimic usage, where for example you'd type*word
to access aconst char
element. Note also that this makes it more clear what typeb
is inint *a, b;
. \$\endgroup\$