Here is some code I have that has been extracted and shrunk down from a project of mine.
#include <stdio.h>
#include <stdint.h>
#include <time.h>
#include <string.h>
#define streq(x, y) (strcmp((x), (y)) == 0)
#define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
typedef struct
{
const char *cmd;
void* (*fn)(void);
} __attribute__((__packed__)) Command;
void* getTime(void)
{
return ((void*)((uintptr_t)time(NULL)));
}
void* getDay(void)
{
time_t t = time(NULL);
struct tm tm = *localtime(&t);
return ((void*)((uintptr_t)(tm.tm_wday)));
}
static Command commands[] =
{
{"time", getTime},
{"day", getDay},
};
int main(int argc, char *argv[])
{
for (int i = 0; i < ARRAY_SIZE(commands); ++i)
{
Command *p = commands+i;
if (streq(argv[1], p->cmd)) printf("%d\n", (int)p->fn());
}
}
When run with some given input, it runs the method associated with that input. Since this is a scaled down version of my project, it can only run two commands.
$ ./test-command day
4
Right now, my implementation should run with \$ O \left( n\right)\$ time complexity. This would be unacceptable with the hundreds of thousands of commands my project could scale to. I am looking for reviews on scalability and optimization, and how well my Command structure and function pointer template would scale as well.
-
\$\begingroup\$ Sort your array and bin search it, no? \$\endgroup\$vnp– vnp2014年06月12日 23:45:39 +00:00Commented Jun 12, 2014 at 23:45
-
\$\begingroup\$ @vnp Would a binary search be better than a hash search? \$\endgroup\$syb0rg– syb0rg2014年06月12日 23:50:42 +00:00Commented Jun 12, 2014 at 23:50
-
\$\begingroup\$ It is much simpler. Hundreds of thousands of commands translate to 30 or so binary steps. If even that is unacceptable, maybe hash is a way to go. However, the hash function must be very very good, for any conflict resolution incurs significant cost. \$\endgroup\$vnp– vnp2014年06月12日 23:55:02 +00:00Commented Jun 12, 2014 at 23:55
-
\$\begingroup\$ @vnp Sounds like you should be writing an answer ;) \$\endgroup\$syb0rg– syb0rg2014年06月12日 23:58:26 +00:00Commented Jun 12, 2014 at 23:58
2 Answers 2
Sorting the command list prior to compile time is not that much of a burden. If there a propensity to fail to do so, a simple run-time check in the
DEBUG
version of code would detect it.Hash is the way to go for high speed, but a simple binary search would be useful to begin with to iron out initial code development issues.
To help with string compare speeds have done this
((x[0] == y[0]) && (strcmp((x), (y)) == 0))
, but this may be premature optimization.Minor: Suggest replacing
void*
with some large unsigned integer likeuintmax_t
as the generic return type for the various functionsvoid* (*fn)(void)
. Or better yet, aunion
of the various types.typedef union { time_t t; void *p ; int i; } Return_T;
Minor:
localtime()
could returnNULL
. Best to test function return value before dereference.Minor: Suggest
const
.static const Command commands[] = ...
.Minor: Best to make array indexes
size_t
:size_t i = 0; ... Command *p = commands+i;
Minor: Casting
(uintptr_t)time(NULL)
and then toint
inprintf("%d\n", (int)p->fn());
can easily lose significant data. See note #4.
bsearch()
example: something like the following untested code
static int compar(const void *a, const void *b) {
const char *key = (const char *) a;
const Command *element = (const Command *) b;
return strcmp(key, element->cmd);
}
void main(int argc, char *argv[]) {
Command *match = bsearch(argv[1], commands, ARRAY_SIZE(commands),
sizeof commands[0], compar);
if (match != NULL) {
printf("%d\n", (int) (match->fn()));
}
}
-
\$\begingroup\$ I could not get your
bsearch()
example to work for me. \$\endgroup\$syb0rg– syb0rg2014年06月17日 02:34:06 +00:00Commented Jun 17, 2014 at 2:34 -
\$\begingroup\$ @syb0rg Sorry can not hep more right now, maybe tomorrow. Good luck. Try cplusplus.com/reference/cstdlib/bsearch \$\endgroup\$chux– chux2014年06月17日 02:42:27 +00:00Commented Jun 17, 2014 at 2:42
-
\$\begingroup\$ An advice #3 is surely of a wrong kind. A cost of avoiding a function call is a conditional: it not taken, a function call follows; if taken, the pipeline is flushed anyway. Where's a win? Besides,
strcmp
is most likely inlined. \$\endgroup\$vnp– vnp2014年06月17日 07:13:05 +00:00Commented Jun 17, 2014 at 7:13 -
\$\begingroup\$ @vnp I doubt you would agree but most processors in 2014 are not pipelined. Consider embedded processors quantities are over 1 Billion per year. These simpler devices do benefit from code as suggested. In any case, "may be premature optimization" already addresses the concern that such a tactic may indeed be of negative value. As with many optimizations, looking at the order of complexity (hash, bsearch,, etc.) of the algorithm typically trumps linear speed improvements. \$\endgroup\$chux– chux2014年06月17日 14:42:27 +00:00Commented Jun 17, 2014 at 14:42
As requested.
An unstructured array may only be searched in linear order. To get better asymptotics you need a more search-friendly structure. A simplest solution would be to presort the Command
array, and apply a binary search to find the command. A definite advantage here is a possibility to use standard library. You may even have the array sorted during build time.
For hundreds of thousands of commands the binary search would complete in about 20 lookups. If this is unacceptable, and your commands' names let you come up with a perfect hash function, hash is a way to go. Beware that a hash function must be really perfect - that is, almost no conflicts and almost no misses. Otherwise, the asymptotic constant would defeat all the advantages of O(1).
Update: assuming a unixish environment, the best way to sort at the build time is to have a commands.c
file as
static Commands commands[] = {
#include "command-list.c"
};
with a command-list.txt
text file
"time", getTime
"day", getDay
and a Makefile rule saying something like (may require some backslashes)
%.c: %.txt
sort $< | sed -e 's/.*/{&},/' > $@
-
\$\begingroup\$ What would be the best way to sort the array during build time? I would prefer not to do it manually. \$\endgroup\$syb0rg– syb0rg2014年06月13日 00:13:08 +00:00Commented Jun 13, 2014 at 0:13
-
2\$\begingroup\$ Terminology snipe: A perfect hash function means no conflicts at all. Technical snipe: The binary search is 20 random lookups. Even a bad hashing scheme can be made to do quite well here if you store all the command names, in hash order, in a single string. \$\endgroup\$tmyklebu– tmyklebu2014年06月15日 02:22:11 +00:00Commented Jun 15, 2014 at 2:22
-
\$\begingroup\$ @tmyklebu You should form an answer out of that. It's something I would upvote. \$\endgroup\$syb0rg– syb0rg2014年06月17日 01:53:22 +00:00Commented Jun 17, 2014 at 1:53
-
\$\begingroup\$ @syb0rg: chux already nailed it. \$\endgroup\$tmyklebu– tmyklebu2014年06月17日 01:59:14 +00:00Commented Jun 17, 2014 at 1:59
Explore related questions
See similar questions with these tags.