So here is my input
\\?\usb#vid_04d9&pid_1702#6&12e3c9ed&0&2#{a5dcbf10-6530-11d2-901f-00c04fb951ed}
I need to extract 04d9 and 1702 from this string. Instead of copying the values I just point my pointer to their respective locations.
And here is my horrible code
#DEFINE FIRST_STRING 1
#DEFINE SECOND_STRING 2
#DEFINE THIRD_STRING 3
VOID
StringDelimiterReplace(
PCHAR string,
INT delimiter,
INT replace
)
/*
Description:
Replaces the delimiter in the string with the replace value specified
*/
{
size_t stringLen = strlen(string);
for (size_t size = 0; size < stringLen; size++) {
if (string[size] == delimiter) string[size] = replace;
}
}
VOID
GenericStringSplit(
PCHAR string,
INT delimiter,
PCHAR *dest,
INT tokenNum,
INT replace
)
/*
Description:
A generic string splitting function which returns the string at token number
specified,
Also provision to replace the delimiter with string
*/
{
int i = 1;
size_t size;
if( delimiter != replace) StringDelimiterReplace(string, delimiter, replace);
if (tokenNum == 1) *dest = string;
for (size = 0; ; size++) {
if (string[size] != replace) continue;
if (i == tokenNum - 1) break;
if (i > tokenNum - 1) return;
i++;
}
*dest = &string[++size];
}
//main
GenericStringSplit(string, '#', &dest, SECOND_STRING, '0円');
GenericStringSplit(dest, '&', &vendorId, FIRST_STRING, '0円');
GenericStringSplit(dest, '0円', &productId, SECOND_STRING, '0円');
GenericStringSplit(vendorId, '_', &vendorId, SECOND_STRING, '0円');
GenericStringSplit(productId, '_', &productId, SECOND_STRING, '0円');
2 Answers 2
Here are some things that may help you improve your code.
Use the correct pragma
There is no #DEFINE
in standard C. It should instead be #define
. Case is significant in C. Similarly, INT
and VOID
are not keywords in standard C, but int
and void
are.
Fix the formatting
The code is diabolically difficult to read with its current formatting. First, put the description above the function itself. Second, write a meaningful comment; the current description of GenericStringSplit
is so vague and confusing as to be useless. Third, don't put multiple statements on a line. For example, instead of this:
if (string[size] == delimiter) string[size] = replace;
write this:
if (string[size] == delimiter) {
string[size] = replace;
}
It's wise to get into the habit of using braces even with a one-liner. Doing so lessens the likelihood of introducing subtle bugs as you maintain the code.
Use the required #include
s
The code uses strlen
and size_t
both of which indicate that the code needs #include <string.h>
. It was not difficult to infer, but it helps reviewers if the code is complete.
Use standard functions
Your StringDelimiterReplace
is, as it's currently used, very similar to the standard strtok
function. It could also benefit from using the standard strchr
function.
/*
* replaces all instances of delimiter with replace in string.
*
* Note that the passed string is modified in place and so cannot
* be const and cannot be NULL. Also, the NUL char ('0円') cannot
* be used as the delimiter. The use of an int that does not
* fit within a char for either the delimiter or the replace
* parameters is not supported. Wide strings are also not supported.
*
* In short, don't actually use this function in production code.
*/
static void StringDelimiterReplace(char *string, int delimiter, int replace) {
while (NULL != (string = strchr(string, delimiter))) {
*string++ = replace;
}
}
Note that this has the effect of only traversing the string once instead of twice in the original code (once for strlen
and again in the loop).
Do one thing well
The GenericStringSplit
does multiple things. It modifies the passed string and it passes back a pointer. I'd strongly suggest that the code would be easier to understand if each function only did one thing.
Allow the use of const
strings
The current code modifies the passed string in place, but an enhancement would be to allow const
strings to be passed in.
Rethink the interface
I'd suggest that an alternative approach would be to actually look for "vid_"
or "pid_"
and then extract the corresponding ID. One approach:
unsigned hexval(const char *string, const char *target) {
char *found = strstr(string, target);
if (found == NULL) {
return 0;
}
return 0xffff & strtoul(&found[strlen(target)], NULL, 16);
}
Usage:
unsigned vid=hexval(string, "vid_");
unsigned pid=hexval(string, "pid_");
printf("%4.4x:%4.4x\n", vid, pid);
-
\$\begingroup\$ Good answer. Corner case:
StringDelimiterReplace(..., '0円', ...)
will cause code to attempt to access memory past the end of the string \$\endgroup\$chux– chux2017年11月02日 19:42:49 +00:00Commented Nov 2, 2017 at 19:42 -
\$\begingroup\$ @chux: True, so never call the function with that parameter equal to
'0円'
. It will also segfault and die if the first string isconst
(at least on Linux). These things could be explicitly checked within the code, provided as comments, or both. Or better, IMHO, would be to not use it at all. \$\endgroup\$Edward– Edward2017年11月02日 19:57:08 +00:00Commented Nov 2, 2017 at 19:57 -
\$\begingroup\$ I've added some of the many caveats to a comment above that code. \$\endgroup\$Edward– Edward2017年11月02日 20:04:27 +00:00Commented Nov 2, 2017 at 20:04
-
\$\begingroup\$ "The use of an int that does not * fit within a char for either the delimiter or the replace * parameters is not supported." is
unsigned char
See c11 7.24.1 3 fordelimiter
andstrchr()
. Forreplace
, perhaps pedantic code should be*string++ = (unsigned char) replace;
. Thestr...()
functions with non-ASCII is always a bit of a trick. \$\endgroup\$chux– chux2017年11月02日 20:29:20 +00:00Commented Nov 2, 2017 at 20:29
Points about the general readability of the code:
- First of all I would try to not separate the function bodies from from the parameters and return type, it just makes the code really unreadable. Put the comments above the declaration not in the middle.
- Try to write short concise and descriptive comments, we know the comment is a description of a function, and that the function is a string splitter.
- If you want to explain your code well then explain not what it does but how it does it, but don't put an essay at the top of functions explain lines that might be harder to understand.
- Don't use multi line comments where you don't need them.
Your comment for generic string splitter says: "function which returns the string at token number specified..." but it has a void return type.
Why are you using defines for your types instead of the normal void, int, char*? Don't use defines where it doesn't make the code more readable.
It's easier to learn this early instead of developing bad habits then trying to fix them.
You are using pointers very weirdly by passing dest as a char**(a pointer to a pointer of a character) then dereferencing it to set it as a pointer to the correct location in the string. This is again why you shouldn't use defines such as PCHAR, because you forget that they are actually pointers. Don't think of char* as a string, think of it as an array of characters or a pointer to the first character in a string.
VOID
,PCHAR
etc. Did you omit some typedefs? \$\endgroup\$<Windows.h>
as everyone suggested I should not be using those \$\endgroup\$