This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.
username=johndoe&password=password123
Would produce:
password123
when finding variable password
.
void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
if (*input == '&' || input == o_input) {
if (*input == '&') {
input++;
if (*input == 0) {
return;
}
}
while (*input == *find) {
if (*input == 0 || *find == 0) {
return;
}
input++;
find++;
if (*input == '=' && *find == 0) {
input++;
if (*input == 0) {
return;
}
start = input;
while (*input != '&' && *input) {
input++;
length++;
}
*dest = malloc(length + 1);
input = start;
while (*input != '&' && *input) {
(*dest)[i] = *input;
input++;
i++;
}
(*dest)[i] = 0;
return;
}
}
}
find = o_find;
input++;
}
}
Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.
-
1\$\begingroup\$ I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag). \$\endgroup\$Calak– Calak2018年11月13日 23:58:56 +00:00Commented Nov 13, 2018 at 23:58
-
\$\begingroup\$ @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple. \$\endgroup\$wispi– wispi2018年11月14日 03:55:15 +00:00Commented Nov 14, 2018 at 3:55
3 Answers 3
The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>
:
- find a string in another string with
strstr
; - find a character in a string with
strchr
; - copy a string with
strcpy
or a substring withmemcpy
Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:
void httpString(char **dest, char *input, const char *find) {
char* found = strstr(input, find);
if (!found) {
printf("find not found!");
return;
}
char* assign = found + strlen(find);
if (*assign != '=') {
printf("ill-formed!");
return;
}
char* value = assign + 1;
char* end_value = strchr(value, '&');
if (!end_value) end_value = strchr(value, 0);
int length = end_value - value;
*dest = (char*) malloc(length + 1);
if (!*dest) {
printf("Not enough memory");
return;
}
memcpy(*dest, value, length);
(*dest)[length] = 0;
}
-
2\$\begingroup\$ In the last line, you could write
(*dest)[length + 1] = '0円'
, which would make the intention a bit clearer. \$\endgroup\$Roland Illig– Roland Illig2018年11月14日 01:02:23 +00:00Commented Nov 14, 2018 at 1:02 -
\$\begingroup\$ @chux, RolandIllig : thanks, I've edited my answer \$\endgroup\$papagaga– papagaga2018年11月15日 15:16:16 +00:00Commented Nov 15, 2018 at 15:16
-
\$\begingroup\$ This code skips leading
&
detection and so will incorrectly findnope
inusername=johndoe¬password=nope&password=password123
. It also incorrectly exits onusername=johndoe&passwordnot=nope&password=password123
\$\endgroup\$chux– chux2018年11月15日 16:26:15 +00:00Commented Nov 15, 2018 at 16:26
Passwords and library functions
Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.
Flaw: Ambiguous allocation
httpString(char **dest, )
along some paths will allocate memory for *dest
, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".
const
As char *input
data does not change, add const
for greater applicability and potential optimizations.
//void httpString(char **dest, char *input, const char *find) {
// char *start;
// char *o_input = input;
void httpString(char **dest, const char *input, const char *find) {
const char *start;
const char *o_input = input;
Minor
No allocation check
*dest = malloc(length + 1);
if (*dest == NULL) {
// do something
Missing proto
Add #include <stdlib.h>
for malloc()
.
Unneeded code
while (*input == *find) {
// if (*input == 0 || *find == 0) {
if (*input == 0) {
return;
}
Naming
char *input
is not that useful. Yes it is input, but input about what?
For such searching goals, code could use boring s1, s2
like C lib strstr(const char *s1, const char *s2)
, yet I prefer something more illustrative.
void httpString(char **dest, const char *src, const char *pattern)
... or more fun: Needle in a haystack
void httpString(char **dest, const char *haystack, const char *needle)
Candidate alternative:
// Find ("&" needle "=") within haystack.
// Allocate & return pointer to string after "=" up to a potential following `&`.
char *httpString(const char *haystack, const char *needle) {
size_t needle_len = strlen(needle);
while (*haystack) {
if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
&& haystack[needle_len] == '=') {
haystack += needle_len + 1;
size_t password_len = strcspn(haystack, "&");
char *pw = malloc(password_len + 1u);
if (pw == NULL) {
return NULL; // Out of memory
}
pw[password_len] = '0円';
return memcpy(pw, haystack, password_len);
}
}
return NULL; // Not found
}
[Edit] restrict
removed as not really needed.
-
\$\begingroup\$ Well,
haystack
must start with an &. Out-of-memory-line could also be abreak;
line (personal preference). Nice piece of code btw. \$\endgroup\$Grim– Grim2023年05月22日 16:30:55 +00:00Commented May 22, 2023 at 16:30 -
\$\begingroup\$ @Grim Thanks. "haystack must start with an &." is unclear. I expect
httpString("username=johndoe&password=password123", "password");
to work fine. \$\endgroup\$chux– chux2023年05月22日 16:38:42 +00:00Commented May 22, 2023 at 16:38 -
\$\begingroup\$ Yes, it work fine. I am not sure if you could address the first parameter. i.e.:
httpString("username=johndoe&password=password123", "username");
\$\endgroup\$Grim– Grim2023年05月22日 16:43:06 +00:00Commented May 22, 2023 at 16:43 -
1\$\begingroup\$ @Grim Fair enough. Given "finds the specified variable", sounds like OP wanted something bracketed with
&...=
. Some start of _token_/_delimiter is useful else amending code to not need the&
could match onuusername
as inhttpString("uusername=johndoe&password=password123", "username");
. All depends on coding goals. \$\endgroup\$chux– chux2023年05月22日 16:49:15 +00:00Commented May 22, 2023 at 16:49
In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username
and %75%73%65%72name
would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.
Why not return the result instead of returning void
?
However, I'd prefer a design that avoids malloc()
altogether, because malloc()
could fail, and your caller could easily forget to free()
the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input
with the decoded results. It's kind of ugly, but avoids malloc()
altogether.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/**
* Percent-decodes a string in-place.
*/
static void percentDecode(char *s) {
/* TODO */
}
/**
* Returns a pointer to the beginning of the a key-value pair, writing
* a NUL delimiter to the input. Advances input to the next key-value pair.
*/
char *keyValuePair(char **input) {
return strsep(input, "&");
}
/**
* Splits keyValue into two strings, and performs percent-decoding on both.
* Returns a pointer to the key, and advances keyValue to point to the value.
*/
char *extractKey(char **keyValue) {
char *key = strsep(keyValue, "=");
percentDecode(key);
percentDecode(*keyValue);
return key;
}
int main() {
char *input = strdup("username=johndoe&password=password123");
for (char *key; (key = keyValuePair(&input)); ) {
char *value = key;
if (0 == strcmp("password", extractKey(&value))) {
printf("Found %s: %s\n", key, value);
}
}
free(input);
}