4
\$\begingroup\$

This function takes a string from an HTTP POST or GET string, finds the specified variable, and allocates memory for the result and puts it in a string. The destination is given as an address of an empty pointer.

username=johndoe&password=password123

Would produce:

password123

when finding variable password.

void httpString(char **dest, char *input, const char *find) {
char *start;
char *o_input = input;
const char *o_find = find;
size_t length = 0;
size_t i = 0;
while (*input) {
 if (*input == '&' || input == o_input) {
 if (*input == '&') {
 input++;
 if (*input == 0) {
 return;
 }
 }
 while (*input == *find) {
 if (*input == 0 || *find == 0) {
 return;
 }
 input++;
 find++;
 if (*input == '=' && *find == 0) {
 input++;
 if (*input == 0) {
 return;
 }
 start = input;
 while (*input != '&' && *input) {
 input++;
 length++;
 }
 *dest = malloc(length + 1);
 input = start;
 while (*input != '&' && *input) {
 (*dest)[i] = *input;
 input++;
 i++;
 }
 (*dest)[i] = 0;
 return;
 }
 }
 }
 find = o_find;
 input++;
}
}

Any feedback related to how this function can be improved would be greatly appreciated. I am worried about potential edge cases where a memory access violation could occur.

200_success
146k22 gold badges190 silver badges479 bronze badges
asked Nov 13, 2018 at 22:15
\$\endgroup\$
2
  • 1
    \$\begingroup\$ I'll try to do a full review tomorrow but for now I have to be pedantic : What you want to parse is called a Query String and for the sake of completness, it can contains more than simple value pairs. You can have multiples values for a key (eg, array of values) or keys without value (act like a flag). \$\endgroup\$ Commented Nov 13, 2018 at 23:58
  • \$\begingroup\$ @Calak Thanks for letting me know. I do realize that query strings can be more complex than this, but I would likely keep inputs simple. \$\endgroup\$ Commented Nov 14, 2018 at 3:55

3 Answers 3

3
\$\begingroup\$

The easiest way to improve this function is to use the C standard library. As it is, it's difficult to read; you can't understand what it does at a glance. And it's unnecessary difficult, because most of the building blocks are already available for free in <string.h>:

  • find a string in another string with strstr;
  • find a character in a string with strchr;
  • copy a string with strcpy or a substring with memcpy

Once you have simplified your code, you'll have more brain space to care for not so negligeable things such as testing memory allocations:

void httpString(char **dest, char *input, const char *find) {
 char* found = strstr(input, find);
 if (!found) {
 printf("find not found!");
 return;
 }
 char* assign = found + strlen(find);
 if (*assign != '=') {
 printf("ill-formed!");
 return;
 }
 char* value = assign + 1;
 char* end_value = strchr(value, '&');
 if (!end_value) end_value = strchr(value, 0);
 int length = end_value - value;
 *dest = (char*) malloc(length + 1);
 if (!*dest) {
 printf("Not enough memory");
 return;
 }
 memcpy(*dest, value, length);
 (*dest)[length] = 0;
}
answered Nov 13, 2018 at 23:37
\$\endgroup\$
3
  • 2
    \$\begingroup\$ In the last line, you could write (*dest)[length + 1] = '0円', which would make the intention a bit clearer. \$\endgroup\$ Commented Nov 14, 2018 at 1:02
  • \$\begingroup\$ @chux, RolandIllig : thanks, I've edited my answer \$\endgroup\$ Commented Nov 15, 2018 at 15:16
  • \$\begingroup\$ This code skips leading & detection and so will incorrectly find nope in username=johndoe&notpassword=nope&password=password123. It also incorrectly exits on username=johndoe&passwordnot=nope&password=password123 \$\endgroup\$ Commented Nov 15, 2018 at 16:26
3
\$\begingroup\$

Passwords and library functions

Code dealing with passwords needs to be careful about calling library functions that are not secure as those functions might leave copies of data lingering who-knows-where or leak timing information. That is a good reason to not call standard functions. Still, for developing code, better to first use standard functions and then later replace with secure code.

Flaw: Ambiguous allocation

httpString(char **dest, ) along some paths will allocate memory for *dest, but not all. Function lacks any notification to the caller if allocation occurred or not. This is one of those "I am worried about potential edge cases".

const

As char *input data does not change, add const for greater applicability and potential optimizations.

//void httpString(char **dest, char *input, const char *find) {
// char *start;
// char *o_input = input;
void httpString(char **dest, const char *input, const char *find) {
 const char *start;
 const char *o_input = input;

Minor

No allocation check

*dest = malloc(length + 1);
if (*dest == NULL) {
 // do something

Missing proto

Add #include <stdlib.h> for malloc().

Unneeded code

while (*input == *find) {
 // if (*input == 0 || *find == 0) {
 if (*input == 0) {
 return;
 }

Naming

char *input is not that useful. Yes it is input, but input about what?

For such searching goals, code could use boring s1, s2 like C lib strstr(const char *s1, const char *s2), yet I prefer something more illustrative.

void httpString(char **dest, const char *src, const char *pattern)

... or more fun: Needle in a haystack

void httpString(char **dest, const char *haystack, const char *needle)

Candidate alternative:

// Find ("&" needle "=") within haystack.
// Allocate & return pointer to string after "=" up to a potential following `&`.
char *httpString(const char *haystack, const char *needle) {
 size_t needle_len = strlen(needle);
 while (*haystack) {
 if (*haystack++ == '&' && strncmp(haystack, needle, needle_len) == 0
 && haystack[needle_len] == '=') {
 haystack += needle_len + 1;
 size_t password_len = strcspn(haystack, "&");
 char *pw = malloc(password_len + 1u);
 if (pw == NULL) {
 return NULL; // Out of memory
 }
 pw[password_len] = '0円';
 return memcpy(pw, haystack, password_len);
 }
 }
 return NULL; // Not found
}

[Edit] restrict removed as not really needed.

answered Nov 15, 2018 at 15:34
\$\endgroup\$
4
  • \$\begingroup\$ Well, haystack must start with an &. Out-of-memory-line could also be a break; line (personal preference). Nice piece of code btw. \$\endgroup\$ Commented May 22, 2023 at 16:30
  • \$\begingroup\$ @Grim Thanks. "haystack must start with an &." is unclear. I expect httpString("username=johndoe&password=password123", "password"); to work fine. \$\endgroup\$ Commented May 22, 2023 at 16:38
  • \$\begingroup\$ Yes, it work fine. I am not sure if you could address the first parameter. i.e.:httpString("username=johndoe&password=password123", "username"); \$\endgroup\$ Commented May 22, 2023 at 16:43
  • 1
    \$\begingroup\$ @Grim Fair enough. Given "finds the specified variable", sounds like OP wanted something bracketed with &...=. Some start of _token_/_delimiter is useful else amending code to not need the & could match on uusername as in httpString("uusername=johndoe&password=password123", "username");. All depends on coding goals. \$\endgroup\$ Commented May 22, 2023 at 16:49
2
\$\begingroup\$

In both GET query strings and POST bodies, the key-value pairs are percent-encoded. Therefore, username and %75%73%65%72name would be considered semantically equivalent keys, and your parser should also look for the percent-encoded variants in the input. Conversely, the function should automatically percent-decode any value that it finds, both for symmetry and utility.

Why not return the result instead of returning void?

However, I'd prefer a design that avoids malloc() altogether, because malloc() could fail, and your caller could easily forget to free() the allocated memory. Consider writing a parser that helps you iterate over the keys and values, overwriting the input with the decoded results. It's kind of ugly, but avoids malloc() altogether.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/**
 * Percent-decodes a string in-place.
 */
static void percentDecode(char *s) {
 /* TODO */
}
/**
 * Returns a pointer to the beginning of the a key-value pair, writing
 * a NUL delimiter to the input. Advances input to the next key-value pair.
 */
char *keyValuePair(char **input) {
 return strsep(input, "&");
}
/**
 * Splits keyValue into two strings, and performs percent-decoding on both.
 * Returns a pointer to the key, and advances keyValue to point to the value.
 */ 
char *extractKey(char **keyValue) {
 char *key = strsep(keyValue, "=");
 percentDecode(key);
 percentDecode(*keyValue);
 return key;
}
int main() {
 char *input = strdup("username=johndoe&password=password123");
 for (char *key; (key = keyValuePair(&input)); ) {
 char *value = key;
 if (0 == strcmp("password", extractKey(&value))) {
 printf("Found %s: %s\n", key, value);
 }
 }
 free(input);
}
answered Nov 14, 2018 at 1:32
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.