3
\$\begingroup\$

This is part of a two-part post (Part 2).


Here, I have two recent projects that parse a file. The first uses a loop that's kind of hard to follow, and the second uses "modes" to decide what to do.

Project 1: A "bookmark manager" for managing URLs easier than a text editor.

The data structures:

typedef struct {
 char *text;
 int length;
} string;
typedef struct {
 string *path;
 string *title;
} url;
typedef struct {
 url **urls; // An array of pointers to urls
 int length;
 int sel_url;
 int modified;
 char *filename;
} urllist;

For the data, I am considering using a linked list instead of an array.

The parser:

void parse_file(char *fname, urllist *ul){
 /* Dumb char-by-char parser that follows the syntax "URL[Title]"
 * [Title] is optional. If left unspecified, then the url is used as
 * a title instead.
 *
 * There's some basic idiot-proofing:
 * 
 * 1. Empty lines are ignored (lazy-ignore)
 * Rather, '\n' and ' ' are skipped if they begin a line.
 * " http://google.com" is still valid; it'll just take two
 * iterations to extract anything from it
 *
 * 2. lines beginning in '[' or '#' are fully ignored (full-ignore).
 * So you can write comments like "#this" or "[this]". The
 * 'comment char' has to be the first character on the line. If
 * there is a space before it, for example, the parser will think
 * it's a url.
 */
 int c;
 url *u;
 FILE *f = fopen(fname, "r");
 while(1){
 c = fgetc(f);
 if(c == -1){ /* Check EOF */
 break;
 }
 if(c != '\n' && c != ' ' && c != '[' && c != '#'){
 /* Thanks to the EOF check, I have to add the first char
 * NOW, while I still have it. This is also an opportunity
 * for a sanity check */
 u = new_url();
 string_push_char(u->path, c);
 } else if(c == '#' || c == '['){ /* full-ignore */
 while((c = fgetc(f)) != '\n');
 continue;
 } else { /* lazy-ignore */
 continue;
 }
 while(1){ /* Step through individual line */
 c = fgetc(f);
 if(c == '\n'){
 /* Line ends without title, copy path to title */
 string_copy(u->title, u->path);
 break;
 }
 if(c == '['){
 /* Title specified. */
 while(1){ /* Get it. */
 c = fgetc(f);
 if(c == ']'){
 while(fgetc(f) != '\n');
 break;
 }
 string_push_char(u->title, c);
 }
 break;
 }
 string_push_char(u->path, c);
 }
 url_push(ul, u);
 }
 fclose(f);
}

It works, but I feel like it could be done better. As I said earlier, it's hard to follow. You can tell because it's littered with comments, which I personally try to avoid unless describing a block of code as a whole.

An example file

# Cool sites
http://google.com/search?q=help+with+bananas[Google: banana help]
http://stackoverflow.com[Stack Overflow]
# YouTube stuff
http://youtube.com

Result

[Google: banana help ]
 Stack Overflow
 http://youtube.com
http://google.com/search?q=help with bananas
asked Dec 1, 2015 at 17:42
\$\endgroup\$
0

1 Answer 1

1
\$\begingroup\$

The code sure must be streamlined. I didn't find any better way to review but to rearrange the code and put my notes as comments.

 // First, you test the character against the same condition twice.
 // A more direct way would be to do it once
 if (c == '#' || c == '[') {
 while (fgetc(f) != '\n')
 ;
 continue;
 }
 if (c == ' ' || c == '\n')
 continue;
 // Notice that here you still have `c` intact (so the comment about
 // NOW is misleading). Time to initialize the `url`:
 u = new_url();
 // Since you have the first character of the `u->path`, a correct
 // idiom is a `do {} while ();` loop:
 do {
 string_push_char(u->path, c);
 c = fgetc(f);
 } while(c != '\n' && c != '[');
 // Here `c` is the reason of loop termination. Test it and proceed accordingly:
 if (c == '[') {
 while ((c = fgetc(f)) != ']') {
 string_push_char(u->title, c);
 }
 while(fgetc(f) != '\n')
 ;
 } else {
 string_copy(u->title, u->path);
 }

This almost it. The code parses the well-formed file, but will fall into an infinite loop if, for example, the title-closing bracket is missing.

As a side note, isspace(c) is strongly preferred to a manual testing.

answered Dec 1, 2015 at 19:35
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.