Trim trailing lines and remove blank lines in C

Question 1

This simple program, that I call trim(1), trims trailing lines and, if the flag -l is passed as an argument, it also removes empty lines and lines containing only blanks.

I did this as an exercise for learning C and good practices of C programming in UNIX.

It reads from standard input if no argument is passed or read from the files passed as arguments.

I am a bit suspicious of that realloc idiom for reallocing the buffer if it gets full. Is it right?

#include <err.h>
#include <errno.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define MID 0 /* in middle of line */
#define BEG 1 /* at beginning of line */
static int blankline = 0;
static int exitval = EXIT_SUCCESS;
static void trim(FILE *fp);
static void usage(void);
/* remove trailing blanks and delete blank lines */
int
main(int argc, char *argv[])
{
 int ch;
 FILE *fp;
 while ((ch = getopt(argc, argv, "l")) != -1) {
 switch (ch) {
 case 'l':
 blankline = 1;
 break;
 default:
 usage();
 break;
 }
 }
 argc -= optind;
 argv += optind;
 if (argc == 0) {
 trim(stdin);
 } else {
 while (*argv) {
 if ((fp = fopen(*argv, "r")) == NULL) {
 warn("%s", *argv);
 exitval = EXIT_FAILURE;
 } else {
 trim(fp);
 fclose(fp);
 }
 argv++;
 }
 }
}
/* trim trailing blanks from fp; if blankline is set also remove blank lines */
static void
trim(FILE *fp)
{
 char *buf = NULL;
 size_t i, size, newsize;
 int status;
 int c;
 size = BUFSIZ;
 if (buf == NULL)
 if ((buf = malloc(size)) == NULL)
 err(EXIT_FAILURE, "malloc");
 i = 0;
 status = BEG;
 while ((c = getc(fp)) != EOF) {
 if (isblank(c)) { /* collect blanks in buf */
 if (i >= size) { /* realloc */
 newsize = size + BUFSIZ;
 if (newsize <= size) /* check for overflow */
 errc(EXIT_FAILURE, EOVERFLOW, "realloc");
 size = newsize;
 if ((buf = realloc(buf, size)) == NULL)
 err(EXIT_FAILURE, "realloc");
 }
 buf[i++] = c;
 } else if (c == '\n') {
 if (status == MID || (!blankline))
 putchar('\n');
 status = BEG;
 i = 0;
 } else {
 fwrite(buf, 1, i, stdout);
 putchar(c);
 status = MID;
 i = 0;
 }
 }
 if (ferror(fp)) {
 warn("getc");
 exitval = EXIT_FAILURE;
 }
}
/* show usage */
static void
usage(void)
{
 (void)fprintf(stderr, "usage: trim [-l] [file...]\n");
 exit(EXIT_FAILURE);
}

Question 2

Well designed code.

I am a bit suspicious of that realloc idiom for reallocing the buffer if it gets full. Is it right?

I see no errors.

Nit: I'd rather see newsize declared locally. It is only used there.

// newsize = size + BUFSIZ;
size_t newsize = size + BUFSIZ;

Other

Unneeded code

The initial allocation is not needed. realloc() in the loop can handle starting from 0

//size = BUFSIZ;
//if (buf == NULL)
// if ((buf = malloc(size)) == NULL)
// err(EXIT_FAILURE, "malloc");
size = 0;

Perhaps there is a concern about ...

fwrite(NULL, 1, 0, stdout);

... yet given the spec "If size or nmemb is zero, fwrite returns zero and the state of the stream remains unchanged.", I see no issue.

Either way I'd consider the micro-optimization below, just for the sake of avoiding the fwite() function call as i==0 will be quite common. (Usually I do not encourage any micro-op - I'll make an exception here)

// fwrite(buf, 1, i, stdout);
if (i > 0) fwrite(buf, 1, i, stdout);

Checking output

Code does not check the return values of fwrite(), putchar(c). Robust code would and report the problem - even if it is rare.

Missing free()

Free the allocation, else trim() leaks memory.

free(buf);

ferror()

Good that code checks this.

main() always reports success

main() employs the implied return 0;. Yet failure is possible.

Given code has exitval = EXIT_FAILURE;, I'd expect main() to return exitval; in the end.

Initialize

Rather than declare then assign lines later, consider declare with initialization.

// int status;
// ...
// status = BEG;
int status = BEG;
// size_t size;
// ...
// size = BUFSIZ;
size_t size = BUFSIZ;

Same for i.

Question 3

"Given code has exitval = EXIT_FAILURE;, I'd expect main() to return exitval; in the end." LOL, it was supposed to return exitval, I really forgot it. Thanks for seeing it!

Question 4

"Code does not check the return values of fwrite(), putchar(c)". Should I use an ferror(stdout) in the end of the function instead of checking the return values of fwrite(3) and putchar(3)?

Question 5

@barthooper Perhaps. A more direct approach would take action as soon as the error occurs. IMO a write error is most likely on the first write - otherwise it is rare. Your call.

chux chuxchux 36.4k2 gold badges43 silver badges96 bronze badges · Accepted Answer · 2020-05-06 12:02:59Z

Well designed code.

I am a bit suspicious of that realloc idiom for reallocing the buffer if it gets full. Is it right?

I see no errors.

Nit: I'd rather see newsize declared locally. It is only used there.

// newsize = size + BUFSIZ;
size_t newsize = size + BUFSIZ;

Other

Unneeded code

The initial allocation is not needed. realloc() in the loop can handle starting from 0

//size = BUFSIZ;
//if (buf == NULL)
// if ((buf = malloc(size)) == NULL)
// err(EXIT_FAILURE, "malloc");
size = 0;

Perhaps there is a concern about ...

fwrite(NULL, 1, 0, stdout);

... yet given the spec "If size or nmemb is zero, fwrite returns zero and the state of the stream remains unchanged.", I see no issue.

Either way I'd consider the micro-optimization below, just for the sake of avoiding the fwite() function call as i==0 will be quite common. (Usually I do not encourage any micro-op - I'll make an exception here)

// fwrite(buf, 1, i, stdout);
if (i > 0) fwrite(buf, 1, i, stdout);

Checking output

Code does not check the return values of fwrite(), putchar(c). Robust code would and report the problem - even if it is rare.

Missing free()

Free the allocation, else trim() leaks memory.

free(buf);

ferror()

Good that code checks this.

main() always reports success

main() employs the implied return 0;. Yet failure is possible.

Given code has exitval = EXIT_FAILURE;, I'd expect main() to return exitval; in the end.

Initialize

Rather than declare then assign lines later, consider declare with initialization.

// int status;
// ...
// status = BEG;
int status = BEG;
// size_t size;
// ...
// size = BUFSIZ;
size_t size = BUFSIZ;

Same for i.

"Given code has exitval = EXIT_FAILURE;, I'd expect main() to return exitval; in the end." LOL, it was supposed to return exitval, I really forgot it. Thanks for seeing it!
"Code does not check the return values of fwrite(), putchar(c)". Should I use an ferror(stdout) in the end of the function instead of checking the return values of fwrite(3) and putchar(3)?
@barthooper Perhaps. A more direct approach would take action as soon as the error occurs. IMO a write error is most likely on the first write - otherwise it is rare. Your call.

Stack Exchange Network

Trim trailing lines and remove blank lines in C

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Trim trailing lines and remove blank lines in C

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions