This simple program, that I call trim(1)
, trims trailing lines and, if the flag -l
is passed as an argument, it also removes empty lines and lines containing only blanks.
I did this as an exercise for learning C and good practices of C programming in UNIX.
It reads from standard input if no argument is passed or read from the files passed as arguments.
I am a bit suspicious of that realloc idiom for reallocing the buffer if it gets full. Is it right?
#include <err.h>
#include <errno.h>
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define MID 0 /* in middle of line */
#define BEG 1 /* at beginning of line */
static int blankline = 0;
static int exitval = EXIT_SUCCESS;
static void trim(FILE *fp);
static void usage(void);
/* remove trailing blanks and delete blank lines */
int
main(int argc, char *argv[])
{
int ch;
FILE *fp;
while ((ch = getopt(argc, argv, "l")) != -1) {
switch (ch) {
case 'l':
blankline = 1;
break;
default:
usage();
break;
}
}
argc -= optind;
argv += optind;
if (argc == 0) {
trim(stdin);
} else {
while (*argv) {
if ((fp = fopen(*argv, "r")) == NULL) {
warn("%s", *argv);
exitval = EXIT_FAILURE;
} else {
trim(fp);
fclose(fp);
}
argv++;
}
}
}
/* trim trailing blanks from fp; if blankline is set also remove blank lines */
static void
trim(FILE *fp)
{
char *buf = NULL;
size_t i, size, newsize;
int status;
int c;
size = BUFSIZ;
if (buf == NULL)
if ((buf = malloc(size)) == NULL)
err(EXIT_FAILURE, "malloc");
i = 0;
status = BEG;
while ((c = getc(fp)) != EOF) {
if (isblank(c)) { /* collect blanks in buf */
if (i >= size) { /* realloc */
newsize = size + BUFSIZ;
if (newsize <= size) /* check for overflow */
errc(EXIT_FAILURE, EOVERFLOW, "realloc");
size = newsize;
if ((buf = realloc(buf, size)) == NULL)
err(EXIT_FAILURE, "realloc");
}
buf[i++] = c;
} else if (c == '\n') {
if (status == MID || (!blankline))
putchar('\n');
status = BEG;
i = 0;
} else {
fwrite(buf, 1, i, stdout);
putchar(c);
status = MID;
i = 0;
}
}
if (ferror(fp)) {
warn("getc");
exitval = EXIT_FAILURE;
}
}
/* show usage */
static void
usage(void)
{
(void)fprintf(stderr, "usage: trim [-l] [file...]\n");
exit(EXIT_FAILURE);
}
1 Answer 1
Well designed code.
I am a bit suspicious of that realloc idiom for reallocing the buffer if it gets full. Is it right?
I see no errors.
Nit: I'd rather see newsize
declared locally. It is only used there.
// newsize = size + BUFSIZ;
size_t newsize = size + BUFSIZ;
Other
Unneeded code
The initial allocation is not needed. realloc()
in the loop can handle starting from 0
//size = BUFSIZ;
//if (buf == NULL)
// if ((buf = malloc(size)) == NULL)
// err(EXIT_FAILURE, "malloc");
size = 0;
Perhaps there is a concern about ...
fwrite(NULL, 1, 0, stdout);
... yet given the spec "If size or nmemb is zero, fwrite returns zero and the state of the stream remains unchanged.", I see no issue.
Either way I'd consider the micro-optimization below, just for the sake of avoiding the fwite()
function call as i==0
will be quite common. (Usually I do not encourage any micro-op - I'll make an exception here)
// fwrite(buf, 1, i, stdout);
if (i > 0) fwrite(buf, 1, i, stdout);
Checking output
Code does not check the return values of fwrite(), putchar(c)
. Robust code would and report the problem - even if it is rare.
Missing free()
Free the allocation, else trim()
leaks memory.
free(buf);
ferror()
Good that code checks this.
main()
always reports success
main()
employs the implied return 0;
. Yet failure is possible.
Given code has exitval = EXIT_FAILURE;
, I'd expect main()
to return exitval;
in the end.
Initialize
Rather than declare then assign lines later, consider declare with initialization.
// int status;
// ...
// status = BEG;
int status = BEG;
// size_t size;
// ...
// size = BUFSIZ;
size_t size = BUFSIZ;
Same for i
.
-
\$\begingroup\$ "Given code has exitval = EXIT_FAILURE;, I'd expect main() to return exitval; in the end." LOL, it was supposed to
return exitval
, I really forgot it. Thanks for seeing it! \$\endgroup\$phillbush– phillbush2020年05月06日 14:04:29 +00:00Commented May 6, 2020 at 14:04 -
\$\begingroup\$ "Code does not check the return values of fwrite(), putchar(c)". Should I use an
ferror(stdout)
in the end of the function instead of checking the return values offwrite(3)
andputchar(3)
? \$\endgroup\$phillbush– phillbush2020年05月06日 14:08:06 +00:00Commented May 6, 2020 at 14:08 -
\$\begingroup\$ @barthooper Perhaps. A more direct approach would take action as soon as the error occurs. IMO a write error is most likely on the first write - otherwise it is rare. Your call. \$\endgroup\$chux– chux2020年05月06日 14:18:37 +00:00Commented May 6, 2020 at 14:18