I'm looking for feedback on the code I came up with for exercise 1-21 in "The C Programming Language" by K&R.
Write a program entab that replaces strings of blanks with the minimum number of tabs and blanks to achieve the same spacing. Use the same stops as for detab . When either a tab or a single blank would suffice to reach a tab stop, which should be given preference?
The feedback I would mainly like is on readability, conventions and general good (or bad) practices.
This was a fun problem to solve and I think I came up with a efficient solution.
#include <stdio.h>
#define TABSIZE 8
#define MAXLENGTH 80
void entab(char s[]);
int _getline(char s[], int lim);
int main()
{
int len;
char line[MAXLENGTH];
while ((len = _getline(line, MAXLENGTH)) > 0)
entab(line);
}
void entab(char s[])
{
int i, c, nb;
nb = 0;
for (i = 0; (c = s[i]) != '0円'; ++i)
{
if (c == ' ')
{
if (++nb > 1)
{
if (i + 1 % TABSIZE == 0)
{
putchar('\t');
nb = 0;
}
}
}
else
{
if (nb > 0)
{
for (; nb > 0; --nb)
putchar('x');
}
putchar(c);
}
}
}
int _getline(char s[], int lim)
{
int c, i;
for (i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; i++)
s[i] = c;
if (c == '\n')
{
s[i] = c;
++i;
}
s[i] = '0円';
return i;
}
The function _getline()
was given by the book.
Only practices encountered in the book up until this assignment are used, I have not yet read about things like pointers and other more advanced features
My own thoughts
- Use (more) comments when coding
- Use more "self-explaining" variable names
- Input longer than 80 characters isn't handled correctly
-
1\$\begingroup\$ Just in case you don't know - never begin your own identifiers with an underscore like that. Standard C reserves those names for the implementation's own use, so you may get a collision. I'll let you off since K&R gave that to you (the book pre-dates the standardisation of the language, so it wasn't as certainly an issue back when it was written). \$\endgroup\$Toby Speight– Toby Speight2019年09月20日 12:00:22 +00:00Commented Sep 20, 2019 at 12:00
-
\$\begingroup\$ @TobySpeight This is definitely my own fault! The book gave the function with the name "getline" however I ran into issues because I think the stdio library also contains a function with that name. Coming from python I thought it was a good idea to use an underscore. I will avoid doing so from now on, thank you for the feedback! \$\endgroup\$Ghxst– Ghxst2019年09月20日 21:26:56 +00:00Commented Sep 20, 2019 at 21:26
1 Answer 1
Welcome to Code Review! I'm always glad to see someone who reads the classics.
You made some interesting observations about the program.
The statement Use more "self-explaining" variable names is absolutely correct: the variable name nb
could definitely be improved. I would like to point out that self-documenting might be better than self-explaining.
If the variable and symbolic constant names are improved it may not need any additional comments.
The header file <stdio.h>
includes a symbolic constant BUFSIZ
that might be a better length than 80
. In some cases BUFSIZ
may be the maximum line length for the system; this was true on older Unix systems.
Possible Bug
This code seems out of scope for the problem as it is defined, I would expect to see that it was outputing blanks.
for (; blankCount > 0; --blankCount)
putchar('x');
A Good Coding Practice
Code needs to be maintained. This may include adding additional lines to control structures such as if
statements and loops. In C and C++ a good programming practice is to have code blocks (complex statements) in all if
, else
and loop statements even if it isn't currently necessary.
if (CONDITION)
{
one statement
}
else
{
one statement
}
while (CONDITION)
{
one statement
}
Alternate Solution with Simplified Functions
It might be easier to read, write and modify entab()
if it called a function to count all the blanks and print the necessary tabs and blanks. Programming in many cases is breaking down a problem into smaller and smaller pieces until each piece is easy to implement. While this may make the entire program a little more complex, each function is simplified.
This example uses a concept you haven't gotten to in the book yet called pointers.
void print_tabs_or_spaces(int tab_count, int out_value)
{
for (int i = 0; i < tab_count; i++)
{
putchar(out_value);
}
}
char* count_blanks_and_output_tabs_and_spaces(char *c)
{
int blank_count = 0;
while (*c == ' ')
{
++blank_count;
c++;
}
int tab_count = blank_count / TABSIZE;
int space_count = blank_count % TABSIZE;
print_tabs_or_spaces(tab_count, '\t');
print_tabs_or_spaces(space_count, ' ');
return c;
}
void entab(char str[])
{
char *c = &str[0];
while(*c != '0円')
{
if (*c == ' ')
{
c = count_blanks_and_output_tabs_and_spaces(c);
}
else
{
putchar(*c);
c++;
}
}
}
-
1\$\begingroup\$ Thank you very much for your observations and feedback! The reason for this loop is because when the global loop starts, we aren't sure yet whether the spaces / blanks we encounter should be tabs, in a case where multiple spaces / blanks are used, but not until the nearest tab stop, a tab shouldn't be used. example:
"hello world"
(2 spaces) will be printed as"helloxxworld"
where ashello world
(3 spaces) will be printed as"hello\tworld"
. I did however forget to change the'x'
to a' '
(space). \$\endgroup\$Ghxst– Ghxst2019年09月19日 20:48:00 +00:00Commented Sep 19, 2019 at 20:48 -
2\$\begingroup\$ @Ghxst I wasn't questioning why the loop was there, just the
x
. :) \$\endgroup\$2019年09月19日 21:06:51 +00:00Commented Sep 19, 2019 at 21:06