In Kernighan and Ritchie (the C programming language):
Write a program to copy its input to its output, replacing each string of one or more blanks by a single blank.
I wrote the following code. Do you have advice to improve it?
#include <stdio.h>
/* copy input to output, replacing each
string of one or more blanks by a single blank */
main(){
int c;
double nc;
char otherBlank;
otherBlank = 'f';
while ((c = getchar()) != EOF){
if (otherBlank == 't' && c != ' '){
/* read a non blank and had a blank before */
putchar(' ');
putchar(c);
otherBlank = 'f';
}
else if (c == ' ')
/* read a blank */
otherBlank = 't';
else
/* read a non blank and had no blank before */
putchar(c);
}
}
4 Answers 4
There a few concerns I have with this program. Let's go through it line by line:
You don't declare what you are returning from
main()
, or what parameters you are going to take in. You should always be declaring these instead of letting the compiler assume them.int main(void)
You asked in the comments why we should declare this to return an
int
and notvoid
. We don't want to just throw away that useful exit status information. There is great answer here that goes into great detail as to why, you should take a look over there."But wait!", you say. "We didn't actually return anything from our
main
function! How does this black magic work?" Yep, I know. That is just what is declared in the standards.C99 & C11 §5.1.2.2(3):
"...reaching the
}
that terminates themain()
function returns a value of 0."You don't use your
nc
variable in your program. It should be removed.You declare
otherBlank
as achar
on one line and assign it the characterf
on the next line. You should be assigning on the same line as you are declaring in this case.Analyzing your program further, it is questionable as to why you aren't using a boolean value to represent
otherBlank
. You only seem to be using it as a "flag" of a sort, so it makes sense to me to represent it as thebool
type. Keep in mind that you will have to include<stdbool.h>
if you choose to change this.Further refactoring your program, there is no need for the
otherBlank
variable. Use a while loop to capture all of the' '
characters and then output only one' '
character. Then continue with the printing of the other non-blank characters.You don't check
putchar
for it's return value, as you should be doing. @Michael went into great detail with this, give his answer a look and an upvote to see why!
Here is what I managed to come up with:
#include <stdio.h>
int main(void)
{
int c;
while ((c = getchar()) != EOF) {
if (c == ' ') {
while ((c = getchar()) == ' ');
if (putchar(' ') == EOF) return -1;
if (c == EOF) break;
}
if (putchar(c) == EOF) return -1;
}
}
Here is how this could be further improved:
Use the function
fgets()
to capture the input once; looping over each character to capture it manually isn't very efficient (and if your program doesn't handle input properly, it could be dangerous).Store the input string in a character array and loop over the characters to test them for
' '
. Then make the necessary changes and print out the entire string at once, instead of looping and printing out the individual characters (again, inefficient).
-
\$\begingroup\$ The return from
getchar
is anint
so that it can holdEOF
. Soc
should beint
. Alsoget_s
is not widely supported and is unlikely to reliably capture the "entire input" - usefgets
for max compatibility. And be aware that thefgets
(orgets_s
) will discard each\n
. \$\endgroup\$William Morris– William Morris2014年05月24日 23:40:28 +00:00Commented May 24, 2014 at 23:40 -
\$\begingroup\$ @WilliamMorris
EOF
is defined in<stdio.h>
as#define EOF (-1)
, and since the range of achar
is defined from -128 to 127 it should be safe for the implicit conversion in order to comparec
to theint
typeEOF
. And you are probably right about usingfgets
, I'm just trying to use all this new-fangled C11 standard stuff :P \$\endgroup\$syb0rg– syb0rg2014年05月25日 03:55:54 +00:00Commented May 25, 2014 at 3:55 -
\$\begingroup\$ syb0rg; please can you put in checks for == EOF on the return code of the putchars; see my comment below. \$\endgroup\$Michael– Michael2014年05月25日 05:54:04 +00:00Commented May 25, 2014 at 5:54
-
\$\begingroup\$ Using gets_s will fail on some input; see stackoverflow.com/questions/12893774/… for more. \$\endgroup\$Michael– Michael2014年05月25日 06:00:00 +00:00Commented May 25, 2014 at 6:00
-
1\$\begingroup\$ @bigTree; there's a nice StackOverflow answer that covers return values stackoverflow.com/questions/204476/… ; note especially the comments about EXIT_SUCCESS and EXIT_FAILURE. \$\endgroup\$Michael– Michael2014年05月26日 08:37:39 +00:00Commented May 26, 2014 at 8:37
Three things
- you should check the return from putchar and fail with an error message if it fails
- normally in C you want to use a boolean value not a character for true and false
- from my understanding of the original question from K&R you seem to have a bug if you have a character then a blank at the end of your file
I see that some other answers have been given with examples that don't include checking the return value. For this reason I'll expand the comment a but more.
If you don't check the return value from putchar it's possible that some of your writes fail then others succeed. This leaves you with a silently corrupted file. This would happen, for example, if your disk filled temporarily during running your program. This is much nastier than just truncating the file to the size of the free space. If the putchar falls the program should visibly fail.
I was brewing an answer to this question at the same time as the others, and it overlaps a lot with Syborg's .... but, it is different enough to present it.
The reasons why this is an improvement over your code is that it has just the one variable. Additionally, it has tight loops without conditionals. Even though there is an outer loop, in many instances (long strings of words with few spaces, or words with many huge gaps), the tight loops will make the code highly optimizable. So, here's how I would code it (note that Java is my strongest language, so I normally get help with C....).
int c = getchar();
while (c != EOF){
do {
putchar(c);
} while (c != ' ' && (c = getchar()) != EOF);
while (c == ' ') {
c = getchar();
}
}
I have put this in an ideone to test it...
Using a do-while loop makes things simpler, sometimes.
-
\$\begingroup\$ The return from
getchar
is an int so that it can holdEOF
. Soc
should beint
. \$\endgroup\$William Morris– William Morris2014年05月24日 23:54:06 +00:00Commented May 24, 2014 at 23:54 -
\$\begingroup\$ @WilliamMorris - Hmmm ... fixing that now... of course (and I'm sure there's other problems too) \$\endgroup\$rolfl– rolfl2014年05月25日 00:04:21 +00:00Commented May 25, 2014 at 0:04
Instead of declaring otherBlank
and then assigning to it, just initialize it right away:
char otherBlank = 'f';
This is a bit cleaner and keeps the variable within a lower scope.
Also, nc
is declared but unused, so it should be removed. Leaving unused variables around will clutter the code and cause compiler warnings (if warning levels are high enough, which they should be).
-
\$\begingroup\$ 'it keeps the variable in a lower scope' Could you explain me please? \$\endgroup\$bigTree– bigTree2014年05月24日 14:27:44 +00:00Commented May 24, 2014 at 14:27
-
1\$\begingroup\$ @bigTree: By doing so, it'll be easier to tell what has access to a variable. This is mostly a maintenance thing. For instance, a variable created in a function is in local scope and exists only within that function. While scope is not a large issue for this small program, it's still a useful thing to know about. \$\endgroup\$Jamal– Jamal2014年05月24日 14:30:12 +00:00Commented May 24, 2014 at 14:30
otherBlank
eitherstdbool
'sbool
or atleastint
with 1/0 perhaps#define
d toTRUE
/FALSE
(old school) \$\endgroup\$