Programming noob going through Kernighan and Ritchie's "The C Programming Language" for the first time. There are some exercises right at the start that require the use of EOF triggers.
I'm on Linux where an EOF trigger is is supposedly sent with Ctrl+d, but this just exits the program when being tested, or closes the terminal otherwise, which I know is how bash handles EOF triggers but how else can I properly test these programs?
4 Answers 4
As mentioned in the comments, control-d (on an empty input buffer) will signal EOF for STDIN.
You said:
Ctrl+d ... just exits the program when being tested, or closes the terminal
Control-d issued to a shell will exit that shell. But it's not like EOF forces a program to exit, it just tells the program that it reached the end of the input file. The program has to decide what to do with that. As you suspected, the shell program was written to exit on EOF. If that shell is at the top level of a terminal window, it might close that window, depending on the terminal program.
Same with other programs. For example:
$ cat
^d
$
The cat program was reading from standard input. I hit control-d, which set EOF on standard in. cat was written to exit.
So, how might we want to test cat? Maybe three test cases:
- A full input line, followed by EOF.
- A partial input line that does not include a newline, followed by EOF.
- No input at all, just EOF right away.
You could do these tests interactively from the keyboard:
$ cat
abc <-- I typed 'abc<return>'
abc <-- this is printed by cat
$ cat
abcabc$ <-- I typed 'abc^d; cat printed 'abc'; I typed ^d'; cat exited; shell printed '$'
$ cat
$ <-- I typed '^d'; shell printed '$'
Or I could do the same three tests in a shell script ("x.sh") to automate the testing:
#!/bin/sh
echo test 1
echo "abc" | cat
echo test 2
echo -n "abc" | cat # echo -n suppresses the final newline
echo test 3
cat /dev/null | cat
Here's its output:
$ ./x.sh
test 1
abc
test 2
abctest 3
$
Hard to interpret. Let's try again, modifying the shell script to capture cat's output:
#!/bin/sh
echo test 1
echo "abc" | cat >test1.out
hexdump -C test1.out
echo test 2
echo -n "abc" | cat >test2.out
hexdump -C test2.out
echo test 3
cat /dev/null | cat >test3.out
ls -s test3.out
Here's the output:
$ ./x.sh
test 1
00000000 61 62 63 0a |abc.|
00000004
test 2
00000000 61 62 63 |abc|
00000003
test 3
0 test3.out
$
This shows you exactly the output from cat for first two test cases. The final ls -s shows zero bytes in the file.
3 Comments
On Linux ctrl-d (twice if stream isn't empty) does indeed set the end of file or error indicator (EOF) on stdin:
#include <stdio.h>
int main() {
int ch = getchar();
if(ch == EOF) puts("EOF");
else printf("%c\n", ch);
}
and example runs:
$ ./a.out
<ctrl-d>
EOF
$ ./a < /dev/null
EOF
$ ./a.out
1<enter>
1
Comments
short answer:
#include <stdio.h>
int main() {
int a;
while (scanf("%d", &a) != EOF) {
// Do code
}
return 0;
}
Change the scanf data type as you want, or modify the way you read data (maybe using a buffer or something else), according to your code.
for testing:
write this in the terminal
gcc -o your_program your_program.c
echo -e "1\n2\n3\n4\n5" > input.txt
./your_program < input.txt
2 Comments
There's a lot of confusion with EOF and the end of a file. There's no such a thing like EOF.
In UNIX systems, since the seventies, when a file is read(2) on the end of the file, read simply returns 0 characters as return value without blocking or waiting for more data to be available. So there's no special character that is used to indicate the end of a file, like systems like CP/M or VMS in which control-Z was used to indicate in the last block where the file actually ended.
It is the tty driver of UNIX, that interprets the Ctrl-D character as a special control and delivers to the program (unblocking any read that is waiting for the amount of characters to be read) the exact amount of characters it had in the buffer before the ctrl-d was received (and discarded).
The unblock of read requests happens on tty(5) driver when any of two special characters is input. The first one is the newline character (normally an ascii CR, as LF is in general ignore by the tty in canonical mode) the terminal converts it into a '\n' ascii newline character and returns the complete buffer to the program (including the new line) When the character is Ctrl-D, the same behaviour is done, but the Ctrl-D is excluded as interpreted and consumed by the driver. This result in that the current input buffer is input to the program whithout including the control character that triggered the input and can result (on an empty buffer) in no characters read at all, as when a file has no more characters in it. How the terminal interprets and processes character in canonical mode (line mode, or non raw mode) is described in the termios(3) man page and you can read it on the online manual in all posix systems.
This means fpr example, that you can escape (by means of a Ctrl-V character in today's systems, and '\' character in old unices) a Ctrl-D and include it in the input stream as a normal character, to the program, or you can press it when the input buffer is empty, the read will return no characters at all in this case (the read() system call return value will be 0, indicating that 0 characters are read) This is the convenion to indicate an end of input, the presence of zero characters in the input buffer.
To detect the possibility of interpreting a single control-d as not eof, because some characters were present in the input buffer, some programs will normally do a second read() and only when two consecutive reads result in zero characters read, an end of input is signalled in the program (but this happens in the program itself, it is how the program interprets two consecutive reads with zero chars input). For example the shell will accept a command line that does not end in a \n, so only after two empty command lines (meaning true empty, this is even the newline is not present) will recognize that as an EOF.
So the net result is that EOF is not a thing, but some convenience protocol used to detect that the end of input has arrived. There's no special character to indicate EOF (the terminal uses a control character to simulate something that sometimes is interpreted as the end of input in the majority of programs) nor is a permanent thing (later, another read can result in a full input buffer) nor there's a special state to indicate that input has exhausted. If you make two consecutive reads (or just one) in which the result is that you read nothing, you can assume that you have reached the end of input. But beware, that there are programs that read a third time, and you can get that, after n times you get new data again (e.g. a tty will return the available data --let's say nothing-- immediately after receiving a ctrl-d, but next time you read the device you can wait blocked in the read until more data is available and you will read data incoming after EOF was detected)
echo "hi" | yourprogramputs end-of-file afterhi. Similarly,yourprogram </dev/nullmeans that stdin is in an EOF condition as of the very first attempt to read. Or if you create a filesample-input.txtand run./yourprogram <sample-input.txt, then an EOF condition takes place after all the content in the file has been consumed. Etc, etc.