Yesterday I posted a request for a code review. After reviewing the code and the reviews, and after some sleep, I decided I can do better. In fact I logged on with the intent of deleting the question just as the first review was posted; so I was stuck with it.
This is pretty much the same program, but it is the result of much more effort. It even includes a thing I've never used before: select
. To use the program, and the previous program, the user provides two arguments: the target IP and the target port. Contrary to the last program, this program terminates with ctrlc.
#include <signal.h>
#include <arpa/inet.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <unistd.h>
#define BUFLEN 1024
#define STDIN 0
#define STDOUT 1
void sigint_handle(int dummy);
int connect_to(char *host, int port);
int transfer(int fd_in, char *buf, int buf_len, int fd_out);
// boolean flag for program continuation
volatile int run = 1;
// entry point of linux ELF
int main(int args, char **argv) {
int sd, ret_val = 0;
fd_set fds;
struct timeval tv;
char buffer[BUFLEN];
assert(args == 3);
signal(SIGINT, sigint_handle);
sd = connect_to(argv[1], atoi(argv[2]));
FD_ZERO(&fds);
tv.tv_sec = 0;
tv.tv_usec = 10000;
while (run) {
FD_SET(STDIN, &fds);
FD_SET(sd, &fds);
ret_val = select(sd + 1, &fds, NULL, NULL, &tv);
if (FD_ISSET(STDIN, &fds))
ret_val = transfer(STDIN, buffer, BUFLEN, sd);
else if (FD_ISSET(sd, &fds))
ret_val = transfer(sd, buffer, BUFLEN, STDOUT);
assert(ret_val == 0);
}
close(sd);
return 0;
}
// SIGINT handler
void sigint_handle(int dummy) {
run = 0;
}
// a very picky connect to tcp host function
int connect_to(char *host, int port) {
struct sockaddr_in addr;
int sd = socket(AF_INET, SOCK_STREAM, 0);
memset(&addr, 0, sizeof(addr));
assert(sd > -1);
addr.sin_family = AF_INET;
assert(inet_pton(AF_INET, host, &addr.sin_addr) > 0);
addr.sin_port = htons(port);
assert(connect(sd, (struct sockaddr *)&addr, sizeof(addr)) > -1);
return sd;
}
// transfer function - basically a one time pipe
int transfer(int fd_in, char *buf, int buf_len, int fd_out) {
int len = read(fd_in, buf, buf_len);
return len - write(fd_out, buf, len);
}
I should note it's not really meant to be user friendly. It's meant to do one specific task as a subprocess, and panics otherwise; again using ctrlc to quit.
import subprocess
import signal
import os
import socket
def dostuff():
host = socket.gethostbyname("google.com")
proc = subprocess.Popen(["./client " + host + " 80"],
stdout=subprocess.PIPE, shell=True,
stdin=subprocess.PIPE)
global pid
pid = proc.pid
proc.stdin.write("GET /\n\n")
while proc.poll() is None:
print proc.stdout.readline()
if __name__ == "__main__":
try: dostuff()
except: os.kill(pid, signal.SIGINT)
5 Answers 5
Detect closed connection 1
select
promises that aread
wouldn't block. This is a case when there are data, but this is also a case that the remote end is closed. In this caseread
immediately returns 0. Without testing forlen > 0
intransfer
end up in an infinite emptyselect/read
cycle.Detect closed connection 2
The remote may close connection while you are attempting to write. This would result in
SIGPIPE
most likely killing the program. You need a handler, as simple as setting a flag to be tested in the main loop.assert
is for detecting bugs. Is not a way to handle errors.
-
\$\begingroup\$ To expand a bit on
assert
: It is a form of sanity check, you should use it for expressions that should always be true, no matter the environment. See post on SO for more info stackoverflow.com/questions/8114008/… \$\endgroup\$jacwah– jacwah2015年07月01日 10:13:20 +00:00Commented Jul 1, 2015 at 10:13
A few notes:
As Jamal noted in the comments, your definitions of
STDIN
andSTDOUT
are somewhat useless. You can see here that using the givenSTDIN_FILENO
andSTDOUT_FILENO
are already given those values. To counter your claim in the comments that they help readability, I would argue that they actually clutter up your code and should be removed.Declaring
run
as a global variable isn't the best way to use signal handlers IMO. I would consider just testing the return value fromsignal
forSIGINT
in your while loop and quitting if that becomes true. I haven't tested this, but you shouldn't even need asighandler_t
for that.The typical way that
main()
is written is as such:int main(int argc, char **argv)
Note the
argc
, not theargs
. This actually stands for something ("argument count"), just likeargv
has a meaning as well ("argument vector").All declarations should be on their own line. From Code Complete, 2d Edition, p. 759:
With statements on their own lines, the code reads from top to bottom, instead of top to bottom and left to right. When you’re looking for a specific line of code, your eye should be able to follow the left margin of the code. It shouldn’t have to dip into each and every line just because a single line might contain two statements.
Always initialize your values. Right now, you've only initialized one value to
0
. Initialize them all. This is to safeguard against some possible weird bugs.Don't
assert
anything that has to do with user input. Assertions are to be used to indicate the programmer's errors, not errors of the users. Instead, test if the input is what you'd like, and if it's not then print a message telling the user what they did wrong and return an error code.Use braces with your
if-else
conditionals please. See the Applegoto
fail as for a good reason why.
-
1\$\begingroup\$ I didn't even notice the
args
. It's uncommon and might break existing code. Good catch. \$\endgroup\$Jamal– Jamal2015年06月26日 01:07:44 +00:00Commented Jun 26, 2015 at 1:07
This may not be useful for accounting for invalid command line arguments:
assert(args == 3);
It's more common to inform the user of the correct arguments upon failure. This message should be printed to stderr
and the program should terminate with some valid error value.
Here's an example of this:
if (argc != 3)
{
fprintf(stderr, "usage: %s arg1 arg2", argv[0]);
exit(EX_USAGE);
}
(EX_USAGE
is part of <sysexits.h>
.)
-
\$\begingroup\$ I don't really find such error codes useful but 0 and 1 or
EXIT_{SUCCESS,FAILURE}
. First, the message is what matters, as you noted, and often error codes aren't considered unless you set up your shell to show non-success ones. Second,sysexits.h
is not portable. \$\endgroup\$edmz– edmz2015年06月26日 10:02:35 +00:00Commented Jun 26, 2015 at 10:02
You are assuming that the socket and the console are immediately writeable, so your program can get blocked if that is ever not the case. I'd use the "write" set of fds in the select
call to test the waters, my answer over here has a bit of example code.
This can also be used to connect asynchronously -- you'd wait for the socket fd to become writeable, then you know that the connection is established. If the connection fails, the fd becomes readable, and you get an error for actually trying to read.
A few other points:
In function declarations, parameters need not have names because the compiler only needs to know what types such function may be called with.
void sigint_handle(int dummy); int connect_to(char *host, int port); int transfer(int fd_in, char *buf, int buf_len, int fd_out);
In
volatile int run = 1
,int
is not totally a good choice. The type guaranteed to be able to be accessed atomically -across asynchronous interrupts, for instance- issig_atomic_t
. See this answer, in particular:(§7.14 , paragraph 2) The type defined is sig_atomic_t, which is the (possibly volatile-qualified) integer type of an object that can be accessed as an atomic entity, even in the presence of asynchronous interrupts. (§7.14 , paragraph 2)
(§7.14p5) If [a] signal occurs other than as the result of calling the abort or raise function, the behavior is undefined if the signal handler refers to any object with static storage duration other than by assigning a value to an object declared as
volatile sig_atomic_t
.struct sockaddr_in addr; memset(&addr, 0, sizeof(addr))
is not really needed because you want to set each field of the struct to zero, not each byte which may also include padding ones. Simply dostruct sockaddr_in addr = {0}
.This rearrangement looks better and more solid
addr.sin_family = AF_INET; addr.sin_port = htons(port); assert(inet_pton(addr.sin_family, host, &addr.sin_addr) > 0);
Watch out for
assert
: as it has been said, it should not be used for error handling but for debugging bugs.assert
can, indeed, do nothing (and so hide bugs) ifNDEBUG
is defined!
-
1\$\begingroup\$ I think it can still be useful to include parameter names in declarations. It's not strictly required, but it makes the code more readable. The names might be just enough for someone reading the code to understand the parameters.
int transfer(int, char *, int, int);
Which of theint
s was the buffer length again? \$\endgroup\$jacwah– jacwah2015年07月01日 11:51:51 +00:00Commented Jul 1, 2015 at 11:51 -
\$\begingroup\$ That's true, @jacwah. Indeed, I said compilers don't need them; humans often do, alternatively to in-place documentation about the function itself and parameters thereof. \$\endgroup\$edmz– edmz2015年07月01日 13:20:27 +00:00Commented Jul 1, 2015 at 13:20
-
\$\begingroup\$ If you agree I don't see the point of that paragraph. The implied message, as I read it, is that OP should omit parameter names in forward declarations. \$\endgroup\$jacwah– jacwah2015年07月01日 13:45:12 +00:00Commented Jul 1, 2015 at 13:45
STDIN
andSTDOUT
? \$\endgroup\$stdin
andstdout
already have the same file descriptor values. \$\endgroup\$