I am wondering about the feasibility of the following basic implementation of a server and how well it would scale. I know that large-scale, distributed servers should probably be written in a language like Erlang, but I'm interested in the viability of the following code "these days".
Other than bugs/issues I'd primarily like to know 3 things:
- C headers have many with compatibility methods/structs/etc. Some of which do similar things. Is this a correct "modern" way to handle incoming IPv4 and IPv6 connections?
- How scalable is it? If I have a single VPN and don't need a distributed server, is it adequate for todays applications? (Potentially thousands/millions of concurrent connections? I appreciate the latter would also very much be hardware dependent!)
// SimpleCServer.c
// Adapted from http://beej.us/guide/bgnet/output/print/bgnet_A4.pdf
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>
#include <arpa/inet.h>
#include <pthread.h>
// The port users will be connecting to
#define PORT "12345"
// Prototype for processing function
void *processRequest(void *sdPtr);
// Get sockaddr, IPv4 or IPv6
void *get_in_addr(struct sockaddr *sa) {
if (sa->sa_family == AF_INET) {
return &(((struct sockaddr_in*)sa)->sin_addr);
}
return &(((struct sockaddr_in6*)sa)->sin6_addr);
}
int main(int argc, char *argv[]) {
// Basic server variables
int sockfd = -1; // Listen on sock_fd
int new_fd; // New connection on new_fd
int yes=1;
int rv;
struct addrinfo hints, *servinfo, *p;
struct sockaddr_storage their_addr; // connector's address information
socklen_t sin_size;
char s[INET6_ADDRSTRLEN];
// pthread variables
pthread_t workerThread; // Worker thread
pthread_attr_t threadAttr; // Set up detached thread attributes
pthread_attr_init(&threadAttr);
pthread_attr_setdetachstate(&threadAttr, PTHREAD_CREATE_DETACHED);
// Server hints
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_PASSIVE; // use my IP
if ((rv = getaddrinfo(NULL, PORT, &hints, &servinfo)) != 0) {
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv));
return 1;
}
// Loop through all the results and bind to the first we can
for(p = servinfo; p != NULL; p = p->ai_next) {
if ((sockfd = socket(p->ai_family, p->ai_socktype, p->ai_protocol)) == -1) {
perror("server: socket");
continue;
}
if (setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(int)) == -1) {
perror("setsockopt");
exit(2);
}
if (bind(sockfd, p->ai_addr, p->ai_addrlen) == -1) {
close(sockfd);
perror("server: bind");
continue;
}
break;
}
if (p == NULL) {
fprintf(stderr, "server: failed to bind\n");
return 3;
}
// All done with this structure
freeaddrinfo(servinfo);
// SOMAXCONN - Maximum queue length specifiable by listen. (128 on my machine)
if (listen(sockfd, SOMAXCONN) == -1) {
perror("listen");
exit(4);
}
printf("server: waiting for connections...\n");
// Main accept() loop
while (1) {
// Accept
sin_size = sizeof their_addr;
new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size);
if (new_fd == -1) {
perror("accept");
continue;
}
// Get IP Address for log
inet_ntop(their_addr.ss_family, get_in_addr((struct sockaddr *)&their_addr), s, sizeof s);
printf("server: got connection from %s\n", s);
// Process the request on a new thread. Spawn (detaching) worker thread
pthread_create(&workerThread, &threadAttr, processRequest, (void *)((intptr_t)new_fd));
}
return 0;
}
void *processRequest(void *sdPtr) {
int sd = (int)sdPtr;
fprintf(stderr, "Processing fd: %d\n", sd);
// Processing goes here
FILE *fpIn = fdopen(sd, "r");
FILE *fpOut = fdopen(sd, "w");
fprintf(fpOut, "Processing fd %d on server.", sd);
fflush(fpOut);
//
fclose(fpIn);
fclose(fpOut);
close(sd);
return NULL;
}
1 Answer 1
First of all, each connection consumes a local port. Therefore, the number of concurrent connection is hard limited by unsigned short, that is 65536 (so millions are out of question). There are also other limitations you may or may not care about.
Second, thread creation is somewhat expensive. Consider pre-allocating a thread pool.
Third, a code for reading data from the connection is missing. I assume that the intention is for each thread to issue a recv
system call. This may lead to many thousands outstanding system calls, each consuming kernel resources. Using poll
is way more scalable.
Finally, a code review. Your main
does way too much. Variables are declared too far away from their uses. Consider restructuring. At least 2 functions (setup_listener_socket
and mainloop
) must be realized.
-
\$\begingroup\$ Thanks! I am not 100% sure what you mean in the first paragraph, however this SF question seems to indicate more than 65536 connections are possible. I did wonder about thread pools though, I shall go hunting for a decent implementation. \$\endgroup\$Ephemera– Ephemera2014年06月11日 01:11:50 +00:00Commented Jun 11, 2014 at 1:11
-
\$\begingroup\$ "each connection consumes a local port." This is plain wrong! The server listens on exactly one port, that's it. The accepted socket does not occupy any port. What each connection does use is a socket descriptor, which indeed is a limited system resource. \$\endgroup\$alk– alk2014年06月14日 10:25:27 +00:00Commented Jun 14, 2014 at 10:25
-
\$\begingroup\$ Nope. Same local port for all connections. \$\endgroup\$tmyklebu– tmyklebu2014年06月15日 02:25:43 +00:00Commented Jun 15, 2014 at 2:25
-
\$\begingroup\$ I stand corrected \$\endgroup\$vnp– vnp2014年06月17日 07:24:46 +00:00Commented Jun 17, 2014 at 7:24
-
\$\begingroup\$ +1 for thread pools. You definitely don't want 1 million threads fighting for the scheduler's attention. \$\endgroup\$jliv902– jliv9022014年07月10日 19:14:08 +00:00Commented Jul 10, 2014 at 19:14