1
\$\begingroup\$

I am a C# developer, dont know much about vc++ or c++, never used it, for some reasons i have decided to use a c++ dll in my app for downloading content from the web.

I dont want to use WebClient.

I want it to download html content by providing Url of the resource, the dll will return the string response.

this is the code i have till now, source

#include <iostream>
#include <string>
#include <stdlib.h>
#include <winsock.h>//dont forget to add wsock32.lib to linker dependencies
using namespace std;
#define BUFFERSIZE 1024
void die_with_error(char *errorMessage);
void die_with_wserror(char *errorMessage);
int main(int argc, char *argv[])
{
 string request;
 string response;
 int resp_leng;
 char buffer[BUFFERSIZE];
 struct sockaddr_in serveraddr;
 int sock;
 WSADATA wsaData;
 char *ipaddress = "208.109.181.178";
 int port = 80;
 request+="GET /test.html HTTP/1.0\r\n";
 request+="Host: www.zedwood.com\r\n";
 request+="\r\n";
 //init winsock
 if (WSAStartup(MAKEWORD(2, 0), &wsaData) != 0)
 die_with_wserror("WSAStartup() failed");
 //open socket
 if ((sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0)
 die_with_wserror("socket() failed");
 //connect
 memset(&serveraddr, 0, sizeof(serveraddr));
 serveraddr.sin_family = AF_INET;
 serveraddr.sin_addr.s_addr = inet_addr(ipaddress);
 serveraddr.sin_port = htons((unsigned short) port);
 if (connect(sock, (struct sockaddr *) &serveraddr, sizeof(serveraddr)) < 0)
 die_with_wserror("connect() failed");
 //send request
 if (send(sock, request.c_str(), request.length(), 0) != request.length())
 die_with_wserror("send() sent a different number of bytes than expected");
 //get response
 response = "";
 resp_leng= BUFFERSIZE;
 while (resp_leng == BUFFERSIZE)
 {
 resp_leng= recv(sock, (char*)&buffer, BUFFERSIZE, 0);
 if (resp_leng>0)
 response+= string(buffer).substr(0,resp_leng);
 //note: download lag is not handled in this code
 }
 //display response
 cout << response << endl;
 //disconnect
 closesocket(sock);
 //cleanup
 WSACleanup();
 return 0;
}
void die_with_error(char *errorMessage)
{
 cerr << errorMessage << endl;
 exit(1);
}
void die_with_wserror(char *errorMessage)
{
 cerr << errorMessage << ": " << WSAGetLastError() << endl;
 exit(1);
}

The instructions with the code says, code will end an http download 'early' if the download is anything slower than instant.

Please suggest better code, or modifications, which can be perfect for use in a Fast Crawler.

asked Jul 27, 2011 at 4:05
\$\endgroup\$
1
  • 7
    \$\begingroup\$ Your implementation is completely broken with respect to HTTP compliance. You might be lucky if it works in more than one case. Implementing HTTP is not trivial. Use WebClient or any other well-tested HTTP library. Furthermore, your performance will not magically increase by switching from C# to C++. \$\endgroup\$ Commented Jul 27, 2011 at 4:36

1 Answer 1

7
\$\begingroup\$

The above implementation has little to no chance of working. The HTTP protocol is quite complex, and none of that complexity is contemplated.

My advice is, don't waste time and energy in this. Use a high-level library.

Also, using low-level code will NOT improve the performance of fetching a document via HTTP -- even if your implementation was blazing-fast 100% polished assembly code, you'd still have to wait centuries (from the CPU point of view) for the data to arrive from the network.

answered Jul 27, 2011 at 6:29
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.