Needed a quick stream to get JSON objects.
#ifndef THORSANVIL_SIMPLE_STREAM_THOR_STREAM_H
#define THORSANVIL_SIMPLE_STREAM_THOR_STREAM_H
#include <istream>
#include <mutex>
#include <condition_variable>
#include <vector>
#include <curl/curl.h>
#include <string.h>
namespace ThorsAnvil
{
namespace Stream
{
extern "C" size_t writeFunc(char* ptr, size_t size, size_t nmemb, void* userdata);
extern "C" size_t headFunc(char* ptr, size_t size, size_t nmemb, void* userdata);
class IThorStream;
class IThorSimpleStream: public std::istream
{
friend class IThorStream;
struct SimpleSocketStreamBuffer: public std::streambuf
{
typedef std::streambuf::traits_type traits;
typedef traits::int_type int_type;
SimpleSocketStreamBuffer(std::string const& url, bool useEasyCurl, bool preDownload, std::function<void()> markStreamBad)
: empty(true)
, open(true)
, sizeMarked(false)
, droppedData(false)
, preDownload(preDownload)
, sizeLeft(0)
, markStreamBad(markStreamBad)
{
curl = curl_easy_init();
if(!curl)
{ markStreamBad();
}
curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1L);
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, writeFunc);
curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, headFunc);
curl_easy_setopt(curl, CURLOPT_WRITEHEADER, this);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, this);
curl_easy_setopt(curl, CURLOPT_PRIVATE, this);
if (useEasyCurl)
{
/* Perform the request, res will get the return code */
CURLcode result = curl_easy_perform(curl);
if ((result != CURLE_OK) && (result != CURLE_WRITE_ERROR))
{ markStreamBad();
}
}
}
~SimpleSocketStreamBuffer()
{
curl_easy_cleanup(curl);
}
virtual int_type underflow()
{
if (droppedData)
{ markStreamBad();
}
return EOF;
}
protected:
friend size_t writeFunc(char* ptr, size_t size, size_t nmemb, void* userdata)
{
std::size_t bytes = size*nmemb;
SimpleSocketStreamBuffer* owner = reinterpret_cast<SimpleSocketStreamBuffer*>(userdata);
std::unique_lock<std::mutex> lock(owner->mutex);
if ((!owner->empty) && (!owner->preDownload))
{
// Its not bad yet.
// It only becomes bad if the user tries
// to read any of this data. Then we mark
// it bad. So the actual marking bad is done
// in underflow().
owner->droppedData=true;
return 0;//CURL_WRITEFUNC_PAUSE;
}
owner->empty = false;
std::size_t oldSize = owner->buffer.size();
owner->buffer.resize(oldSize + bytes);
std::copy(ptr, ptr + bytes, &owner->buffer[oldSize]);
owner->setg(&owner->buffer[0], &owner->buffer[0], &owner->buffer[oldSize + bytes]);
owner->cond.notify_one();
if (owner->sizeMarked)
{
owner->sizeLeft -= bytes;
owner->open = (owner->sizeLeft != 0);
}
return bytes;
}
friend size_t headFunc(char* ptr, size_t size, size_t nmemb, void* userdata)
{
if (strncmp(ptr, "HTTP/", 5) == 0)
{
int respCode = 0;
char* space = strchr(ptr+5, ' ');
if ((space != NULL) && (sscanf(space," %d OK", & respCode) == 1) && (respCode == 200))
{ /* GOOD */ }
else
{
SimpleSocketStreamBuffer* owner = reinterpret_cast<SimpleSocketStreamBuffer*>(userdata);
std::unique_lock<std::mutex> lock(owner->mutex);
owner->markStreamBad();
}
}
if (strncmp(ptr, "Content-Length:", 15) == 0)
{
SimpleSocketStreamBuffer* owner = reinterpret_cast<SimpleSocketStreamBuffer*>(userdata);
std::unique_lock<std::mutex> lock(owner->mutex);
owner->sizeLeft = atoi(ptr+15);
owner->sizeMarked = true;
if (owner->preDownload)
{
owner->buffer.reserve(owner->sizeLeft);
}
}
return size*nmemb;
}
bool empty;
bool open;
bool sizeMarked;
bool droppedData;
bool preDownload;
std::size_t sizeLeft;
std::mutex mutex;
std::condition_variable cond;
std::vector<char> buffer;
CURL* curl;
std::function<void()> markStreamBad;
};
SimpleSocketStreamBuffer buffer;
public:
IThorSimpleStream(std::string const& url, bool preDownload = false)
: std::istream(NULL)
, buffer(url, true, preDownload, [this](){this->setstate(std::ios::badbit);})
{
std::istream::rdbuf(&buffer);
}
};
}
}
#endif
It was inspired by this gist:
https://gist.github.com/Loki-Astari/8201956
Which uses an early version of this code and my JSON library to easily make REST calls to an HTTP enpoint and retrieve the JSON response object directly into an object with no manual parsing. The JSON parsing code has already been reviewed here:
1 Answer 1
I have a few comments that are unrelated to the synchronous/asynchronous and/or header-only nature of the code.
parameters
bools
I don't like passing bool
s as parameters. I really dislike a function like your SimpleSocketStreamBuffer
constructor that take multiple bool
s. You need to do a fair amount of looking to be sure how:
foo x("www.google.com", false, true, bar);
differs from:
foo x("www.google.com", true, false, bar);
(...and the same for the other variations as well). I'd strongly prefer to create an enumeration and have the parameters of that type.
std::function
I don't like passing a std::function
as a parameter either. I'd prefer to see a template parameter to specify the function type, and then use an std::function
only internally for storage. With those, the constructor looks something like this:
enum class curl_type { easy, hard };
enum class dl_strategy { lazy, greedy };
template <class func>
SimpleSocketStreamBuffer(std::string const& url,
curl_type ct,
dl_strategy download,
func markStreamBad)
Then we modify the remainder to suit:
, preDownload(download == dl_strategy::greedy)
and:
if (ct == curl_type::easy)
// ...
Of course, if you're going to do this, you want to do it throughout (e.g., to the stream definition, not just the stream buffer definition).
Using these, defining an object looks more like this:
foo x("www.google.com", curl_type::easy, dl_strategy::greedy, bar);
...which strikes me as quite a bit more self-explanatory.
userdata pointer
In writeFunc
and headerFunc
, it would probably be best to verify that userdata
isn't a null pointer before using it. Once you've verified that it's non-null, I think I'd prefer to define owner
as a reference instead of a pointer:
SimpleSocketStreamBuffer & owner = *reinterpret_cast<SimpleSocketStreamBuffer*>(userdata);
This prevents accidentally re-assigning owner
to point anywhere else, and (arguably) simplifies the rest of the code a bit by letting you use .
instead of ->
when you're dealing with the owner
object.
buffer manipulation
I also don't particularly like the code you used to add data to the end of the buffer:
owner->buffer.resize(oldSize + bytes);
std::copy(ptr, ptr + bytes, &owner->buffer[oldSize]);
I think I'd prefer to just insert the data in a single step:
owner->buffer.insert(owner->buffer.end(), ptr, ptr + bytes);
lambda syntax
There's one other change I'd consider, but I'm a bit less certain about whether I'd actually make it. A lambda that takes no parameters doesn't need to include empty parens. Using this (and taking the preceding changes into account) the ctor for IThorSimpleStream
can end up looking like this:
IThorSimpleStream(std::string const& url, dl_strategy s = dl_strategy::lazy)
: std::istream(NULL)
, buffer(url, curl_type::easy, s, [this]{this->setstate(std::ios::badbit); })
{
std::istream::rdbuf(&buffer);
}
Particularly given the degree to which people are undoubtedly accustomed to using empty parens when defining functions, this could lead to some confusion. I suspect when people are more accustomed to lambda syntax, we'll probably view the empty parens are kind of foolish looking, but for now it may be better to leave them there.
-
\$\begingroup\$ Thanks, I like all those. I did not know about dropping the empty param (not sure I like it (might get used to) but some experimenting is in order). \$\endgroup\$Loki Astari– Loki Astari2014年03月02日 08:05:02 +00:00Commented Mar 2, 2014 at 8:05
#include
the header file no further requirements on libraries required. \$\endgroup\$