One of my colleagues told me that you can use streambuffer
for std::string
, and instead of using string id
in the for
loop, you can declare it outside.
I am not sure whether this will make any difference or not.
void getRecord(uint64_t currentTime, const std::vector<uint32_t> &shards) {
try {
currentTime = (currentTime / 1000) * 1000;
uint64_t start = (currentTime / (60 * 60 * 1000 * 24)) % 3;
string query;
Cassandra::Result result;
std::string base(boost::lexical_cast<std::string>(currentTime) + ".");
for (std::vector<uint32_t>::const_iterator it = shards.begin() ; it != shards.end(); ++it) {
string id = base + boost::lexical_cast<std::string>(*it); // use stream buffer
query = "select name, value from keyspace.hour_"+boost::lexical_cast<std::string>(start)+" where id ='"+id+"';";
result = Cassandra::execute_query(query);
}
}
}
I am coming from a Java background, so I'm not sure which one will be more efficient compared to the other, as C++ is all about memory allocation. Is there any way to optimize the above code, even slightly?
For me what I am trying to optimize is the other part leaving Cassandra query part as it is. I know there might be few minimal optimization that I am supposed to make but if it is worth doing it, then I might do it now then later figuring it out.
Below is my full code. Again, I am mainly worried about my other string manipulation instead of Cassandra query part - I have removed some extra code which was not needed.
void getData(uint64_t currentTime, const std::vector<uint32_t> &shards) {
uint16_t id;
uint64_t lmd;
uint32_t length;
const char* binary_value;
string column_key;
string bt;
uint64_t user_id;
string colo;
bool flag = false;
try {
currentTime = (currentTime / 1000) * 1000;
uint64_t start = (currentTime / (60 * 60 * 1000 * 24)) % 3;
string query;
Cassandra::Result result;
std::string base(boost::lexical_cast<std::string>(currentTime) + ".");
for (std::vector<uint32_t>::const_iterator it = shards.begin() ; it != shards.end(); ++it) {
string id = base + boost::lexical_cast<std::string>(*it); // use stream buffer
query = "select name, value from keyspace.hour_"+boost::lexical_cast<std::string>(start)+" where id ='"+id+"';";
result = Cassandra::execute_query(query);
while (result && result.value()->next()) {
char* key = 0;
for (size_t i = 0; i < result.value()->column_count(); ++i) {
cql::cql_byte_t* data = NULL;
cql::cql_int_t size = 0;
result.value()->get_data(i, &data, size);
if (!flag) {
key = reinterpret_cast<char*>(data);
flag = true;
} else {
int index=0;
id = Mgr::get_uint16((&data[index]));
index += 2;
lmd = Mgr::get_uint64((&data[index]));
index += 8;
length = Mgr::get_uint32((&data[index]));
index += 4;
binary_value = reinterpret_cast<const char *>(&data[index]);
flag = false;
}
}
if(key) {
vector<string> res;
char* p;
char* totken = strtok_r(key, ".", &p);
while(totken != NULL)
{
res.push_back(totken);
totken = strtok_r(NULL, ".", &p);
}
column_key= res[0];
bt= res[1];
user_id= atoi(res[2].c_str());
colo = res[3];
}
}
}
} catch (const std::exception& ex) {
}
}
1 Answer 1
This looks like incomplete code to me.... what do you do with result
once you have it?
Still, I look at what your code does, and can't help but think your priorities are wrong in this instance.....
sure, streambuffer
may hypothetically be faster than your String manipulation, but, really, who cares? (Note: I say 'who cares' from the context of the work you are doing).
Running any database query is going to take much, much longer than the string concatenation operations you are doing.....
When Cassandra gets your query, it is going to strip it down to components, analyze them for validity, relate them to database structures, optimize the query to use the best possible access plans, and then execute the query....
I would guess that the work you are doing in your method is probably 'orders of magnitude' less than what Cassandra is going to do...
In this instance, for your code specifically, I would recommend that you use the most readable and maintainable mechanisms. If you want to use streambuffer
s then do it, but whatever you do the motivation should not, in this case, be because of performance.
Read up on Amdahl's Law ... ;-)
-
\$\begingroup\$ For me what I am trying to optimize is the other part leaving Cassandra query part as it is.. I know there might be few minimal optimization that I am supposed to make but if it is worth doing it, then I might do it now then later figuring it out such as the way I am doing for string manipulation and other minor things if any.. \$\endgroup\$david– david2013年12月04日 20:10:57 +00:00Commented Dec 4, 2013 at 20:10
-
\$\begingroup\$ I have updated the question with more details..!! \$\endgroup\$david– david2013年12月04日 20:40:15 +00:00Commented Dec 4, 2013 at 20:40
Cassandra::execute_query(query);
by many orders of magnitude. \$\endgroup\$