I wrote some piece of code which reads and write multiple data structures on a file using C++. I would be grateful to get your feedback on what you think about the code (it works at least when I tested). Thanks.
#include "stdafx.h"
#include <iostream>
#include <fstream>
#include <sstream>
using namespace std;
// data structure to be written to a file
struct WebSites
{
char SiteName[100];
int Rank;
}s1,s2,s3,s4;
int _tmain(int argc, _TCHAR* argv[])
{
strcpy(s1.SiteName, "www.ppp.com");
s1.Rank = 0;
strcpy(s2.SiteName, "www.rrr.com");
s2.Rank = 111;
strcpy(s3.SiteName, "www.code.com");
s3.Rank = 123;
strcpy(s4.SiteName, "www.yahoo.com");
s4.Rank = 14;
// write
fstream binary_file("c:\\test.dat",ios::out|ios::binary|ios::app);
binary_file.write(reinterpret_cast<char *>(&s1),sizeof(WebSites));
binary_file.write(reinterpret_cast<char *>(&s2),sizeof(WebSites));
binary_file.write(reinterpret_cast<char *>(&s3),sizeof(WebSites));
binary_file.write(reinterpret_cast<char *>(&s4),sizeof(WebSites));
binary_file.close();
// Read data
fstream binary_file2("c:\\test.dat",ios::binary|ios::in| ios::ate );
int size = binary_file2.tellg();
for(int i = 0; i<size/sizeof(WebSites); i++)
{
WebSites p_Data;
binary_file2.seekg(i*sizeof(WebSites));
binary_file2.read(reinterpret_cast<char *>(&p_Data),sizeof(WebSites));
cout<<p_Data.SiteName<<endl;
cout<<"Rank: "<< p_Data.Rank<<endl;
}
binary_file2.close();
return 0;
}
1 Answer 1
The trouble with writing binary blobs is that they lead to brittle storage.
The stored objects have a tendency to break over time as the assumptions you make about the hardware no longer hold true (in this case that the sizeof(int)
is constant and the endianess of int
will not change).
It has become more standard therefore to use a method know as serialization. In this you convert the object to a format that is hardware agnostic (and usually human readable).
Note: Binary blobs have advantages. But you must weigh those against the brittleness. Therefore your first choice should be serialization (unless you have specific requirements that prevents this). Then look at binary Blobs only after you have shown that serialization has too much overhead (unlikely for most situations, but it is a possibility).
In C++ you would do this using the operator<<
and operator>>
.
I would re-write your code as:
struct WebSites
{
std::string siteName;
int rank;
WebSite()
: siteName("")
, rank(0)
{}
WebSite(std::string const& siteName, int rank)
: siteName(siteName)
, rank(rank)
{}
void swap(WebSites& other) throws()
{
std::swap(rank, other.rank);
std::swap(siteName, other.siteName);
}
};
std::ostream& operator<<(std::ostream& stream, WebSites const& data)
{
stream << data.rank << " "
<< data.siteName.size() << ":"
<< data.siteName;
return stream;
}
std::istream& operator>>(std::istream& stream, WebSites& data)
{
WebSite tmp;
std::size_t size;
if (stream >> tmp.rank >> size)
{
tmp.siteName.resize(size);
if (stream.read(tmp.siteName[0], size)
{
data.swap(tmp);
}
}
return stream;
}
Now you can write your code like this:
int _tmain(int argc, _TCHAR* argv[])
{
WebSite s1("www.ppp.com", 0);
WebSite s2("www.rrr.com", 111);
WebSite s3("www.code.com", 123);
WebSite s4("www.yahoo.com", 14);
// write
fstream binary_file("c:\\test.dat",ios::out|ios::binary|ios::app);
binary_file << s1 << s2 << s3 << s4;
binary_file.close();
// Read data
fstream binary_file2("c:\\test.dat",ios::binary|ios::in| ios::ate );
WebSites p_Data;
while(binary_file2 >> p_Data)
{
cout << p_Data.SiteName << endl;
cout << "Rank: "<< p_Data.Rank << endl;
}
binary_file2.close();
// Read data into a vector
fstream binary_file3("c:\\test.dat",ios::binary|ios::in| ios::ate );
std::vector<WebSites> v;
std::copy(std::istream_iterator<WebSites>(binary_file3),
std::istream_iterator<WebSites>(),
std::back_inserter(v)
);
binary_file3.close();
}
-
\$\begingroup\$ Hey Loki, what you described is serialization right? Do you have some links which explain this concept further? \$\endgroup\$user1999360– user19993602013年06月17日 09:49:31 +00:00Commented Jun 17, 2013 at 9:49
-
\$\begingroup\$ Loki I asked a question related to your suggestion above: stackoverflow.com/questions/17277070/… -- your feedback would be very useful. thanks \$\endgroup\$user1999360– user19993602013年06月24日 14:23:27 +00:00Commented Jun 24, 2013 at 14:23
-
\$\begingroup\$ @user1999360: Yes this is serialization. \$\endgroup\$Loki Astari– Loki Astari2013年06月24日 22:24:30 +00:00Commented Jun 24, 2013 at 22:24
-
\$\begingroup\$ Loki, what would you do if the struct
Websites
had one member variable of typestd:string
and other member variable of typestd:wstring
? \$\endgroup\$user1999360– user19993602013年06月25日 09:07:00 +00:00Commented Jun 25, 2013 at 9:07 -
1\$\begingroup\$ +1 for "Binary blobs have advantages. But you must way those against the brittleness." I have several situations where I absolutely need data in a particular per-byte format, but it's definitely not the norm for most programmers using basic (non-byte) types. \$\endgroup\$underscore_d– underscore_d2016年01月07日 10:36:21 +00:00Commented Jan 7, 2016 at 10:36