I would like to know does this code follow common best practices in C++ using STL.
Goal of the code:
- There are 10 players that were playing a game 1000 times. Each time someone has won they have put winners name into the binary file
data.bin
. Each name takes 20 bytes of data in the binary file and uses0円
as a string terminator. - This program uses that file as an input and prints out ranked list of each player's scores.
Here is my code:
#include <iostream>
#include <fstream>
#include <map>
#include <algorithm>
#include <vector>
using std::map, std::string, std::ifstream, std::ios, std::string, std::cerr, std::vector, std::pair, std::sort, std::cout;
constexpr size_t NAME_SIZE=20;
const char* FILE_NAME = "data.bin";
constexpr int FILE_NOT_OPENED = 1;
struct zuint { uint16_t val = 0; };
int main() {
char tmp[NAME_SIZE];
map<string, zuint > points;
ifstream file(FILE_NAME, ios::in | ios::binary);
if(!file){
cerr << "File not opened. 'data.bin' probably does not exist." << "\n";
exit(FILE_NOT_OPENED);
}
while (!file.eof()) {
file.read(tmp, NAME_SIZE);
points[tmp].val++;
}
file.close();
vector<pair<string, zuint>> pointscpy(points.begin(), points.end());
points.clear();
sort(
pointscpy.begin(),
pointscpy.end(),
[](pair<string, zuint> v1, pair<string, zuint> v2) {
return v1.second.val > v2.second.val;
}
);
for(int i=0;i<pointscpy.size(); i++){
cout << i + 1 << ". " << pointscpy[i].first << ": " << pointscpy[i].second.val << "\n";
}
return EXIT_SUCCESS;
}
1 Answer 1
That long line of using
is problematic. First, it's far too long (it's not even sorted, which might explain why std::string
appears twice). Secondly, it looks like you've been told to avoid using namespace std;
but have heard only half the reason it's a bad idea. Those many using
s tend to obscure which identifiers in your code are from the library and which are your own. I really recommend you drop that habit and start referring to things by their full names most of the time.
size_t
and uint16_t
are never defined. I'm guessing you meant std::size_t
and std::uint16_t
(the latter from <cstdint>
header). It's not clear why you need an exactly-16-bit type for this simple counting - what's wrong with std::uint_fast16_t
? Or even a plain unsigned int
? Other identifiers not defined include exit
(presumably std::exit
from <cstdlib>
) and EXIT_SUCCESS
(also from <cstdlib>
).
The zuint
structure doesn't seem to provide any benefit over using the integer type directly.
The tmp
array doesn't need to be scoped to the whole function - it can be temporary, as the name suggests, within the read loop.
No need to write two separate literal strings to std::cerr
- combine them into a single string:
std::cerr << "File not opened. 'data.bin' probably does not exist.\n";
Perhaps consider std::perror
(from <cstdio>
) as an alternative error reporting function which provides more information.
The loop is flawed (testing for eof
before reading is a known anti-pattern). When istream::read()
fails, we haven't populated tmp
and shouldn't be adding an entry. Even when it fails, we're not defensive against malformed data (missing final null character). It's better to read into an oversize buffer and supply the null even if it's absent in the input.
Consider using std::unordered_map
instead of std::map
when we don't care about the order of its elements. That will likely be more performant when there are many different names in the file.
We can save writing long types in full, by taking advantage of auto
.
The comparator we pass to std::sort
can avoid copying strings, by accepting references to pairs.
We don't need to return success value at the end of main()
- the main function is magic, and simply running off the end implies a return value of 0.
Modified code
#include <algorithm>
#include <cerrno>
#include <cstdlib>
#include <cstring> // for strerror()
#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>
constexpr std::size_t name_size = 20;
const char* file_name = "data.bin";
int main()
{
std::ifstream file(file_name, std::ios::in | std::ios::binary);
if (!file) {
std::cerr << file_name << ": " << std::strerror(errno) << '\n';
return EXIT_FAILURE;
}
std::unordered_map<std::string, unsigned int> points;
{
char name[name_size + 1];
name[name_size] = '0円';
while (file.read(name, name_size)) {
++points[name];
}
}
if (!file.eof()) {
std::cerr << file_name << ": " << std::strerror(errno) << '\n';
return EXIT_FAILURE;
}
std::vector<std::pair<std::string, unsigned int>>
pointscpy(points.begin(), points.end());
std::sort(pointscpy.begin(), pointscpy.end(),
[](auto const& v1, auto const& v2) {
return v1.second > v2.second;
});
int i = 0;
for (auto const& p: pointscpy) {
std::cout << ++i << ". " << p.first << ": " << p.second << '\n';
}
}
Future directions
Instead of hard-coding the file name in the program, accept it as a run-time argument, and/or accept input on the standard input stream so that it can be used as a filter.
-
\$\begingroup\$ I guess I was not clear about '0円' terminated name. It is stored in form of "<Name>0円" then data after '0円' is ignored ( it is like C-style string). When it comes to zuint I used it because of default value(0) but I guess that is the default. Main reason why I chose uint16_t is because I knew that 1000 is the biggest possible number so I chose the smallest possible container. But everything you said stands. \$\endgroup\$Marko Bošnjak– Marko Bošnjak2022年05月06日 21:26:05 +00:00Commented May 6, 2022 at 21:26
-
\$\begingroup\$ I had some doubts about the null-terminated names (particularly whether the 20-char record size was the maximum name length with or without the null), but the code resolved the questions. My point is that when the input is as expected, all is well, but in the real world we need to be prepared for accidents or malicious actions: if I write a file that has 20 non-null characters in a row, I can force a buffer overrun. Never trust your inputs! \$\endgroup\$Toby Speight– Toby Speight2022年05月07日 06:57:09 +00:00Commented May 7, 2022 at 6:57
-
\$\begingroup\$ And yes, a map of string to any numeric type will initialise the values to zero - you can depend on that. \$\endgroup\$Toby Speight– Toby Speight2022年05月07日 06:58:41 +00:00Commented May 7, 2022 at 6:58
-
\$\begingroup\$ Also I forgot to ask. Is
points.clear()
fine to just clear the memory or there is some reason why you removed it? \$\endgroup\$Marko Bošnjak– Marko Bošnjak2022年05月07日 15:28:48 +00:00Commented May 7, 2022 at 15:28 -
\$\begingroup\$ Yes, it's fine - but it doesn't achieve very much, because the peak usage is still going to be when the vector is created. I'm not sure that
clear()
will return any of the memory it was using anyway - I think that will remain as unused capacity, though that could depend on your implementation. \$\endgroup\$Toby Speight– Toby Speight2022年05月07日 17:51:56 +00:00Commented May 7, 2022 at 17:51