UPDATE: I have changed some of the code from feedback provided on #cpp_review
I created a little project that will basically move all header files into one (mostly for libs).
The project is small, only 2 classes and a main file for the cli options The projects github is here: https://github.com/AntonioCS/only_one_header
Here are the files:
main.cpp (please ignore the duplicate static const char
I am trying to figure out some issues)
#include <docopt.h>
//#include <iostream>
#include "HeaderManager.h"
static const char USAGE2[] =
R"(Usage: onlyoneheader [-ho <file>] [--output=<file>] [--remove_comments] [--no_separators] [--no_timestamp] [--silent] <path>
-h --help show this
-o <file> --output=<file> specify output file [default: ./default.h]
--remove_comments remove one line comments from the header files
--no_separators don't add // <filename> START|END separators
--no_timestamp don't add timestamp
--silent no info text
)";
static const char USAGE[] =
R"(Usage: onlyoneheader [-h] [--output=<file>] [--remove_comments] [--no_separators] [--no_timestamp] [--silent] <path>
-h --help show this
--output=<file> specify output file [default: ./allinone.h]
--remove_comments remove one line comments from the header files
--no_separators don't add // <filename> START|END separators
--no_timestamp don't add timestamp
--silent no info text
)";
int main(int argc, char** argv) {
std::map<std::string, docopt::value> args = docopt::docopt(
USAGE,
{ argv + 1, argv + argc },
true,
"Only One Header v0.1"
);
#if 0
for (auto const& arg : args) {
std::cout << arg.first << ": " << arg.second << std::endl;
}
#endif
OnlyOneHeader::HeaderManager headerManager{args["<path>"].asString()};
headerManager.setOptionAddSeparatorState(args["--no_separators"].asBool() == false);
headerManager.setOptionAddTimestampState(args["--no_timestamp"].asBool() == false);
headerManager.setOptionRemoveCommentsState(args["--remove_comments"].asBool());
headerManager.setOptionSilentState(args["--silent"].asBool());
headerManager.process();
const auto output_file_name{ args["--output"].isString() ? args["--output"].asString() : "all_in_one.h" };
headerManager.output(output_file_name);
}
#pragma once
#include <vector>
#include <string>
#include <unordered_map>
#include "HeaderFile.h"
namespace OnlyOneHeader {
class HeaderManager
{
std::vector<HeaderFile> m_all_hf{};
std::string m_dir_path{};
std::unordered_map<std::string, HeaderFile*> m_header_filename_connection;
std::unordered_map<std::string, uint16_t> m_global_includes;
std::string m_output_file{ "all_in_one.h" };
std::vector<std::string> m_unused_hf{};
double m_elapsed_process{};
double m_elapsed_ouput{};
struct {
bool add_separators{ true };
bool add_timestamp{ true };
bool remove_comments{ false };
bool silent{ false };
} m_options;
public:
HeaderManager(const std::string& path) : m_dir_path(path) {}
void process();
void output(const std::string& output_file);
void setOptionAddSeparatorState(bool state) { m_options.add_separators = state; }
void setOptionAddTimestampState(bool state) { m_options.add_timestamp = state; }
void setOptionRemoveCommentsState(bool state) { m_options.remove_comments = state; }
void setOptionSilentState(bool state) { m_options.silent = state; }
private:
void displayInfo() const;
void grabAllHeaderFiles();
void removeUnIncludedFiles();
void grabAllGlobalIncludesFromHeaderFile(const HeaderFile& hf);
void createFileNameHeaderFileConnection();
void order();
void decreaseAllIncluded(HeaderFile& hf);
};
}
#include <fstream>
#include "HeaderManager.h"
#include <chrono>
#include <iostream>
namespace OnlyOneHeader {
double getMsCount(const std::chrono::steady_clock::time_point& begin)
{
return std::chrono::duration_cast<std::chrono::milliseconds>(std::chrono::steady_clock::now() - begin).count() / 1000.0;
}
void HeaderManager::process()
{
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
grabAllHeaderFiles();
removeUnIncludedFiles(); //this must come before createFileNameHeaderFileConnection()
createFileNameHeaderFileConnection();
order();
m_elapsed_process = getMsCount(begin);
}
void HeaderManager::output(const std::string& output_file)
{
m_output_file = output_file;
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
std::fstream output{m_output_file, std::ios::out};
if (output.is_open()) {
output << "#pragma once\n\n";
if (m_options.add_timestamp) {
auto t = std::time(nullptr);
auto tm = *std::localtime(&t);
output << "////////////////////////////////////////\n";
output << "//AUTO GENERATED FILE - DO NOT EDIT\n";
output << "//Generated on " << std::put_time(&tm, "%Y-%m-%d at %H:%M:%S") << "\n";
output << "////////////////////////////////////////\n\n";
}
for (const auto& [globalInc, v] : m_global_includes) {
output << "#include <" << globalInc << ">\n";
}
output << '\n';
for (const auto& hf : m_all_hf) {
if (m_options.add_separators) output << "/*** " << hf.filename() << " START ***/\n";
output << hf.getContents();
if (m_options.add_separators) output << "/*** " << hf.filename() << " END ***/\n";
}
output.close();
m_elapsed_ouput = getMsCount(begin);
if (!m_options.silent) displayInfo();
}
else {
throw std::runtime_error{"Unable to open output file"};
}
}
void HeaderManager::grabAllHeaderFiles()
{
if (!fs::is_directory(m_dir_path)) {
throw std::runtime_error{ "Invalid directory: " + m_dir_path };
}
for (const auto& p : fs::directory_iterator(m_dir_path)) {
if (p.path().extension() == ".h" || p.path().extension() == ".hpp") {
auto hf = m_all_hf.emplace_back(p.path(), m_options.remove_comments);
if (hf.hasGlobalIncludes()) {
grabAllGlobalIncludesFromHeaderFile(hf);
}
}
}
}
void HeaderManager::removeUnIncludedFiles()
{
std::vector<HeaderFile> included_files;
for (const auto& hf : m_all_hf) {
if (!hf.hasLocalIncludes()) {
bool was_not_included{true};
//since this file doesn't include any other local file
//I'm looping all the other include files and checking them to see if they include this file
for (const auto& hf2 : m_all_hf) {
if (hf2.isInLocalIncludes(hf.filename())) {
included_files.push_back(hf);
was_not_included = false;
break;
}
}
if (was_not_included) {
m_unused_hf.emplace_back(hf.filename().data());
}
}
else {
included_files.push_back(hf);
}
}
m_all_hf = included_files;
}
void HeaderManager::grabAllGlobalIncludesFromHeaderFile(const HeaderFile& hf)
{
for (const auto& gi : hf.allGlobalIncludes()) {
m_global_includes[gi] += 1;
}
}
void HeaderManager::createFileNameHeaderFileConnection()
{
for (auto& hf : m_all_hf) {
m_header_filename_connection[hf.filename()] = &hf;
}
}
void HeaderManager::order()
{
for (auto& hf : m_all_hf) {
if (hf.hasLocalIncludes()) {
decreaseAllIncluded(hf);
}
}
std::sort(
m_all_hf.begin(),
m_all_hf.end(),
[](const HeaderFile& a, const HeaderFile& b)
{
return a.position_value < b.position_value;
}
);
}
void HeaderManager::decreaseAllIncluded(HeaderFile& hf)
{
constexpr std::size_t order_decrease_amount{100};
for (const auto& infile : hf.allLocalIncludes()) {
auto initv = m_header_filename_connection[infile]->position_value;
while (m_header_filename_connection[infile]->position_value >= hf.position_value) {
m_header_filename_connection[infile]->position_value -= order_decrease_amount;
}
if (
m_header_filename_connection[infile]->position_value != initv &&
m_header_filename_connection[infile]->hasLocalIncludes()
) {
decreaseAllIncluded(*m_header_filename_connection[infile]);
}
}
}
void HeaderManager::displayInfo() const
{
std::cout << "Time for procesing: " << m_elapsed_process << "ms\n";
std::cout << "Time for outputing: " << m_elapsed_ouput << "ms\n";
std::cout << "Total time: " << (m_elapsed_ouput + m_elapsed_process) << "ms\n";
std::cout << "Total global headers: " << m_global_includes.size() << '\n';
std::cout << "File - Times it was included\n";
for (const auto& [globalInc, v] : m_global_includes) {
std::cout << globalInc << " - " << v << '\n';
}
std::cout << "Total local headers: " << m_header_filename_connection.size() + m_unused_hf.size() << '\n';
std::cout << "Total headers not included: " << m_unused_hf.size() << '\n';
}
}
#pragma once
#include <filesystem>
#include <vector>
#include <string>
namespace OnlyOneHeader {
namespace fs = std::filesystem;
class HeaderFile
{
fs::path m_path{};
std::vector<std::string> m_local_includes{};
std::vector<std::string> m_global_includes{};
std::string m_filename{};
std::string m_contents{};
inline static const std::string m_bom_string{ "\xEF\xBB\xBF" };
bool m_remove_comments{ false };
public:
std::size_t position_value{ 9999 };
HeaderFile(const fs::path& p, bool remove_comments = false) : m_path(p), m_filename(p.filename().string()), m_remove_comments(remove_comments)
{
extractData();
}
[[nodiscard]] const std::vector<std::string>& allLocalIncludes() const
{
return m_local_includes;
}
[[nodiscard]] const std::vector<std::string>& allGlobalIncludes() const
{
return m_global_includes;
}
[[nodiscard]] bool isInLocalIncludes(const HeaderFile& hf) const;
[[nodiscard]] bool isInLocalIncludes(std::string_view file) const;
[[nodiscard]] std::string filename() const
{
return m_filename;
}
void extractData();
[[nodiscard]] std::string getContents() const
{
return m_contents;
}
[[nodiscard]] std::string path() const
{
return m_path.string();
}
[[nodiscard]] bool hasLocalIncludes() const
{
return !m_local_includes.empty();
}
[[nodiscard]] bool hasGlobalIncludes() const
{
return !m_global_includes.empty();
}
private:
void extractIncludeFilename(const std::string& line);
[[nodiscard]] bool startsWith(std::string_view str, std::string_view startsWithThis) const;
};
}
#include <fstream>
#include <algorithm> //std::sort
#include "HeaderFile.h"
namespace OnlyOneHeader {
enum class ExtractState
{
BOM_CHECK,
READ_CODE
};
bool HeaderFile::isInLocalIncludes(const HeaderFile& hf) const
{
return isInLocalIncludes(hf.filename());
}
bool HeaderFile::isInLocalIncludes(std::string_view file) const
{
return (std::find(m_local_includes.begin(), m_local_includes.end(), file) != m_local_includes.end());
}
void HeaderFile::extractData()
{
std::fstream f{path(), std::ios::in};
if (!f.is_open()) {
throw std::runtime_error{ "Unable to open file: " + path() };
}
ExtractState state{ ExtractState::BOM_CHECK };
std::string line{};
while (std::getline(f, line)) {
switch (state) {
case ExtractState::BOM_CHECK:
{
//detect BOM
if (startsWith(line, m_bom_string)) {
line.erase(0, m_bom_string.size()); //simple way to get rid of this thanks to Gaurav from cpplang slack
}
state = ExtractState::READ_CODE;
}
[[fallthrough]];
case ExtractState::READ_CODE:
{
if (startsWith(line, "#include ")) {
extractIncludeFilename(line);
}
else if (!startsWith(line, "#pragma once") && !line.empty()) {
if (m_remove_comments) {
auto it = std::find_if(line.begin(), line.end(), [](int c) { return !std::isspace(c); });
if (it != line.end() && *it == '/' && *(++it) == '/')
continue;
}
m_contents += line + '\n';
}
}
break;
}
}
}
void HeaderFile::extractIncludeFilename(const std::string& line)
{
auto f = line.find('"');
if (f != std::string::npos) {
auto f2 = line.find('"', f + 1);
if (f2 != std::string::npos) {
m_local_includes.push_back(line.substr(f + 1, f2 - f - 1));
}
}
else {
f = line.find('<');
if (f != std::string::npos) {
auto f2 = line.find('>');
if (f2 != std::string::npos) {
m_global_includes.push_back(line.substr(f + 1, f2 - f - 1));
}
}
}
}
bool HeaderFile::startsWith(std::string_view str, std::string_view startsWithThis) const
{
auto v = str.rfind(startsWithThis, 0);
return (v == 0);
}
}
1 Answer 1
This code is generally clean and easy to understand, which is good. Here are some things that may help you improve your code.
Make sure you have all required #include
s
The code uses sort
but doesn't #include <algorithm>
and main
uses map
but doesn't #include <map>
. For portability and immunity from changes to third party libraries, it's best to add includes for all of the functions your code uses, whether or not another library has already included them.
Add support for multiline comments
At the moment, if we have a header file which includes this:
// this is comment #1
/* This is comment #2 */
/*
* This is a multiline comment #3
*/
only the first comment is excluded if we specify --remove_comments
.
Use rational return values
Almost all of your member function return void
and take no parameters. This is highly suspect. For example, it might be useful for grabAllHeaderFiles
to report the number of files processed. Also, the displayInfo()
might benefit from being passed a std::ostream
rather than always printing to std::cout
.
Rethink class data members
Is it really necessary to have all of those data structures only to throw most of them away? For instance, m_unused_hf
could be replaced with a simple counter with no loss of functionality. Also it appears that the --remove_comments
is properly part of HeaderManager
, so why is it duplicated in every HeaderFile
? It would probably be better to pass it in as a parameter.
Rethink the exposed interface
Only HeaderManager
uses HeaderFile
so it might make sense to put its definition inside HeaderManager
. Also, the only thing that uses HeaderFile::path()
is HeaderFile::extractData()
. Instead of writing this:
std::fstream f{path(), std::ios::in};
I'd rewrite that as:
std::ifstream f{m_path};
Also, it seems to me that extractData
should be private
.
Consider a more efficient algorithm and data structure
The removeUnIncludedFiles()
is not particularly efficient. Instead of making \$n^2\$ comparisons, consider constructing a map
and keeping a count of all local include files, as is done for the global include files, would be much simpler. Another possibility would be to simply keep a global m-ary tree for all files. Try running your program on /usr/include
and see why reducing memory footprint might be a good idea. On my machine, use uses up the entire stack and crashes.
-
\$\begingroup\$ Thank you so much for you review. I have addressed some of your points. Regarding the
--remove_comments
, this is the 1st version, so I went with the easy and simple comments. I will update the code as soon as I commit everything. Thanks again. \$\endgroup\$AntonioCS– AntonioCS2020年06月10日 18:32:07 +00:00Commented Jun 10, 2020 at 18:32 -
1\$\begingroup\$ I'm glad you found it useful! You may already know this, but it may be helpful to read (or re-read) codereview.stackexchange.com/help/someone-answers \$\endgroup\$Edward– Edward2020年06月10日 18:35:04 +00:00Commented Jun 10, 2020 at 18:35