I am currently trying to write a Parser system for my project. There are different type of files to be parsed (file1, file2, file3):
file1 -> AData // stored in AData class using AParser class's parsing logic
file2 -> BData // stored in BData class using BParser class's parsing logic
file3 -> CData // stored in CData class using BParser class's parsing logic
Files could be binary or txt. Different files require different logic for parsing because of the way file are written.
I have used a factory pattern for this purpose. The Base
class is Parser
which is an abstract class.
#include <fstream>
#include <iostream>
#include <memory>
#include <string>
// Base class
template <class T>
class Parser {
public:
virtual void DoParsing(T&, std::ifstream& fs) = 0;
};
// Base class for data
struct Data {};
//////
struct AData : Data {
int data;
};
class AParser final : public Parser<AData> {
public:
void DoParsing(AData& data, std::ifstream& fs) {
// implementation goes here
}
};
///
struct BData : Data {
char* data;
};
class BParser final : public Parser<BData> {
public:
void DoParsing(BData& data, std::ifstream& fs) {
// implementation goes here
}
};
template <class T>
class IParsing {
public:
void action(std::shared_ptr<Parser<T>> p, T d, std::ifstream& fs) {
p->DoParsing(d, fs);
}
};
class FParsing {
public:
FParsing() {}
void HandleParsing(std::string type, Data& d, std::ifstream& fs) {
if (type == "AParsing") {
std::shared_ptr<IParsing<AData>> iparse =
std::make_shared<IParsing<AData>>();
iparse->action(std::make_shared<AParser>(), static_cast<AData&>(d),
fs);
} else if (type == "BParsing") {
// iparse->action(std::make_shared<BParser>(), fs);
std::shared_ptr<IParsing<BData>> iparse =
std::make_shared<IParsing<BData>>();
iparse->action(std::make_shared<BParser>(), static_cast<BData&>(d),
fs);
} else {
std::cout << "Need shape\n";
}
}
private:
};
int main() {
std::ifstream iFile("data.txt");
FParsing fparse;
//AData is passed by ref because
// it will be populated during parsing process
AData ad;
fparse.HandleParsing("AParsing", ad, iFile);
// BData
BData ab;
fparse.HandleParsing("BParsing", ab, iFile);
}
My questions are:
- Do you think this is a right approach for creating a parsing system?
- Is the factory pattern implementation correct?
- I just wanted to make sure I am not making things more complex than it should be.
- Are there other design patterns better than factory pattern for this purpose?
3 Answers 3
Design issues
- You are unnecessarily restricting the
Parser
interface by forcingfs
to be astd::ifstream&
whenstd::istream&
would work as well. AParser::DoParsing(AData&, std::ifstream&)
should be markedoverride
. Same forBParser::DoParsing(BData&, std::ifstream&)
.- I can't see any specific purpose for
IParsing
: Why not callAParser::DoParsing
orBParser::DoParsing
directly? - Besides, parameter
d
ofIParsing::action(std::shared_ptr<Parser<T>> p, T d, std::ifstream& fs)
should be of typeT&
. FParsing::HandleParsing(std::string, Data&, std::ifstream&)
doesn't use any class instance members. Maybe make itstatic
?- Also, parameter
type
from that function should probably be anenum
- no need to use strings for that. - Why allocate two
std::shared_ptr
in case a valid parser is found if they won't be shared and will be deallocated once the function returns? - What advantage do you hope to get from this extravagant implementation over overloading
operator>>(std::istream&, AData&)
andoperator>>(std::istream&, BData&)
instead?
-
\$\begingroup\$ Regarding
What advantage do you hope to get from this extravagant implementation over overloading operator>>(std::istream&, AData&) and operator>>(std::istream&, BData&) instead?
I thought of doing this way to make the design more object oriented and extendable. So when you say to overloadoperator>>
, you mean placeoperator>>
overloading ` in base classParser
? . \$\endgroup\$solti– solti2017年10月16日 02:12:58 +00:00Commented Oct 16, 2017 at 2:12 -
\$\begingroup\$ @solti: I meant placing those overloaded
std::istream& operator>>(std::istream&, AData&)
in the same scope asAData
, so you can simply writeAData a; std::cin >> a;
(or replacestd::cin
with a file stream if you like). No need for an overly complex architecture if the standard way is simple enough... \$\endgroup\$hoffmale– hoffmale2017年10月16日 02:30:51 +00:00Commented Oct 16, 2017 at 2:30 -
\$\begingroup\$ oh! ok make sense. One last question I had was, (I did't address or placed it in my code), How do I reduce code reusability? Because some of the parsing (not all but some) logic for both Adata and BData's operator>> might be same. How do you think I should handle this case? \$\endgroup\$solti– solti2017年10月16日 03:21:08 +00:00Commented Oct 16, 2017 at 3:21
-
\$\begingroup\$ @solti: I think you mean to increase reusability. Can't comment too much on this without seeing the code, but my first guess would be extract common operations into seperate functions. \$\endgroup\$hoffmale– hoffmale2017年10月16日 03:35:04 +00:00Commented Oct 16, 2017 at 3:35
-
\$\begingroup\$ so the separate functions should be in base class of AData and BData? \$\endgroup\$solti– solti2017年10月16日 04:28:34 +00:00Commented Oct 16, 2017 at 4:28
Instead of passing in the type of data to be parsed as well as a string we could allow our parser to create us a new type based on the type passed in alone with no need for the string.
namespace Factory
{
template<typename T>
T Create(std::istream& stream)
{
return std::move(T::Parse(stream));
}
}
This factory then just proxies onto the object we wish to create with our stream. Then each parse method can ensure the data format is valid and return us a newly created object if it is valid or otherwise throw some form of exception.
struct AData
{
int data;
static AData Parse(std::istream& stream)
{
// TODO check here for correct data etc.
int intVal;
stream >> intVal;
return AData { intVal };
}
};
To use this we can then do something like the following
auto adata = std::move(Factory::Create<AData>(ss));
The above moves the data from our factory after creation to prevent unneccessary copies.
The full code is then as follows:
#include <iostream>
#include <sstream>
struct AData
{
int data;
static AData Parse(std::istream& stream)
{
// TODO check here for correct data etc.
int intVal;
stream >> intVal;
return AData { intVal };
}
};
struct BData
{
std::string data;
static BData Parse(std::istream& stream)
{
// TODO check here for correct data etc.
return BData { std::string(std::istreambuf_iterator<char>(stream), {})};
}
};
namespace Factory
{
template<typename T>
T Create(std::istream& stream)
{
return std::move(T::Parse(stream));
}
}
int main()
{
std::stringstream ss;
ss << 1;
auto adata = std::move(Factory::Create<AData>(ss));
std::cout << adata.data << "\n";
std::stringstream ss2;
ss2 << "Hello World";
auto bdata = std::move(Factory::Create<BData>(ss2));
std::cout << bdata.data << "\n";
}
-
\$\begingroup\$ I like you design. But I have one question, How do avoid duplication of code that goes in
Parse
function? Because there might be some code inside Parse function of both AData and BData that might be similar. \$\endgroup\$solti– solti2017年10月16日 02:10:31 +00:00Commented Oct 16, 2017 at 2:10 -
\$\begingroup\$ Depends on the situation. If we have a shared base class then we could have another static function in a baseclass that takes a ref to the base class and the stream to populate the information for the base class \$\endgroup\$const_ref– const_ref2017年10月16日 09:00:24 +00:00Commented Oct 16, 2017 at 9:00
I think you are complicating the code too much.
As the parsing process depends only on the type of file to be loaded, which you know at compile time, I dont think inheritance is necessary.
If you are using templates, the different data classes (AData
, BData
, CData
) don't need to inherit from a base class.
If the Parser doesn't hold any state, why not make it a free function?
Putting it all together
// Data types
struct AData {
int data;
};
struct BData {
char* data;
};
// Not implemented, specialize for each data type supported
template <class DataType>
void parse(std::ifstream& ifstream, DataType& data);
template <>
void parse<AData>(std::ifstream& ifstream, AData& data) {
// parse AData
}
template <>
void parse<BData>(std::ifstream& ifstream, BData& data) {
// parse BData
}
int main() {
std::ifstream file;
AData ad;
parse(file, ad); // Will call parse specialization for AData
BData bd;
parse(file, bd); // Will call parse specialization for BData
}
Note that this approach would need additional work if the type of the data to be loaded has to be detected at runtime.
But if you know it beforehand, this approach is simple, elegant and effective.