What Design to choose for Parsing different files to populate different classes?

Question 1

I am currently working on project which requires parsing different types of files. The content of the files will populate the classes.

eg: file1 populate content of class1, file2 populate content of class2, etc.

My question is where should the parsing logic of the file go?

1st option : If I place the parsing logic inside the class then the class becomes really huge, because currently the class itself is ~400 lines and will grow in the future. The advantage of this way is things are really simple, I could use operator>>, etc.

2nd option : Create a new class that takes std::istream and class reference, do the parsing logic there and populate the class. The advantage of this way of doing is that if the class remains the same but formatting of the file changes,

eg: `file1 populate content of class1 or file1-v2 populate content of class1`

I have to use polymorphism to change the parsing logic and keep the design simple. Disadvantage of this option is, it's a little complex but I can live with it.

Which option do you think is better or is there any other better option?

Question 2

Classes should have a single responsibility. Presumably the responsibility of parsing a file to populate a class and the responsibilities of class itself are distinct and should not be combined. Therefore the 2nd option is the better design.

It's pretty common for file formats to change as a system develops, and by separating this concern now, you'll be in a good place if the file format or the class you're populating changes. It's also a design that allows greater flexibility--e.g. if you decide you want to populate your class from the network, a different file format, etc.

Question 3

make sense, thank you for pointing out single responsibility. I will go by option2. (just a side thought) My other thought was saying that parsing is part of operator>> so it should be part of the same class. How do you thing I should argue with that thought?

Question 4

It's often a bit vague in my opinion what a single responsibility represents, but it may be helpful to think of it as "There should never be more than one reason for a class to change". See softwareengineering.stackexchange.com/questions/17100/…. I'm not familiar enough with the >> operator or your implementation to give very useful advice, but if >> isn't part of the classes responsibility either then I think it should be extracted (perhaps the implementation of >> would go in the parser class you plan to create in the 2nd option)

Question 5

If the parsing is more complex than a few lines of code, it is probably better to put it into separate classes, lets call them Class1Parser, Class2Parser and so on. If those classes contain similar parts, you can either

create a common base class BaseParser for both where the reusable parts live
create a separate utility class ParserUtil for reusable functions
make the parser a generic class like Parser<T> where T is either Class1 or Class2
use a combination of the former options (like making only the ParserUtil or BaseParser generic)

Note that an inheritance hierarchy of the parsers now can be independent from the inheritance hierarchy of the parsed classes, this would not quite be possible by putting the parsing logic directly into Class1/Class2.

None of these options should hold you back from implementing an operator>> for Class1 or Class2, if you think you need it for ease of use. This operator can simply delegate its functionality to an object of type Class1Parser or Class2Parser.

If these approaches end up becoming more complex than putting the parser logic into Class1 or Class2, then there is something utterly wrong in your code. In essence, it should actually be the same logic. By putting the parsing logic into separate classes, it stays decoupled from other methods in those classes, which should reduce the overall complexity, not increase it.

Question 6

You might even want to consider adding some form of header to the file which would be the same size for each type. Then you can read in the header, pass it to a factory which will return an implementation of your parser interface. This of course would require all your classes to implement some interface which the parser's parsing function can return

Question 7

Firstly thank you for the detailed answer. Secondly when you say make the parser a generic class like Parser<T> where T is either Class1 or Class2 where do you think the reusable parts of code go if I choose this option? Did you mean it will go in Parser<T> class? I asked this because if Parser<T> is abstract class would this be a problem?

Question 8

isn't that what Template method pattern all about? (in regards to what vanta mula asked in comment above)

Question 9

@vantamula: I try not speculating too much about your code - especially not about classes for which I don't know anything more than names "Class1", "Class2" etc.

Samuel Samuel 9,2471 gold badge27 silver badges43 bronze badges · Accepted Answer · 2017-10-17 01:28:45Z

9

Classes should have a single responsibility. Presumably the responsibility of parsing a file to populate a class and the responsibilities of class itself are distinct and should not be combined. Therefore the 2nd option is the better design.

It's pretty common for file formats to change as a system develops, and by separating this concern now, you'll be in a good place if the file format or the class you're populating changes. It's also a design that allows greater flexibility--e.g. if you decide you want to populate your class from the network, a different file format, etc.

Share

Improve this answer

answered Oct 17, 2017 at 1:28

Samuel's user avatar

Samuel Samuel

9,2471 gold badge27 silver badges43 bronze badges

2

make sense, thank you for pointing out single responsibility. I will go by option2. (just a side thought) My other thought was saying that parsing is part of operator>> so it should be part of the same class. How do you thing I should argue with that thought?

vanta mula
– vanta mula

2017年10月17日 01:48:22 +00:00
Commented Oct 17, 2017 at 1:48
1

It's often a bit vague in my opinion what a single responsibility represents, but it may be helpful to think of it as "There should never be more than one reason for a class to change". See softwareengineering.stackexchange.com/questions/17100/…. I'm not familiar enough with the >> operator or your implementation to give very useful advice, but if >> isn't part of the classes responsibility either then I think it should be extracted (perhaps the implementation of >> would go in the parser class you plan to create in the 2nd option)

Samuel
– Samuel

2017年10月17日 02:46:29 +00:00
Commented Oct 17, 2017 at 2:46

Add a comment |

Stack Exchange Network

What Design to choose for Parsing different files to populate different classes?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

What Design to choose for Parsing different files to populate different classes?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions