This doubt is about Switch Statements from Chapter 3: Functions of the book named Clean Code
Here we have a function:
public Money calculatePay(Employee e)
throws InvalidEmployeeType {
switch (e.type) {
case COMMISSIONED:
return calculateCommissionedPay(e);
case HOURLY:
return calculateHourlyPay(e);
case SALARIED:
return calculateSalariedPay(e);
default:
throw new InvalidEmployeeType(e.type);
}
}
From my inexperienced point of view, I can see that the switch statement inside the function calculatePay
is only returning things based on the object Employee
. Isn't it doing the "One Thing" mentioned by Uncle Bob?
But Uncle Bob says:
There are several problems with this function. First, it's large, and when new employee types are added, it will grow. Second, it very clear does more than more thing. Third, it violates the Single Responsibility Principle(SRP) because there is more than one reason for it to change. Fourth, it violates the Open Closed Principle(OCP) because it must change whenever new types are added. But possibly the worst problem with this function is that there are an unlimited number of other functions that will have the same structure.
How does the switch statement do more than one thing?
-
18The worst part of this code snippet in my opinion is none of those things listed; it's that the concept of the function itself is confused! The units don't even match up -- $/hour, $/year, and $/project are super meaningfully different.Daniel Wagner– Daniel Wagner06/11/2021 02:31:34Commented Jun 11, 2021 at 2:31
-
2> But possibly the worst problem with this function is that there are an unlimited number of other functions that will have the same structure. Guilt by future association. How does Bob know this? Maybe this is the last release of an end-of-life codebase, in the middle of a startup that is going under.Kaz– Kaz06/11/2021 22:30:38Commented Jun 11, 2021 at 22:30
-
2I don't even think this is a large method, even by Uncle Bob's standards.Helena– Helena06/12/2021 13:09:48Commented Jun 12, 2021 at 13:09
-
It a) determines the method appropriate to calculate the result for the given employee, and b) it invokes that method?Hagen von Eitzen– Hagen von Eitzen06/13/2021 11:32:15Commented Jun 13, 2021 at 11:32
-
1IMO, with some exceptions, I don't think you can give out design advice by talking about 13 line snippets of code. I think for this sort of thing you have to take into account the state of the whole system, where you expect it to be later, and how to not break backwards compatibility. Also... LoC is a very weak metric, and worse, and Uncle Bob takes a million dollar concept ("more than one reason to change") and uses it to solve a 1 penny problem (remove a switch statement); if you can answer the first super hard question, you probably don't need him telling you to remove switch statements.jrh– jrh06/13/2021 14:12:55Commented Jun 13, 2021 at 14:12
7 Answers 7
Martin's concept of "do one thing" is overly ambiguous to the point I believe it does more harm than good.
In the passage Martin states that a switch does one thing per case and therefore by definition does N things. This implies that any single method call is "one thing". If you follow this thinking, you will quickly realize a program will never be able to do anything!
Martin has a different definition which is that a method does "one thing" when all operations are on the same abstraction level. But the cases here calculateCommissionedPay()
, calculateHourlyPay()
do seem to be on the same abstraction level, so this contradicts his general criticism of switch-statments as always doing N-things.
That said, there are reasons to avoid this particular use of a switch. The switch check a type-field and then selects the method to call based on that. The idiomatic way to do this in object oriented programming is to use subclasses and overloading. Employee
could have subclasses HourlyEmployee
, SalariedEmployee
etc., which override a calculatePay()
method. That way you could avoid the switch altogether and just call e.calculatePay()
.
But if the input really is an enum value as in the example, then you need a way to get the appropriate Employee
-subclass given this enum value. How do you do that? A switch
of course! So you end up with code something like this:
public Employee createEmployee(int employeeType)
throws InvalidEmployeeType {
switch (employeeType) {
case COMMISSIONED:
return new CommissionEmployee(e);
case HOURLY:
return new HourlyEmployee(e);
case SALARIED:
return new SalariedEmployee(e);
default:
throw new InvalidEmployeeType(e.type);
}
}
public Money calculatePay(int employeeType)
throws InvalidEmployeeType {
Employee e = createEmployee(e.type);
return e.calculatePay();
}
You will notice a few things:
- We still have a
switch
which allegedly does "N-things". - The switch will still have to grow when new employment types are added.
- We still have a Open/Closed violation, since adding a new subclass will require us to modify the switch, just as before.
But if there are multiple places in the code were we switch
on the employee type as in the first example, this refactoring is a clear improvement since we now only need one single switch and can use overloading in the other places.
If it is not clear from the above, I reject the premise that switch statements are bad and should be avoided in general. Sometimes it is the right tool for the job. But certain uses of switch is an anti-pattern, for example using switch as a substitute for overloading, when overloading would be more appropriate.
-
2If employees are stored in a database which a field for wage/salary classification, having a function that accepts an employee ID and generates an employee object without regard for the pay classification, and having the payment code use different logic based upon the contents of that field, doesn't seem any worse than requiring that the
getEmployee
function generate objects of different class types based upon the contents of that field.supercat– supercat06/10/2021 19:33:23Commented Jun 10, 2021 at 19:33 -
16You would probably want to do composition or interfacing for this, An employee would have a "Payscale" member which would be used to calculate pay. An employee may be subclassed, but not by "Hourly", "Salaried", "Commissionsed", instead many distinct employee types would impliment the "Employee" interface where they would define their own payscales if such a thing were so unstable between categories. Practically though, it really is composition you would use, and it is the payscale itself that would impliment the "Payscale" interface, and each employee would have a payscale as a member.Krupip– Krupip06/10/2021 19:36:50Commented Jun 10, 2021 at 19:36
-
2+1 because indeed, Martin says a coupe of pages away that "if a function does only those steps that are one level below the stated name of function, then the function is doing one thing." (see here) and this function seems ok according to this criteria. The argument about SRP seems also doubtful, especially in view of his own explanations here. But ok, for OCP he's rightChristophe– Christophe06/10/2021 21:44:36Commented Jun 10, 2021 at 21:44
-
3@T.E.D.: In what way is the switch still there? Overload resolution is not implemented via a switch under the hood, if that is what you mean.JacquesB– JacquesB06/10/2021 22:41:48Commented Jun 10, 2021 at 22:41
-
2For the OO solution, you're assuming that calculatePay belongs on the employee object. It could, but more likely pay isn't a concept of an Employee, and is a concept of a pay subsystem. In which case that logic in no way belongs on the employee class And it would require the instantiator of the class to know which type of employee to instantiate. Depending on the purpose of the program that may be ok, but I'd say more commonly it isn't.Gabe Sechan– Gabe Sechan06/11/2021 03:09:10Commented Jun 11, 2021 at 3:09
The calculatePay
in your question is doing indeed too much:
- It extracts the type of the employee.
- It calls
calculateCommissionedPay
in one case. - And it also calls
calculateHourlyPay
in another case. - It also knows about
calculateSalariedPay
. - And finally it handles the default case by constructing and throwing an exception.
The nice thing is that calculatePay
doesn't need to do all that. If the application was architected correctly, there wouldn't be any Employee.type
. Instead, there would be classes like ComissionedEmployee
, HourlyEmployee
and so on which implement the interface Employee
. The interface would have a calculatePay
method, and the different classes will provide the actual implementation of this method.
This is essentially what Replace conditional with polymorphism section in Refactoring by Martin Fowler is all about: relying on polymorphism instead of picking an execution path based on the type of an object.
This change would reduce calculatePay
to one line:
public Money calculatePay(Employee e) {
return e.calculatePay();
}
Then, it can be removed: the caller of this function can call Employee.calculatePay
method directly.
This is essentially the example you see in Clean code in Listing 3-5, on the next page.
-
35Clearly that is Bob's preferred solution, but in my view his five complaints about that function are rhetoric hyberbole and all boil down do more or less the same thing. Sure, when you can replace a
switch
with dynamic dispatch, yea! - but that isn't always possible, and in many cases a judiciousswitch
can absolutely be the right choice. (Usually when there aren't a lot of other places with the sameswitch
.)Kilian Foth– Kilian Foth06/10/2021 12:31:02Commented Jun 10, 2021 at 12:31 -
2@KilianFoth: agreed,
switch
can be a valid option in other cases (no pun intended). The question, however, is about the specific piece of code. Do you believe that the function is well-written, and that havingcalculateCommissionedPay
andcalculateHourlyPay
andcalculateSalariedPay
is a good choice?Arseni Mourzenko– Arseni Mourzenko06/10/2021 12:53:18Commented Jun 10, 2021 at 12:53 -
16It almost certainly isn't, since Bob introduced it in the context of refactoring, and using dynamic dispatch instead is possible. But you can't really decide this just from the example. "Doing too many things" is not what makes the code bad. I once wrote a 500-case switch (in the middle of a text-to-TeX converter), and any attempt to hide this behind a nice object-oriented hierarchy would only have turned a plodding-but-okay bit of code into a festering monstrosity. The point is really that the code is using
switch
where an alternative solution is clearly superior.Kilian Foth– Kilian Foth06/10/2021 13:17:01Commented Jun 10, 2021 at 13:17 -
3@KorayTugay: If employees of all type are stored in a common database or other such entity, and retrieved via common function, having a single class of employee and a switch in a payment calculation function seems cleaner than requiring that the code which generates an object based upon information in the database include special-case logic to handle different pay classifications.supercat– supercat06/10/2021 19:26:43Commented Jun 10, 2021 at 19:26
-
2@supercat: If you don't want to subclass
Employee
, you could create a separatePayType
class to handle the special-case logic, and have eachEmployee
object contain a reference to the appropriatePayType
.dan04– dan0406/10/2021 23:49:34Commented Jun 10, 2021 at 23:49
Everyone complains about Robert Martins "one thing". Let me try to demystify it.
calculatePay
is one thing. Just the name is enough to set us on the path of expecting one thing. When I look inside this I expect to find things that calculatePay
.
These seem like they're focused on our one thing: calculateCommissionedPay(e)
,
calculateHourlyPay(e)
, and calculateSalariedPay(e)
and indeed they help us get there but these are not focused on the whole of our one thing. They are smaller cases of it. They're details. We just want to calculate pay. We don't care why you're payed what you're payed. We just want to calculate it.
So what if e
had e.wages()
and e.bonus()
hanging off of it? Could we add them together here or are those evil details? Well when wouldn't you add them together? This isn't some weird part of calculatePay
. This is calculatePay
.
One thing isn't something that can't be decomposed. That's what's really wrong with the example Mr. Martin gives. It makes it seem like each function should only be one line. Which is nonsense. One thing means this code should tell me one simple understandable story without dragging me through weird cases and up and down through wildly different abstractions. Do the work it takes to make the function simple enough to look like it's only doing one thing. Even if e.wages()
is hiding tons of polymorphism, nasty switch cases, or 100 different levels of object oriented abstraction. At least here it's just e.wages()
that you can add to e.bonus()
, whatever that really is.
-
5"One thing isn't something that can't be decomposed", that's on the spot! The issue with "one thing" is that "thing" is ambiguous and it's rarely "one". The accounting point of view counting the operations gets us lost in details when doing functional decomposition, whereas he could simply speak about one goal.Christophe– Christophe06/10/2021 21:54:28Commented Jun 10, 2021 at 21:54
-
3If you were only allowed to call methods that did the whole of the one thing, you'd never be able to write any useful methods at all! The longest method you could write would be
Money calculatePay(Employee e) {return internalCalculatePay(e);}
but then how would you write internalCalculatePay? - you'd have the same problem!Stack Exchange Broke The Law– Stack Exchange Broke The Law06/11/2021 21:06:53Commented Jun 11, 2021 at 21:06 -
1@user253751 yes that's exactly what I'm pointing out. But Uncle Bob never said you couldn't decompose. What he did was give us an example that made people think that. Which is why we keep talking about this.candied_orange– candied_orange06/11/2021 21:09:30Commented Jun 11, 2021 at 21:09
-
Martin = Robert C. Martin = Uncle Bob (not Martin Fowler).Peter Mortensen– Peter Mortensen06/12/2021 19:37:06Commented Jun 12, 2021 at 19:37
Uncle Bob's criticisms are mostly valid, but his proposal to make payment calculation a virtual method tied to the employment type makes the code less flexible.
Having the pay calculation depend on a piece of data rather than on type, makes it easier to convert employees. For example, if a commissioned sales person gets promoted to a salaried management position, do you really want to construct a new employee and destroy the old one? Or would you rather change one field in the existing employee record.
Better Option 1 -- Composition
Instead of coupling the payment algorithm to the employee type, tie it to the employee's wage type. This still solves the problem with polymorphism, but it keys off of mutable data field rather than coupling it to an employee type (which is much less malleable).
class WageStrategy {
public:
virtual Money CalculatePay(const Employee &e) const = 0;
};
class SalaryWages : public WageStrategy {
public:
Money CalculatePay(const Employee &e) const override;
}
class HourlyWages : public WageStrategy {
public:
Money CalculatePay(const Employee &e) const override;
}
class CommissionWages : public WageStrategy {
public:
Money CalculatePay(const Employee &e) const override;
}
class Employee {
public:
Money CalculatePay() const {
return m_wages.CalculatePay(*this);
}
private:
WageStrategy * m_wages;
};
Better Option 2 -- Parameterization
Is it worth having independent implementations of wage calculations? If the company comes up with a new compensation plan (e.g., small base salary plus commissions), so you really want to create another type? Do you like having to search all over the code to know all the different ways wages can be computed? What if we just parameterized the business logic like this:
Money CalculatePay(const Employee &e) {
Money pay = 0;
pay += e.GetBaseSalary();
for (const auto &sale : e.GetSales()) {
pay += e.GetCommissionRate() * sale.GetTotal();
}
pay += e.GetHoursWorked() * e.GetHourlyRate();
return pay;
}
No switch statements. No polymorphism. Data-driven so you can effectively create new pay strategies (especially hybrid ones) without code changes--that's even better than open-closed. A single path of execution provides clarity and simple code coverage testing.
-
Totally agree, option 1 would be my solution of choice. One should favor composition over inheritance and having different approaches to solve the same problem (calculate salary) screams strategy pattern.Thomas Hilbert– Thomas Hilbert06/12/2021 00:13:22Commented Jun 12, 2021 at 0:13
-
Is the plus in
pay += e.GetBaseSalary();
just for consistency? Or is there some subtle design-principle involved? Maybe in case a line is added before that line?Cyclops– Cyclops06/13/2021 16:32:01Commented Jun 13, 2021 at 16:32 -
@Cyclops: Yes. It also allows the lines to be re-ordered without having to make subtle changes for correctness.Adrian McCarthy– Adrian McCarthy06/13/2021 20:18:37Commented Jun 13, 2021 at 20:18
While I generally don't agree with Martin on this point, here's one way of looking at the situation which might be useful:
"This piece of logic acts as a gatekeeper." In an out-of-the-way place where you might never think to look for it.
This piece of logic, "although it is not part of an Employee
object," is nonetheless cognizant of the employee-type, such that it determines which one of several "fairly beefy" subroutines are to be called.
So – if you're "chasing a bug," and you think that it must be located in one of these routines, there's one critical piece of information that you might be missing: "under what conditions are they actually called?" This piece of logic ties all of this to the value of e.type
, but it's external to the place where such logic properly belongs: "the employee
object."
Likewise, these various subroutines really ought to be methods of that same object, so that it's now possible to open "just one source-file ... the one which implements this object ... and to see everything that you need to know to understand it all, and to know that you have actually done so."
-
Martin = Robert C. Martin = Uncle Bob (not Martin Fowler).Peter Mortensen– Peter Mortensen06/12/2021 19:41:21Commented Jun 12, 2021 at 19:41
I understand where you're coming from, but in this particular case, you're focusing on the wrong thing. The main point he's trying to illustrate in this section is that this should be treated as a code smell (a strong indication of possible unwanted coupling) because there's likely a bunch of other functions in the system that have (literally or conceptually) the same switch statement in them. He lists, as a hypothetical example, isPayday(Employee e, Date date)
, and deliverPay(Employee e, Money pay)
, stating that there are other functions like that in the codebase. They all switch on employee.
This kind of coupling is extremely common in the real world.
And it's bad because you have to hunt down and change all of them if, say, you need to introduce a new case (or if you face any kind of change that requires the same kind of behavior to be repeated in all of these places). And nothing guarantees that these changes won't propagate outside of those functions. This is how you end up having to modify 20 different files because of a "simple" change.
The main point is to DRY out this (literal or conceptual) repetition; the book goes on to suggest to solve this by confining the switch statement to a single factory that returns a polymorphic result.
The "does more than one thing" is an offhand remark in this section of the book, probably best interpreted in the context of this statement at the start of the section: "By their nature, switch statements always do N things" - presumably because of their N cases.
So, sometimes it calculates a commissioned pay, sometimes a hourly pay, and sometimes a salaried pay. Building upon our discussion here (about the idea that implementations shouldn't be jumping levels of abstraction), while these are all at the same level of abstraction, they are conceptually distinct things: they aren't a group of functions that together achieve one overall task (they don't form steps of an overall operation), instead they are largely operationally unrelated, operating on a case-by-case basis1. As Daniel Wagner noticed in a comment to your question: "$/hour, $/year, and $/project are super meaningfully different"; these two problems are connected.
1 The concept of single responsibility is a rule of thumb intimately associated with the concept of cohesion. Cohesion is a measure of how closely related things in a module (function, class, component, ...) are. What you want is internally strongly cohesive, but externally loosely coupled modules (functions, classes, components, ...). If elements are not strongly related (and change for different reasons), as the codebase evolves, you (1) get internal coupling that makes changing each thing independently hard, but also (2) it's likely that elements that are related to each thing in that particular module were spread throughout several other modules (so, there's external coupling). The strongest form of cohesion is when there's an operational relationship - a number of functions that combine to accomplish a single overall task. Here you don't have that, you have logical relatedness instead (they all do the same sort of thing), and that's not a strong form of cohesion.
Quote from (a):
Another wording for the Single Responsibility Principle is:
Gather together the things that change for the same reasons. Separate those things that change for different reasons.
Sources:
(a) The Clean Code Blog: The Single Responsibility Principle,
(b) SRP: The Single Responsibility Principle (incidentally, this is the broken link from the book)
Depending on the language, you can't always fix all of the code smells, but sticking a switch statement into a factory turns this from a design that causes a proliferation of the same switch statement to something that converts a key into a specialized object focused only on one particular case, an object that you can pass around your application for later use by more self-contained components.
P.S. "Second, it very clearly does more than one thing." - I guess another takeaway from this is, if you're trying to explain something to other people (or if you're writing a book), never assume that what's clear to you is also obvious to everyone else :D
I agree whith those answers who point out that this specific case would be handled better by using inheritance.
But on the other hand you can't rely always on inheritance. I disagree with the idea that the switch statement violates the SRP. The switch statement acts as a dispatcher to the proper action when the options are not just black and white. I don't know what was written in the book, if it was a warning against placing a lot of code within the case block instead of a single call that would be understandable. But criticising the switch only because it has more than one case seems overly fanatic, that would rule out a lot of common practices that provided a lot of clean code for many years. What would the propoents of this interpretation think of the Scala style pattern matching?