How would you refactor nested IF Statements? [duplicate]

Question 1

I was cruising around the programming blogosphere when I happened upon this post about GOTO's:

http://giuliozambon.blogspot.com/2010/12/programmers-tabu.html

Here the writer talks about how "one must come to the conclusion that there are situations where GOTOs make for more readable and more maintainable code" and then goes on to show an example similar to this:

if (Check#1)
{
 CodeBlock#1
 if (Check#2)
 {
 CodeBlock#2
 if (Check#3)
 {
 CodeBlock#3
 if (Check#4)
 {
 CodeBlock#4
 if (Check#5)
 {
 CodeBlock#5
 if (Check#6)
 {
 CodeBlock#6
 if (Check#7)
 {
 CodeBlock#7
 }
 else
 {
 rest - of - the - program
 }
 }
 }
 }
 }
 }
}

The writer then proposes that using GOTO's would make this code much easier to read and maintain.

I personally can think of at least 3 different ways to flatten it out and make this code more readable without resorting to flow-breaking GOTO's. Here are my two favorites.

1 - Nested Small Functions. Take each if and its code block and turn it into a function. If the boolean check fails, just return. If it passes, then call the next function in the chain. (Boy, that sounds a lot like recursion, could you do it in a single loop with function pointers?)

2 - Sentinal Variable. To me this is the easyest. Just use a blnContinueProcessing variable and check to see if it is still true in your if check. Then if the check fails, set the variable to false.

How many different ways can this type of coding problem be refactored to reduce nesting and increase maintainability?

Question 2

Your code as posted checks for each variable and runs all code blocks up through the highest variable passed, and runs rest-of-the-program iff checks 1 through 6 succeed and 7 fails. Is that what you intended? Just checking.

Question 3

@Crazy...not really. The bottom line is that there is a lot of state being maintained and maintained poorly at that. No matter what is being checked this code still screams state machine.

Question 4

@Walter: I think codereview.SE would be more appropriate than SO.

Question 5

codinghorror.com/blog/2006/01/flattening-arrow-code.html

Question 6

Code blocks like this are quite common in parsers. I have never found a way to make it cleaner - every possible alternative has a large degree of suck-factor.

Question 7

It is really hard to tell without knowing how the different checks interact. Rigorous refactoring might be in order. Creating a topology of objects that execute the correct block depending on their type, also a strategy pattern or state pattern might do the trick.

Without knowing what to do best I would consider two possible simple refactorings that could be further refactored by extracting more methods.

The first one I don't realy like since I always like as litle exit points in a method (preferably one)

if (!Check#1)
{ 
 return;
}
CodeBlock#1
if (!Check#2)
{
 return;
}
CodeBlock#2
...

The second one remove's the multiple returns but also adds a lot of noise. (it basicly only removes the nesting)

bool stillValid = Check#1
if (stillValid)
{
 CodeBlock#1
}
stillValid = stillValid && Check#2
if (stillValid)
{
 CodeBlock#2
}
stillValid = stillValid && Check#3
if (stillValid)
{
 CodeBlock#3
}
...

This last one can be refactored nicely into functions and when you give them good names the result might be reasonable';

bool stillValid = DoCheck1AndCodeBlock1()
stillValid = stillValid && DoCheck2AndCodeBlock2()
stillValid = stillValid && DoCheck3AndCodeBlock3()
public bool DoCheck1AndCodeBlock1()
{
 bool returnValid = Check#1
 if (returnValid)
 {
 CodeBlock#1
 }
 return returnValid
}

All in all there are most likely way better options

Question 8

At this moment example1 exits when check#1 passes... should probably read if(!check#1){return;}codeblock#1;

Question 9

Readability of the first refactoring is unbeatable. It also help reduce unneeded indentation. I use is as much as I can.

Question 10

The first refactoring is not quite correct, you can get to check2 without check1 being true, and it seems that both have to be true.

Question 11

@Inca and @Marcelo Hernández Rishmawy corrected, thanks !

Question 12

That why I love Ruby: return if !check1

Question 13

That is called "Arrow Code" because of the shape of the code with proper indenting.

Jeff Atwood had a good blog post on Coding Horror about how to flatten out the arrows:

Flattening Arrow Code

Read the article for the full treatment, but here are the major points..

Replace conditions with guard clauses
Decompose conditional blocks into seperate functions
Convert negative checks into positive checks
Always opportunistically return as soon as possible from the function

Question 14

+1: Also convert positive checks into negative checks and use early returns (not all coding standards allow this.)

Question 15

Yeah, we called it 'skyscraper code' - if you turn the code sideways, it looks like high rise buildings...

Question 16

@MichaelDurrant also known as the pharaonic developer's pyramid.

Question 17

I know some people will argue that it's a goto, but return; is the obvious refactoring, i.e.

if (!Check#1)
{ 
 return;
}
CodeBlock#1
if (!Check#2)
{
 return;
}
CodeBlock#2
.
.
.
if (Check#7)
{
 CodeBlock#7
}
else
{
 rest - of - the - program
}

If it really is just a bunch of guard checks before running the code, then this works fine. If it's more complicated than that, then this will only make it a bit simpler, and you need one of the other solutions.

Question 18

This is arrow code flattening through early return statements. And these early return statements are usually error case detection conditions, until the very last piece of code, which is then the actual real business logic of this function. If these techniques were taught to programmers since they start with any O.O. language, the world would be a much better place to live.

Question 19

That spaghetti code seems like the perfect candidate for refactoring it into a state machine.

Question 20

Oh, a state machine ... I like that idea. In a way, the code is a Deterministic finite-state machine (DFA)

Question 21

This may already have been mentioned, but my "go-to" (pun intended) answer to the "arrowhead antipattern" shown here is to reverse the ifs. Test the opposite of the current conditions (easy enough with a not operator) and return from the method if that is true. No gotos (though pedantically speaking, a void return is little more than a jump to the line of calling code, with the trivial extra step of popping a frame off the stack).

Case in point, here's the original:

if (Check#1)
{
 CodeBlock#1
 if (Check#2)
 {
 CodeBlock#2
 if (Check#3)
 {
 CodeBlock#3
 if (Check#4)
 {
 CodeBlock#4
 if (Check#5)
 {
 CodeBlock#5
 if (Check#6)
 {
 CodeBlock#6
 if (Check#7)
 {
 CodeBlock#7
 }
 else
 {
 rest - of - the - program
 }
 }
 }
 }
 }
 }
}

Refactored:

if (!Check#1) return;
CodeBlock#1
if (!Check#2) return;
CodeBlock#2
if (!Check#3) return;
CodeBlock#3
if (!Check#4) return;
CodeBlock#4
if (!Check#5) return;
CodeBlock#5
if (!Check#6) return;
CodeBlock#6
if (Check#7)
 CodeBlock#7
else
{
 //rest of the program
}

At each if, we basically check to see if we should continue. It works exactly the same way as the original with only one level of nesting.

If there's something beyond the end of this snippet that should also be run, extract this code into its own method and call it from wherever this code currently lives, before proceeding to the code that would come after this snippet. The snippet itself is long enough, given sufficient actual LOC in each code block, to justify splitting out several more methods, but I digress.

Question 22

While I don't subscribe to the "only one exit point" philosophy, I believe functions should when practical be divided into three sections; only the middle section should have side-effects, and it should not have any returns. To my mind, a middle-section return is worse than a goto, since an outdented label the latter will all attention to the fact that there's a goto lurking about.

Question 23

"since an outdented label the latter will all attention to the fact that there's a goto lurking about." @supercat what does that mean?

Question 24

@reggaeguitar: Strike "the latter", or else said "Is worse than a goto/label combination, since an outdented label used by the latter"...

Question 25

@supercat: the question is: "lurking around" in a small or in a big sea? The function should not contain hundreds of lines, otherwise it should already have been refactored. Either refactored to other smaller functions, objects, or whatever other clean code design patterns.

Question 26

I still see var ret; do { ... break; ... break; ... } while(false); return ret; being coded around. When combined with if/else pyramids they produce huge functions, which no one will ever have the courage to refactor. When a technology break happens (e.g. for C#: .NET 6, microservices, etc.) absolutely nothing from the old code can be reused nor understood. Then you rely only on documentation, which is usually not updated. The "if pyramids" combined with the "only one return statement allowed" are the worst clean code anti-pattern ever, and the shortest path to a self-destructive system.

Question 27

Using an OO approach a composite pattern where leaf is simple condition and component a union of simple elements make this kind of code extensible and adaptable

Question 28

I have no idea what this means. What is a leaf in this contex? Union of what?

Question 29

Then I recommend reading more about design-patterns, it can only improve your skills: en.wikipedia.org/wiki/Composite_pattern

Question 30

If you've routinely got logic that actually requires this pyramid of if checks, you are probably (metaphorically) using a wrench to hammer nails. You'd be better served doing that kind of twisted, complicated logic in a language that supports that kind of twisted and complicated logic with better structures than linear if/else if/else-style constructs.

Languages that might be better-suited to this kind of structure could include SNOBOL4 (with its bizarre dual-GOTO style branching) or logic languages like Prolog and Mercury (with their unification and backtracking capabilities, not to mention their DCGs for rather succinct expression of complicated decisions).

Of course if this is not an option (because most programmers are, tragically, not polyglots) the ideas others have come up with are good like using various OOP structures or breaking up the clauses into functions or even, if you're desperate, and don't mind the fact that most people find them unreadable, using a state machine.

My solution, however, remains reaching for a language that permits you to express what logic you're trying to express (assuming this is commonplace in your code) in an easier, more readable fashion.

Question 31

Good luck finding other developers that know SNOBOL4, Prolog and Mercury

Question 32

From here:

Quite often when I'm looking at a set of pac-man ifs I find that if I just draw out something like a truth table of all the conditions involved I can work out a much better route to resolving the problem.

That way you can also assess whether there is a better method, how you might break it down further and ( and this is a big one with this kind of code ) whether there are any holes in the logic.

Having done that you can probably break it down into a couple of switch statements and a couple of methods and save the next poor mook who has to go through the code a whole lot of problems.

Question 33

Personally, I like wrapping these if statements into separate functions which return a bool if the function succeeds.

The structure looks as follows:


if (DoCheck1AndCodeBlock1() && 
 DoCheck2AndCodeBlock2() && 
 DoCheck3AndCodeBlock3()) 
{
 // ... you may perform the final operation here ....
}

The only drawback is that these functions will have to output data using attributes passed by reference or member variables.

Question 34

It depends a lot on the size of each code block and the total size, but I tend to split either by a straight "extract method" on a block, or by moving the unconditional parts into separate methods. But the caveats @KeesDijk mentioned apply. The former approach gives you a series of functions each like

if (Check#1)
{
 CodeBlock#1
 NextFunction
}

Which can work well but does lead to code bloat and the "method only used once" antipattern. So I generally prefer this approach:

if (CheckFunctionOne)
{
 MethodOneWithDescriptiveName
 if (CheckFunctionTwo)
 {
 MethodTwoWithDescriptiveName
 ....

With appropriate use of private variables and parameter passing it can be done. Having glanced at literate programming can help here.

Question 35

If you want a possible object orientated solution, the code looks like a canonical example for refactoring using the chain of responsibility design pattern.

KeesDijk KeesDijk 8,9684 gold badges37 silver badges41 bronze badges · Accepted Answer · 2011-02-14 21:06:12Z

It is really hard to tell without knowing how the different checks interact. Rigorous refactoring might be in order. Creating a topology of objects that execute the correct block depending on their type, also a strategy pattern or state pattern might do the trick.

Without knowing what to do best I would consider two possible simple refactorings that could be further refactored by extracting more methods.

The first one I don't realy like since I always like as litle exit points in a method (preferably one)

if (!Check#1)
{ 
 return;
}
CodeBlock#1
if (!Check#2)
{
 return;
}
CodeBlock#2
...

The second one remove's the multiple returns but also adds a lot of noise. (it basicly only removes the nesting)

bool stillValid = Check#1
if (stillValid)
{
 CodeBlock#1
}
stillValid = stillValid && Check#2
if (stillValid)
{
 CodeBlock#2
}
stillValid = stillValid && Check#3
if (stillValid)
{
 CodeBlock#3
}
...

This last one can be refactored nicely into functions and when you give them good names the result might be reasonable';

bool stillValid = DoCheck1AndCodeBlock1()
stillValid = stillValid && DoCheck2AndCodeBlock2()
stillValid = stillValid && DoCheck3AndCodeBlock3()
public bool DoCheck1AndCodeBlock1()
{
 bool returnValid = Check#1
 if (returnValid)
 {
 CodeBlock#1
 }
 return returnValid
}

All in all there are most likely way better options

At this moment example1 exits when check#1 passes... should probably read if(!check#1){return;}codeblock#1;
Readability of the first refactoring is unbeatable. It also help reduce unneeded indentation. I use is as much as I can.
The first refactoring is not quite correct, you can get to check2 without check1 being true, and it seems that both have to be true.

Stack Exchange Network

How would you refactor nested IF Statements? [duplicate]

11 Answers 11

Linked

Hot Network Questions

How would you refactor nested IF Statements? [duplicate]

11 Answers 11

Linked

Related

Hot Network Questions