Printing removed items using lambdas and streams

Question 1

I'm struggling to make my lambdas readable.

I've seen various approaches. Below are three different examples that all do the same thing.

Forgive the example: I suspect there are better actual solutions to this problem. I'm only concerned with readability (and ultimately supportability).

/** Output any items that are present in "existing" but not in "updated" */
private static void outputRemovedItems(List<String> existing, List<String> updated) {
 // #1 simple for-loop
 for (String item : existing) {
 if (!updated.contains(item)) {
 System.out.println("item " + item + " removed");
 }
 }
 // #2 fluid lambdas
 existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(item -> System.out.println("item " + item + " removed"));
 // #3 awful-looking middle-ground: functions split out
 Predicate<String> onlyRemovedItems = item -> !updated.contains(item);
 Consumer<String> outputItems = item -> System.out.println("item " + item + " removed");
 existing.stream().filter(onlyRemovedItems).forEach(outputItems);
}

Personally, I find the lambda approaches more difficult to read than the simple for-loop: they make me think too much, which detracts from the readability. So this is stopping me from using them in actual production code. The more complex the operation, the worse my lambdas seem to look!

Does anyone have any tips to make my lambdas more readable?

Or.. do people from a functional programming background actually find the lambdas above easier to read than the loop?

Question 2

Welcome to Code Review. I hope you get some great answers. I personally prefer #2, I thought a big point of lambdas was to focus on the "what" instead of the "how." filter and forEach are pretty intuitive, names, even without being familiar with lambdas one could guess that something is being filtered out, and then something is being done for each member remaining. Or in this case, not remaining given the NOT.

Question 3

For me approach number 2 is fine. if you finds it difficult to read, try to read it as a sentence not as code.

Question 4

Your "fluid" example is fine, well structured, the newlines are consistent, and is in line with code styles that I have seen and like. It is perhaps a bit too soon to say it follows "best practice", but it does look right.

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(item -> System.out.println("item " + item + " removed"));

The "middle-ground" approach you have is immediately "identical" to the fluid approach, but in this context is less readable, even if functionally identical. It does have advantages when your function is, for example, a method parameter, or an injected constant (like a custom comparator, or something). Since the code is all localized to a single method, though, the need to separate the declaration and use of the function is simply not there.

My only concern with your fluid approach, and it is a small one, is that you reuse the item placeholder in two contexts. While it is logical, in this case, to use item, I find it is typically better to have unique variable names at each stage in the pipeline, perhaps:

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(removed -> System.out.println("item " + removed + " removed"));

Additionally, using printf helps too:

 .forEach(removed -> System.out.printf("item %s removed\n", removed));

One lase comment, I have found that the more I use Java 8 features the easier it gets to establish the right mindset when reading the code. You may just find that it grows on you, and your statement "I find the lambda approaches more difficult to read than the simple for-loop" is no longer true. Also, consider this:

// #2 fluid lambdas
existing.stream()
 .parallel()
 .filter(item -> !updated.contains(item))
 .forEach(removed -> System.out.printf("item %s removed\n", removed));

and consider doing that in "the simple for loop".

Question 5

If you're using a parallel stream, you should use forEachOrdered for a System.out, since forEach would be called in multiple threads and without respect to the original ordering.

Question 6

@Hardcoded - that is true, I assume that the order of reporting is not significant, and, the printf is synchronized already, so is thread-safe. The bulk of the work will potentially be in the contains(...) call depending on the size of updated. Regardless, you are right, that using forEachOrdered will report the results in the same order as a sequential stream.

Question 7

One should emphasize that the perceived readability largely depends on how familiar the reader is with certain concepts. Most programmers know the good, old for loop that has basically been present since the time when programs have been delivered as punched cards. In contrast to that, someone who has been programming Lisp or Haskell in the past 20 years may consider even the most complex Java lambda as a "toy example".

The advantage of easier parallelizability of the lambdas has already been pointed out in the other answer. But the actual crux is the transition from external iteration to internal iteration: Leaving the decision of how to iterate a set of elements to the container implementation offers new degrees of freedom for various sorts of optimization (search for "Pipelined Execution", for example, in this tutorial).

This leads to your actual question. The example is a bit contrived, and may be too simple to really see the benefits of using stream operations with lambdas. This is also why the iterative version with the for loop is certainly appropriate here. What remains open is the question about how to keep things readable for the cases where lambdas are more appropriate.

The fluid version

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(item -> System.out.println("item " + item + " removed"));

is the intended and most idiomatic way to write this. However, for more complex lambda expressions, I think there's nothing wrong with giving names to lambdas, probably in the form of local variables or static helper methods.

This might seem like a step towards the "awful looking middle ground" that you sketched. But in this case, it is only looking awful due to the code bloat that is caused here: You have to specify the types of the lambdas, and in fact, the type declaration causes more code than the actual lambda itself. So you should probably not extract a lambda like

Predicate<String> onlyRemovedItems = item -> !updated.contains(item);

But if the condition was more complicated, it could be beneficial: Imagine that you wanted to use such a more complex lambda as a "building block" at several places, in various other stream operations. Of course, you always could "inline" the whole lambda into one large, deeply nested lambda that solely resides at the call site. But this hinders readability, and extracting it would allow you to use it conveniently, probably via method references.

(An extreme example where I implemented two versions - once "inlined", and once with dedicated, easily readable names - can be seen in https://codereview.stackexchange.com/a/79082/41175 )

Question 8

Here's a useful link illustrating 'external vs internal iterations' too...

rolfl rolfl 98.1k17 gold badges219 silver badges419 bronze badges · Answer 1 · 2015-03-25 10:38:52Z

Your "fluid" example is fine, well structured, the newlines are consistent, and is in line with code styles that I have seen and like. It is perhaps a bit too soon to say it follows "best practice", but it does look right.

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(item -> System.out.println("item " + item + " removed"));

The "middle-ground" approach you have is immediately "identical" to the fluid approach, but in this context is less readable, even if functionally identical. It does have advantages when your function is, for example, a method parameter, or an injected constant (like a custom comparator, or something). Since the code is all localized to a single method, though, the need to separate the declaration and use of the function is simply not there.

My only concern with your fluid approach, and it is a small one, is that you reuse the item placeholder in two contexts. While it is logical, in this case, to use item, I find it is typically better to have unique variable names at each stage in the pipeline, perhaps:

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(removed -> System.out.println("item " + removed + " removed"));

Additionally, using printf helps too:

 .forEach(removed -> System.out.printf("item %s removed\n", removed));

One lase comment, I have found that the more I use Java 8 features the easier it gets to establish the right mindset when reading the code. You may just find that it grows on you, and your statement "I find the lambda approaches more difficult to read than the simple for-loop" is no longer true. Also, consider this:

// #2 fluid lambdas
existing.stream()
 .parallel()
 .filter(item -> !updated.contains(item))
 .forEach(removed -> System.out.printf("item %s removed\n", removed));

and consider doing that in "the simple for loop".

If you're using a parallel stream, you should use forEachOrdered for a System.out, since forEach would be called in multiple threads and without respect to the original ordering.
@Hardcoded - that is true, I assume that the order of reporting is not significant, and, the printf is synchronized already, so is thread-safe. The bulk of the work will potentially be in the contains(...) call depending on the size of updated. Regardless, you are right, that using forEachOrdered will report the results in the same order as a sequential stream.

Marco13 Marco13 8314 silver badges8 bronze badges · Answer 2 · 2015-03-25 14:32:47Z

One should emphasize that the perceived readability largely depends on how familiar the reader is with certain concepts. Most programmers know the good, old for loop that has basically been present since the time when programs have been delivered as punched cards. In contrast to that, someone who has been programming Lisp or Haskell in the past 20 years may consider even the most complex Java lambda as a "toy example".

The advantage of easier parallelizability of the lambdas has already been pointed out in the other answer. But the actual crux is the transition from external iteration to internal iteration: Leaving the decision of how to iterate a set of elements to the container implementation offers new degrees of freedom for various sorts of optimization (search for "Pipelined Execution", for example, in this tutorial).

This leads to your actual question. The example is a bit contrived, and may be too simple to really see the benefits of using stream operations with lambdas. This is also why the iterative version with the for loop is certainly appropriate here. What remains open is the question about how to keep things readable for the cases where lambdas are more appropriate.

The fluid version

// #2 fluid lambdas
existing.stream()
 .filter(item -> !updated.contains(item))
 .forEach(item -> System.out.println("item " + item + " removed"));

is the intended and most idiomatic way to write this. However, for more complex lambda expressions, I think there's nothing wrong with giving names to lambdas, probably in the form of local variables or static helper methods.

This might seem like a step towards the "awful looking middle ground" that you sketched. But in this case, it is only looking awful due to the code bloat that is caused here: You have to specify the types of the lambdas, and in fact, the type declaration causes more code than the actual lambda itself. So you should probably not extract a lambda like

Predicate<String> onlyRemovedItems = item -> !updated.contains(item);

But if the condition was more complicated, it could be beneficial: Imagine that you wanted to use such a more complex lambda as a "building block" at several places, in various other stream operations. Of course, you always could "inline" the whole lambda into one large, deeply nested lambda that solely resides at the call site. But this hinders readability, and extracting it would allow you to use it conveniently, probably via method references.

(An extreme example where I implemented two versions - once "inlined", and once with dedicated, easily readable names - can be seen in https://codereview.stackexchange.com/a/79082/41175 )

Here's a useful link illustrating 'external vs internal iterations' too...

Stack Exchange Network

Printing removed items using lambdas and streams

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Printing removed items using lambdas and streams

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions