I'm trying to learn Java 8 Stream and when I try to convert some function to java8 to practice. I meet a problem.
I'm curious that how can I convert follow code to java stream format.
/*
* input example:
* [
{
"k1": { "kk1": 1, "kk2": 2},
"k2": {"kk1": 3, "kk2": 4}
}
{
"k1": { "kk1": 10, "kk2": 20},
"k2": {"kk1": 30, "kk2": 40}
}
]
* output:
* {
"k1": { "kk1": 11, "kk2": 22},
"k2": {"kk1": 33, "kk2": 44}
}
*
*
*/
private static Map<String, Map<String, Long>> mergeMapsValue(List<Map<String, Map<String, Long>>> valueList) {
Set<String> keys_1 = valueList.get(0).keySet();
Set<String> keys_2 = valueList.get(0).entrySet().iterator().next().getValue().keySet();
Map<String, Map<String, Long>> result = new HashMap<>();
for (String k1: keys_1) {
result.put(k1, new HashMap<>());
for (String k2: keys_2) {
long total = 0;
for (Map<String, Map<String, Long>> mmap: valueList) {
Map<String, Long> m = mmap.get(k1);
if (m != null && m.get(k2) != null) {
total += m.get(k2);
}
}
result.get(k1).put(k2, total);
}
}
return result;
}
-
So all maps are the same - i.e. have the same keys at both levels?Boris the Spider– Boris the Spider2016年06月28日 09:19:16 +00:00Commented Jun 28, 2016 at 9:19
-
4This is not a code translation service. You need to show us what you have already tried so we can tell you what you are doing wrong.explv– explv2016年06月28日 09:32:44 +00:00Commented Jun 28, 2016 at 9:32
-
2You should rethink your original approach first, i.e. what to iterate over in the outer loop and what in the inner loop.Holger– Holger2016年06月28日 09:46:38 +00:00Commented Jun 28, 2016 at 9:46
-
@Holger I know I should think first, but I totally can't get a way to finish it.yunfan– yunfan2016年06月28日 10:06:10 +00:00Commented Jun 28, 2016 at 10:06
3 Answers 3
The trick here is to collect correctly the inner maps. The workflow would be:
- Flat map the list of map
List<Map<String, Map<String, Long>>>
into a stream of map entriesStream<Map.Entry<String, Map<String, Long>>>
. - Group by the key of each of those entry, and for the values mapped to same key, merge the two maps together.
Collecting maps by merging them would ideally warrant a flatMapping
collector, which unfortunately doesn't exist in Java 8, although it will exist in Java 9 (see JDK-8071600). For Java 8, it is possible to use the one provided by the StreamEx library (and use MoreCollectors.flatMapping
in the following code).
private static Map<String, Map<String, Long>> mergeMapsValue(List<Map<String, Map<String, Long>>> valueList) {
return valueList.stream()
.flatMap(e -> e.entrySet().stream())
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collectors.flatMapping(
e -> e.getValue().entrySet().stream(),
Collectors.<Map.Entry<String,Long>,String,Long>toMap(Map.Entry::getKey, Map.Entry::getValue, Long::sum)
)
));
}
Without using this convenient collector, we can still build our own with equivalent semantics:
private static Map<String, Map<String, Long>> mergeMapsValue2(List<Map<String, Map<String, Long>>> valueList) {
return valueList.stream()
.flatMap(e -> e.entrySet().stream())
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collector.of(
HashMap::new,
(r, t) -> t.getValue().forEach((k, v) -> r.merge(k, v, Long::sum)),
(r1, r2) -> { r2.forEach((k, v) -> r1.merge(k, v, Long::sum)); return r1; }
)
));
}
4 Comments
Long::sum
- I keep forgetting that exists. I think your Stream
approach is better overall as it doesn't produce intermediate List
results - although both yours and mine are completely illegible. I would advocate for the foreach
with Java 8 Map
methods approach...Collector
is the "combiner" - this will only be used if the stream
is run in parallel for a sufficiently large datasets and will combine the results from different threads.As a starting point, converting to use computeIfAbsent
and merge
gives us the following:
private static <K1, K2> Map<K1, Map<K2, Long>> mergeMapsValue(List<Map<K1, Map<K2, Long>>> valueList) {
final Map<K1, Map<K2, Long>> result = new HashMap<>();
for (final Map<K1, Map<K2, Long>> map : valueList) {
for (final Map.Entry<K1, Map<K2, Long>> sub : map.entrySet()) {
for (final Map.Entry<K2, Long> subsub : sub.getValue().entrySet()) {
result.computeIfAbsent(sub.getKey(), k1 -> new HashMap<>())
.merge(subsub.getKey(), subsub.getValue(), Long::sum);
}
}
}
return result;
}
This removes much of the logic from your inner loop.
This code below is wrong, I leave it here for reference.
Converting to the Stream
API is not going to make it neater, but lets give it a go.
import static java.util.stream.Collectors.collectingAndThen;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.mapping;
import static java.util.stream.Collectors.toList;
private static <K1, K2> Map<K1, Map<K2, Long>> mergeMapsValue(List<Map<K1, Map<K2, Long>>> valueList) {
return valueList.stream()
.flatMap(v -> v.entrySet().stream())
.collect(groupingBy(Entry::getKey, collectingAndThen(mapping(Entry::getValue, toList()), l -> l.stream()
.reduce(new HashMap<>(), (l2, r2) -> {
r2.forEach((k, v) -> l2.merge(k, v, Long::sum);
return l2;
}))));
}
This is what I've managed to come up with - it's horrible. The problem is that with the foreach
approach, you have a reference to each level of the iteration - this makes the logic simple. With the functional approach, you need to consider each folding operation separately.
How does it work?
We first stream()
our List<Map<K1, Map<K2, Long>>>
, giving a Stream<Map<K1, Map<K2, Long>>>
. Next we flatMap
each element, giving a Stream<Entry<K1, Map<K2, Long>>>
- so we flatten the first dimension. But we cannot flatten further as we need to K1
value.
So we then use collect(groupingBy)
on the K1
value giving us a Map<K1, SOMETHING>
- what is something?
Well, first we use a mapping(Entry::getValue, toList())
to give us a Map<K1, List<Map<K2, Long>>>
. We then use collectingAndThen
to take that List<Map<K2, Long>>
and reduce it. Note that this means we produce an intermediate List
, which is wasteful - you could get around this by using a custom Collector
.
For this we use List.stream().reduce(a, b)
where a
is the initial value and b
is the "fold" operation. a
is set to new HashMap<>()
and b
takes two values: either the initial value or the result of the previous application of the function and the current item in the List
. So we, for each item in the List
use Map.merge
to combine the values.
I would say that this approach is more or less illegible - you won't be able to decipher it in a few hours time, let alone a few days.
6 Comments
reduce
when you are modifying the arguments of the function.reduce
modifies the the map l2
by invoking merge
on it for each mapping of r2
.reduce
is meant for immutable items and returning a result combining the two inputs. I mutate the the LHS Map
given as an input to the combiner - this breaks the contract.I took the flatMap(e -> e.entrySet().stream())
part from Tunaki, but used a shorter variant for the collector:
Map<String, Integer> merged = maps.stream()
.flatMap(map -> map.entrySet().stream())
.collect(Collectors.toMap(
Map.Entry::getKey, Map.Entry::getValue, Integer::sum));
More elaborate example:
Map<String, Integer> a = new HashMap<String, Integer>() {{
put("a", 2);
put("b", 5);
}};
Map<String, Integer> b = new HashMap<String, Integer>() {{
put("a", 7);
}};
List<Map<String, Integer>> maps = Arrays.asList(a, b);
Map<String, Integer> merged = maps.stream()
.flatMap(map -> map.entrySet().stream())
.collect(Collectors.toMap(
Map.Entry::getKey, Map.Entry::getValue, Integer::sum));
assert merged.get("a") == 9;
assert merged.get("b") == 5;