I have a function that looks up in a list
the elements which share same values in order to group them.
As a fast summary, this should be the input/output:
[A, A', B, C, B', A''] --> [[A, A', A''], [B, B'], [C]]
I do not care about order. It could also produce:
[A, A', B, C, B', A''] --> [[B, B'], [A, A', A''], [C]]
This is my test code to see the output samples:
class A { private Long id; private String name; private Object param; A(Long id, String name, Object param) { this.id = id; this.name = name; this.param = param; } @Override public String toString() { return "{\tid:" + id + ",\tname:" + name + ",\tparam:" + param + "}"; } } public class ListsTest { private final A a1 = new A(1L, "A", 100); private final A a2 = new A(2L, "B", 200); private final A a3 = new A(1L, "A", 300); private final A a4 = new A(1L, "B", 400); @Test public void groupByIdAndName() throws IllegalAccessException, NoSuchFieldException { List<A> aList = List.of(a1, a2, a3, a4); System.out.println("groupByIdAndName"); System.out.println("Input: ---> " + aList); List<List<A>> aGroupedBy = Lists.groupByFields(aList, "id", "name"); System.out.println("Output: --> " + aGroupedBy); System.out.println("------------------------------------------------------------------------"); assertThat(aGroupedBy, is(List.of(List.of(a1, a3), List.of(a2), List.of(a4)))); } @Test public void groupById() throws IllegalAccessException, NoSuchFieldException { List<A> aList = List.of(a1, a2, a3, a4); System.out.println("groupById"); System.out.println("Input: ---> " + aList); List<List<A>> aGroupedBy = Lists.groupByFields(aList, "id"); System.out.println("Output: --> " + aGroupedBy); System.out.println("------------------------------------------------------------------------"); assertThat(aGroupedBy, is(List.of(List.of(a1, a3, a4), List.of(a2)))); } @Test public void groupByName() throws IllegalAccessException, NoSuchFieldException { List<A> aList = List.of(a1, a2, a3, a4); System.out.println("groupByName"); System.out.println("Input: ---> " + aList); List<List<A>> aGroupedBy = Lists.groupByFields(aList, "name"); System.out.println("Output: --> " + aGroupedBy); System.out.println("------------------------------------------------------------------------"); assertThat(aGroupedBy, is(List.of(List.of(a1, a3), List.of(a2, a4)))); } }
Which outputs:
groupById
Input: ---> [{ id:1, name:A, param:100}, { id:2, name:B, param:200}, { id:1, name:A, param:300}, { id:1, name:B, param:400}] Output: --> [[{ id:1, name:A, param:100}, { id:1, name:A, param:300}], [{ id:2, name:B, param:200}], [{ id:1, name:B, param:400}]]
Output: --> [[{ id:1, name:A, param:100}, { id:1, name:A, param:300}, { id:1, name:B, param:400}], [{ id:2, name:B, param:200}]]
groupByIdAndName
Input: ---> [{ id:1, name:A, param:100}, { id:2, name:B, param:200}, { id:1, name:A, param:300}, { id:1, name:B, param:400}]
Output: --> [[{ id:1, name:A, param:100}, { id:1, name:A, param:300}], [{ id:2, name:B, param:200}], [{ id:1, name:B, param:400}]]
groupByName
Input: ---> [{ id:1, name:A, param:100}, { id:2, name:B, param:200}, { id:1, name:A, param:300}, { id:1, name:B, param:400}]
Output: --> [[{ id:1, name:A, param:100}, { id:1, name:A, param:300}], [{ id:2, name:B, param:200}, { id:1, name:B, param:400}]]
The code I developed for it is the following:
public class Lists {
private static <T> Object getObjectFieldsHash(T obj, String... fields) throws NoSuchFieldException, IllegalAccessException {
List<Object> vals = new ArrayList<>();
for (String field: fields) {
Field f = obj.getClass().getDeclaredField(field);
f.setAccessible(true);
vals.add(f.get(obj));
}
return Objects.hash(vals);
}
public static <T> List<List<T>> groupByFields(List<T> objects, String... fields ) throws NoSuchFieldException, IllegalAccessException {
List<List<T>> result = new ArrayList<>(); // Is it possible to create same type of original List instead of always ArrayList?
Map<Object, Integer> indexes = new HashMap<>();
for (T obj: objects) {
Object hash = getObjectFieldsHash(obj, fields);
indexes.computeIfAbsent(hash, (_unused) -> indexes.size()); // How can I remove _unused variable?
Integer nextIndex = indexes.get(hash);
if (nextIndex >= result.size()) { // Maybe I could use ==. Does it improve anything?
result.add(new ArrayList<T>()); // Is it possible to create same type of original List instead of always ArrayList?
}
result.get(nextIndex).add(obj);
}
return result;
}
}
Is there any way to improve this?
I'm thinking of:
- I can pass as argument any subtype of
List
, but I'm always returningArrayList
s. - I was forced to declare the
_unused
variable in mylambda function
, or won't compile, producing error: "Cannot infer functional interface type". - Does using
==
instead of>=
to check if I already created a sublist provide me any kind of improvement? - I named this method
groupByFields
, but I feel like is the opposite of "flatMap" kind functions. Is there any concept which gives me a more standarized method name?
-
1\$\begingroup\$ Prefer method-references over reflection. \$\endgroup\$Rob Audenaerde– Rob Audenaerde2019年01月11日 10:44:34 +00:00Commented Jan 11, 2019 at 10:44
1 Answer 1
object
is easier to read than obj
. values
is easier to read than vals
. Avoid unnecessary abbreviations.
Don’t give things misleading names. You’re calling something a field
when it’s really a fieldName
.
Using a Map
in groupByFields
is a good idea, but you’re making it a lot harder than it needs to be by sticking everything in a list right away. Just keep the hash as the key and a set of objects as the value.
There’s a good chance this is a big performance sink because of how much reflection you’re doing. Rather than reflect on every instance of every object, just do the reflection work once.
Your getObjectsFieldHash
method is finding the has of a List of values, rather than on the actual values. Is that intentional?
To answer your questions,
Is it possible to create same type of original List instead of always ArrayList?
Yes, but why? This method should be a lot more generic - it can handle arbitrary collections unless you need your grouping function to be stable. You should be doing most of your work behind the Collections API interfaces. If you really need a specific concrete type yourself, it's preferable to make that type and dump the existing collection into it via a constructor. That will copy all the members over.
How can I remove _unused variable?
You can’t. computeIfAbsent
requires a function that takes in a key.
Maybe I could use ==. Does it improve anything?
It might affect performance in some trivial way, but you’re better off refactoring the code to be easier to read.
Is it possible to create same type of original List instead of always ArrayList?
See above
So, if you're OK with using what I think are more appropriate collections than lists, you can try:
public static <T> Collection<Set<T>> groupByFields(
final Class<T> objectClazz,
final Collection<T> objects,
final String... fieldNames)
throws NoSuchFieldException, IllegalAccessException {
final Field[] fields = fieldsFor(objectClazz, fieldNames);
final Map<Integer, Set<T>> groupings = new HashMap<>();
for (final T object : objects) {
final int hash = hashOf(object, fields);
groupings.computeIfAbsent(hash, k -> new HashSet<T>()).add(object);
}
return groupings.values();
}
If you feel strongly about lists, but don't have a really good reason to require specific list types, try:
public static <T> List<List<T>> groupByFields(
final List<T> objects,
final String... fieldNames)
throws NoSuchFieldException, IllegalAccessException {
final Field[] fields = fieldsFor(objects.get(0).getClass(), fieldNames);
final Map<Integer, List<T>> groupings = new HashMap<>();
for (final T object : objects) {
final int hash = hashOf(object, fields);
groupings.computeIfAbsent(hash, k -> new ArrayList<T>()).add(object);
}
return new ArrayList<>(groupings.values());
}
And if you feel really strongly about controlling the types of lists being used, you can use:
public static <T> List<List<T>> groupByFields(
final Class<? extends List<T>> listClazz,
final Class<? extends List<List<T>>> listOfListsClazz,
final List<T> objects,
final String... fieldNames)
throws NoSuchFieldException, IllegalAccessException, InstantiationException {
final Field[] fields = fieldsFor(objects.get(0).getClass(), fieldNames);
final Map<Integer, List<T>> groupings = new HashMap<>();
for (final T object : objects) {
final int hash = hashOf(object, fields);
groupings.computeIfAbsent(hash, k -> {
try {
return listClazz.newInstance();
} catch (IllegalAccessException | InstantiationException e) {
throw new IllegalStateException("Cannot instantiate a new instance of " + listClazz.getCanonicalName());
}
})
.add(object);
}
final List<List<T>> result = listOfListsClazz.newInstance();
result.addAll(groupings.values());
return result;
}
All three top-level methods use the same two helpers:
private static final <T> Field[] fieldsFor(final Class<T> clazz, final String[] fieldNames)
throws NoSuchFieldException {
final Field[] fields = new Field[fieldNames.length];
for (int i = 0; i < fieldNames.length; i++) {
final Field field = clazz.getDeclaredField(fieldNames[i]);
field.setAccessible(true);
fields[i] = field;
}
return fields;
}
private static final <T> Integer hashOf(final T object, final Field... fields)
throws IllegalAccessException {
final Object[] values = new Object[fields.length];
for (int i = 0; i < values.length; i++) {
values[i] = fields[i].get(object);
}
return Objects.hash(values);
}
Finally, note that I do agree with RobAu
that avoiding reflection would be better, but we don't have enough context to consider alternatives.
Explore related questions
See similar questions with these tags.