Skip to main content
Code Review

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link

I'd use a simple List or ArrayList here. Vector is considered obsolete Vector is considered obsolete.

I'd use a simple List or ArrayList here. Vector is considered obsolete.

I'd use a simple List or ArrayList here. Vector is considered obsolete.

Source Link
palacsint
  • 30.4k
  • 9
  • 82
  • 157
  1. First of all, two things to read or search for: Cluster analysis (just a hint, I'm not an expert about that) and Linked: The New Science of Networks by Albert-László Barabási.

Barabási in his book shows that networks usually have some nodes which have a lot more connections than the others. The distribution is not the same in the real world as the sample shell script generates.

  1. The code is quite good, I like your variable and method names and separated methods. I wonder why haven't anyone reviewed it yet.

Vector<Set<String>> friendBuckets = bucketizeEmployees(employees);

I'd use a simple List or ArrayList here. Vector is considered obsolete.

  1. In the getConnectedFriends method the
Set<String> queuedFriends = new LinkedHashSet<String>();

could be a Queue. It has a poll method. As far as I tested it's faster than the currently used iterator-based remove.

1.

public class Main {

Main isn't a good class name. Everyone can have a Main. What's it purpose? Try to find a more descriptive name.

1.

static HashMap<String, Set<String>> friendShips; 

HashMap<...> reference types should be simply Map<...>. See: Effective Java, 2nd edition, Item 52: Refer to objects by their interfaces

Should I always use the private access modifier for class fields?; Item 13 of Effective Java 2nd Edition: Minimize the accessibility of classes and members.

1.

 static HashMap<String, Set<String>> friendShips; 

Instead of Map<String, Set<String>> you could use Guava's Multimap (doc, javadoc) which was designed exactly for that. It would reduce the size of the mapFriends method:

 public static void mapFriends(final String friendA, final String friendB,
 final Multimap<String, String> friendsShipMap) {
 friendsShipMap.put(friendA, friendB);
 }

So, it could be removed.

1.

public static Vector<Set<String>> bucketizeEmployees(Set<String> employees) {
 ...
}

This method calls getConnectedFriends(employee), which is the following:

private static Set<String> getConnectedFriends(String friend) {
 ...
}

It's confusing: what is the difference between an employee and friend? Are they the same?

1.

if (friendShips.get(friend) != null) {

The following is the same:

 if (friendShips.containsKey(friend)) {

A guard clause would be even better.

 if (!friendShips.containsKey(friend)) {
 return connectedFriends;
 }
if (!connectedFriends.contains(directFriend) && !queuedFriends.contains(directFriend)) {
 queuedFriends.add(directFriend);
}

The !queuedFriends.contains(directFriend) condition is unnecessary, it's a set which can't contain elements twice and adding an already added element to a LinkedHashSet doesn't modify anything. From the javadoc:

Note that insertion order is not affected if an element is re-inserted into the set.

  1. The following pattern occurs more than once in the code:

     if (map.containsKey(key)) {
     String value = map.get(key);
     ...
     }
     ...
    

It might be a microoptimization, but if you profile the code and the results shows it as a bottleneck the following structure is the same:

 String value = map.get(key); 
 if (value != null) {
 ...
 }
 ...
  1. A few guard clause would help to make getConnectedFriends more flatten:

     while (!queuedFriends.isEmpty()) {
     final String poppedFriend = getHeadElement(queuedFriends);
     connectedFriends.add(poppedFriend);
     if (!friendShips.containsKey(poppedFriend)) {
     continue;
     }
     for (final String directFriend: friendShips.get(poppedFriend)) {
     if (connectedFriends.contains(directFriend)) {
     continue;
     }
     queuedFriends.add(directFriend);
     }
     }
    
lang-java

AltStyle によって変換されたページ (->オリジナル) /