I have been writing a small compiler generator for which I need to solve the strongly connected component problem. As the Guava library contains, to my knowledge, no implementation for that problem, I have decided to write my own based on Kosaraju's algorithm.
For this, I have created a GraphUtils
class and a GraphTraverser<T>
to iterate over the graph.
import com.google.common.collect.ImmutableList;
import com.google.common.graph.ElementOrder;
import com.google.common.graph.Graph;
import com.google.common.graph.GraphBuilder;
import com.google.common.graph.Graphs;
import com.google.common.graph.MutableGraph;
public class GraphUtils {
private GraphUtils() {
}
/**
* Guarantees: the graph will be directed and forest-like without self loops.
*
* @param graph
* @return the SCC graph. each node contains all the nodes in the CC of the original graph
*/
public static <T> Graph<Set<T>> findStronglyConnectedComponents(Graph<T> graph) {
if (graph.nodes().isEmpty()) {
throw new IllegalArgumentException("Can't find components in an empty graph");
}
final MutableGraph<Set<T>> result = GraphBuilder.directed().allowsSelfLoops(false)
.nodeOrder(ElementOrder.insertion()).build();
// Kosaraju's algorithm
final Map<T, Set<T>> ccStore = new HashMap<>(graph.nodes().size());
// Step 1
final ImmutableList<T> topologicalOrder = GraphUtils.traverse(graph).postOrderTraversal(graph.nodes()).toList()
.reverse();
// Step 2
final Graph<T> transposeGraph = Graphs.transpose(graph);
// Step 3
for (T node : topologicalOrder) {
if (ccStore.keySet().contains(node)) {
continue;
}
final Set<T> connectedComponent = new HashSet<>();
final Set<T> hitExistingNodes = new HashSet<>();
GraphUtils.traverse(transposeGraph)
.postOrderTraversal(Collections.singleton(node), ccStore.keySet(), hitExistingNodes::add)
.forEach(connectedComponent::add);
result.addNode(connectedComponent);
hitExistingNodes.forEach(n -> {
// We encounterd a connection between connected components
Set<T> existingCC = ccStore.get(n);
result.putEdge(existingCC, connectedComponent);
});
connectedComponent.forEach(n -> {
ccStore.put(n, connectedComponent);
});
}
return result;
}
public static <T> GraphTraverser<T> traverse(Graph<T> graph) {
return new GraphTraverser<>(graph);
}
}
GraphTraverser<T>
import com.google.common.collect.AbstractIterator;
import com.google.common.collect.FluentIterable;
import com.google.common.graph.Graph;
public class GraphTraverser<T> {
private static final class PostOrderNode<T> {
public final T root;
public final Iterator<T> childIterator;
public PostOrderNode(T root, Iterator<T> childIterator) {
this.root = Objects.requireNonNull(root);
this.childIterator = Objects.requireNonNull(childIterator);
}
}
private final class PostOrderIterator extends AbstractIterator<T> {
private final ArrayDeque<PostOrderNode<T>> stack = new ArrayDeque<>();
private final Iterator<T> rootNodes;
private final Set<T> visitedSet;
private final Set<T> ignoredSet;
private final Consumer<T> ignoreNodeEncountered;
public PostOrderIterator(Collection<T> roots, Set<T> ignoredNodes, Consumer<T> ignoreNodeMet) {
this.rootNodes = roots.iterator();
this.visitedSet = new HashSet<>(graph.nodes().size());
this.ignoredSet = ignoredNodes;
this.ignoreNodeEncountered = ignoreNodeMet;
}
@Override
protected T computeNext() {
while (stack.isEmpty() && rootNodes.hasNext()) {
pushNodeIfUnvisited(rootNodes.next());
}
while (!stack.isEmpty()) {
PostOrderNode<T> top = stack.getLast();
if (top.childIterator.hasNext()) {
T child = top.childIterator.next();
pushNodeIfUnvisited(child);
} else {
stack.removeLast();
return top.root;
}
}
return endOfData();
}
private void pushNodeIfUnvisited(T t) {
if (ignoredSet.contains(t)) {
if (ignoreNodeEncountered != null) {
ignoreNodeEncountered.accept(t);
}
return;
}
if (!visitedSet.add(t)) {
return;
}
stack.addLast(expand(t));
}
private PostOrderNode<T> expand(T t) {
return new PostOrderNode<T>(t, graph.successors(t).iterator());
}
}
private final Graph<T> graph;
public GraphTraverser(Graph<T> graph) {
this.graph = Objects.requireNonNull(graph);
}
public FluentIterable<T> postOrderTraversal() {
return postOrderTraversal(graph.nodes());
}
public FluentIterable<T> postOrderTraversal(Collection<T> rootNodes) {
return postOrderTraversal(rootNodes, Collections.emptySet(), null);
}
/**
* Does post order traversal of the (directed) graph. When a node in ignoredNodes is encountered, ignoreNodeMet is
* called
*
* @param rootNodes
* the nodes to start traversal at
* @param ignoredNodes
* nodes that will be ignored, i.e. not recursively traversed
* @param ignoredNodeMet
* might be null for no callback
* @return
*/
public FluentIterable<T> postOrderTraversal(Collection<T> rootNodes, Set<T> ignoredNodes,
Consumer<T> ignoredNodeMet) {
return new FluentIterable<T>() {
@Override
public Iterator<T> iterator() {
return new PostOrderIterator(rootNodes, ignoredNodes, ignoredNodeMet);
}
};
}
}
Here's an example usage, with current, correct output:
MutableGraph<Integer> originalGraph = GraphBuilder.directed().expectedNodeCount(10).build();
originalGraph.putEdge(1, 0);
originalGraph.putEdge(2, 1);
originalGraph.putEdge(0, 2);
originalGraph.putEdge(0, 3);
originalGraph.putEdge(5, 3);
originalGraph.putEdge(3, 4);
System.out.println(originalGraph);
// isDirected: true, allowsSelfLoops: false, nodes: [1, 0, 2, 3, 5, 4], edges: [<1 -> 0>, <0 -> 2>, <0 -> 3>, <2 -> 1>, <3 -> 4>, <5 -> 3>]
Graph<Set<Integer>> sccGraph = GraphUtils.findStronglyConnectedComponents(originalGraph);
System.out.println(sccGraph);
// isDirected: true, allowsSelfLoops: false, nodes: [[5], [0, 1, 2], [3], [4]], edges: [<[5] -> [3]>, <[0, 1, 2] -> [3]>, <[3] -> [4]>]
I'm mostly interested in the design of GraphTraverser<T>
and the efficiency of the algorithm and the returned result. If you find any bugs, please point them out. Any comments on code style and readability are appreciated.
1 Answer 1
Ok, first experiments/test show that this does exactly what it says on the tin: it finds strongly connected (groups of) nodes, or -- in my non-expert parlance -- where the cycles are in a directed graph.
Cannot say much about performance, as the use case I am dealing with involves analysing small-ish graphs (a few dozen nodes at most). Just wondering why your solution doesn't use the Traverser that guava/google graph ships.
Also, noticed that you explicitly forbid self-loops, yet when I analyse a Graph with a self-loop in it, it all works wonderfully well (the self-loop is identified as a cycle Set of one).
In short, thanks very much for helping me out of a pickle
-
1\$\begingroup\$ The method guarantees, i.e. ensures no self-loops of the returned value, it's not a restriction on the argument. \$\endgroup\$WorldSEnder– WorldSEnder2019年01月29日 14:56:02 +00:00Commented Jan 29, 2019 at 14:56