I was recently presented with the following problem...
You are given a collection of singly-linked lists (SLLs). Return true if any of them share a common node (or "intersect"), or false otherwise.
Additional notes
- Please don’t use recursion.
- Assume the SLLs might be very large in length (in the millions).
- Stop traversing and return immediately if you detect a common node.
- If a cycle is detected, please throw an error.
- Aim to be as efficient as possible from a time complexity perspective.
- Implement this function with this signature:
DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool
Input:
- The first lines of the input will describe the singly-linked-lists in a directed acyclic graph (DAG) format. The graph description language is a similar idea to the GraphViz graph description language, see: https://en.wikipedia.org/wiki/DOT_(graph_description_language).
- Each node is described as a string token, which is a unique identifier for the node. So "a->b" means a DAG with node "a" directionally connected to node "b".
- As we are describing singly-linked-lists, take it as a given that the out degree of every node is either zero or one.
- After each of the edges has been described, then each subsequent line will include set of SLL starting nodes to traverse from.
Output:
- For each SLL print 'True' or 'False' based on the return value of your function
- true prints
True
- false prints
False
- On throwing an error print
Error Thrown!
So I must admit, not being particularly familiar with linked lists or graph data structures, I struggled with this exercise, however I did manage to complete a [probably quite naive] solution (coded in C#):
public static IEnumerable<string> Main(string[] lines)
{
var graph = new Dictionary<string, Node>();
foreach (var line in lines)
{
if (line.Contains(','))
{
string returnValue;
try
{
returnValue = LinkedListIntersection(line.Split(',').Select(v => v.Trim()), graph).ToString();
}
catch (InvalidOperationException ex)
when (ex.Message == "Cycle detected.")
{
returnValue = "Error Thrown!";
}
yield return returnValue;
}
else if (line.Contains("->"))
{
var splitStr = line.Split("->", StringSplitOptions.None);
var current = splitStr[0];
var next = splitStr[1];
// Check to see if the parent node already exists
if (!graph.TryGetValue(next, out var nextNode))
{
// If it doesn't, add it in so we can reference the object
nextNode = new Node(next, null);
graph.Add(next, nextNode);
}
// Check to see if the child node already exists
if (graph.TryGetValue(current, out var currentNode))
{
// If it does, update the existing object
currentNode.Next = nextNode;
}
else
{
graph.Add(current, new Node(current, nextNode));
}
}
}
}
private static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph)
{
var allTraversedNodes = new HashSet<string>();
foreach (var value in nodeGroup)
{
var currentTraversedNodes = new HashSet<string>();
if (!graph.TryGetValue(value, out var node))
{
continue;
}
do
{
if (allTraversedNodes.Contains(node.Value))
{
return true;
}
// Don't follow cycles
if (currentTraversedNodes.Contains(node.Next?.Value))
{
throw new InvalidOperationException($"Cycle detected.");
}
allTraversedNodes.Add(node.Value);
currentTraversedNodes.Add(node.Value);
node = node.Next;
} while (node != null);
}
return false;
}
class Node
{
public string Value { get; }
public Node Next { get; set; }
public Node(string value, Node next)
{
Value = value;
Next = next;
}
}
As you can see from the code, I'm parsing the lines of data, generating Node
objects from the DAG values and storing them in a Dictionary<string, Node>
. When it comes time to check for intersections, it's as simple as performing a lookup operation on the dictionary to locate the initial Node
, traversing the linked list while using a set to record each Node
traversed while checking for any intersections. Pretty simple eh?
Some test cases, from which I've generated some unit tests, were also provided to prove the functionality, which for my solution all passed.
[Fact]
public void Test1()
{
var lines = new[]
{
"a->b",
"r->s",
"b->c",
"x->c",
"q->r",
"y->x",
"w->z",
"a, q, w",
"a, c, r",
"y, z, a, r",
"a, w"
};
var expectedResults = new[]
{
"False",
"True",
"True",
"False"
};
var results = Main(lines).ToArray();
results.Should().Equal(expectedResults);
}
[Fact]
public void Test2()
{
var lines = new[]
{
"A->B",
"G->B",
"B->C",
"C->D",
"D->E",
"H->J",
"J->F",
"A, G, E",
"H, A"
};
var expectedResults = new[]
{
"True",
"False"
};
var results = Main(lines).ToArray();
results.Should().Equal(expectedResults);
}
[Fact]
public void Test3()
{
var lines = new[]
{
"ABC->BCD",
"BCD->CDE",
"DEF->EFG",
"EFG->BCD",
"123->456",
"ABC, CDE",
"123, DEF",
"ABC, DEF"
};
var expectedResults = new[]
{
"True",
"False",
"True"
};
var results = Main(lines).ToArray();
results.Should().Equal(expectedResults);
}
[Fact]
public void Test4()
{
var lines = new[]
{
"X->Y",
"Y->X",
"A->B",
"B->C",
"X, A"
};
var expectedResults = new[]
{
"Error Thrown!"
};
var results = Main(lines).ToArray();
results.Should().Equal(expectedResults);
}
The first thing I want to understand, for my solution, is there a more suitable data structure to store and query a collection of singly-linked lists than the Dictionary<string, Node>
structure used? I wasn't particularly happy about duplicating the Value
property in both the Node
object and as the dictionary Key
- it feels like there should be better way to do this. What are my options?
The second question, and this is the big one, are there better solutions for this problem from a time complexity perspective?
I have been musing theoretical alternate approaches, the best of which involved pre-processing and caching all intersections during the graph building process, perhaps somehow keeping track of the heads of all the different linked lists and then traversing them once all the data had been loaded to generate some kind of adjacency matrix (a 2d array with a dictionary index lookup?) which records the intersections, and then performing a lookup on those intersections with every unique permutation of node group nodes. This (in theory at least) should be more efficient in terms of query performance, however I have absolutely no idea what that would look like or how to achieve it in practice. Does anybody out there have any better ideas?
2 Answers 2
Don't write, never commit/publish undocumented/uncommented code.
While not in the specification of input and output, the Additional notes are explicit about implementation of DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool
I don't see an equivalent of.
I don't see a need to store lists, singly linked or otherwise.
Disregarding cycles, "lists intersect" when belonging to the same connected component, disjoint sets of nodes.
Needing access to successor or representative id, just map id to id.
Wanting to detect cycles seems to limit union-find algorithms applicable.
-
\$\begingroup\$ "Don't write, never commit/publish undocumented/uncommented code." - I'm not sure what you mean by this, could you explain please? \$\endgroup\$James Law– James Law2023年02月24日 19:04:27 +00:00Commented Feb 24, 2023 at 19:04
-
\$\begingroup\$
DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool
(bearing in mind is pseudocode), is implemented asprivate static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph)
- the boilerplate code provided in the video doesn't match the written description. Attention to detail is not the forte of the creator of this exercise. \$\endgroup\$James Law– James Law2023年02月24日 19:07:37 +00:00Commented Feb 24, 2023 at 19:07 -
\$\begingroup\$ "I don't see a need to store lists, singly linked or otherwise." - ok, how would you store the data? \$\endgroup\$James Law– James Law2023年02月24日 19:08:32 +00:00Commented Feb 24, 2023 at 19:08
-
\$\begingroup\$ "Disregarding cycles, "lists intersect" when belonging to the same connected component, disjoint sets of nodes." - how would you go about implementing what you've described in this scenario? Could you provide a code example please? \$\endgroup\$James Law– James Law2023年02月24日 19:23:18 +00:00Commented Feb 24, 2023 at 19:23
-
\$\begingroup\$ "Wanting to detect cycles seems to limit union-find algorithms applicable." - I don't think it's so much that we want to detect the cycles, but surely we have to be able to handle them somehow? Could you suggest an alternative? \$\endgroup\$James Law– James Law2023年02月24日 19:24:59 +00:00Commented Feb 24, 2023 at 19:24
Firstly, the problem statement calls for DoLinkedListsIntersect(Collection<SinglyLinkedList>) returns bool
, but that's not present in this code.
That's a problem in an interview situation - I'd be wanting to see that you can meet requirements that other code has on yours.
It seems to me that the auxiliary storage in allTraversedNodes
and currentTraversedNodes
could grow very large. I don't believe we need all that storage.
Instead, use the knowledge that if any node is common to two lists, they share the same tail node. So we just map each head node to the corresponding tail (this is a form of path compression), then test whether those tails are all distinct.
In pseudocode, we get something like
tails = map(find_tail, heads)
intersects = count(set(tails)) < count(set(heads))
Ensure that find_tail()
throws a suitable exception if it finds a cycle (using a standard hare-and-tortoise method), and take appropriate action if we catch that.
-
\$\begingroup\$
just one node per list - much less storage
refers to any single query? \$\endgroup\$greybeard– greybeard2023年02月24日 10:30:14 +00:00Commented Feb 24, 2023 at 10:30 -
\$\begingroup\$ Yes (I hadn't spotted that the problem statement has multiple queries - that forces down the temporary-modification version of this answer). The duration of that storage is just for that query, of course - we reclaim it for the next query. \$\endgroup\$Toby Speight– Toby Speight2023年02月24日 11:24:42 +00:00Commented Feb 24, 2023 at 11:24
-
1\$\begingroup\$
just map each head node to the corresponding tail
is what the path compression of some of the find algorithms union-find data structures are named for results in. \$\endgroup\$greybeard– greybeard2023年02月25日 06:42:46 +00:00Commented Feb 25, 2023 at 6:42 -
1\$\begingroup\$ "Firstly, the problem statement calls for
DoLinkedListsIntersect(Collection<SinglyLinkedList>
) returns bool, but that's not present in this code." - yes it is, implemented asprivate static bool LinkedListIntersection(IEnumerable<string> nodeGroup, IDictionary<string, Node> graph)
, derived from the boilerplate code provided by the creator of the solution as can be seen in the video at 3:07 - youtu.be/zCezJ8QkUL4?t=187. I'm really sorry about the quality of the question that's been put to me but sadly I have no control over that. \$\endgroup\$James Law– James Law2023年02月25日 12:25:30 +00:00Commented Feb 25, 2023 at 12:25
Explore related questions
See similar questions with these tags.
Some tests were also provided...
isn't originated by you, please put it into a block quote. \$\endgroup\$Test4
) being provided. As for the implementation, I applied the rationale that even if a cycle exists it doesn't necessarily invalidate all of the data in that set, as there could be lists within the collection that don't cycle and would produce a validtrue
/false
output when queried. \$\endgroup\$Dictionary<string, string>
could suffice but just doesn't feel right. You can skip a dictionary and go withHashSet<Node>
but yourNode
class would need to implementIEquatable<Node>
and overrideGetHashCode()
. \$\endgroup\$