Given an array of ordered pairs of values, what algorithm will find converse duplicates? [Converse meaning the same values, but in the opposite order.]
That is, given [ab,ac,ad,bc,bd,ca,db] is there an efficient way to find ca and db, being the converse duplicates for ac and bd?
The application is simple enough: the ordered pairs are edges in a directed graph and if there is a converse edge then a single double-ended edge is to be drawn rather than one edge in each direction. The values are strings, being node names.
It can be viewed as a lookup in a sparse array. Given coordinates (a,b), check whether (b,a) exists. However, common programming languages do not (appear to) provide sparse 2d arrays.
I have written a solution in ruby using hash-of-hash, but it's about 20 lines of awkward code and an unsatisfying outcome. Writing the same code in a language like C# or Java would be longer. A better solution is sought, in pseudocode or a description (steps) of the algorithm. Either way, I am seeking an answer that describes how to find the solution as well as the benefits and drawbacks of the particular algorithm.
I haven't attempted to define 'efficient' or 'better', and performance is not an overriding consideration for a drawing of a few hundred nodes.
The nodes are not sorted, so the default algorithm would be, for each pair, to form the converse and brute-force search the preceding half. A binary search would require a prior sort. A solution based on hash indexing should be much faster.
-
I don't fully understand the question. So you want to find a pair of pairs of values (x,y) and (z,t) such that z, t are the lexicographical inverse of x and y?InformedA– InformedA2014年07月02日 08:05:08 +00:00Commented Jul 2, 2014 at 8:05
-
2How fast do you want it to be? simply checking each edge for its converse with a binary search is O(n log n), which is usually considered efficient. So what complexity are you trying to achieve?Frank– Frank2014年07月02日 10:22:09 +00:00Commented Jul 2, 2014 at 10:22
-
@randomA: No, I want to find all pairs of pairs such as (x,y) and (y,x), and handle them differently (draw double-ended arrow).david.pfx– david.pfx2014年07月02日 14:01:40 +00:00Commented Jul 2, 2014 at 14:01
-
@Frank: See edit.david.pfx– david.pfx2014年07月02日 14:02:26 +00:00Commented Jul 2, 2014 at 14:02
-
2Create two hash values from each tuple. First hash is computed from an ordered-tuple hash function; Second hash is computed from the reversed tuple using the same ordered-tuple hash function. (This is actually a quite elementary algorithm design question.)rwong– rwong2014年07月02日 14:02:54 +00:00Commented Jul 2, 2014 at 14:02
4 Answers 4
Just collect the pairs in a set. (In Ruby, it will be a Set of two-element Arrays.)
let Set s = {}
for each pair [a,b]
if s contains [a,b]
// duplicate, do nothing
else if s contains [b,a] // converse duplicate
...
else
add [a,b] to S
If you are writing Ruby, it is already capable of using arrays
-
We must have different definitions of trivial. I do not see that f() performs any sorting function. I do not see explanations of the xxx-edge() functions. I do not see any output of converse duplicates. In short, I do not see a solution.david.pfx– david.pfx2014年10月02日 23:05:09 +00:00Commented Oct 2, 2014 at 23:05
This is the code for my ruby implementation. The output is in a format intended for Graphviz.
graph_edges = {}
fmap.keys.sort.each do |srckey|
forms = fmap[srckey]
fsrc = 1ドル if srckey =~ /---/
forms.keys.each do |fkey|
fdest = 1ドル if fkey =~ /---/
if fdest
if graph_edges[fdest] && graph_edges[fdest][fsrc]
graph_edges[fdest][fsrc] = :both
else
graph_edges[fsrc] ||= {}
graph_edges[fsrc][fdest] ||= :only
end
end
end if fsrc
end
fout3.puts "digraph {"
graph_edges.each do |fsrc,h|
h.each_pair do |fdest,dir|
dirs = " [dir = both]" if dir == :both
fout3.puts " #{fsrc} -> #{fdest}#{dirs};"
end
end
fout3.puts "}"
Hopefully there is something useful here. I shall be working on something better based on a tuple value hash.
-
1Programmers is about conceptual questions and answers are expected to explain things. Throwing code dumps instead of explanation is like copying code from IDE to whiteboard: it may look familiar and even sometimes be understandable, but it feels weird... just weird. Whiteboard doesn't have compilergnat– gnat2015年04月06日 07:04:44 +00:00Commented Apr 6, 2015 at 7:04
Here's a working solution with decent performance.
ordered = lambda x, y: (x, y) if x <= y else (y, x)
def makeArrows(edges):
"""
input: a sequence of (src, dst) pairs, maybe some reciprocal.
output: generator yielding (src, dst, is_double_ended) triples.
"""
undirected_edges = set()
for src, dst in edges:
edge = ordered(src, dst) # both 'ab' and 'ba' become ('a', 'b') here
if edge in undirected_edges:
# we found a reciprocal; there can only be one pair like 'ab' + 'ba',
# so we remove it and output as double-ended
undirected_edges.remove(edge)
yield (src, dst, True)
else:
# accumulate an edge and wait; a reciprocal might be ahead
undirected_edges.add(edge)
# all edges not removed by now are one-ended
for (src, dst) in undirected_edges:
yield (src, dst, False)
Now we can try it. Since Python strings are sequences, here I short-cut 'ab'
instead of ('a', 'b')
.
raw_edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
>>> list(makeArrows(raw_edges))
[('c', 'a', True), ('d', 'b', True), ('b', 'c', False), ('a', 'b', False), ('a', 'd', False)]
Here's a way...in python
ordered = lambda x, y: (x, y) if x <= y else (y, x)
def make_hash(edges):
edge_hash = {}
for x,y in edges:
edge = ordered(x, y)
edge_hash[edge] = edge_hash.setdefault(edge, 0) + 1
return edge_hash
edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
for edge, count in make_hash(edges).items():
print ''.join(edge), 'uni' if count == 1 else 'bi'
It creates a dictionary (hash) that is indexed by the sorted order of the edges. If there are two entries, we know it is has a converse duplicate. If only one, it is not. The dictionary is returned and the count of matching (in any order) entries.
-
3Programmers.SE encourages questions that explain the why behind the how. Your answer would be much stronger if you stated how your suggested algorithm addresses the OP's concerns and how this approach is superior to other approaches.user53019– user530192015年04月02日 01:14:42 +00:00Commented Apr 2, 2015 at 1:14