An algorithm for finding converse duplicates of ordered pairs

Question 1

Given an array of ordered pairs of values, what algorithm will find converse duplicates? [Converse meaning the same values, but in the opposite order.]

That is, given [ab,ac,ad,bc,bd,ca,db] is there an efficient way to find ca and db, being the converse duplicates for ac and bd?

The application is simple enough: the ordered pairs are edges in a directed graph and if there is a converse edge then a single double-ended edge is to be drawn rather than one edge in each direction. The values are strings, being node names.

It can be viewed as a lookup in a sparse array. Given coordinates (a,b), check whether (b,a) exists. However, common programming languages do not (appear to) provide sparse 2d arrays.

I have written a solution in ruby using hash-of-hash, but it's about 20 lines of awkward code and an unsatisfying outcome. Writing the same code in a language like C# or Java would be longer. A better solution is sought, in pseudocode or a description (steps) of the algorithm. Either way, I am seeking an answer that describes how to find the solution as well as the benefits and drawbacks of the particular algorithm.

I haven't attempted to define 'efficient' or 'better', and performance is not an overriding consideration for a drawing of a few hundred nodes.

The nodes are not sorted, so the default algorithm would be, for each pair, to form the converse and brute-force search the preceding half. A binary search would require a prior sort. A solution based on hash indexing should be much faster.

Question 2

I don't fully understand the question. So you want to find a pair of pairs of values (x,y) and (z,t) such that z, t are the lexicographical inverse of x and y?

Question 3

How fast do you want it to be? simply checking each edge for its converse with a binary search is O(n log n), which is usually considered efficient. So what complexity are you trying to achieve?

Question 4

@randomA: No, I want to find all pairs of pairs such as (x,y) and (y,x), and handle them differently (draw double-ended arrow).

Question 5

@Frank: See edit.

Question 6

Create two hash values from each tuple. First hash is computed from an ordered-tuple hash function; Second hash is computed from the reversed tuple using the same ordered-tuple hash function. (This is actually a quite elementary algorithm design question.)

Question 7

Just collect the pairs in a set. (In Ruby, it will be a Set of two-element Arrays.)

let Set s = {}
for each pair [a,b]
 if s contains [a,b]
 // duplicate, do nothing
 else if s contains [b,a] // converse duplicate
 ...
 else
 add [a,b] to S

If you are writing Ruby, it is already capable of using arrays

Question 8

We must have different definitions of trivial. I do not see that f() performs any sorting function. I do not see explanations of the xxx-edge() functions. I do not see any output of converse duplicates. In short, I do not see a solution.

Question 9

This is the code for my ruby implementation. The output is in a format intended for Graphviz.

 graph_edges = {}
 fmap.keys.sort.each do |srckey|
 forms = fmap[srckey]
 fsrc = 1ドル if srckey =~ /---/
 forms.keys.each do |fkey|
 fdest = 1ドル if fkey =~ /---/
 if fdest
 if graph_edges[fdest] && graph_edges[fdest][fsrc]
 graph_edges[fdest][fsrc] = :both
 else
 graph_edges[fsrc] ||= {}
 graph_edges[fsrc][fdest] ||= :only
 end
 end
 end if fsrc
 end
 fout3.puts "digraph {"
 graph_edges.each do |fsrc,h| 
 h.each_pair do |fdest,dir| 
 dirs = " [dir = both]" if dir == :both
 fout3.puts " #{fsrc} -> #{fdest}#{dirs};" 
 end
 end
 fout3.puts "}"

Hopefully there is something useful here. I shall be working on something better based on a tuple value hash.

Question 10

Programmers is about conceptual questions and answers are expected to explain things. Throwing code dumps instead of explanation is like copying code from IDE to whiteboard: it may look familiar and even sometimes be understandable, but it feels weird... just weird. Whiteboard doesn't have compiler

Question 11

Here's a working solution with decent performance.

ordered = lambda x, y: (x, y) if x <= y else (y, x)
def makeArrows(edges):
 """
 input: a sequence of (src, dst) pairs, maybe some reciprocal.
 output: generator yielding (src, dst, is_double_ended) triples.
 """
 undirected_edges = set()
 for src, dst in edges:
 edge = ordered(src, dst) # both 'ab' and 'ba' become ('a', 'b') here
 if edge in undirected_edges:
 # we found a reciprocal; there can only be one pair like 'ab' + 'ba',
 # so we remove it and output as double-ended
 undirected_edges.remove(edge)
 yield (src, dst, True)
 else:
 # accumulate an edge and wait; a reciprocal might be ahead
 undirected_edges.add(edge)
 # all edges not removed by now are one-ended
 for (src, dst) in undirected_edges:
 yield (src, dst, False)

Now we can try it. Since Python strings are sequences, here I short-cut 'ab' instead of ('a', 'b').

raw_edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
>>> list(makeArrows(raw_edges))
[('c', 'a', True), ('d', 'b', True), ('b', 'c', False), ('a', 'b', False), ('a', 'd', False)]

Question 12

Here's a way...in python

ordered = lambda x, y: (x, y) if x <= y else (y, x)
def make_hash(edges):
 edge_hash = {}
 for x,y in edges:
 edge = ordered(x, y)
 edge_hash[edge] = edge_hash.setdefault(edge, 0) + 1
 return edge_hash
edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
for edge, count in make_hash(edges).items():
 print ''.join(edge), 'uni' if count == 1 else 'bi'

It creates a dictionary (hash) that is indexed by the sorted order of the edges. If there are two entries, we know it is has a converse duplicate. If only one, it is not. The dictionary is returned and the count of matching (in any order) entries.

Question 13

Programmers.SE encourages questions that explain the why behind the how. Your answer would be much stronger if you stated how your suggested algorithm addresses the OP's concerns and how this approach is superior to other approaches.

kevin cline kevin cline 33.8k3 gold badges73 silver badges143 bronze badges · Answer 1 · 2014-10-02 22:00:43Z

1

Just collect the pairs in a set. (In Ruby, it will be a Set of two-element Arrays.)

let Set s = {}
for each pair [a,b]
 if s contains [a,b]
 // duplicate, do nothing
 else if s contains [b,a] // converse duplicate
 ...
 else
 add [a,b] to S

If you are writing Ruby, it is already capable of using arrays

Share

Improve this answer

edited Oct 3, 2014 at 15:48

answered Oct 2, 2014 at 22:00

kevin cline's user avatar

kevin cline kevin cline

33.8k3 gold badges73 silver badges143 bronze badges

1

We must have different definitions of trivial. I do not see that f() performs any sorting function. I do not see explanations of the xxx-edge() functions. I do not see any output of converse duplicates. In short, I do not see a solution.

david.pfx
– david.pfx

2014年10月02日 23:05:09 +00:00
Commented Oct 2, 2014 at 23:05

Add a comment |

david.pfx david.pfx 8,1972 gold badges23 silver badges45 bronze badges · Answer 2 · 2014-07-04 09:54:25Z

This is the code for my ruby implementation. The output is in a format intended for Graphviz.

 graph_edges = {}
 fmap.keys.sort.each do |srckey|
 forms = fmap[srckey]
 fsrc = 1ドル if srckey =~ /---/
 forms.keys.each do |fkey|
 fdest = 1ドル if fkey =~ /---/
 if fdest
 if graph_edges[fdest] && graph_edges[fdest][fsrc]
 graph_edges[fdest][fsrc] = :both
 else
 graph_edges[fsrc] ||= {}
 graph_edges[fsrc][fdest] ||= :only
 end
 end
 end if fsrc
 end
 fout3.puts "digraph {"
 graph_edges.each do |fsrc,h| 
 h.each_pair do |fdest,dir| 
 dirs = " [dir = both]" if dir == :both
 fout3.puts " #{fsrc} -> #{fdest}#{dirs};" 
 end
 end
 fout3.puts "}"

Hopefully there is something useful here. I shall be working on something better based on a tuple value hash.

Programmers is about conceptual questions and answers are expected to explain things. Throwing code dumps instead of explanation is like copying code from IDE to whiteboard: it may look familiar and even sometimes be understandable, but it feels weird... just weird. Whiteboard doesn't have compiler

9000 9000 24.3k4 gold badges53 silver badges80 bronze badges · Answer 3 · 2014-10-03 17:08:34Z

Here's a working solution with decent performance.

ordered = lambda x, y: (x, y) if x <= y else (y, x)
def makeArrows(edges):
 """
 input: a sequence of (src, dst) pairs, maybe some reciprocal.
 output: generator yielding (src, dst, is_double_ended) triples.
 """
 undirected_edges = set()
 for src, dst in edges:
 edge = ordered(src, dst) # both 'ab' and 'ba' become ('a', 'b') here
 if edge in undirected_edges:
 # we found a reciprocal; there can only be one pair like 'ab' + 'ba',
 # so we remove it and output as double-ended
 undirected_edges.remove(edge)
 yield (src, dst, True)
 else:
 # accumulate an edge and wait; a reciprocal might be ahead
 undirected_edges.add(edge)
 # all edges not removed by now are one-ended
 for (src, dst) in undirected_edges:
 yield (src, dst, False)

Now we can try it. Since Python strings are sequences, here I short-cut 'ab' instead of ('a', 'b').

raw_edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
>>> list(makeArrows(raw_edges))
[('c', 'a', True), ('d', 'b', True), ('b', 'c', False), ('a', 'b', False), ('a', 'd', False)]

pcm pcm 101 · Answer 4 · 2015-04-01 23:03:14Z

Here's a way...in python

ordered = lambda x, y: (x, y) if x <= y else (y, x)
def make_hash(edges):
 edge_hash = {}
 for x,y in edges:
 edge = ordered(x, y)
 edge_hash[edge] = edge_hash.setdefault(edge, 0) + 1
 return edge_hash
edges = ['ab', 'ac', 'ad', 'bc', 'bd', 'ca', 'db']
for edge, count in make_hash(edges).items():
 print ''.join(edge), 'uni' if count == 1 else 'bi'

It creates a dictionary (hash) that is indexed by the sorted order of the edges. If there are two entries, we know it is has a converse duplicate. If only one, it is not. The dictionary is returned and the count of matching (in any order) entries.

Programmers.SE encourages questions that explain the why behind the how. Your answer would be much stronger if you stated how your suggested algorithm addresses the OP's concerns and how this approach is superior to other approaches.

Stack Exchange Network

An algorithm for finding converse duplicates of ordered pairs

4 Answers 4

Hot Network Questions

An algorithm for finding converse duplicates of ordered pairs

4 Answers 4

Related

Hot Network Questions