Title: Identifying equivalent lists
I am using lists as keys, and therefore want to identify equivalent ones.
For example,
[0, 2, 2, 1, 1, 2]
[4, 0, 0, 1, 1, 0]
are 'equivalent' in my book. I 'equivalify' the list by checking the list value with the highest count, and set that to 0, then the next highest count to 1, and so on and so forth. It's done with this code below, which does the job properly but is fairly slow - and I am wondering if there is a way of doing the same job yet I haven't seen it yet.
for key in preliminary_key_list:
key = list(key) # convert from immutable tuple to mutable list
i = 0
translation = {}
counts = [[j, key.count(j)] for j in set(key)]
counts_right = [j[1] for j in counts]
for count in set(key):
translation[counts[counts_right.index(max(counts_right))][0]] = i
counts_right[counts_right.index(max(counts_right))] = 0
i += 1
for i in range(len(key)):
key[i] = translation[key[i]]
2 Answers 2
for key in preliminary_key_list:
key = list(key) # convert from immutable tuple to mutable list
You don't need to do this. Rather then creating a modifiable list, just use the tuple and construct a new tuple at the end
i = 0
translation = {}
counts = [[j, key.count(j)] for j in set(key)]
This should be a list of tuples, not a list of lists.
counts_right = [j[1] for j in counts]
for count in set(key):
Use for i, count in enumerate(set(key)): to avoid having to manage i yourself.
You use set(key) twice, which will force python to do work twice. Also, there is no guarantee you'll get the same order both times.
translation[counts[counts_right.index(max(counts_right))][0]] = i
counts_right[counts_right.index(max(counts_right))] = 0
These lines are really expensive. max will scan the whole list checking for the highest value. .index will then rescan the list to find the index for the value.
Then you do the whole thing over again on the next line!
You'd do far better as @GlenRogers suggests to sort by the counts.
i += 1
for i in range(len(key)):
key[i] = translation[key[i]]
Here using a comprehension would be better (I think):
key = tuple(translation[item] for item in key)
Here is how I'd write the code:
ret = []
for key in preliminary_key_list:
counts = sorted(set(key), key = lambda k: -key.count(k) )
translation = { val: idx for idx, val in enumerate(counts) }
ret.append([translation[k] for k in key])
return ret
-
\$\begingroup\$ It's certainly a lot faster and it's concise as well. Makes me wonder how you get to that level where you know how to do it. Your explanation was good too. Thanks Winston \$\endgroup\$nebffa– nebffa2012年10月25日 11:31:47 +00:00Commented Oct 25, 2012 at 11:31
You can do things slightly differently - instead of manually sorting, you can use the standard library sort() function. Below I've modified things slightly for convenience, but using it, timeit originally gave 16.3 seconds for 10^6 iterations and now gives 10.8 seconds.
def equivalify( preliminary_key_list ):
ret = []
for key in preliminary_key_list:
key = list(key) # convert from immutable tuple to mutable list
translation = {}
counts = [[key.count(j), j] for j in set(key)]
counts.sort( reverse = True )
for idx, ( cnt, val ) in enumerate( counts ):
translation[ val ] = idx
for i,k in enumerate( key ):
key[i] = translation[ k ]
ret += [key]
return ret
[0, 1, 1, 2, 2, 3, 3, 3]? \$\endgroup\$