I have two lists with the same number of elements, all of them strings. These strings are the same set but in a different order in each list with no duplicates.
list_a = ['s1', 's2', 's3', 's4', 's5', ...]
list_b = ['s8', 's5', 's1', 's9', 's3', ...]
I need to go through each element in list_a
and find the index in list_b
that contains that same element. I can do this with two nested for loops but there has to be a better/more efficient way:
b_indexes = []
for elem_a in list_a:
for indx_b, elem_b in enumerate(list_b):
if elem_b == elem_a:
b_indexes.append(indx_b)
break
-
Are there duplicates?TerryA– TerryA2013年10月07日 11:38:28 +00:00Commented Oct 7, 2013 at 11:38
-
No duplicates, sorry.Gabriel– Gabriel2013年10月07日 11:39:24 +00:00Commented Oct 7, 2013 at 11:39
4 Answers 4
If there are no duplicates, you can just use list.index()
:
list_a = ['s1', 's2', 's3', 's4', 's5']
list_b = ['s8', 's5', 's1', 's9', 's3']
print [list_b.index(i) for i in list_a]
You only need to use one for loop, because you've said that the strings in list_a also appear in list_b, so there's no need to go if elem_b == elem_a:
and iterate through the second list.
Comments
In functional style:
map(list_b.index, list_a)
A list will be produced containing the index in list_b of each element in list_a.
6 Comments
map()
is probably a little bit faster, but a list comprehension is more readable and you can use conditionals and stuff (well, you can with map()
by adding a custom function, but then it just goes a bit untidy)map()
will return a generator, while a list comprehension will return a list (although changing the []
to a ()
will make it a generator expression :)map
more readable than the list comprehension--mostly because each person thinks of their own special name for the value in the LC, and because the LC might have the extra conditionals you speak of--with map, you can see in the first three characters that there is no funny business going on. :)This should give you a list of the indexes.
[list_b.index(elem) for elem in list_a]
Comments
An alternative approach to the index
method is to build a dictionary of the locations in one pass instead of searching through the list each time. If the list is long enough, this should be faster, because it makes the process linear in the number of elements (on average) instead of quadratic. To be specific, instead of
def index_method(la, lb):
return [lb.index(i) for i in la]
you could use
def dict_method(la, lb):
where = {v: i for i,v in enumerate(lb)}
return [where[i] for i in la]
This should be roughly comparable on small lists, albeit maybe a little slower:
>>> list_a = ['s{}'.format(i) for i in range(5)]
>>> list_b = list_a[:]
>>> random.shuffle(list_b)
>>> %timeit index_method(list_a, list_b)
1000000 loops, best of 3: 1.86 μs per loop
>>> %timeit dict_method(list_a, list_b)
1000000 loops, best of 3: 1.93 μs per loop
But it should be much faster on longer ones, and the difference will only grow:
>>> list_a = ['s{}'.format(i) for i in range(100)]
>>> list_b = list_a[:]
>>> random.shuffle(list_b)
>>> %timeit index_method(list_a, list_b)
10000 loops, best of 3: 140 μs per loop
>>> %timeit dict_method(list_a, list_b)
10000 loops, best of 3: 20.9 μs per loop