How to find sublist which present one list not another list in python? [duplicate]

Question 1

I need to compare two lists which are basically list-of-list find out the sublists which are present in one list but not other. Also the arrangement of the sublists does not consider i.e. ['a','b'] = ['b,'a']. The two lists are

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]

The output list should be

out_list = [['T_2','T_3']]

Question 2

The answer is already answered. stackoverflow.com/questions/21448310/…

Question 3

your sublists are unique, like no duplicates in your list?

Question 4

For two element sublists, this should suffice:

[x for x in List_1 if x not in List_2 and x[::-1] not in List_2]

Code:

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
print([x for x in List_1 if x not in List_2 and x[::-1] not in List_2])

Question 5

That only reliably works for sublists of 2 items, though

Question 6

@roganjosh, Yes. Already added to my answer. :)

Question 7

So I see :) I think we can do better, with set intersection but I'll need to flesh it out a bit in my head. This is going to have to scan the lists twice

Question 8

is this a 2n^2 solution (time complexity)?

Question 9

Here's a little messy functional solution that uses sets and tuples in the process (sets are used because what you're trying to calculate is the symmetric difference, and tuples are used because unlike lists, they're hashable, and can be used as set elements):

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
f = lambda l : tuple(sorted(l))
out_list = list(map(list, set(map(f, List_1)).symmetric_difference(map(f, List_2))))
print(out_list)

Output:

[['T_2', 'T_3']]

Question 10

I'd say frozensets are more appropiate for such task:

fs2 = set(map(frozenset,List_2))
out = set(map(frozenset,List_1)).symmetric_difference(fs2)
print(out) 
# {frozenset({'T_2', 'T_3'})}

The advantage of using frozensets here is that they can be hashed, hence you can simply map both lists and take the set.symmetric_difference.

If you want a nested list from the output, you can simply do:

list(map(list, out))

Note that some sublists might appear in a different order, though given the task should not be a problem

Question 11

Hmm yeah i think ur right @MrGeek

Question 12

The issue now is that you don't get a list output, so one way or another, it's gonna look like MrGeek's answer if you follow it through to its entirety

Question 13

Yes, my whole point here is that using sets or frozensets might simply be more appropiate @roganjosh. So not sure what OP wants to do from here, but depending on the task this might be more useful

Question 14

Note that if order is interchangeable as it seems, why not use sets @roganjosh but yeah, if the output has to be a nested lists totally agree

Question 15

Yes yes I know you aren't, just exposing my point of view, agreed :) @roganjosh

Question 16

You can convert lists to sets for equality comparison and use any() to add into list only items which doesn't exists in second list:

List_1 = [['T_1', 'T_2'], ['T_2', 'T_3'], ['T_1', 'T_3']]
List_2 = [['T_1', 'T_2'], ['T_3', 'T_1']]
out_list = [l1 for l1 in List_1 if not any(set(l1) == set(l2) for l2 in List_2)]

For better understanding resources consumption and efficiency of each answer I've done some tests. Hope it'll help to choose best.

Results on data from question:

Olvin Roght's answer - 12.963876624000001;
yatu's answer - 8.218290244000002;
rusu_ro1's answer - 8.857162503000001;
MrGeek's answer - 11.631234766000002;
Austin's answer - 3.452045860999995;
GZ0's answer - 7.037438627.

Results on bigger data:

Olvin Roght's answer - 83.452110953;
yatu's answer - 0.1939603360000035;
rusu_ro1's answer - 0.24479892000000802;
MrGeek's answer - 0.32636319700000627;
Austin's answer - 5.052051797000004;
GZ0's answer - 0.20400504799999908.

Question 17

what about testing with 1k-10k-1000k, I'm sure that the top will not be the same, with 3 elements the tests are not so relevant

Question 18

@rusu_ro1, of course. I've added tests with bigger data.

Question 19

The timings are interesting but it's not a level playing field. You can see my comments under yatu's answer; it doesn't give back a list.

Question 20

Also, Austin's answer flat-out is not extensible. I suspect you have learned something from the timing of your own answer, though :)

Question 21

@roganjosh, I've updated functions to give back same results. And about my answer - It's not a surprise for me. Actually, it's one of reasons why I've added tests, cause my solution looks most compact but definitely not most efficient ;)

Question 22

if you do not have duplicates in your lists you can use:

 set(frozenset(e) for e in List_1).symmetric_difference({frozenset(e) for e in List_2})

output:

{frozenset({'T_2', 'T_3'}), frozenset({1, 2})}

if you need a list of lists as output you can use:

[list(o) for o in output]

ouptut:

[['T_2', 'T_3']]

Question 23

Here is a one-liner variant of the frozenset solutions from @yatu and @rusu_ro1 for those who prefer a more concise syntax:

out = [*map(list,{*map(frozenset,List_1)}^{*map(frozenset,List_2)})]

If it is not required to convert the output into a nested list, just do

out = {*map(frozenset,List_1)}^{*map(frozenset,List_2)}

Meanwhile, one advantage of using the symmetric_difference function rather than the ^ operator is that the former can take any iterable as its argument. This avoids converting map(frozenset,List_2) into a set and therefore gains some performance.

out = [*map(list,{*map(frozenset,List_1)}.symmetric_difference(map(frozenset,List_2)))]

Austin Austin 26k4 gold badges27 silver badges52 bronze badges · Accepted Answer · 2019-08-31 10:53:26Z

4

For two element sublists, this should suffice:

[x for x in List_1 if x not in List_2 and x[::-1] not in List_2]

Code:

List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
print([x for x in List_1 if x not in List_2 and x[::-1] not in List_2])

Share

Improve this answer

edited Aug 31, 2019 at 10:54

answered Aug 31, 2019 at 10:53

Austin's user avatar

Austin Austin

26k4 gold badges27 silver badges52 bronze badges

4 Comments

roganjosh

roganjosh Over a year ago

That only reliably works for sublists of 2 items, though

2019年08月31日T10:53:54.377Z+00:00

Austin

Austin Over a year ago

@roganjosh, Yes. Already added to my answer. :)

2019年08月31日T10:54:30.107Z+00:00

roganjosh

roganjosh Over a year ago

So I see :) I think we can do better, with set intersection but I'll need to flesh it out a bit in my head. This is going to have to scan the lists twice

2019年08月31日T10:56:24.99Z+00:00

kederrac

kederrac Over a year ago

is this a 2n^2 solution (time complexity)?

2019年08月31日T12:55:56.137Z+00:00

CollectivesTM on Stack Overflow

How to find sublist which present one list not another list in python? [duplicate]

6 Answers 6

4 Comments

Comments

6 Comments

11 Comments

Comments

Comments

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

6 Answers 6

4 Comments

Comments

6 Comments

11 Comments

Comments

Comments

Linked

Related