How can I get a flat result from a list comprehension instead of a nested list?

Question 1

I have a list A, and a function f which takes an item of A and returns a list. I can use a list comprehension to convert everything in A like [f(a) for a in A], but this returns a list of lists. Suppose my input is [a1,a2,a3], resulting in [[b11,b12],[b21,b22],[b31,b32]].

How can I get the flattened list [b11,b12,b21,b22,b31,b32] instead? In other words, in Python, how can I get what is traditionally called flatmap in functional programming languages, or SelectMany in .NET?

(In the actual code, A is a list of directories, and f is os.listdir. I want to build a flat list of subdirectories.)

_{See also: How do I make a flat list out of a list of lists? for the more general problem of flattening a list of lists after it's been created.}

Question 2

You can have nested iterations in a single list comprehension:

[filename for path in dirs for filename in os.listdir(path)]

which is equivalent (at least functionally) to:

filenames = []
for path in dirs:
 for filename in os.listdir(path):
 filenames.append(filename)

Question 3

Although clever, that is hard to understand and not very readable.

Question 4

Doesn't really answer the question as asked. This is rather a workaround for not encountering the problem in the first place. What if you already have a list of lists. For example, what if your list of lists is a result of the multiprocessing module's map function? Perhaps the itertools solution or the reduce solution is best.

Question 5

Dave31415: [ item for list in listoflists for item in list ]

Question 6

'readability' is a subjective judgment. I find this solution quite readable.

Question 7

I thought it was readable too, until I saw the order of the terms... :(

Question 8

>>> from functools import reduce # not needed on Python 2
>>> list_of_lists = [[1, 2],[3, 4, 5], [6]]
>>> reduce(list.__add__, list_of_lists)
[1, 2, 3, 4, 5, 6]

The itertools solution is more efficient, but this feels very pythonic.

Question 9

For a list of 1,000 sublists, this is 100 times slower that the itertools way and the difference only gets worse as your list grows.

Question 10

You can find a good answer in the itertools recipes:

import itertools
def flatten(list_of_lists):
 return list(itertools.chain.from_iterable(list_of_lists))

Question 11

The same approach can be used to define flatmap, as proposed by this answer and this external blog post

Question 12

The question proposed flatmap. Some implementations are proposed but they may unnecessary creating intermediate lists. Here is one implementation that's based on iterators.

def flatmap(func, *iterable):
 return itertools.chain.from_iterable(map(func, *iterable))
In [148]: list(flatmap(os.listdir, ['c:/mfg','c:/Intel']))
Out[148]: ['SPEC.pdf', 'W7ADD64EN006.cdr', 'W7ADD64EN006.pdf', 'ExtremeGraphics', 'Logs']

In Python 2.x, use itertools.map in place of map.

Question 13

You could just do the straightforward:

subs = []
for d in dirs:
 subs.extend(os.listdir(d))

Question 14

Yep, this is fine (though not quite as good as @Ants') so I'm giving it a +1 to honor its simplicity!

Question 15

You can concatenate lists using the normal addition operator:

>>> [1, 2] + [3, 4]
[1, 2, 3, 4]

The built-in function sum will add the numbers in a sequence and can optionally start from a specific value:

>>> sum(xrange(10), 100)
145

Combine the above to flatten a list of lists:

>>> sum([[1, 2], [3, 4]], [])
[1, 2, 3, 4]

You can now define your flatmap:

>>> def flatmap(f, seq):
... return sum([f(s) for s in seq], [])
... 
>>> flatmap(range, [1,2,3])
[0, 0, 1, 0, 1, 2]

Edit: I just saw the critique in the comments for another answer and I guess it is correct that Python will needlessly build and garbage collect lots of smaller lists with this solution. So the best thing that can be said about it is that it is very simple and concise if you're used to functional programming :-)

Question 16

import itertools
x=[['b11','b12'],['b21','b22'],['b31']]
y=list(itertools.chain(*x))
print y

itertools will work from python2.3 and greater

Question 17

subs = []
map(subs.extend, (os.listdir(d) for d in dirs))

(but Ants's answer is better; +1 for him)

Question 18

Using reduce (or sum, which saves you many characters and an import;-) for this is just wrong -- you keep uselessly tossing away old lists to make a new one for each d. @Ants has the right answer (smart of @Steve to accept it!).

Question 19

You can't say in general that this is a bad solution. It depends on whether performance is even an issue. Simple is better unless there is a reason to optimize. That's why the reduce method could be best for many problems. For example you have a slow function that produces a list of a few hundred objects. You want to speed it up by using multiprocessing 'map' function. So you create 4 processes and use reduce to flat map them. In this case the reduce function is fine and very readable. That said, it's good that you point out why this can be suboptimal. But it is not always suboptimal.

Question 20

You could try itertools.chain(), like this:

import itertools
import os
dirs = ["c:\\usr", "c:\\temp"]
subs = list(itertools.chain(*[os.listdir(d) for d in dirs]))
print subs

itertools.chain() returns an iterator, hence the passing to list().

Question 21

This is the most simple way to do it:

def flatMap(array):
 return reduce(lambda a,b: a+b, array)

The 'a+b' refers to concatenation of two lists

Question 22

You can use pyxtension:

from pyxtension.streams import stream
stream([ [1,2,3], [4,5], [], [6] ]).flatMap() == range(7)

Question 23

Can this directly replace a list comprehension like [f(a) for a in A] (where f returns a list)? Or does it only flatten a list of lists after the fact?

Question 24

@KarlKnechtel -it actually do both: it can replace the list comprehension with applying a function f in this way: stream([ [1,2,3], [4,5], [], [6] ]).flatMap( f ) AND it returns a flatten list after that, with f applied over the elements of the flattened list

Question 25

As far as I can tell, the question is specifically about replacing a list comprehension, since otherwise it would be a duplicate of stackoverflow.com/questions/952914. Mind editing to show a more appropriate example?

Question 26

@KarlKnechtel Yes, you are right - I indeed missed the spec that f returns a list. Can't edit the answer, so will post a new answer here: No - it can't directly replace that list comprehension, as [f(a) for a in A] (where f returns a list)? is simply a mapping, which would be equivalent to stream(A).map( f ), while stream(A).flatMap( f ) would be equivalent of stream(A).map( f ).flatMap() - I hope this is slightly more clear.

Question 27

Google brought me next solution:

def flatten(l):
 if isinstance(l,list):
 return sum(map(flatten,l))
 else:
 return l

Question 28

Would be a little better if it handled generator expressions too, and would be a lot better if you explained how to use it...

Question 29

This answer belongs on stackoverflow.com/questions/2158395 instead, but it would likely be a duplicate there.

Question 30

You can also use the flatten function using numpy:

import numpy as np
matrix = [[i+k for i in range(10)] for k in range(10)]
matrix_flat = np.array(arr).flatten()

numpy documentation flatten

Question 31

Here are flatten and flat_map functions that can be applied to any iterable:

def flatten(iters):
 for it in iters:
 for elem in it:
 yield elem
def flat_map(fn, it):
 return flatten(map(fn, it))

Usage is simple:

for e in flat_map(range, [1, 2, 3]):
 print(e, end=" ")
# Output: 0 0 1 0 1 2

flatten can be written in a recursive manner as well:

def flatten(it):
 try:
 yield from next(it)
 yield from flatten(it)
 except StopIteration:
 pass

Question 32

Minor suggestion: print(*flat_map(range, [1, 2, 3]), sep=' ')

Question 33

I was looking for flatmap and found this question first. flatmap is basically a generalization of what the original question asks for. If you are looking for a concise way of defining flatmap for summable collections such as lists you can use

sum(map(f,xs),[])

It's only a little longer than just writing

flatmap(f,xs)

but also potentially less clear at first.

EDIT: as noted by an attentive commenter, the Python documentation suggests against using sum() for array concatenation. For this purpose, suppose, our sum() function is replaced by itertools.chain.

The sanest solution would be to have flatmap as a basic function inside the programming language but as long as it is not, you can still define it using a better or more concrete name:

# `function` must turn the element type of `xs` into a summable type.
# `function` must be defined for arguments constructed without parameters.
def aggregate(function, xs):
 return sum( map(function, xs), type(function( type(xs)() ))() )
# or only for lists
aggregate_list = lambda f,xs: sum(map(f,xs),[])

Strings are not summable unfortunately, it won't work for them.
You can do

assert( aggregate_list( lambda x: x * [x], [2,3,4] ) == [2,2,3,3,3,4,4,4,4] )

but you can't do

def get_index_in_alphabet(character):
 return (ord(character) & ~0x20) - ord('A')
assert(aggregate( lambda x: get_index_in_alphabet(x) * x, "abcd") == "bccddd")

For strings, you need to use

aggregate_string = lambda f,s: "".join(map(f,s)) # looks almost like sum(map(f,s),"")
assert( aggregate_string( lambda x: get_index_in_alphabet(x) * x, "abcd" ) == "bccddd" )

It's obviously a mess, requiring different function names and even syntax for different types. Hopefully, Python's type system will be improved in the future.

Question 34

even the documentation, from at least Python 2.7, recommends not to use sum for concatenation, so don't suggest it!

Question 35

@cards Thank you for letting us know. The documentation recommends to use itertools.chain. I am clueless, why Python needs us to import so basic stuff. It's like requiring people to import a for-loop.

Question 36

If listA=[list1,list2,list3]
flattened_list=reduce(lambda x,y:x+y,listA)

This will do.

Question 37

This is a very inefficient solution if the sublists are large. The + operator between two lists is O(n+m)

Ants Aasma 55.3k16 gold badges98 silver badges99 bronze badges · Accepted Answer · 2009-07-02 23:32:56Z

185

You can have nested iterations in a single list comprehension:

[filename for path in dirs for filename in os.listdir(path)]

which is equivalent (at least functionally) to:

filenames = []
for path in dirs:
 for filename in os.listdir(path):
 filenames.append(filename)

Share

Improve this answer

edited Oct 4, 2020 at 10:22

Ruben Helsloot's user avatar

Ruben Helsloot

13.2k6 gold badges33 silver badges56 bronze badges

answered Jul 2, 2009 at 23:32

Ants Aasma's user avatar

Ants Aasma

55.3k16 gold badges98 silver badges99 bronze badges

Sign up to request clarification or add additional context in comments.

16 Comments

Curtis Yallop

Curtis Yallop Over a year ago

Although clever, that is hard to understand and not very readable.

2014年10月20日T23:01:43.003Z+00:00

Dave31415

Dave31415 Over a year ago

Doesn't really answer the question as asked. This is rather a workaround for not encountering the problem in the first place. What if you already have a list of lists. For example, what if your list of lists is a result of the multiprocessing module's map function? Perhaps the itertools solution or the reduce solution is best.

2015年01月22日T03:43:01.553Z+00:00

rampion

rampion Over a year ago

Dave31415: [ item for list in listoflists for item in list ]

2015年02月06日T19:53:19.28Z+00:00

Reb.Cabin

Reb.Cabin Over a year ago

'readability' is a subjective judgment. I find this solution quite readable.

2015年05月22日T23:13:25.697Z+00:00

c z

c z Over a year ago

I thought it was readable too, until I saw the order of the terms... :(

2017年05月10日T15:14:46.773Z+00:00

|

CollectivesTM on Stack Overflow

How can I get a flat result from a list comprehension instead of a nested list?

16 Answers 16

16 Comments

1 Comment

1 Comment

Comments

1 Comment

Comments

Comments

2 Comments

Comments

Comments

4 Comments

2 Comments

Comments

1 Comment

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

16 Answers 16

16 Comments

1 Comment

1 Comment

Comments

1 Comment

Comments

Comments

2 Comments

Comments

Comments

4 Comments

2 Comments

Comments

1 Comment

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related