I have a list A, and a function f which takes an item of A and returns a list. I can use a list comprehension to convert everything in A like [f(a) for a in A], but this returns a list of lists. Suppose my input is [a1,a2,a3], resulting in [[b11,b12],[b21,b22],[b31,b32]].
How can I get the flattened list [b11,b12,b21,b22,b31,b32] instead? In other words, in Python, how can I get what is traditionally called flatmap in functional programming languages, or SelectMany in .NET?
(In the actual code, A is a list of directories, and f is os.listdir. I want to build a flat list of subdirectories.)
See also: How do I make a flat list out of a list of lists? for the more general problem of flattening a list of lists after it's been created.
16 Answers 16
You can have nested iterations in a single list comprehension:
[filename for path in dirs for filename in os.listdir(path)]
which is equivalent (at least functionally) to:
filenames = []
for path in dirs:
for filename in os.listdir(path):
filenames.append(filename)
16 Comments
[ item for list in listoflists for item in list ]>>> from functools import reduce # not needed on Python 2
>>> list_of_lists = [[1, 2],[3, 4, 5], [6]]
>>> reduce(list.__add__, list_of_lists)
[1, 2, 3, 4, 5, 6]
The itertools solution is more efficient, but this feels very pythonic.
1 Comment
itertools way and the difference only gets worse as your list grows.You can find a good answer in the itertools recipes:
import itertools
def flatten(list_of_lists):
return list(itertools.chain.from_iterable(list_of_lists))
1 Comment
The question proposed flatmap. Some implementations are proposed but they may unnecessary creating intermediate lists. Here is one implementation that's based on iterators.
def flatmap(func, *iterable):
return itertools.chain.from_iterable(map(func, *iterable))
In [148]: list(flatmap(os.listdir, ['c:/mfg','c:/Intel']))
Out[148]: ['SPEC.pdf', 'W7ADD64EN006.cdr', 'W7ADD64EN006.pdf', 'ExtremeGraphics', 'Logs']
In Python 2.x, use itertools.map in place of map.
Comments
You could just do the straightforward:
subs = []
for d in dirs:
subs.extend(os.listdir(d))
1 Comment
You can concatenate lists using the normal addition operator:
>>> [1, 2] + [3, 4]
[1, 2, 3, 4]
The built-in function sum will add the numbers in a sequence and can optionally start from a specific value:
>>> sum(xrange(10), 100)
145
Combine the above to flatten a list of lists:
>>> sum([[1, 2], [3, 4]], [])
[1, 2, 3, 4]
You can now define your flatmap:
>>> def flatmap(f, seq):
... return sum([f(s) for s in seq], [])
...
>>> flatmap(range, [1,2,3])
[0, 0, 1, 0, 1, 2]
Edit: I just saw the critique in the comments for another answer and I guess it is correct that Python will needlessly build and garbage collect lots of smaller lists with this solution. So the best thing that can be said about it is that it is very simple and concise if you're used to functional programming :-)
Comments
import itertools
x=[['b11','b12'],['b21','b22'],['b31']]
y=list(itertools.chain(*x))
print y
itertools will work from python2.3 and greater
Comments
subs = []
map(subs.extend, (os.listdir(d) for d in dirs))
(but Ants's answer is better; +1 for him)
2 Comments
You could try itertools.chain(), like this:
import itertools
import os
dirs = ["c:\\usr", "c:\\temp"]
subs = list(itertools.chain(*[os.listdir(d) for d in dirs]))
print subs
itertools.chain() returns an iterator, hence the passing to list().
Comments
This is the most simple way to do it:
def flatMap(array):
return reduce(lambda a,b: a+b, array)
The 'a+b' refers to concatenation of two lists
Comments
You can use pyxtension:
from pyxtension.streams import stream
stream([ [1,2,3], [4,5], [], [6] ]).flatMap() == range(7)
4 Comments
[f(a) for a in A] (where f returns a list)? Or does it only flatten a list of lists after the fact?f in this way: stream([ [1,2,3], [4,5], [], [6] ]).flatMap( f ) AND it returns a flatten list after that, with f applied over the elements of the flattened listf returns a list. Can't edit the answer, so will post a new answer here: No - it can't directly replace that list comprehension, as [f(a) for a in A] (where f returns a list)? is simply a mapping, which would be equivalent to stream(A).map( f ), while stream(A).flatMap( f ) would be equivalent of stream(A).map( f ).flatMap() - I hope this is slightly more clear.Google brought me next solution:
def flatten(l):
if isinstance(l,list):
return sum(map(flatten,l))
else:
return l
2 Comments
You can also use the flatten function using numpy:
import numpy as np
matrix = [[i+k for i in range(10)] for k in range(10)]
matrix_flat = np.array(arr).flatten()
Comments
Here are flatten and flat_map functions that can be applied to any iterable:
def flatten(iters):
for it in iters:
for elem in it:
yield elem
def flat_map(fn, it):
return flatten(map(fn, it))
Usage is simple:
for e in flat_map(range, [1, 2, 3]):
print(e, end=" ")
# Output: 0 0 1 0 1 2
flatten can be written in a recursive manner as well:
def flatten(it):
try:
yield from next(it)
yield from flatten(it)
except StopIteration:
pass
1 Comment
print(*flat_map(range, [1, 2, 3]), sep=' ')I was looking for flatmap and found this question first. flatmap is basically a generalization of what the original question asks for. If you are looking for a concise way of defining flatmap for summable collections such as lists you can use
sum(map(f,xs),[])
It's only a little longer than just writing
flatmap(f,xs)
but also potentially less clear at first.
EDIT: as noted by an attentive commenter, the Python documentation suggests against using sum() for array concatenation. For this purpose, suppose, our sum() function is replaced by itertools.chain.
The sanest solution would be to have flatmap as a basic function inside the programming language but as long as it is not, you can still define it using a better or more concrete name:
# `function` must turn the element type of `xs` into a summable type.
# `function` must be defined for arguments constructed without parameters.
def aggregate(function, xs):
return sum( map(function, xs), type(function( type(xs)() ))() )
# or only for lists
aggregate_list = lambda f,xs: sum(map(f,xs),[])
Strings are not summable unfortunately, it won't work for them.
You can do
assert( aggregate_list( lambda x: x * [x], [2,3,4] ) == [2,2,3,3,3,4,4,4,4] )
but you can't do
def get_index_in_alphabet(character):
return (ord(character) & ~0x20) - ord('A')
assert(aggregate( lambda x: get_index_in_alphabet(x) * x, "abcd") == "bccddd")
For strings, you need to use
aggregate_string = lambda f,s: "".join(map(f,s)) # looks almost like sum(map(f,s),"")
assert( aggregate_string( lambda x: get_index_in_alphabet(x) * x, "abcd" ) == "bccddd" )
It's obviously a mess, requiring different function names and even syntax for different types. Hopefully, Python's type system will be improved in the future.
2 Comments
sum for concatenation, so don't suggest it!itertools.chain. I am clueless, why Python needs us to import so basic stuff. It's like requiring people to import a for-loop.If listA=[list1,list2,list3]
flattened_list=reduce(lambda x,y:x+y,listA)
This will do.
1 Comment
+ operator between two lists is O(n+m)Explore related questions
See similar questions with these tags.