I need to generate all the partitions of a given integer.
I found this algorithm by Jerome Kelleher for which it is stated to be the most efficient one:
def accelAsc(n):
a = [0 for i in range(n + 1)]
k = 1
a[0] = 0
y = n - 1
while k != 0:
x = a[k - 1] + 1
k -= 1
while 2*x <= y:
a[k] = x
y -= x
k += 1
l = k + 1
while x <= y:
a[k] = x
a[l] = y
yield a[:k + 2]
x += 1
y -= 1
a[k] = x + y
y = x + y - 1
yield a[:k + 1]
reference: http://homepages.ed.ac.uk/jkellehe/partitions.php
By the way, it is not quite efficient. For an input like 40
it freezes nearly my whole system for few seconds before giving its output.
If it was a recursive algorithm I would try to decorate it with a caching function or something to improve its efficiency, but being like that I can't figure out what to do.
Do you have some suggestions about how to speed up this algorithm? Or can you suggest me another one, or a different approach to make another one from scratch?
-
2Just because it takes a few seconds to compute 40 it doesn't mean that is not efficient.Rik Poggi– Rik Poggi2012年04月20日 10:25:27 +00:00Commented Apr 20, 2012 at 10:25
-
5That algorithm doesn't yield compositions, it yields partitions. But that was a fortunate mistake: there are 549755813888 compositions of 40, which would stall anyone's computer.DSM– DSM2012年04月20日 10:27:29 +00:00Commented Apr 20, 2012 at 10:27
-
2Please edit your question, as it's confusing those who are actually looking for integer compositions.Tomek Kaftal– Tomek Kaftal2014年01月28日 10:59:41 +00:00Commented Jan 28, 2014 at 10:59
-
Reference page has moved to: jeromekelleher.net/generating-integer-partitions.htmlSteve Byrne– Steve Byrne2016年10月16日 04:38:41 +00:00Commented Oct 16, 2016 at 4:38
4 Answers 4
To generate compositions directly you can use the following algorithm:
def ruleGen(n, m, sigma):
"""
Generates all interpart restricted compositions of n with first part
>= m using restriction function sigma. See Kelleher 2006, 'Encoding
partitions as ascending compositions' chapters 3 and 4 for details.
"""
a = [0 for i in range(n + 1)]
k = 1
a[0] = m - 1
a[1] = n - m + 1
while k != 0:
x = a[k - 1] + 1
y = a[k] - 1
k -= 1
while sigma(x) <= y:
a[k] = x
x = sigma(x)
y -= x
k += 1
a[k] = x + y
yield a[:k + 1]
This algorithm is very general, and can generate partitions and compositions of many different types. For your case, use
ruleGen(n, 1, lambda x: 1)
to generate all unrestricted compositions. The third argument is known as the restriction function, and describes the type of composition/partition that you require. The method is efficient, as the amount of effort required to generate each composition is constant, when you average over all the compositions generated. If you would like to make it slightly faster in python then it's easy to replace the function sigma with 1.
It's worth noting here as well that for any constant amortised time algorithm, what you actually do with the generated objects will almost certainly dominate the cost of generating them. For example, if you store all the partitions in a list, then the time spent managing the memory for this big list will be far greater than the time spent generating the partitions.
Say, for some reason, you want to take the product of each partition. If you take a naive approach to this, then the processing involved is linear in the number of parts, whereas the cost of generation is constant. It's quite difficult to think of an application of a combinatorial generation algorithm in which the processing doesn't dominate the cost of generation. So, in practice, there'll be no measurable difference between using the simpler and more general ruleGen with sigma(x) = x and the specialised accelAsc.
-
How would I generate all compositions of cardinality k for non-negative integers?polarise– polarise2015年04月15日 11:59:06 +00:00Commented Apr 15, 2015 at 11:59
-
I found this, which is along the lines of what I'm interested in: people.sc.fsu.edu/~jburkardt/py_src/subset/comp_enum.pypolarise– polarise2015年04月15日 13:28:03 +00:00Commented Apr 15, 2015 at 13:28
If you are going to use this function repeatedly for the same inputs, it could still be worth caching the return values (if you are going to use it across separate runs, you could store the results in a file).
If you can't find a significantly faster algorithm, then it should be possible to speed this up by an order of magnitude or two by moving the code into a C extension (this is probably easiest using cython), or alternatively by using PyPy instead of CPython (PyPy has its downsides - it does not yet support Python 3, or some commonly-used libraries like numpy and scipy).
The reason for this is, since python is dynamically typed, the interpreter is probably spending most of its time checking the types of the variables - for all the interpreter knows, one of the operations could turn x
into a string, in which case expressions like x + y
would suddenly have very different meanings. Cython gets around this problem by allowing you to statically declare the variables as integers, while PyPy has a just-in-time compiler which minimises redundant type checks.
Testing with n=75 I get:
PyPy 1.8:
w:\>c:\pypy-1.8\pypy.exe pstst.py
1.04800009727 secs.
CPython 2.6:
w:\>python pstst.py
5.86199998856 secs.
Cython + mingw + gcc 4.6.2:
w:\pstst> python -c "import pstst;pstst.run()"
4.06399989128
I saw no difference with Psyco(?)
The run function:
def run():
import time
start = time.time()
for p in accelAsc(75):
pass
print time.time() - start, 'secs.'
If I change the definition of accelAsc for Cython to start with:
def accelAsc(int n):
cdef int x, y, k
# no more changes..
I get the Cython time down to 2.27 secs.
I'd say that your performance issue is somewhere else.
I didn't compare it with other approaches, but it does seem efficient to me:
import time
start = time.time()
partitions = list(accelAsc(40))
print('time: {:.5f} sec'.format(time.time() - start))
print('length:', len(partitions))
Gave:
time: 0.03636 sec
length: 37338
-
Don't time Python stuff like this, use the
timeit
module.Gareth Latty– Gareth Latty2012年04月24日 20:19:46 +00:00Commented Apr 24, 2012 at 20:19 -
@Lattyware: I don't time Python stuff like this, this is not meant to be a performance timing. I was showing the OP that I couldn't reproduce his "few seconds freeze".Rik Poggi– Rik Poggi2012年04月24日 20:29:38 +00:00Commented Apr 24, 2012 at 20:29
Explore related questions
See similar questions with these tags.