Skip to main content
Code Review

Return to Answer

typo
Source Link
Gareth Rees
  • 50.1k
  • 3
  • 130
  • 210

You updated your question to say that you are trying to simulate a Chinese restaurant process. I have soto say that I can't relate the code in your question to this process, which has nothing to do with numbers that sum to a particular total. So I'm puzzled! Maybe you can say more about how the code you posted relates to this process?

Partition Count
---------- ------
1234 249231
134, 2 83453
124, 3 82664
123, 4 83569
1, 234 83161
14, 23 42126
1413, 2,24 3 41891 41688
1312, 2434 4168841726
1314, 2, 43 4156941891
1213, 34 2, 4 4172641569
12, 3, 4 42022
1, 24, 3 41522
1, 23, 4 41719
1, 2, 34 41867
1, 2, 3, 4 41792

You updated your question to say that you are trying to simulate a Chinese restaurant process. I have so say that I can't relate the code in your question to this process, which has nothing to do with numbers that sum to a particular total. So I'm puzzled! Maybe you can say more about how the code you posted relates to this process?

Partition Count
---------- ------
1234 249231
134, 2 83453
124, 3 82664
123, 4 83569
1, 234 83161
14, 23 42126
14, 2, 3 41891
13, 24 41688
13, 2, 4 41569
12, 34  41726
12, 3, 4 42022
1, 24, 3 41522
1, 23, 4 41719
1, 2, 34 41867
1, 2, 3, 4 41792

You updated your question to say that you are trying to simulate a Chinese restaurant process. I have to say that I can't relate the code in your question to this process, which has nothing to do with numbers that sum to a particular total. So I'm puzzled! Maybe you can say more about how the code you posted relates to this process?

Partition Count
---------- ------
1234 249231
134, 2 83453
124, 3 82664
123, 4 83569
1, 234 83161
14, 23 42126
13, 24  41688
12, 34 41726
14, 2, 3 41891
13, 2, 4 41569
12, 3, 4 42022
1, 24, 3 41522
1, 23, 4 41719
1, 2, 34 41867
1, 2, 3, 4 41792
section about Chinese restaurant process
Source Link
Gareth Rees
  • 50.1k
  • 3
  • 130
  • 210

5. Chinese restaurant process

You updated your question to say that you are trying to simulate a Chinese restaurant process . I have so say that I can't relate the code in your question to this process, which has nothing to do with numbers that sum to a particular total. So I'm puzzled! Maybe you can say more about how the code you posted relates to this process?

Anyway, if you want to generate a sample from this distribution, the easiest thing to do is to simulate the process directly:

def chinese_restaurant_partition(n):
 """
 Return a random partition of the numbers from 1 to n generated by
 the Chinese restaurant process.
 """
 partition = [] # list of sets
 assignment = [] # set each number is assigned to
 for i in xrange(n):
 r = random.randrange(i + 1)
 if r < i:
 # assign i + 1 to existing block
 block = assignment[r]
 else:
 # create new block
 block = set()
 partition.append(block)
 block.add(i + 1)
 assignment.append(block)
 return partition

For example:

>>> chinese_restaurant_partition(4)
[set([1]), set([2, 4]), set([3])]

This distribution looks like this (again, after generating a million samples):

Partition Count
---------- ------
1234 249231
134, 2 83453
124, 3 82664
123, 4 83569
1, 234 83161
14, 23 42126
14, 2, 3 41891
13, 24 41688
13, 2, 4 41569
12, 34 41726
12, 3, 4 42022
1, 24, 3 41522
1, 23, 4 41719
1, 2, 34 41867
1, 2, 3, 4 41792

5. Chinese restaurant process

You updated your question to say that you are trying to simulate a Chinese restaurant process . I have so say that I can't relate the code in your question to this process, which has nothing to do with numbers that sum to a particular total. So I'm puzzled! Maybe you can say more about how the code you posted relates to this process?

Anyway, if you want to generate a sample from this distribution, the easiest thing to do is to simulate the process directly:

def chinese_restaurant_partition(n):
 """
 Return a random partition of the numbers from 1 to n generated by
 the Chinese restaurant process.
 """
 partition = [] # list of sets
 assignment = [] # set each number is assigned to
 for i in xrange(n):
 r = random.randrange(i + 1)
 if r < i:
 # assign i + 1 to existing block
 block = assignment[r]
 else:
 # create new block
 block = set()
 partition.append(block)
 block.add(i + 1)
 assignment.append(block)
 return partition

For example:

>>> chinese_restaurant_partition(4)
[set([1]), set([2, 4]), set([3])]

This distribution looks like this (again, after generating a million samples):

Partition Count
---------- ------
1234 249231
134, 2 83453
124, 3 82664
123, 4 83569
1, 234 83161
14, 23 42126
14, 2, 3 41891
13, 24 41688
13, 2, 4 41569
12, 34 41726
12, 3, 4 42022
1, 24, 3 41522
1, 23, 4 41719
1, 2, 34 41867
1, 2, 3, 4 41792
explain how to sample the ordered integer partitions uniformly
Source Link
Gareth Rees
  • 50.1k
  • 3
  • 130
  • 210

4. Sampling partitions uniformly

Supposing that you wanted to sample the space of ordered integer partitions uniformly, how could you implement it?

First we have to know how to count ordered integer partitions. And to do that, there's a neat combinatorial observation. Consider a group of objects waiting to be partitioned:

Four objects with three locations for partition

If there are n objects, there are n − 1 locations where you could insert a partition (indicated by the dotted lines). For example, if we insert partitions at the first and the third locations, we get the partition 1, 2, 1:

Four objects partitioned as 1, 2, 1

Each subset of locations corresponds to a different ordered partition, and each ordered partition corresponds to a different subset of locations. This means that we can choose an ordered partition uniformly by choosing a subset of locations uniformly. There are 2n − 1 subsets of the n − 1 locations (for example, 4 objects have 23 = 8 ordered partitions, as noted above), and we can choose one of these subsets uniformly by putting each location into the subset with probability 1⁄2.

Here's a revised function that implements the new algorithm:

def random_ints_with_sum(n):
 """
 Generate positive random integers summing to `n`, sampled
 uniformly from the ordered integer partitions of `n`.
 """
 p = 0
 for _ in xrange(n - 1):
 p += 1
 if random.randrange(2):
 yield p
 p = 0
 yield p + 1

And here's an example distribution after a million calls to the revised function:

Partition Count
----------- ------
4, 125042
3, 1 124848
2, 2 125189
2, 1, 1 125126
1, 3 124861
1, 2, 1 125152
1, 1, 2 124248
1, 1, 1, 1 125534

You'll see that this is very close to uniform.

Again, I should emphasize that since I don't know what your function is going to be used for, I don't know whether changing it so that it samples uniformly from the ordered integer partitions is the right thing to do. But if it is, now you know how to do it.

(Who would have thought there would be so much to write about a six-line function?)

4. Sampling partitions uniformly

Supposing that you wanted to sample the space of ordered integer partitions uniformly, how could you implement it?

First we have to know how to count ordered integer partitions. And to do that, there's a neat combinatorial observation. Consider a group of objects waiting to be partitioned:

Four objects with three locations for partition

If there are n objects, there are n − 1 locations where you could insert a partition (indicated by the dotted lines). For example, if we insert partitions at the first and the third locations, we get the partition 1, 2, 1:

Four objects partitioned as 1, 2, 1

Each subset of locations corresponds to a different ordered partition, and each ordered partition corresponds to a different subset of locations. This means that we can choose an ordered partition uniformly by choosing a subset of locations uniformly. There are 2n − 1 subsets of the n − 1 locations (for example, 4 objects have 23 = 8 ordered partitions, as noted above), and we can choose one of these subsets uniformly by putting each location into the subset with probability 1⁄2.

Here's a revised function that implements the new algorithm:

def random_ints_with_sum(n):
 """
 Generate positive random integers summing to `n`, sampled
 uniformly from the ordered integer partitions of `n`.
 """
 p = 0
 for _ in xrange(n - 1):
 p += 1
 if random.randrange(2):
 yield p
 p = 0
 yield p + 1

And here's an example distribution after a million calls to the revised function:

Partition Count
----------- ------
4, 125042
3, 1 124848
2, 2 125189
2, 1, 1 125126
1, 3 124861
1, 2, 1 125152
1, 1, 2 124248
1, 1, 1, 1 125534

You'll see that this is very close to uniform.

Again, I should emphasize that since I don't know what your function is going to be used for, I don't know whether changing it so that it samples uniformly from the ordered integer partitions is the right thing to do. But if it is, now you know how to do it.

(Who would have thought there would be so much to write about a six-line function?)

Source Link
Gareth Rees
  • 50.1k
  • 3
  • 130
  • 210
Loading
lang-py

AltStyle によって変換されたページ (->オリジナル) /