Password Attacker (Google apac test problem)

Question 1

Short Summary:

Given a \$M\$ possible characters all of which must be used at least once, find the number of possible combinations of length (exactly) \$N\$.

The first input contains the number of cases, followed by pairs of M, N. Output is modulo 1000000007(\10ドル^9+7\$).

Example Input:
4
1 1
3 4
5 5
15 15
Example Output:
Case #1: 1
Case #2: 36
Case #3: 120
Case #4: 674358851

The stupid solution is to simply generate all possible repeated_permutations and then exclude the ones which do not use at least M different characters (this of course is painfully slow).

def num_pw_slow(m,n)
 m.times.to_a.repeated_permutation(n).count do |row|
 m.times.map {|i| row.include? i}.all? 
 end
end

The real solution:

It's noticeable that the number of solutions which include all M characters is also: (The number of repeated permutations) \$ - \$ (the number of repeated permutations missing at least one character).

The number of repeated permutations is simply \$m^n\$. The second is a little more involved. Combinations are given by \$n!/ (n-k)!k!\$ where \$n=m\$ and \$k = m-1\$. Permutations are \$(m-1)^n\,ドル the tricky part is we need the set of permutations, since each of the \$(m-1)\$ possible combinations differs from another by a single character, its repeated_permutations will contain a duplicates for all permutations which exclude that character.

For example if m=3, n=4, there are \3ドル^4\$ repeated_permutations possible - with repeated permutations of ab, bc, ca (possible combinations of m, with length m-1) missing one required character when expanded to length n, the set operation over which needs to eliminate duplicates from repeated_permutations of a, b, c So the solution is then:

3**4 - (2**4*3 - 1**4*3)

This can be generalized for any m, n e.g.

m**n - (m-1)*num_combinations**n
#second term is the number of duplicates amongst repeated_permutations of (m-1)
+ (m-2)*num_combinations**n 
#third term is the number of duplicates in the second term
- (m-3)*num_combinations**n 
#fourth term is duplicates in the third term
+ (m-4)*num_combinations**n
#and so on

Which gives the following code in Ruby:

(m**n) - (1..m-1).reduce(0) { |sum, i| sum + (i % 2 == 0 ? -1 : 1)*combinations(m, m-i)*(m-i)**n }

Full Solution:

def factorial(num)
 (1..(num.zero? ? 1 : num)).reduce(:*)
end
def combinations(n, k)
 factorial(n) / (factorial(n-k)*factorial(k))
end
def num_pw(m, n)
 (m**n) - (1..m-1).reduce(0) { |sum, i| sum + (i % 2 == 0 ? -1 : 1)*combinations(m, m-i)*(m-i)**n }
end
num_cases = gets.chomp.to_i
num_cases.times do |num|
 m, n = gets.chomp.split.map(&:to_i)
 puts "Case ##{num+1}: #{num_pw(m, n) % (10**9+7)}"
end

This took me over three hours to figure out (well over the exam running time) and quite inconsistent with my times on other problems (which are usually close or slightly better than average) - so for whoever solved this in under 45 minutes or so, I'd be very much interested how. Math SE gives a concise solution with the use of Sterling's number (I'd never heard of it before and I'm not sure if this is expected knowledge or not?)

Question 2

Only a wild guess (I haven't tried anything) but maybe you could have came up with a solution involving dynamic programming, something like stackoverflow.com/questions/5133050/… except that you didn't know it was named that way.

Question 3

Yeah I think if you know what Stirling's number is it's easy; without knowing I found it quite difficult though.

Question 4

This kind of problem can be solved with dynamic programming. The key points are :

you care about the number of combinations and not the list of them
assuming you know how to compute it for some values, you can go forward and compute it for even more values.

Here is how it could go (not verified whatsoever).

Given the length n (n> 0) and the number of distinct characters m (m> 0), you know that :

nb_comb(n, 1) = 1 which corresponds to strings with a single character.
nb_comb(n, m) = 0 if 1 <= n < m which corresponds to the fact that the string is too short to use all characters
nb_comb(n, m) = m * nb_comb(n-1, m-1) + m * nb_comb(n-1, m) if 1 < m < n which corresponds to the fact that to build a string of n characters, you can consider that you have to build a string of length n-1 and to add one at the end. Basically, you can either :
- consider strings containing m-1 distinct characters and add the missing character : you have nb_comb(n-1, m-1) such strings for each of the m distinct characters. This contributes to m * nb_comb(n-1, m-1) combinations.
- consider string containing all m characters and add any of the already added character : you have nb_comb(n-1, m) such strings and m ways to add an already existing character at the end. This contributes to m * nb_comb(n-1, m) new combinations.

(You can check that strings have not been counted twice (either the last character is a duplicate from another or it is not a duplicate).)

Without knowing it corresponds to Stirling numbers, I managed to deduce the same equations as the one in that stackoverflow question.

Python recursive solution to check it works :

import itertools
def nb(n, m):
 #print(n, m, "?")
 assert n>0
 assert m>0
 if m == 1:
 res = 1
 elif n == 1:
 res = 0
 else:
 res = m * (nb(n-1, m-1) + nb(n-1, m))
 if False: # check results
 slow_res = sum(1 for p in itertools.product(range(m), repeat=n) if len(set(p)) == m)
 assert slow_res == res
 return res
tests = [
 (1, 1, 1),
 (2, 2, 2),
 (3, 3, 6),
 (4, 3, 36),
 (5, 5, 120),
 (15, 15, 674358851),
]
for n, m, r in tests:
 nb_ = nb(n, m) % (1000000007)
 assert nb_ == r, "(%s,%s,%s != %s)" % (n, m, r, nb_)

This still gets way too slow for bigger solutions. Having a look at the functions calls for the case 15-15, for instance, adding a call to print at the beginning of the fonction and running python my_script | sort -n | uniq -c | sort -n, you get :

 ...
 1001 (1, 11)
 1001 (1, 5)
 1287 (2, 10)
 1287 (2, 7)
 1716 (2, 8)
 1716 (2, 9)
 2002 (1, 10)
 2002 (1, 6)
 3003 (1, 7)
 3003 (1, 9)
 3432 (1, 8)

showing that the fonction gets called with the same arguments thousands of times.

A solution would be memoization but a better one is to go the other way, generating the different values.

A way to do so is :

def nb2(n, m):
 # nb(i, j) is in results[i-1][j-1]
 results = [[None for j in range(m)] for i in range(n)]
 for j in range(m):
 results[0][j] = 0
 for i in range(n):
 results[i][0] = 1
 for i in range(1, n):
 for j in range(1, m):
 results[i][j] = (j+1) * (results[i-1][j-1] + results[i-1][j])
 return results[n-1][m-1]

Question 5

Does this produce correct results? Before I had a solution I considered building the set of the required characters \$m!\$ and inserting additional ones, but I didn't see how you calculate the number of unique insertions vs. duplicate ones efficiently.

Question 6

I haven't tested yet :-/

Question 7

To clarify I was referring to your second point: consider string containing all m characters and add any of the already added character, it's obvious you can calculate the duplicates produced from say "ab".insert(0, "a") and "ab".insert(1, "a") but how do you check palindromic duplicates with such an approach? eg. "ab".insert(0, "b") is also a duplicate of "ba".insert(2, "b").

Question 8

I guess my explanation is not precise enough but I assumed adding at the end of the string.

Question 9

Then your initial string isn't \$m!\$ though - idk probably I would understand your approach better with an example solution or a formula.

SylvainD SylvainD 29.7k1 gold badge49 silver badges93 bronze badges · Answer 1 · 2015-05-22 12:43:23Z

This kind of problem can be solved with dynamic programming. The key points are :

you care about the number of combinations and not the list of them
assuming you know how to compute it for some values, you can go forward and compute it for even more values.

Here is how it could go (not verified whatsoever).

Given the length n (n> 0) and the number of distinct characters m (m> 0), you know that :

nb_comb(n, 1) = 1 which corresponds to strings with a single character.
nb_comb(n, m) = 0 if 1 <= n < m which corresponds to the fact that the string is too short to use all characters
nb_comb(n, m) = m * nb_comb(n-1, m-1) + m * nb_comb(n-1, m) if 1 < m < n which corresponds to the fact that to build a string of n characters, you can consider that you have to build a string of length n-1 and to add one at the end. Basically, you can either :
- consider strings containing m-1 distinct characters and add the missing character : you have nb_comb(n-1, m-1) such strings for each of the m distinct characters. This contributes to m * nb_comb(n-1, m-1) combinations.
- consider string containing all m characters and add any of the already added character : you have nb_comb(n-1, m) such strings and m ways to add an already existing character at the end. This contributes to m * nb_comb(n-1, m) new combinations.

(You can check that strings have not been counted twice (either the last character is a duplicate from another or it is not a duplicate).)

Without knowing it corresponds to Stirling numbers, I managed to deduce the same equations as the one in that stackoverflow question.

Python recursive solution to check it works :

import itertools
def nb(n, m):
 #print(n, m, "?")
 assert n>0
 assert m>0
 if m == 1:
 res = 1
 elif n == 1:
 res = 0
 else:
 res = m * (nb(n-1, m-1) + nb(n-1, m))
 if False: # check results
 slow_res = sum(1 for p in itertools.product(range(m), repeat=n) if len(set(p)) == m)
 assert slow_res == res
 return res
tests = [
 (1, 1, 1),
 (2, 2, 2),
 (3, 3, 6),
 (4, 3, 36),
 (5, 5, 120),
 (15, 15, 674358851),
]
for n, m, r in tests:
 nb_ = nb(n, m) % (1000000007)
 assert nb_ == r, "(%s,%s,%s != %s)" % (n, m, r, nb_)

This still gets way too slow for bigger solutions. Having a look at the functions calls for the case 15-15, for instance, adding a call to print at the beginning of the fonction and running python my_script | sort -n | uniq -c | sort -n, you get :

 ...
 1001 (1, 11)
 1001 (1, 5)
 1287 (2, 10)
 1287 (2, 7)
 1716 (2, 8)
 1716 (2, 9)
 2002 (1, 10)
 2002 (1, 6)
 3003 (1, 7)
 3003 (1, 9)
 3432 (1, 8)

showing that the fonction gets called with the same arguments thousands of times.

A solution would be memoization but a better one is to go the other way, generating the different values.

A way to do so is :

def nb2(n, m):
 # nb(i, j) is in results[i-1][j-1]
 results = [[None for j in range(m)] for i in range(n)]
 for j in range(m):
 results[0][j] = 0
 for i in range(n):
 results[i][0] = 1
 for i in range(1, n):
 for j in range(1, m):
 results[i][j] = (j+1) * (results[i-1][j-1] + results[i-1][j])
 return results[n-1][m-1]

Does this produce correct results? Before I had a solution I considered building the set of the required characters \$m!\$ and inserting additional ones, but I didn't see how you calculate the number of unique insertions vs. duplicate ones efficiently.
To clarify I was referring to your second point: consider string containing all m characters and add any of the already added character, it's obvious you can calculate the duplicates produced from say "ab".insert(0, "a") and "ab".insert(1, "a") but how do you check palindromic duplicates with such an approach? eg. "ab".insert(0, "b") is also a duplicate of "ba".insert(2, "b").
I guess my explanation is not precise enough but I assumed adding at the end of the string.
Then your initial string isn't \$m!\$ though - idk probably I would understand your approach better with an example solution or a formula.

Stack Exchange Network

Password Attacker (Google apac test problem)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Password Attacker (Google apac test problem)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions