6
\$\begingroup\$

Link to problem: link
Number of solvers: 27 out of 844
Solve rate (per person): 3.199%
Success rate (over all submissions): 1.78%

Project Euler+ on HackerRank is basically supposed to be a generalization of Project Euler problems, but in a programming contest-like format.

Project Euler+ #250 is as follows:

Find the number of nonempty subsets of \$\left\{1^1,2^2,3^3,...,n^n\right\}\$, the sum of whose elements are divisible by \$k\$. Print you answer modulo \10ドル^9\$.

Input

The only line of input contains integers \$n\$ and \$k\$ separated by a single space.

Constraints

  • \1ドル\le n\le10^{400}\$
  • \3ドル\le k\le50\$

There are a total of 56 testcases (from 0 to 55), with only testcase 0 and 1 given as samples (which I easily pass):

'''
Testcase 0:
IN : 6 50
OUT: 0
Testcase 1:
IN : 10 50
OUT: 21
'''

Here is my code currently:

# Python 3
def compute(n, k):
 '''
 With n,k integers, this function finds the number of non-empty
 subsets of {1^1,2^2,3^3,...,n^n} such that the sum of its elements
 is divisible by k. The answer is returned (mod 10^9) as an integer.
 
 Constraints: 1 <= n <= 10**400
 3 <= k <= 50
 '''
 array = [1] + [0]*(k-1)
 values = [0 if i%k==0 else pow(i,i,k) for i in range(1,n+1)]
 # ^ This somewhat speeds up execution time, but it still times out on the
 # exact same 32 test cases that it always does.
 # At least it gets the correct answers that it doesn't time out for faster :D
 # (remember that i==0(mod k) implies i**i==0(mod k) since k divides i, but
 # it is not true in general that i==j(mod k) implies i**i==j(mod k))
 for x in values:
 array = [(array[y]+array[(x+y)%k])%(10**9) for y in range(len(array))]
 # Oh, so this must be the thing that must be making it time out. This most
 # likely is optimized, so maybe I just need to figure out a way to more
 # efficiently take (i**i) mod k.
 return array[0]-1
n, k = map(int,input().split(" "))
print(compute(n, k))

The problem that I am currently facing is that I am getting on the following 32 test cases (as in it produces correct answers for the other 24):

'''
+--------------+--------------+--------------+--------------+
| Test Case 08 | Test Case 12 | Test Case 16 | Test Case 20 |
| Test Case 09 | Test Case 13 | Test Case 17 | Test Case 21 |
| Test Case 10 | Test Case 14 | Test Case 18 | Test Case 22 |
| Test Case 11 | Test Case 15 | Test Case 19 | Test Case 23 |
|--------------+--------------+--------------+--------------|
| Test Case 24 | Test Case 28 | Test Case 48 | Test Case 52 |
| Test Case 25 | Test Case 29 | Test Case 49 | Test Case 53 |
| Test Case 26 | Test Case 46 | Test Case 50 | Test Case 54 |
| Test Case 27 | Test Case 47 | Test Case 51 | Test Case 55 |
+--------------+--------------+--------------+--------------+
'''

I get that Python 3 is not the best choice of programming language for this type of problem since Python seems to be slow for this type of task. See this discussion post made 5 years ago.

However my question is:

How can I make my code more time efficient?

It seems like it should be possible since searching the submissions by language yields six Python3 submissions that do score 100/100, but I'm not sure what I'm not understanding about the method I need to use.

toolic
15.1k5 gold badges29 silver badges211 bronze badges
asked Feb 13 at 21:19
\$\endgroup\$
8
  • \$\begingroup\$ "i==j(mod k) [does not imply] i**i==j(mod k)'" what about j**i(mod k)? \$\endgroup\$ Commented Feb 14 at 9:25
  • \$\begingroup\$ Actually something that could help is the property that \$i^i\equiv(i+100)^{i+100}(\operatorname{mod}\;100)\$. Since \$k\$ only ranges from 3 to 50 (inclusive), shouldn't we only need to look at the last 2 digits of \$i^i\$? \$\endgroup\$ Commented Feb 14 at 15:20
  • \$\begingroup\$ @greybeard I can confirm that (where \$\%\$ represents modulo, and where \$n\ge2\$ and is an integer) \$(i^i\%n)\equiv((i\%n)^i)\%n\$ \$\endgroup\$ Commented Feb 14 at 16:07
  • \$\begingroup\$ "last 2 digits?" don't think so - "answer modulo 10**9". What does k have to do with that? \$\endgroup\$ Commented Feb 14 at 16:13
  • \$\begingroup\$ @greybeard Printing the answer modulo 10**9 is an additional requirement that was thrown in by the creator of Project Euler+ #250 to make sure the output for some testcases isn't too large. \$\endgroup\$ Commented Feb 14 at 16:18

2 Answers 2

5
\$\begingroup\$

Your iteration over range(1,n+1) is completely futile, as n can be \10ドル^{400}\$. Here's how I did it.

Your values have repetition. For example:

>>> [pow(i, i, 3) for i in range(1, 31)]
[1, 1, 0, 1, 2, 0, 1, 1, 0, 1, 2, 0, 1, 1, 0, 1, 2, 0, 1, 1, 0, 1, 2, 0, 1, 1, 0, 1, 2, 0]

That's 1, 1, 0, 1, 2, 0 repeated over and over again. Figure out the repetition, then compute how often each residue occurs. Above we have the pattern 1, 1, 0, 1, 2, 0 five times, so the residues 0, 1 and 2 occur 10, 15 and 5 times, respectively.

Your operation

array = [(array[y]+array[(x+y)%k])%(10**9) for y in range(len(array))]

for a residue x can be represented as a multiplication with a matrix. For example for k=5, x=2, and current array=[a0, ..., a4]:

 | 1 0 1 0 0 | | a0 |
 | 0 1 0 1 0 | | a1 |
array = | 0 0 1 0 1 | * | a2 |
 | 1 0 0 1 0 | | a3 |
 | 0 1 0 0 1 | | a4 |

The main diagonal of 1s reflects the array[y] in your list comprehension, and the other diagonal of 1s (rotated right by x=2 cells) reflects your array[(x+y)%k].

Now let's say you calculated that x=2 occurs 1000 times. Do you multiply with that matrix 1000 times? No. You use exponentiation by squaring to compute the 1000th power of the matrix, and multiply your array with that.

Actually the rows of the matrix power are all the same, just rotated, so you don't need full matrices, just one row. That's k times faster.

With this, I solved all but four test cases. Then, instead of computing the matrix powers for all k residues separately, I instead first combined (i.e., multiplied) the base matrices for residues that occur equally often. This way I need fewer matrix powers. And it was enough to get 100% acceptance.

answered Feb 16 at 16:04
\$\endgroup\$
6
  • \$\begingroup\$ Question is: you say that the array = [list comprehension] operation can be represented as multiplication of a matrix and a vector. Are there any resources (or any tips you have) for deriving this representation in the general case? I have been able to list all of the periods for \$i^i\% k\$ for all values of \3ドル\le k\le50\$ if it helps. \$\endgroup\$ Commented Feb 18 at 21:04
  • \$\begingroup\$ Actually, it's also possible to note something that interests me about this: If we were to expand the context to only prime numbers \$p_0,p_1,...p_i\$ as a value for \$k\,ドル all we have to do to compute its period is to use the fact that the period for a prime is \$f(p):=p(p-1)\$. (That's at least what I noticed for all primes less than 50, although this interpolation should be correct for all primes) \$\endgroup\$ Commented Feb 18 at 21:07
  • 1
    \$\begingroup\$ @CrSb0001 I added a bit about the matrix to the answer now. \$\endgroup\$ Commented Feb 18 at 23:06
  • \$\begingroup\$ I would love to see the code for this, or atleast some sample for how you combined the base matrices for equally frequent residues \$\endgroup\$ Commented Feb 19 at 13:06
  • 1
    \$\begingroup\$ @NaitikMundra I don't really want to spoil the challenge any more than I already have, but ok, that's a boring part, so here it is :-) \$\endgroup\$ Commented Feb 19 at 22:06
2
\$\begingroup\$

Documentation

The docstring is quite useful. It specifies the input types and ranges, and it describes what the function is doing.

Comments

These comments can be deleted since they merely describe the performance issue:

# ^ This somewhat speeds up execution time, but it still times out on the
# exact same 32 test cases that it always does.
# At least it gets the correct answers that it doesn't time out for faster :D

This comment is promising because it describes parts of your algorithm:

# (remember that i==0(mod k) implies i**i==0(mod k) since k divides i, but
# it is not true in general that i==j(mod k) implies i**i==j(mod k))

I recommend adding more detail to it.

Again, this comment can be deleted since it merely describes the performance issue:

 # Oh, so this must be the thing that must be making it time out. This most
 # likely is optimized, so maybe I just need to figure out a way to more
 # efficiently take (i**i) mod k.

Readability

The code is hard to read and understand because the code is quite dense.

I ran the black program to automatically reformat the code. It helps a little because it adds consistent spacing around operators.

Naming

array is not a very descriptive name for a variable. Give it a better name to show what it represents. values is a slightly better name, but the same advice applies.

Efficiency

There is no need to repeatedly calculate len(array) inside the loop. You can assign it to a value outside the loop:

nums = len(array)
for x in values:
 array = [(array[y] + array[(x + y) % k]) % (10**9) for y in range(nums)]

You could also try to factor the magic number outside of the loop:

nums = len(array)
BIG_NUM = 10**9
for x in values:
 array = [(array[y] + array[(x + y) % k]) % BIG_NUM for y in range(nums)]

I think your assessment is correct that the code in this for loop is probably taking the most time:

for x in values:

It is a nested loop.


The suggestions above likely won't solve your performance problem, but maybe they are a step towards a solution, at least until a mathematician posts an answer.

answered Feb 13 at 22:38
\$\endgroup\$
3
  • \$\begingroup\$ Sorry it took so long to respond. This sadly does not help with optimization, although thanks for the ideas! \$\endgroup\$ Commented Feb 14 at 17:15
  • \$\begingroup\$ Although what I have done (renaming array to offsets) is notice that offsets has length k, so we only need to do the nested loop for y in range(k). Also, we can use the property \$i^i\%k\equiv(i\%k)^i\%k\$ and write values=[pow((i+1)%k,i+1,k)for i in range(n)] (although NB: this has not helped me get less cases where I time out) \$\endgroup\$ Commented Feb 14 at 17:17
  • 1
    \$\begingroup\$ @CrSb0001: Great! Although I could not directly help with your main issue, sometimes minor suggestions can make things more obvious to the person who wrote the code. \$\endgroup\$ Commented Feb 14 at 17:19

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.