I have created a brainteaser, which I called the "impossible problem" (see this video I made). The aim is to create a valid 3-digit subtraction which uses every digit from 1 to 9 once. Here is an example of one of the solutions: 873-254=619.
After finding it hard to come up with answers I created this program. What it basically does is iterate through every possible 3-digit subtraction and if it finds one which fits the criteria, it prints it.
from math import trunc
def solver(number,number2, numberofdigits):
seq = str(number),str(number2), str(number - number2)
digits = "".join(seq)
goodChecks = 0
count= numberofdigits/3
for i in range(1,10):
if digits.count(str(i)) == count:
goodChecks += 1
if goodChecks == 9:
return digits
else:
return False
middlenumber =[]
successes = 0
num_of_digits = int(input("please enter a number of digits, which is a multiple of 3"))
if num_of_digits == 3:
minY = 381
maxY = 987
if num_of_digits == 6:
minY =246912
maxY = 998877
if num_of_digits == 3:
minX = 123
if num_of_digits == 6:
minX =123123
for y in range(minY, maxY+1):
numberlist = []
if y%100 == 0:
print(y)
for x in range(minX,trunc(y/2)):
digits = solver(y,x,num_of_digits)
if digits is not False:
successes += 2
print(digits)
numberlist.extend([x,y-x])
middlenumber.extend([x, y-x])
print("")
print("I found: ", successes, " successful solution to your brainteaser")
if successes < 20:
print("there were almost no solutions")
elif successes < 100:
print("there were not many solutions")
elif successes < 1000:
print("there were more than a hundred solutions it is definitely not impossible :)")
else:
print("that's a lot of successes")
print("There were ", len(middlenumber) - len(set(middlenumber)) , " duplicates, by the way :)")
My program worked quite well but then I decided could you do the same for 6 digits? For the 3 digit problem the program had 96 possiblities to iterate over. Which is a measly 531,441 iterations. However for the 6 digit there is 912 possiblities, which is a colossal 282,429,536,481 iterations. This is going to take my computer days to solve.
I have already tried optimizing my program but I can't figure out how to do it any faster.
-
1\$\begingroup\$ Welcome to Code Review! Good job on your first question. \$\endgroup\$SirPython– SirPython2016年03月24日 20:13:53 +00:00Commented Mar 24, 2016 at 20:13
-
2\$\begingroup\$ Thankyou for the edits, my question reads much better now. \$\endgroup\$Joe Clinton– Joe Clinton2016年03月24日 20:21:47 +00:00Commented Mar 24, 2016 at 20:21
-
3\$\begingroup\$ "do the same for 6 digits". What does this mean? I can't do it exactly the same as I will repeat digits. Is the idea to have 2 of each digit? \$\endgroup\$Justin– Justin2016年03月25日 02:12:50 +00:00Commented Mar 25, 2016 at 2:12
-
\$\begingroup\$ Yes, for 3 digits you would have numbers 1-9 once. For 6 digits numbers 1-9 twice. For 9 digits (supercomputer needed), numbers 1-9 three times. And so on ... Sorry for not explaining \$\endgroup\$Joe Clinton– Joe Clinton2016年03月25日 17:36:21 +00:00Commented Mar 25, 2016 at 17:36
3 Answers 3
first of all try use middlenumber as set. Than adding numbers no by
extends
but something asmiddlenumber.add(x < y - x and (x, y-x) or (y-x, x))
. You have strange duplicates numbernaive brute force algorithm use 9^6 checking as you mentioned
something better when you change to permutations (9!/3! checking). There example for simpler 3-digit version
from itertools import permutations def permutations_solver(): digits = set('987654321') numbers = (''.join(x) for x in permutations(digits, 6)) for x in numbers: a = int(x[0 : 3]) b = int(x[3 : 6]) c = a - b if b > c and c > 0 and set(str(c) + x) == digits: yield a, b, c print(len(list(permutations_solver())))
checking
b > c and c > 0 and set(str(c) + x) == digits
above still is very ineffectiveyou can try write iterator on digits, something as
""" set('abcdefgh') == set('123456789') and set('zxyr') == set('01') zxyr abc < def (enough a < d) abc z == 0 +def x + a + d == g + 10 * z ---- y + b + e == h + 10 * x ghi r + c + f == i + 10 * y zxy r == 0 // it looks like logic programming problem """ from itertools import product def digits_solver(): column = [(x, a, d, (x + a + d) % 10, (x + a + d) // 10) for x, a, d in product(range(2), range(1, 10), range(1, 10)) if len(set([a, d, (x + a + d) % 10])) == 3 and (x + a + d) % 10 != 0] first_column = [(x, a, d, g) for x, a, d, g, z in column if z == 0 and a < d] for x, a, d, g in first_column: second_column = [(y, b, e, h) for y, b, e, h, xx in column if xx == x and not set([a,d,g]) & set([b,e,h])] for y, b, e, h in second_column: third_column = [(r, c, f, i) for r, c, f, i, yy in column if yy == y and not set([a,d,g,b,e,h]) & set([c,f,i]) and r == 0 ] for _, c, f, i in third_column: yield (100 * g + 10 * h + i, 100 * d + 10 * e + f, 100 * a + 10 * b + c)
in above example
for
byfor
byfor
is bad idea, you can remove it by recursion. The code will be much less messyif you want try 6-digits number, you can change check unique digits (using
set
at examples) to multiset (at python you can useCounter
class from collections library to emulate it)
-
\$\begingroup\$ +1 just for the
digits_solver
implementation, which makes the code run 30 times faster on my computer \$\endgroup\$David Z– David Z2016年03月26日 14:26:50 +00:00Commented Mar 26, 2016 at 14:26 -
\$\begingroup\$ Hi, when you wrote this a few years ago I didn't understand it. But now that I've advanced far further in programming I do, and I can say it is a really nice solution. Thankyou. I've marked it as solved. \$\endgroup\$Joe Clinton– Joe Clinton2019年05月06日 15:56:47 +00:00Commented May 6, 2019 at 15:56
For another perspective in addition to the existing answers, here are my thoughts. I've made a github repository which goes through each of the changes I made step by step. I've linked to the corresponding commit for each of the points that follow. (There are a few relatively minor commits in the repository that I left out of this answer.)
Code clarity issues
It's a lot easier to understand your code if you put things in functions. Conventionally, a Python script has a
main
function that does everything, and it's invoked withif __name__ == '__main__': main()
So at a minimum, I'd suggest wrapping all your code (except for the
solver
function) in amain
function.It's good to separate the input/output from the calculation. So the
for
loop in the middle should probably be broken out into its own function. You should give it a name that clearly identifies what it does, likefind_solutions
.Don't declare
middlenumber
andsuccesses
all the way at the beginning of the code. Wait until you need them. In fact, if you're breaking out thefor
loop into its own function, you should put those variables in the function itself. Have the function return the number of solutions and number of duplicates, because that's all you need.solver
is not a great name. That function actually checks whether two numbers and their difference contain all the digits from 1 to 9 a specified number of times. So give it a name that reflects that. You can expand on the name by using a doc comment.Similarly,
number
andnumber2
aren't very conventional names, although in this case because you're just running a mathematical algorithm, there aren't much better choices. It's common to usea
andb
in this case, since the numbers are integers, orx
andy
if they were floats. And it probably makes more sense to pass the number of times each digit should appear as the argument, not the number of digits in each of the original numbers, because that way it's a lot easier to understand what the code does.middlenumber
is another name that doesn't make any sense. That variable actually holds entries representing the solutions you've found, so why not name itsolutions
accordingly?Get rid of variables you don't use anywhere, specifically
numberlist
.It's conventional to use
None
as the return value to indicate that a function doesn't have any meaningful result to return, so I'd suggest using that instead ofFalse
. Nothing wrong withFalse
, but it's a bit strange to have a function returning either a boolean or a string depending on the result of a calculation.Don't pass around more variables than you need to, unless it helps performance. In this case you don't need to return both the number of successes and the list of successes. You can find the number of successes by taking the length of the list.
The calculation of maxes and mins probably belongs as part of the solution-finding function. That helps you avoid passing around more variables than you need to; with this change you only need to pass one argument, the number of digits.
Speed issues
Since you indicated an interest in speeding up the program, the first thing to do is set a benchmark for how fast it runs. You can do this using the cProfile
module's run()
function:
if __name__ == '__main__':
import cProfile
cProfile.run('main()')
It gives you output like this:
1461380 function calls in 0.795 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.795 0.795 <string>:1(<module>)
1 0.033 0.033 0.795 0.795 impossibleproblem.py:17(find_solutions)
132781 0.582 0.000 0.762 0.000 impossibleproblem.py:3(check_for_all_digits)
1 0.000 0.000 0.795 0.795 impossibleproblem.py:36(main)
1 0.000 0.000 0.795 0.795 {built-in method exec}
3 0.000 0.000 0.000 0.000 {built-in method len}
4 0.000 0.000 0.000 0.000 {built-in method print}
607 0.000 0.000 0.000 0.000 {built-in method trunc}
1195029 0.166 0.000 0.166 0.000 {method 'count' of 'str' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
168 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
132781 0.014 0.000 0.014 0.000 {method 'join' of 'str' objects}
Most of the time is spent in check_for_all_digits()
(my renamed version of your solver()
function), and after that in str.count()
, so you'll want to focus your optimization efforts on those parts of the code. There are basically two approaches you can take:
You can try to improve the implementation of
check_for_all_digits()
, as suggested by lvc. There are a few ways to go about this:- Count all the digits at once using a
collections.Counter
, which is made for this task, instead of iterating over the string 9 times. - It turns out that
Counter
isn't particularly fast here, so you can use a list to make your own version that is optimized for counting digits. - Instead of using string parsing, you can convert the digits to character codes and use them as indices into the list.
All of these together get you a roughly 30% speedup in the runtime of the program. This is somewhat straightforward to generalize to the 6-digit case, but it will still take a long time.
1303505 function calls in 0.513 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.513 0.513 <string>:1(<module>) 199193 0.026 0.000 0.026 0.000 impossibleproblem.py:13(<genexpr>) 1 0.044 0.044 0.513 0.513 impossibleproblem.py:18(find_solutions) 1 0.000 0.000 0.513 0.513 impossibleproblem.py:37(main) 132781 0.383 0.000 0.468 0.000 impossibleproblem.py:4(check_for_all_digits) 71070 0.014 0.000 0.032 0.000 {built-in method all} 1 0.000 0.000 0.513 0.513 {built-in method exec} 3 0.000 0.000 0.000 0.000 {built-in method len} 899672 0.046 0.000 0.046 0.000 {built-in method ord} 4 0.000 0.000 0.000 0.000 {built-in method print} 607 0.000 0.000 0.000 0.000 {built-in method trunc} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 168 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects} 2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
- Count all the digits at once using a
Alternatively, instead of checking whether a given combination of digits has all 9 digits in it, you can only iterate over permutations of the 9 digits which don't have repeats, and test whether two of the numbers formed this way add up to the third. This allows you to use the numeric arithmetic check written by vaeta, so you avoid string manipulation (which, based on the results from above, is the slowest part of the program) altogether. Using this method gets you something like a 97% speedup (the code runs 30 times faster).
896 function calls in 0.025 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.025 0.025 <string>:1(<module>) 356 0.023 0.000 0.023 0.000 impossibleproblem.py:11(<listcomp>) 1 0.000 0.000 0.025 0.025 impossibleproblem.py:17(main) 169 0.000 0.000 0.025 0.000 impossibleproblem.py:3(digits_solver) 1 0.000 0.000 0.000 0.000 impossibleproblem.py:4(<listcomp>) 1 0.000 0.000 0.000 0.000 impossibleproblem.py:7(<listcomp>) 28 0.001 0.000 0.001 0.000 impossibleproblem.py:9(<listcomp>) 1 0.000 0.000 0.025 0.025 {built-in method exec} 163 0.000 0.000 0.000 0.000 {built-in method len} 4 0.000 0.000 0.000 0.000 {built-in method print} 168 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
However, this is a little trickier to generalize to the 6-digit case. It's an interesting challenge but my time is limited so I leave this as an exercise for the reader. ;-)
In solver
:
- A function name should usually be either a verb describing what the function does, or a noun describing its result. You have a noun describing what it does, and then not very well - it doesn't actually solve anything, it just checks a solution. Call it
check_digits
. A docstring also wouldn't hurt. number
andnumber2
aren't the best argument names, although it is hard to come up with better ones here - you could useminuend
andsubtrahend
, although those are slightly obscure and the difference between them probably doesn't matter here (whether the solution is positive or negative, it will have the same digits). Distinct one-letter names are probably best here, as long as you explain them in your docstring.- Your return values are either a string of all the digits in the equation, or
False
. Stick with bools and just return eitherTrue
orFalse
. - I think you can remove
numberofdigits
here. You already restrict the search space to numbers with the appropriate length, and so it should be sufficient to check that all of 1-9 appear the same number of times as each other. - Instead of searching the whole string with
str.count
multiple times, use aCounter
. You can also collapse the loop into a single generator expression; you could getgoodChecks
by callingsum
, but since you only care about whether they all check out,all
is a better fit.
So we would get:
from collections import Counter
def check_digits(a, b):
'''
Check if the equation a - b = c uses all the digits 1 through 9
an equal number of times
'''
count = Counter(str(a) + str(b) + str(a-b))
return all(count[str(i)] == count['1'] for i in range(1, 10))
Going down, middlenumber
is a .. very strange name for a list. It seems to contain solutions, although you put two numbers in it each success, which seems odd; but I'll assume that that makes sense, and say call it solutions
. You don't need the number successes
at all - the way you use it, it will always just be len(solutions)
.
The two if
statements could be combined, as in:
if num_of_digits == 3:
minX = 123
minY = 381
In your loop, you have a list called numberlist
which gets the same things as solutions
except its cleared for every y
. You don't actually use it, so you could just drop it completely. If you do want to keep track of which solutions go with which y
, you could change solutions
into a defaultdict(list)
, and then do:
solutions[y].extend([x, y-z])
That would mean the total number of solutions would be sum(len(v) for v in solutions.values())
, and the unique solutions would be set().union(*solutions.values())
(which is slightly unfortunate, but not too bad).
In the x
loop, comparing against False
is always odd, and with these changes you don't need to anymore - you can now do:
if check_digits(y, x):
solutions.extend([x, y-x])
print(y, x, y-x, sep='')
To get the same effect as returning the digits, but it might make more sense to print the equation, eg:
print(y, '-', x, '=', y-x)
To optimise this type of problem, the best way is usually to rethink your strategy. You currently generate most of the possible pairs of numbers of the appropriate length, calculate their difference, and then check if the equation satisfies your constraints. As you note, that is a massive search space. Consider the following type of strategy instead:
- Generate all the sets of three three-digit numbers from the digits 1 through 9 (
itertools.permutations
will help you here) - For each triple x,y,z , check if
x-y = z
.
This will reduce your search space, and also allow you to use ints everywhere, and not parse them into strings.
-
\$\begingroup\$ Thankyou so much this really helped. I have applied the changes but i found that in the check_digits function return all(count[i] == count[1] for i in range(1, 10)) should be changed to return all(count[str(i)] == 1 for i in range(1, 10)) \$\endgroup\$Joe Clinton– Joe Clinton2016年03月26日 09:26:58 +00:00Commented Mar 26, 2016 at 9:26
-
\$\begingroup\$ @Joe yes. my mistake. In fact, to test if all of them are equal (instead of just equal to 1, in order to support numbers of more than 3 digits), you would do
count[str(i)] == count['1']
. I've fixed this in my answer. \$\endgroup\$lvc– lvc2016年03月26日 09:55:23 +00:00Commented Mar 26, 2016 at 9:55
Explore related questions
See similar questions with these tags.