Compose valid subtractions using all digits from a set

Question 1

I have created a brainteaser, which I called the "impossible problem" (see this video I made). The aim is to create a valid 3-digit subtraction which uses every digit from 1 to 9 once. Here is an example of one of the solutions: 873-254=619.

After finding it hard to come up with answers I created this program. What it basically does is iterate through every possible 3-digit subtraction and if it finds one which fits the criteria, it prints it.

from math import trunc
def solver(number,number2, numberofdigits):
 seq = str(number),str(number2), str(number - number2)
 digits = "".join(seq)
 goodChecks = 0
 count= numberofdigits/3
 for i in range(1,10):
 if digits.count(str(i)) == count:
 goodChecks += 1
 if goodChecks == 9:
 return digits
 else:
 return False
middlenumber =[]
successes = 0
num_of_digits = int(input("please enter a number of digits, which is a multiple of 3"))
if num_of_digits == 3:
 minY = 381
 maxY = 987
if num_of_digits == 6:
 minY =246912
 maxY = 998877
if num_of_digits == 3:
 minX = 123
if num_of_digits == 6:
 minX =123123
for y in range(minY, maxY+1):
 numberlist = []
 if y%100 == 0:
 print(y)
 for x in range(minX,trunc(y/2)):
 digits = solver(y,x,num_of_digits)
 if digits is not False:
 successes += 2
 print(digits)
 numberlist.extend([x,y-x])
 middlenumber.extend([x, y-x])
print("")
print("I found: ", successes, " successful solution to your brainteaser")
if successes < 20:
 print("there were almost no solutions")
elif successes < 100:
 print("there were not many solutions")
elif successes < 1000:
 print("there were more than a hundred solutions it is definitely not impossible :)")
else:
 print("that's a lot of successes")
print("There were ", len(middlenumber) - len(set(middlenumber)) , " duplicates, by the way :)")

My program worked quite well but then I decided could you do the same for 6 digits? For the 3 digit problem the program had 9⁶ possiblities to iterate over. Which is a measly 531,441 iterations. However for the 6 digit there is 9¹² possiblities, which is a colossal 282,429,536,481 iterations. This is going to take my computer days to solve.

I have already tried optimizing my program but I can't figure out how to do it any faster.

Question 2

Welcome to Code Review! Good job on your first question.

Question 3

Thankyou for the edits, my question reads much better now.

Question 4

"do the same for 6 digits". What does this mean? I can't do it exactly the same as I will repeat digits. Is the idea to have 2 of each digit?

Question 5

Yes, for 3 digits you would have numbers 1-9 once. For 6 digits numbers 1-9 twice. For 9 digits (supercomputer needed), numbers 1-9 three times. And so on ... Sorry for not explaining

Question 6

first of all try use middlenumber as set. Than adding numbers no by extends but something as middlenumber.add(x < y - x and (x, y-x) or (y-x, x)). You have strange duplicates number
naive brute force algorithm use 9^6 checking as you mentioned

something better when you change to permutations (9!/3! checking). There example for simpler 3-digit version

from itertools import permutations
def permutations_solver():
 digits = set('987654321')
 numbers = (''.join(x) for x in permutations(digits, 6))
 for x in numbers:
 a = int(x[0 : 3])
 b = int(x[3 : 6])
 c = a - b
 if b > c and c > 0 and set(str(c) + x) == digits:
 yield a, b, c
print(len(list(permutations_solver())))

checking b > c and c > 0 and set(str(c) + x) == digits above still is very ineffective

you can try write iterator on digits, something as

""" set('abcdefgh') == set('123456789') and set('zxyr') == set('01')
zxyr abc < def (enough a < d)
 abc z == 0
+def x + a + d == g + 10 * z
---- y + b + e == h + 10 * x
 ghi r + c + f == i + 10 * y
 zxy r == 0 // it looks like logic programming problem
"""
from itertools import product
def digits_solver():
 column = [(x, a, d, (x + a + d) % 10, (x + a + d) // 10)
 for x, a, d in product(range(2), range(1, 10), range(1, 10))
 if len(set([a, d, (x + a + d) % 10])) == 3 and (x + a + d) % 10 != 0]
 first_column = [(x, a, d, g) for x, a, d, g, z in column if z == 0 and a < d]
 for x, a, d, g in first_column:
 second_column = [(y, b, e, h) for y, b, e, h, xx in column if xx == x and not set([a,d,g]) & set([b,e,h])]
 for y, b, e, h in second_column:
 third_column = [(r, c, f, i) for r, c, f, i, yy in column
 if yy == y and not set([a,d,g,b,e,h]) & set([c,f,i]) and r == 0 ]
 for _, c, f, i in third_column:
 yield (100 * g + 10 * h + i, 100 * d + 10 * e + f, 100 * a + 10 * b + c)

in above example for by for by for is bad idea, you can remove it by recursion. The code will be much less messy
if you want try 6-digits number, you can change check unique digits (using set at examples) to multiset (at python you can use Counter class from collections library to emulate it)

Question 7

+1 just for the digits_solver implementation, which makes the code run 30 times faster on my computer

Question 8

Hi, when you wrote this a few years ago I didn't understand it. But now that I've advanced far further in programming I do, and I can say it is a really nice solution. Thankyou. I've marked it as solved.

Question 9

For another perspective in addition to the existing answers, here are my thoughts. I've made a github repository which goes through each of the changes I made step by step. I've linked to the corresponding commit for each of the points that follow. (There are a few relatively minor commits in the repository that I left out of this answer.)

Code clarity issues

It's a lot easier to understand your code if you put things in functions. Conventionally, a Python script has a main function that does everything, and it's invoked with
```
if __name__ == '__main__':
 main()
```
So at a minimum, I'd suggest wrapping all your code (except for the solver function) in a main function.
It's good to separate the input/output from the calculation. So the for loop in the middle should probably be broken out into its own function. You should give it a name that clearly identifies what it does, like find_solutions.
Don't declare middlenumber and successes all the way at the beginning of the code. Wait until you need them. In fact, if you're breaking out the for loop into its own function, you should put those variables in the function itself. Have the function return the number of solutions and number of duplicates, because that's all you need.
solver is not a great name. That function actually checks whether two numbers and their difference contain all the digits from 1 to 9 a specified number of times. So give it a name that reflects that. You can expand on the name by using a doc comment.

Similarly, number and number2 aren't very conventional names, although in this case because you're just running a mathematical algorithm, there aren't much better choices. It's common to use a and b in this case, since the numbers are integers, or x and y if they were floats. And it probably makes more sense to pass the number of times each digit should appear as the argument, not the number of digits in each of the original numbers, because that way it's a lot easier to understand what the code does.

middlenumber is another name that doesn't make any sense. That variable actually holds entries representing the solutions you've found, so why not name it solutions accordingly?
Get rid of variables you don't use anywhere, specifically numberlist.
It's conventional to use None as the return value to indicate that a function doesn't have any meaningful result to return, so I'd suggest using that instead of False. Nothing wrong with False, but it's a bit strange to have a function returning either a boolean or a string depending on the result of a calculation.
Don't pass around more variables than you need to, unless it helps performance. In this case you don't need to return both the number of successes and the list of successes. You can find the number of successes by taking the length of the list.
The calculation of maxes and mins probably belongs as part of the solution-finding function. That helps you avoid passing around more variables than you need to; with this change you only need to pass one argument, the number of digits.

Speed issues

Since you indicated an interest in speeding up the program, the first thing to do is set a benchmark for how fast it runs. You can do this using the cProfile module's run() function:

if __name__ == '__main__':
 import cProfile
 cProfile.run('main()')

It gives you output like this:

 1461380 function calls in 0.795 seconds
 Ordered by: standard name
 ncalls tottime percall cumtime percall filename:lineno(function)
 1 0.000 0.000 0.795 0.795 <string>:1(<module>)
 1 0.033 0.033 0.795 0.795 impossibleproblem.py:17(find_solutions)
 132781 0.582 0.000 0.762 0.000 impossibleproblem.py:3(check_for_all_digits)
 1 0.000 0.000 0.795 0.795 impossibleproblem.py:36(main)
 1 0.000 0.000 0.795 0.795 {built-in method exec}
 3 0.000 0.000 0.000 0.000 {built-in method len}
 4 0.000 0.000 0.000 0.000 {built-in method print}
 607 0.000 0.000 0.000 0.000 {built-in method trunc}
 1195029 0.166 0.000 0.166 0.000 {method 'count' of 'str' objects}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
 168 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
 2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}
 132781 0.014 0.000 0.014 0.000 {method 'join' of 'str' objects}

Most of the time is spent in check_for_all_digits() (my renamed version of your solver() function), and after that in str.count(), so you'll want to focus your optimization efforts on those parts of the code. There are basically two approaches you can take:

You can try to improve the implementation of check_for_all_digits(), as suggested by lvc. There are a few ways to go about this:

Count all the digits at once using a collections.Counter, which is made for this task, instead of iterating over the string 9 times.
It turns out that Counter isn't particularly fast here, so you can use a list to make your own version that is optimized for counting digits.
Instead of using string parsing, you can convert the digits to character codes and use them as indices into the list.

All of these together get you a roughly 30% speedup in the runtime of the program. This is somewhat straightforward to generalize to the 6-digit case, but it will still take a long time.

 1303505 function calls in 0.513 seconds
 Ordered by: standard name
 ncalls tottime percall cumtime percall filename:lineno(function)
 1 0.000 0.000 0.513 0.513 <string>:1(<module>)
 199193 0.026 0.000 0.026 0.000 impossibleproblem.py:13(<genexpr>)
 1 0.044 0.044 0.513 0.513 impossibleproblem.py:18(find_solutions)
 1 0.000 0.000 0.513 0.513 impossibleproblem.py:37(main)
 132781 0.383 0.000 0.468 0.000 impossibleproblem.py:4(check_for_all_digits)
 71070 0.014 0.000 0.032 0.000 {built-in method all}
 1 0.000 0.000 0.513 0.513 {built-in method exec}
 3 0.000 0.000 0.000 0.000 {built-in method len}
 899672 0.046 0.000 0.046 0.000 {built-in method ord}
 4 0.000 0.000 0.000 0.000 {built-in method print}
 607 0.000 0.000 0.000 0.000 {built-in method trunc}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
 168 0.000 0.000 0.000 0.000 {method 'extend' of 'list' objects}
 2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}

Alternatively, instead of checking whether a given combination of digits has all 9 digits in it, you can only iterate over permutations of the 9 digits which don't have repeats, and test whether two of the numbers formed this way add up to the third. This allows you to use the numeric arithmetic check written by vaeta, so you avoid string manipulation (which, based on the results from above, is the slowest part of the program) altogether. Using this method gets you something like a 97% speedup (the code runs 30 times faster).

 896 function calls in 0.025 seconds
 Ordered by: standard name
 ncalls tottime percall cumtime percall filename:lineno(function)
 1 0.000 0.000 0.025 0.025 <string>:1(<module>)
 356 0.023 0.000 0.023 0.000 impossibleproblem.py:11(<listcomp>)
 1 0.000 0.000 0.025 0.025 impossibleproblem.py:17(main)
 169 0.000 0.000 0.025 0.000 impossibleproblem.py:3(digits_solver)
 1 0.000 0.000 0.000 0.000 impossibleproblem.py:4(<listcomp>)
 1 0.000 0.000 0.000 0.000 impossibleproblem.py:7(<listcomp>)
 28 0.001 0.000 0.001 0.000 impossibleproblem.py:9(<listcomp>)
 1 0.000 0.000 0.025 0.025 {built-in method exec}
 163 0.000 0.000 0.000 0.000 {built-in method len}
 4 0.000 0.000 0.000 0.000 {built-in method print}
 168 0.000 0.000 0.000 0.000 {method 'add' of 'set' objects}
 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
 2 0.000 0.000 0.000 0.000 {method 'format' of 'str' objects}

However, this is a little trickier to generalize to the 6-digit case. It's an interesting challenge but my time is limited so I leave this as an exercise for the reader. ;-)

Question 10

In solver:

A function name should usually be either a verb describing what the function does, or a noun describing its result. You have a noun describing what it does, and then not very well - it doesn't actually solve anything, it just checks a solution. Call it check_digits. A docstring also wouldn't hurt.
number and number2 aren't the best argument names, although it is hard to come up with better ones here - you could use minuend and subtrahend, although those are slightly obscure and the difference between them probably doesn't matter here (whether the solution is positive or negative, it will have the same digits). Distinct one-letter names are probably best here, as long as you explain them in your docstring.
Your return values are either a string of all the digits in the equation, or False. Stick with bools and just return either True or False.
I think you can remove numberofdigits here. You already restrict the search space to numbers with the appropriate length, and so it should be sufficient to check that all of 1-9 appear the same number of times as each other.
Instead of searching the whole string with str.count multiple times, use a Counter. You can also collapse the loop into a single generator expression; you could get goodChecks by calling sum, but since you only care about whether they all check out, all is a better fit.

So we would get:

from collections import Counter
def check_digits(a, b):
 '''
 Check if the equation a - b = c uses all the digits 1 through 9
 an equal number of times
 '''
 count = Counter(str(a) + str(b) + str(a-b))
 return all(count[str(i)] == count['1'] for i in range(1, 10))

Going down, middlenumber is a .. very strange name for a list. It seems to contain solutions, although you put two numbers in it each success, which seems odd; but I'll assume that that makes sense, and say call it solutions. You don't need the number successes at all - the way you use it, it will always just be len(solutions).

The two if statements could be combined, as in:

if num_of_digits == 3:
 minX = 123
 minY = 381

In your loop, you have a list called numberlist which gets the same things as solutions except its cleared for every y. You don't actually use it, so you could just drop it completely. If you do want to keep track of which solutions go with which y, you could change solutions into a defaultdict(list), and then do:

solutions[y].extend([x, y-z])

That would mean the total number of solutions would be sum(len(v) for v in solutions.values()), and the unique solutions would be set().union(*solutions.values()) (which is slightly unfortunate, but not too bad).

In the x loop, comparing against False is always odd, and with these changes you don't need to anymore - you can now do:

if check_digits(y, x):
 solutions.extend([x, y-x])
 print(y, x, y-x, sep='')

To get the same effect as returning the digits, but it might make more sense to print the equation, eg:

print(y, '-', x, '=', y-x)

To optimise this type of problem, the best way is usually to rethink your strategy. You currently generate most of the possible pairs of numbers of the appropriate length, calculate their difference, and then check if the equation satisfies your constraints. As you note, that is a massive search space. Consider the following type of strategy instead:

Generate all the sets of three three-digit numbers from the digits 1 through 9 (itertools.permutations will help you here)
For each triple x,y,z , check if x-y = z.

This will reduce your search space, and also allow you to use ints everywhere, and not parse them into strings.

Question 11

Thankyou so much this really helped. I have applied the changes but i found that in the check_digits function return all(count[i] == count[1] for i in range(1, 10)) should be changed to return all(count[str(i)] == 1 for i in range(1, 10))

Question 12

@Joe yes. my mistake. In fact, to test if all of them are equal (instead of just equal to 1, in order to support numbers of more than 3 digits), you would do count[str(i)] == count['1']. I've fixed this in my answer.

vaeta vaeta 8865 silver badges8 bronze badges · Accepted Answer · 2016-03-25 02:29:01Z

first of all try use middlenumber as set. Than adding numbers no by extends but something as middlenumber.add(x < y - x and (x, y-x) or (y-x, x)). You have strange duplicates number
naive brute force algorithm use 9^6 checking as you mentioned

something better when you change to permutations (9!/3! checking). There example for simpler 3-digit version

from itertools import permutations
def permutations_solver():
 digits = set('987654321')
 numbers = (''.join(x) for x in permutations(digits, 6))
 for x in numbers:
 a = int(x[0 : 3])
 b = int(x[3 : 6])
 c = a - b
 if b > c and c > 0 and set(str(c) + x) == digits:
 yield a, b, c
print(len(list(permutations_solver())))

checking b > c and c > 0 and set(str(c) + x) == digits above still is very ineffective

you can try write iterator on digits, something as

""" set('abcdefgh') == set('123456789') and set('zxyr') == set('01')
zxyr abc < def (enough a < d)
 abc z == 0
+def x + a + d == g + 10 * z
---- y + b + e == h + 10 * x
 ghi r + c + f == i + 10 * y
 zxy r == 0 // it looks like logic programming problem
"""
from itertools import product
def digits_solver():
 column = [(x, a, d, (x + a + d) % 10, (x + a + d) // 10)
 for x, a, d in product(range(2), range(1, 10), range(1, 10))
 if len(set([a, d, (x + a + d) % 10])) == 3 and (x + a + d) % 10 != 0]
 first_column = [(x, a, d, g) for x, a, d, g, z in column if z == 0 and a < d]
 for x, a, d, g in first_column:
 second_column = [(y, b, e, h) for y, b, e, h, xx in column if xx == x and not set([a,d,g]) & set([b,e,h])]
 for y, b, e, h in second_column:
 third_column = [(r, c, f, i) for r, c, f, i, yy in column
 if yy == y and not set([a,d,g,b,e,h]) & set([c,f,i]) and r == 0 ]
 for _, c, f, i in third_column:
 yield (100 * g + 10 * h + i, 100 * d + 10 * e + f, 100 * a + 10 * b + c)

in above example for by for by for is bad idea, you can remove it by recursion. The code will be much less messy
if you want try 6-digits number, you can change check unique digits (using set at examples) to multiset (at python you can use Counter class from collections library to emulate it)

+1 just for the digits_solver implementation, which makes the code run 30 times faster on my computer
Hi, when you wrote this a few years ago I didn't understand it. But now that I've advanced far further in programming I do, and I can say it is a really nice solution. Thankyou. I've marked it as solved.

Stack Exchange Network

Compose valid subtractions using all digits from a set

3 Answers 3

Code clarity issues

Speed issues

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Compose valid subtractions using all digits from a set

3 Answers 3

Code clarity issues

Speed issues

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions