Testing a Collatz Conjecture Conjecture (Python)

Question 1

I had an intuition that the maximal trajectory lengths for collatz conjecture numbers would be produced by numbers of the form (2**n)-1 so I wrote a program to test it (I was wrong). I thought I would put it up for review to catch any coding sins I committed, and see how others would improve my code.

Orginal Code

'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None
def conjecture(text):
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return pprint.pformat(result)
print(conjecture(text))

"CollatzConjecture.txt" is a text file containing number trajectoryLength on separate lines. I wanted to find the number corresponding to the maximum trajectory length between successive powers of 2 in order to test my hypothesis.

Question 2

Can you add an example input textfile?

Question 3

Where is the key function actually used?

Question 4

@Ludisposed I'll link the actual text file I used: oeis.org/A006577/b006577.txt

Question 5

Now that you have improved code it might make sense to ask a new question with the updated code, linking back to this one, as stated in the link provided by Mast.

Question 6

@TobiAlafin: Hm, what a nice edit mess :) I'll let you figure it out, but if I were you, I would just roll it back to your first first revision (again) and then forget about this question. Instead spend the effort on making the follow-up question look good.

Question 7

def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None

This is really inefficient and not how dictionaries should be used.

The point of a dict is O(1) lookup, this is essentially O(n) and it doesn't make sense
You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
You don't close the file, a nice addition is using the with statement to automatically open/close files PEP343

Question 8

key(value, dct) gets the key corresponding to a value. dict.get() gets the value corresponding to a key. I'll edit my code to make it clearer.

Question 9

@TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine.

Question 10

Peter Cordes thank you. I had not thought of that, I'll apply that in the future.

Question 11

First of all, thumbs up on including a documentation block at the beginning.

When trying to run the segment, I get

Traceback (most recent call last):
 File "conjecture.py", line 28, in <module>
 print(conjecture(text))
 File "conjecture.py", line 19, in conjecture
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 File "conjecture.py", line 19, in <dictcomp>
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range

Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.

Furthermore, I see several issues with this implementation:

The conjecture function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing.
A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
Floating point arithmetic in a fully integral problem seems like an issue.
The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
It doesn't use a __main__ block, which would be beneficial for re-usable code.
The conjecture function returns a string, instead of a dictionary.

So let's start with your implementation:

'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None
def conjecture(text):
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return pprint.pformat(result)
print(conjecture(text))

Use a `if name == 'main'` segment.

One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main block:

... function definitions ...
if __name__ == '__main__':
 f = open("CollatzConjecture.txt")
 text = f.read()
 print(conjecture(text))

Move parsing the trajectories out of `conjecture`.

Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture function. But the current implementation does.

By moving out the text to traj transformation, we make conjecture more general.

...
def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 ...
if __name__ == '__main__':
 f = open("CollatzConjecture.txt")
 text = f.read()
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 print(conjecture(traj))

And we can then split the parsing into a separate function, where we use a context manager.

...
def parse_lengths(filename):
 with open(filename) as fp:
 #Processes the text file into a dict mapping numbers to their trajectory length.
 return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
 traj = parse_lengths("CollatzConjecture.txt")
 print(conjecture(traj))

And similarly, move the pformat out of conjecture:

def conjecture(text):
 result = {}
 ...
 return result
if __name__ == '__main__':
 traj = parse_lengths("CollatzConjecture.txt")
 pprint.pprint(conjecture(traj))

Simplify `conjecture`.

Now that we're left with a conjecture method that does only things related to the conjecture, we can look at it a bit more:

def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return result

Starting with this expression: int((math.log(st,2)-1)**2), what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (2⁶) becomes 25 (5²), and also 2048 (2¹¹) becomes 100 (10²). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.

def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return result

Surprisingly, though, this does not alter the output.

Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.

Final:

In the end, I found myself settling on this.

'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None
def parse_lengths(filename):
 with open(filename) as fp:
 #Processes the text file into a dict mapping numbers to their trajectory length.
 return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
 result = {}
 traj_size = len(traj)
 lower, higher = 1, 2
 while True:
 if higher > traj_size:
 # We don't have all the data for this range available.
 break
 max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
 result[higher] = key(max_in_slice, traj)
 # Prepare data for next iteration.
 lower, higher = lower * 2, higher * 2
 return result
if __name__ == '__main__':
 traj = parse_lengths('CollatzConjecture.txt')
 pprint.pprint(conjecture(traj))

I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items() instead of .keys() to simplify).

I hope this at least helps a bit.

Question 12

It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict of trajectories you desire:

def load_trajectories(file_name):
 traj = {}
 with open(file_name, 'r') as f:
 for line in f.readlines():
 S = line.split(' ')
 num, traj_len = int(S[0]), int(S[1])
 traj[num] = traj_len
 return traj

Note here that I used the context manager with open(file_name, 'r') as f:, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.

From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value pairs to simplify things a bit:

def find_key(target_value, dct):
 for key, val in dct.items():
 if val == target_value:
 return key
 raise Exception('Value not found!')

With this, it really only remains to clean up the conjecture function. Here, we can pass the traj dictionary as an argument, as well as the (un-exponentiated) bound.

In the parlance of your code, note that st is always of the form 2**x, so when we compute math.log(st, 2), it always evaluates to x, so one line of your code reads (equivalently)

slce = list(traj2.items())[(x-1)**2:st]

which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items() will be turned into a list in the way you desire, so it'll be better to be more explicit:

slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]

Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__': guard, and some other minor reorganization yields the following code:

import math
import pprint
def find_key(target_value, dct):
 for key, val in dct.items():
 if val == target_value:
 return key
 raise Exception('Value not found!')
def load_trajectories(file_name):
 traj = {}
 with open(file_name, 'r') as f:
 for line in f.readlines():
 S = line.split(' ')
 num, traj_len = int(S[0]), int(S[1])
 traj[num] = traj_len
 return traj
def conjecture(traj):
 """Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
 The conjecture is false.
 """
 bound = math.floor(math.log(len(traj),2))
 exp_bound = 2**bound
 traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
 result = {}
 for n in range(1,bound+1):
 exp_n = 2**n
 slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
 result[exp_n] = find_key(max([val for key, val in slce]), traj)
 return pprint(pformat(result))
if __name__ == "__main__":
 file_name = "CollatzConjecture.txt"
 print(conjecture(file_name))

Question 13

"I would iterate over the key:value pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.

Question 14

The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:

 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)

This converts the dict into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict each time through the loop. But you can instead invert the loop:

 best = [0] * (bound + 1)
 for key, value in traj2.items():
 x = int(math.log(key, 2)) + 1
 if value > best[x] or (value == best[x] and key < result[2**x]):
 best[x] = value
 result[2**x] = key

This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).

Ludisposed Ludisposed 11.8k2 gold badges41 silver badges91 bronze badges · Answer 1 · 2019-01-28 13:44:13Z

10

\$\begingroup\$

def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None

This is really inefficient and not how dictionaries should be used.

The point of a dict is O(1) lookup, this is essentially O(n) and it doesn't make sense
You should remove your blockcomments, and create a nice docstring. That will make the purpose of the code clearer
You don't close the file, a nice addition is using the with statement to automatically open/close files PEP343

Share

edited Jan 28, 2019 at 14:28

answered Jan 28, 2019 at 13:44

Ludisposed's user avatar

Ludisposed Ludisposed

11.8k2 gold badges41 silver badges91 bronze badges

\$\endgroup\$

3

\$\begingroup\$ key(value, dct) gets the key corresponding to a value. dict.get() gets the value corresponding to a key. I'll edit my code to make it clearer. \$\endgroup\$

Tobi Alafin
– Tobi Alafin

2019年01月28日 13:49:22 +00:00
Commented Jan 28, 2019 at 13:49
4

\$\begingroup\$ @TobiAlafin: If you need to map both ways (A -> B and B -> A), use two dictionaries to get O(1) lookups in each direction. A linear search O(n) in one direction vs. an O(1) in the other, doesn't make sense, unless that direction is almost never used, and you can't afford to double the storage cost by keeping a reverse dictionary. Or there are probably custom data structures that can index on both sets of data without duplicating the storage... like what a relational database would use for a table with 2 columns, and an index on each column. For short keys two dicts is totally fine. \$\endgroup\$

Peter Cordes
– Peter Cordes

2019年01月28日 20:18:50 +00:00
Commented Jan 28, 2019 at 20:18
\$\begingroup\$ Peter Cordes thank you. I had not thought of that, I'll apply that in the future. \$\endgroup\$

Tobi Alafin
– Tobi Alafin

2019年01月29日 01:40:00 +00:00
Commented Jan 29, 2019 at 1:40

Add a comment |

score 4 · Answer 2 · 2019-01-28 16:24:19Z

First of all, thumbs up on including a documentation block at the beginning.

When trying to run the segment, I get

Traceback (most recent call last):
 File "conjecture.py", line 28, in <module>
 print(conjecture(text))
 File "conjecture.py", line 19, in conjecture
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 File "conjecture.py", line 19, in <dictcomp>
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
IndexError: list index out of range

Turns out that when I saved the output of the link, there were empty trailing lines. After fixing that, I could get it to run.

Furthermore, I see several issues with this implementation:

The conjecture function is doing a lot of the work, with several lines that are quite filled with operations, meaning it's hard to find out what it's doing.
A dictionary can contain a value multiple times, and in fact does. For instance, both 5 and 32 have a length of 5 (5 -> 16 -> ...) and (32 -> 16 -> ...). For longer lengths, there will be more ways in which it can be obtained.
Floating point arithmetic in a fully integral problem seems like an issue.
The file is opened, and then not immediately closed. For this it's not really an issue, but in general one should use a contextmanager to clean up used resources as quickly as possible.
The semantics of the output isn't really clear to me, but I'll probably figure out a bit during picking it apart.
It doesn't use a __main__ block, which would be beneficial for re-usable code.
The conjecture function returns a string, instead of a dictionary.

So let's start with your implementation:

'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import math, pprint
f = open("CollatzConjecture.txt")
text = f.read()
def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None
def conjecture(text):
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return pprint.pformat(result)
print(conjecture(text))

Use a `if name == 'main'` segment.

One of the best places to start with such a script (especially if you want to make it reusable!) is to place the "main" code (file-mangling) inside a main block:

... function definitions ...
if __name__ == '__main__':
 f = open("CollatzConjecture.txt")
 text = f.read()
 print(conjecture(text))

Move parsing the trajectories out of `conjecture`.

Your conjecture has nothing to do with the piece of text, but about the length of the trajectories. We shouldn't care how these lengths are generated, calculated or parsed inside the conjecture function. But the current implementation does.

By moving out the text to traj transformation, we make conjecture more general.

...
def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 ...
if __name__ == '__main__':
 f = open("CollatzConjecture.txt")
 text = f.read()
 traj = {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in text.split("\n")} #Processes the text file into a dict mapping numbers to their trajectory length.
 print(conjecture(traj))

And we can then split the parsing into a separate function, where we use a context manager.

...
def parse_lengths(filename):
 with open(filename) as fp:
 #Processes the text file into a dict mapping numbers to their trajectory length.
 return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
...
if __name__ == '__main__':
 traj = parse_lengths("CollatzConjecture.txt")
 print(conjecture(traj))

And similarly, move the pformat out of conjecture:

def conjecture(text):
 result = {}
 ...
 return result
if __name__ == '__main__':
 traj = parse_lengths("CollatzConjecture.txt")
 pprint.pprint(conjecture(traj))

Simplify `conjecture`.

Now that we're left with a conjecture method that does only things related to the conjecture, we can look at it a bit more:

def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return result

Starting with this expression: int((math.log(st,2)-1)**2), what does it even do?! First it takes the 2-logarithm, substracts 1, and then squares it. At first I suspected that you wanted the previous power of two (and tested it). But, what actually happens is that 64 (2⁶) becomes 25 (5²), and also 2048 (2¹¹) becomes 100 (10²). My basic guess is that this is incorrect, and you mean 64 => 32, 128 => 64, and so forth. Assuming that's what you want, you can just divide by two.

def conjecture(traj):
 result = {}
 bound = math.floor(math.log(len(traj),2))
 traj2 = {num: traj[num] for num in list(traj.keys())[:2**bound+1]} #Slices the dictionary to receive the section of interest.
 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[st // 2:st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)
 return result

Surprisingly, though, this does not alter the output.

Next up is the fact that I see slices being taken of dictionaries (after casting to a list). In older Python versions (before 3.6) that might cause surprising results, as dictionaries are not necessarily ordered by value, though in this case it does turn out to be that way it seems. It's better to explicitly filter out the values that you want.

Final:

In the end, I found myself settling on this.

'''
* I had a conjecture that maximal trajectory lengths for the collatz conjecture were yielded by numbers of the form (2**n)-1. This program attempts to test it.
* "conjecture()" finds the number with the highest trajectory lengths in a given range. The ranges are of the form [2**n, 2**(n+1)).
'''
import pprint
def key(value, dct):
 for key in dct.keys():
 if dct[key] == value:
 return key
 return None
def parse_lengths(filename):
 with open(filename) as fp:
 #Processes the text file into a dict mapping numbers to their trajectory length.
 return {int(j.split(" ")[0]): int(j.split(" ")[1]) for j in fp}
def conjecture(traj):
 result = {}
 traj_size = len(traj)
 lower, higher = 1, 2
 while True:
 if higher > traj_size:
 # We don't have all the data for this range available.
 break
 max_in_slice = max(length for num, length in traj.items() if lower <= num <= higher)
 result[higher] = key(max_in_slice, traj)
 # Prepare data for next iteration.
 lower, higher = lower * 2, higher * 2
 return result
if __name__ == '__main__':
 traj = parse_lengths('CollatzConjecture.txt')
 pprint.pprint(conjecture(traj))

I'm also not quite happy with this yet, as the "key" function is still bothering me as well (we could iterate over .items() instead of .keys() to simplify).

I hope this at least helps a bit.

Michael Engen Michael Engen 2581 silver badge7 bronze badges · Answer 3 · 2019-01-28 15:51:35Z

It's desirable to try to break one's code into small, more reusable segments. For example, it would be preferable (in my opinion) to create a small function to open the text file, extract its contents, and return the dict of trajectories you desire:

def load_trajectories(file_name):
 traj = {}
 with open(file_name, 'r') as f:
 for line in f.readlines():
 S = line.split(' ')
 num, traj_len = int(S[0]), int(S[1])
 traj[num] = traj_len
 return traj

Note here that I used the context manager with open(file_name, 'r') as f:, since in Python we need to first open a file, do things with it, and then close it. The context manager handles the opening and closing of the file for us, and within it we can access the file.

From my searching there doesn't seem to be a universally accepted way to find the "inverse" of a value in a dictionary, but I like yours for the most part. As one change, I would iterate over the key:value pairs to simplify things a bit:

def find_key(target_value, dct):
 for key, val in dct.items():
 if val == target_value:
 return key
 raise Exception('Value not found!')

With this, it really only remains to clean up the conjecture function. Here, we can pass the traj dictionary as an argument, as well as the (un-exponentiated) bound.

In the parlance of your code, note that st is always of the form 2**x, so when we compute math.log(st, 2), it always evaluates to x, so one line of your code reads (equivalently)

slce = list(traj2.items())[(x-1)**2:st]

which isn't the 'slicing into power of 2' that you desire. Moreover, it's not (necessarily) guaranteed that traj2.items() will be turned into a list in the way you desire, so it'll be better to be more explicit:

slce = [key, val for key, val in traj.items() if key in range(2**(x-1), 2**x)]

Together, along with splitting our imports onto different lines, adding a if __name__ == '__main__': guard, and some other minor reorganization yields the following code:

import math
import pprint
def find_key(target_value, dct):
 for key, val in dct.items():
 if val == target_value:
 return key
 raise Exception('Value not found!')
def load_trajectories(file_name):
 traj = {}
 with open(file_name, 'r') as f:
 for line in f.readlines():
 S = line.split(' ')
 num, traj_len = int(S[0]), int(S[1])
 traj[num] = traj_len
 return traj
def conjecture(traj):
 """Checks the conjecture that the maximum 'Collatz length' among the integers in [2**n, 2**(n+1)) is 2**n - 1.
 The conjecture is false.
 """
 bound = math.floor(math.log(len(traj),2))
 exp_bound = 2**bound
 traj = {key:val for key, val in traj.items() if key < exp_bound} # Only include numbers of interest.
 result = {}
 for n in range(1,bound+1):
 exp_n = 2**n
 slce = [key, val for key, val in traj.items() if key in range(2**(n-1), exp_n)]
 result[exp_n] = find_key(max([val for key, val in slce]), traj)
 return pprint(pformat(result))
if __name__ == "__main__":
 file_name = "CollatzConjecture.txt"
 print(conjecture(file_name))

"I would iterate over the key:value pairs to simplify things a bit": it's not just a question of simplifying. It's also more efficient, because advancing the iterator doesn't involve any search so is as cheap as possible, whereas the lookup potentially has to resolve hash collisions.

Peter Taylor Peter Taylor 24.5k1 gold badge49 silver badges94 bronze badges · Answer 4 · 2019-01-29 09:07:39Z

The existing answers cover most of the issues that I spotted. There's just one thing that I think it's useful to add:

 for st in (2**x for x in range(1, bound+1)): #Generator for powers of 2.
 slce = list(traj2.items())[int((math.log(st,2)-1)**2):st] #Slices "traj2" into powers of 2.
 result[st] = key(max([n[1] for n in slce]), traj2)

This converts the dict into a list every time through the loop. The suggestions, which don't rely on undocumented behaviour, instead filter the dict each time through the loop. But you can instead invert the loop:

 best = [0] * (bound + 1)
 for key, value in traj2.items():
 x = int(math.log(key, 2)) + 1
 if value > best[x] or (value == best[x] and key < result[2**x]):
 best[x] = value
 result[2**x] = key

This gives straightforward linear running time and allows you explicit control over tie-breaking. (I'm not sure whether I've got the tie-breaking correct for your conjecture, but you can easily fix that).

Stack Exchange Network

Testing a Collatz Conjecture Conjecture (Python)

Orginal Code

4 Answers 4

Use a `if name == 'main'` segment.

Move parsing the trajectories out of `conjecture`.

Simplify `conjecture`.

Final:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Testing a Collatz Conjecture Conjecture (Python)

Orginal Code

4 Answers 4

Use a if __name__ == '__main__' segment.

Move parsing the trajectories out of conjecture.

Simplify conjecture.

Final:

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Use a `if name == 'main'` segment.

Move parsing the trajectories out of `conjecture`.

Simplify `conjecture`.