Collatz conjecture plots (python)

Question 1

I have tried to improve my code with the suggestions I recieved in my old question about Collatz conjecture

Here is the new code:

import matplotlib.pyplot as plt 
import numpy as np 
def collatz_algorithm(seed: int):
 """Apply the algorithm to a certain seed and stores the values after every iteration"""
 n = seed
 yield seed
 while n > 1:
 if n % 2 == 0: 
 n //= 2
 else: 
 n = 3 * n + 1
 yield n 
def collatz_values(seed: int):
 """Creates a list that contains the values of the seed after every iterations when applying the algorithm"""
 values = list(collatz_algorithm(seed))
 return values
def max_value(seed: int):
 """Gets the maximum value reached by the seed during the iterative process"""
 max_value = max(collatz_values(seed))
 return max_value
def colors(seeds: int):
 """Creates an array of numbers for the colormap based on the number of seeds"""
 colors = plt.cm.bwr(np.linspace(0, 1, seeds + 1)) 
 return colors
def collatz_plot(start_seed, last_seed, step = 1, alpha = 0.5, ax = 0):
 """Creates a plot indicating the number of iterations and the values reached for a certain set of seeds"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_scatter(start_seed, last_seed, step = 1, alpha = 0.5, ax = 1):
 """Creates a scatter plot indicating the max value reached by every seed in the set"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].scatter(i, max_value(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_graphs (start_seed, last_seed, step = 1, alpha = 0.5):
 """Plots the 2 graphs on 2 different axis"""
 collatz_plot(start_seed, last_seed, step, alpha)
 collatz_scatter(start_seed, last_seed, step, alpha)
fig, axs = plt.subplots(2, 1)
collatz_graphs(1, 500)
plt.suptitle("Collatz conjecture", font = "Times new roman", size = 22)
axs[0].set_xlabel("n of iterations before stuck in a loop", font = "Times new roman")
axs[0].set_ylabel("Values after every iteration", font = "Times new roman")
axs[1].set_xlabel("Seed", font = "Times new roman")
axs[1].set_ylabel("Max value reached by the seed", font = "Times new roman")
plt.show()

It displays this graph: enter image description here

What do you think about it? Please point out every mistake. Also, I would like to know why this line works:

axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)

Since there is only 1 argument for the plot function I think it creates a list of integers. What I cannot understand is why it doesn't put "collatz_values" on the x axis. Should I create another variable which keeps the count of iterations to make the plot function clearer?

Question 2

Since there is only 1 argument ... what I cannot understand is why it doesn't put collatz_values on the x axis.

That's just how matplotlib implemented it. If you provide only 1 array, it's assumed to be y. If you provide 2 arrays, they will then be assumed to be x and y.

Should I create another variable which keeps the count of iterations to make the plot function clearer?

That's your call, but it's probably worth getting used to the convention of plot(y) and plot(x, y) because you'll encounter it in lots of existing matplotlib code.

Color bug (updated with example**)

Indexing i - 2 here is a bug:

color = colors(last_seed - start_seed)[i - 2]

It breaks whenever start_seed > 2, e.g.:

>>> collatz_graphs(3, 500)
# IndexError: index 498 is out of bounds for axis 0 with size 498

**Explanation by example:

for i in range(start_seed, last_seed + 1):
 color = colors(last_seed - start_seed)[i - 2]

Here the index i goes from start_seed to last_seed + 1.

If we take start_seed=4 and last_seed=8 for example, then i would include 5 seeds:

\$i \in \{4, 5, 6, 7, 8\}\$

Now try to access a list of 5 colors using i-2. Notice how the higher indices simply don't exist because we didn't start at 0:

colors(...)[2] # i = 4 i-2 = 2
colors(...)[3] # i = 5 i-2 = 3
colors(...)[4] # i = 6 i-2 = 4
colors(...)[5] # i = 7 i-2 = 5 (no such index)
colors(...)[6] # i = 8 i-2 = 6 (no such index)

So to manually index the colors, we need to start at 0 with i-start_seed:

colors(...)[0] # i = 4 i-4 = 0
colors(...)[1] # i = 4 i-4 = 1
colors(...)[2] # i = 4 i-4 = 2
colors(...)[3] # i = 4 i-4 = 3
colors(...)[4] # i = 4 i-4 = 4

But in practice, there's a more idiomatic approach for this.

What you're trying to do is retrieve one color per seed, so this is a classic use case for zip. It lets you couple the seeds and colors to avoid manual indexing:

s = range(start_seed, last_seed + 1, step)
c = colors(last_seed - start_seed)
for seed, color in zip(s, c):
 axs[ax].plot(collatz_values(seed), color=color, alpha=alpha)

Utility functions

I'm not sure of the value in these one-liner functions:

max_value(seed) can just be max(collatz_algorithm(seed))
collatz_values(seed) can just be list(collatz_algorithm(seed))

They don't really provide useful abstractions. In fact the original expressions are arguably clearer than the function calls.

colors(seeds) in its current form is also debatable, but if the colormap were also parametrized, it might make more sense as a function:

def colors(seeds: int, colormap: str = "bwr"):
 """Creates an array of numbers for the given colormap based on the number of seeds"""
 cmap = plt.get_cmap(colormap, seeds)
 colors = [cmap(seed) for seed in range(seeds + 1)]
 return colors

(You could also consider passing in all the seeds and generating len(seeds) colors.)

Plotting functions

def collatz_plot(start_seed, last_seed, step = 1, alpha = 0.5, ax = 0):
 """Creates a plot indicating the number of iterations and the values reached for a certain set of seeds"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_scatter(start_seed, last_seed, step = 1, alpha = 0.5, ax = 1):
 """Creates a scatter plot indicating the max value reached by every seed in the set"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].scatter(i, max_value(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)

Use the standard matplotlib function conventions:
- Accept ax as an Axes object (whereas yours accepts an Axes index).
- Accept **kwargs to support any and all plotting parameters (whereas yours only supports alpha).
- Return the Axes object (whereas yours returns nothing).
Rename the action functions to describe their purpose, preferably starting with verbs (e.g., use plot_max_vs_seed instead of collatz_scatter).
Move the labeling code into the respective plotting function. I'd say the plot and labels are one abstract unit.
Remove spaces around default non-type-hinted parameters per PEP 8 (e.g., use step=1 instead of step = 1).
Remove the loop for ax.scatter since x, y, c can all be array-like.

def plot_val_vs_iter(start_seed, last_seed, step=1, ax=None, **kwargs):
 """Create a line plot indicating the value reached per iteration by every seed in the set"""
 if ax is None:
 ax = plt.gca()
 s = range(start_seed, last_seed + 1, step)
 c = colors(last_seed - start_seed, "winter") # using the modified color function
 for seed, color in zip(s, c):
 ax.plot(collatz_values(seed), color=color, **kwargs)
 ax.set(xlabel="Iteration", ylabel="Value")
 return ax
def plot_max_vs_seed(start_seed, last_seed, step=1, ax=None, **kwargs):
 """Create a scatter plot indicating the max value reached by every seed in the set"""
 if ax is None:
 ax = plt.gca()
 x = range(start_seed, last_seed + 1, step)
 y = [max(collatz_algorithm(seed)) for seed in x]
 c = colors(last_seed - start_seed, "winter") # using the modified color function
 ax.scatter(x=x, y=y, c=c, **kwargs)
 ax.set(xlabel="Seed", ylabel="Max value")
 return ax

Example usage:

start_seed = 100 # no IndexError anymore, even when starting from 100
last_seed = 500
fig, (ax_val, ax_max) = plt.subplots(2, 1)
ax_val = plot_val_vs_iter(start_seed, last_seed, ax=ax_val, alpha=0.75)
ax_max = plot_max_vs_seed(start_seed, last_seed, ax=ax_max, alpha=0.5)
...

figure output

Question 3

Thanks for the detailed explanation. I, however, don't know how to incorporate your parts of code since some variables (as example seed) are still here. Also, why did numbers > 3 cause errors?

Question 4

@Lorenzo For the indexing bug, I added an explanation by example. For your question about the variables/seed, I don't understand what you're referring to.

Question 5

Nevermind, I fixed the problem. Thank you so much for the help

tdy tdytdy 2,2661 gold badge10 silver badges21 bronze badges · Accepted Answer · 2024-05-06 14:49:30Z

Since there is only 1 argument ... what I cannot understand is why it doesn't put collatz_values on the x axis.

That's just how matplotlib implemented it. If you provide only 1 array, it's assumed to be y. If you provide 2 arrays, they will then be assumed to be x and y.

Should I create another variable which keeps the count of iterations to make the plot function clearer?

That's your call, but it's probably worth getting used to the convention of plot(y) and plot(x, y) because you'll encounter it in lots of existing matplotlib code.

Color bug (updated with example**)

Indexing i - 2 here is a bug:

color = colors(last_seed - start_seed)[i - 2]

It breaks whenever start_seed > 2, e.g.:

>>> collatz_graphs(3, 500)
# IndexError: index 498 is out of bounds for axis 0 with size 498

**Explanation by example:

for i in range(start_seed, last_seed + 1):
 color = colors(last_seed - start_seed)[i - 2]

Here the index i goes from start_seed to last_seed + 1.

If we take start_seed=4 and last_seed=8 for example, then i would include 5 seeds:

\$i \in \{4, 5, 6, 7, 8\}\$

Now try to access a list of 5 colors using i-2. Notice how the higher indices simply don't exist because we didn't start at 0:

colors(...)[2] # i = 4 i-2 = 2
colors(...)[3] # i = 5 i-2 = 3
colors(...)[4] # i = 6 i-2 = 4
colors(...)[5] # i = 7 i-2 = 5 (no such index)
colors(...)[6] # i = 8 i-2 = 6 (no such index)

So to manually index the colors, we need to start at 0 with i-start_seed:

colors(...)[0] # i = 4 i-4 = 0
colors(...)[1] # i = 4 i-4 = 1
colors(...)[2] # i = 4 i-4 = 2
colors(...)[3] # i = 4 i-4 = 3
colors(...)[4] # i = 4 i-4 = 4

But in practice, there's a more idiomatic approach for this.

What you're trying to do is retrieve one color per seed, so this is a classic use case for zip. It lets you couple the seeds and colors to avoid manual indexing:

s = range(start_seed, last_seed + 1, step)
c = colors(last_seed - start_seed)
for seed, color in zip(s, c):
 axs[ax].plot(collatz_values(seed), color=color, alpha=alpha)

Utility functions

I'm not sure of the value in these one-liner functions:

max_value(seed) can just be max(collatz_algorithm(seed))
collatz_values(seed) can just be list(collatz_algorithm(seed))

They don't really provide useful abstractions. In fact the original expressions are arguably clearer than the function calls.

colors(seeds) in its current form is also debatable, but if the colormap were also parametrized, it might make more sense as a function:

def colors(seeds: int, colormap: str = "bwr"):
 """Creates an array of numbers for the given colormap based on the number of seeds"""
 cmap = plt.get_cmap(colormap, seeds)
 colors = [cmap(seed) for seed in range(seeds + 1)]
 return colors

(You could also consider passing in all the seeds and generating len(seeds) colors.)

Plotting functions

def collatz_plot(start_seed, last_seed, step = 1, alpha = 0.5, ax = 0):
 """Creates a plot indicating the number of iterations and the values reached for a certain set of seeds"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_scatter(start_seed, last_seed, step = 1, alpha = 0.5, ax = 1):
 """Creates a scatter plot indicating the max value reached by every seed in the set"""
 for i in range (start_seed, last_seed + 1, step):
 axs[ax].scatter(i, max_value(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)

Use the standard matplotlib function conventions:
- Accept ax as an Axes object (whereas yours accepts an Axes index).
- Accept **kwargs to support any and all plotting parameters (whereas yours only supports alpha).
- Return the Axes object (whereas yours returns nothing).
Rename the action functions to describe their purpose, preferably starting with verbs (e.g., use plot_max_vs_seed instead of collatz_scatter).
Move the labeling code into the respective plotting function. I'd say the plot and labels are one abstract unit.
Remove spaces around default non-type-hinted parameters per PEP 8 (e.g., use step=1 instead of step = 1).
Remove the loop for ax.scatter since x, y, c can all be array-like.

def plot_val_vs_iter(start_seed, last_seed, step=1, ax=None, **kwargs):
 """Create a line plot indicating the value reached per iteration by every seed in the set"""
 if ax is None:
 ax = plt.gca()
 s = range(start_seed, last_seed + 1, step)
 c = colors(last_seed - start_seed, "winter") # using the modified color function
 for seed, color in zip(s, c):
 ax.plot(collatz_values(seed), color=color, **kwargs)
 ax.set(xlabel="Iteration", ylabel="Value")
 return ax
def plot_max_vs_seed(start_seed, last_seed, step=1, ax=None, **kwargs):
 """Create a scatter plot indicating the max value reached by every seed in the set"""
 if ax is None:
 ax = plt.gca()
 x = range(start_seed, last_seed + 1, step)
 y = [max(collatz_algorithm(seed)) for seed in x]
 c = colors(last_seed - start_seed, "winter") # using the modified color function
 ax.scatter(x=x, y=y, c=c, **kwargs)
 ax.set(xlabel="Seed", ylabel="Max value")
 return ax

Example usage:

start_seed = 100 # no IndexError anymore, even when starting from 100
last_seed = 500
fig, (ax_val, ax_max) = plt.subplots(2, 1)
ax_val = plot_val_vs_iter(start_seed, last_seed, ax=ax_val, alpha=0.75)
ax_max = plot_max_vs_seed(start_seed, last_seed, ax=ax_max, alpha=0.5)
...

figure output

Thanks for the detailed explanation. I, however, don't know how to incorporate your parts of code since some variables (as example seed) are still here. Also, why did numbers > 3 cause errors?
@Lorenzo For the indexing bug, I added an explanation by example. For your question about the variables/seed, I don't understand what you're referring to.
Nevermind, I fixed the problem. Thank you so much for the help

Stack Exchange Network

Collatz conjecture plots (python)

1 Answer 1

Color bug (updated with example**)

Utility functions

Plotting functions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Collatz conjecture plots (python)

1 Answer 1

Color bug (updated with example**)

Utility functions

Plotting functions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions