I have tried to improve my code with the suggestions I recieved in my old question about Collatz conjecture
Here is the new code:
import matplotlib.pyplot as plt
import numpy as np
def collatz_algorithm(seed: int):
"""Apply the algorithm to a certain seed and stores the values after every iteration"""
n = seed
yield seed
while n > 1:
if n % 2 == 0:
n //= 2
else:
n = 3 * n + 1
yield n
def collatz_values(seed: int):
"""Creates a list that contains the values of the seed after every iterations when applying the algorithm"""
values = list(collatz_algorithm(seed))
return values
def max_value(seed: int):
"""Gets the maximum value reached by the seed during the iterative process"""
max_value = max(collatz_values(seed))
return max_value
def colors(seeds: int):
"""Creates an array of numbers for the colormap based on the number of seeds"""
colors = plt.cm.bwr(np.linspace(0, 1, seeds + 1))
return colors
def collatz_plot(start_seed, last_seed, step = 1, alpha = 0.5, ax = 0):
"""Creates a plot indicating the number of iterations and the values reached for a certain set of seeds"""
for i in range (start_seed, last_seed + 1, step):
axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_scatter(start_seed, last_seed, step = 1, alpha = 0.5, ax = 1):
"""Creates a scatter plot indicating the max value reached by every seed in the set"""
for i in range (start_seed, last_seed + 1, step):
axs[ax].scatter(i, max_value(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
def collatz_graphs (start_seed, last_seed, step = 1, alpha = 0.5):
"""Plots the 2 graphs on 2 different axis"""
collatz_plot(start_seed, last_seed, step, alpha)
collatz_scatter(start_seed, last_seed, step, alpha)
fig, axs = plt.subplots(2, 1)
collatz_graphs(1, 500)
plt.suptitle("Collatz conjecture", font = "Times new roman", size = 22)
axs[0].set_xlabel("n of iterations before stuck in a loop", font = "Times new roman")
axs[0].set_ylabel("Values after every iteration", font = "Times new roman")
axs[1].set_xlabel("Seed", font = "Times new roman")
axs[1].set_ylabel("Max value reached by the seed", font = "Times new roman")
plt.show()
It displays this graph: enter image description here
What do you think about it? Please point out every mistake. Also, I would like to know why this line works:
axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
Since there is only 1 argument for the plot function I think it creates a list of integers. What I cannot understand is why it doesn't put "collatz_values" on the x axis. Should I create another variable which keeps the count of iterations to make the plot function clearer?
1 Answer 1
Since there is only 1 argument ... what I cannot understand is why it doesn't put
collatz_values
on the x axis.
That's just how matplotlib implemented it. If you provide only 1 array, it's assumed to be y
. If you provide 2 arrays, they will then be assumed to be x
and y
.
Should I create another variable which keeps the count of iterations to make the plot function clearer?
That's your call, but it's probably worth getting used to the convention of plot(y)
and plot(x, y)
because you'll encounter it in lots of existing matplotlib code.
Color bug (updated with example**)
Indexing i - 2
here is a bug:
color = colors(last_seed - start_seed)[i - 2]
It breaks whenever start_seed > 2
, e.g.:
>>> collatz_graphs(3, 500)
# IndexError: index 498 is out of bounds for axis 0 with size 498
**Explanation by example:
for i in range(start_seed, last_seed + 1):
color = colors(last_seed - start_seed)[i - 2]
Here the index i
goes from start_seed
to last_seed + 1
.
If we take start_seed=4
and last_seed=8
for example, then i
would include 5 seeds:
\$i \in \{4, 5, 6, 7, 8\}\$
Now try to access a list of 5 colors using i-2
. Notice how the higher indices simply don't exist because we didn't start at 0:
colors(...)[2] # i = 4 i-2 = 2
colors(...)[3] # i = 5 i-2 = 3
colors(...)[4] # i = 6 i-2 = 4
colors(...)[5] # i = 7 i-2 = 5 (no such index)
colors(...)[6] # i = 8 i-2 = 6 (no such index)
So to manually index the colors, we need to start at 0 with i-start_seed
:
colors(...)[0] # i = 4 i-4 = 0
colors(...)[1] # i = 4 i-4 = 1
colors(...)[2] # i = 4 i-4 = 2
colors(...)[3] # i = 4 i-4 = 3
colors(...)[4] # i = 4 i-4 = 4
But in practice, there's a more idiomatic approach for this.
What you're trying to do is retrieve one color per seed, so this is a classic use case for zip
. It lets you couple the seeds and colors to avoid manual indexing:
s = range(start_seed, last_seed + 1, step)
c = colors(last_seed - start_seed)
for seed, color in zip(s, c):
axs[ax].plot(collatz_values(seed), color=color, alpha=alpha)
Utility functions
I'm not sure of the value in these one-liner functions:
max_value(seed)
can just bemax(collatz_algorithm(seed))
collatz_values(seed)
can just belist(collatz_algorithm(seed))
They don't really provide useful abstractions. In fact the original expressions are arguably clearer than the function calls.
colors(seeds)
in its current form is also debatable, but if the colormap were also parametrized, it might make more sense as a function:
def colors(seeds: int, colormap: str = "bwr"):
"""Creates an array of numbers for the given colormap based on the number of seeds"""
cmap = plt.get_cmap(colormap, seeds)
colors = [cmap(seed) for seed in range(seeds + 1)]
return colors
(You could also consider passing in all the seeds and generating len(seeds)
colors.)
Plotting functions
def collatz_plot(start_seed, last_seed, step = 1, alpha = 0.5, ax = 0): """Creates a plot indicating the number of iterations and the values reached for a certain set of seeds""" for i in range (start_seed, last_seed + 1, step): axs[ax].plot(collatz_values(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha) def collatz_scatter(start_seed, last_seed, step = 1, alpha = 0.5, ax = 1): """Creates a scatter plot indicating the max value reached by every seed in the set""" for i in range (start_seed, last_seed + 1, step): axs[ax].scatter(i, max_value(i), color = colors(last_seed - start_seed)[i - 2], alpha = alpha)
Use the standard matplotlib function conventions:
- Accept
ax
as an Axes object (whereas yours accepts an Axes index). - Accept
**kwargs
to support any and all plotting parameters (whereas yours only supportsalpha
). - Return the Axes object (whereas yours returns nothing).
- Accept
Rename the action functions to describe their purpose, preferably starting with verbs (e.g., use
plot_max_vs_seed
instead ofcollatz_scatter
).Move the labeling code into the respective plotting function. I'd say the plot and labels are one abstract unit.
Remove spaces around default non-type-hinted parameters per PEP 8 (e.g., use
step=1
instead ofstep = 1
).Remove the loop for
ax.scatter
sincex
,y
,c
can all be array-like.
def plot_val_vs_iter(start_seed, last_seed, step=1, ax=None, **kwargs):
"""Create a line plot indicating the value reached per iteration by every seed in the set"""
if ax is None:
ax = plt.gca()
s = range(start_seed, last_seed + 1, step)
c = colors(last_seed - start_seed, "winter") # using the modified color function
for seed, color in zip(s, c):
ax.plot(collatz_values(seed), color=color, **kwargs)
ax.set(xlabel="Iteration", ylabel="Value")
return ax
def plot_max_vs_seed(start_seed, last_seed, step=1, ax=None, **kwargs):
"""Create a scatter plot indicating the max value reached by every seed in the set"""
if ax is None:
ax = plt.gca()
x = range(start_seed, last_seed + 1, step)
y = [max(collatz_algorithm(seed)) for seed in x]
c = colors(last_seed - start_seed, "winter") # using the modified color function
ax.scatter(x=x, y=y, c=c, **kwargs)
ax.set(xlabel="Seed", ylabel="Max value")
return ax
Example usage:
start_seed = 100 # no IndexError anymore, even when starting from 100
last_seed = 500
fig, (ax_val, ax_max) = plt.subplots(2, 1)
ax_val = plot_val_vs_iter(start_seed, last_seed, ax=ax_val, alpha=0.75)
ax_max = plot_max_vs_seed(start_seed, last_seed, ax=ax_max, alpha=0.5)
...
-
\$\begingroup\$ Thanks for the detailed explanation. I, however, don't know how to incorporate your parts of code since some variables (as example seed) are still here. Also, why did numbers > 3 cause errors? \$\endgroup\$Lorenzo– Lorenzo2024年05月09日 20:14:39 +00:00Commented May 9, 2024 at 20:14
-
\$\begingroup\$ @Lorenzo For the indexing bug, I added an explanation by example. For your question about the variables/seed, I don't understand what you're referring to. \$\endgroup\$tdy– tdy2024年05月10日 00:57:06 +00:00Commented May 10, 2024 at 0:57
-
\$\begingroup\$ Nevermind, I fixed the problem. Thank you so much for the help \$\endgroup\$Lorenzo– Lorenzo2024年05月13日 19:52:01 +00:00Commented May 13, 2024 at 19:52
Explore related questions
See similar questions with these tags.