Here's my take on the Linear Regression exercise from John Zelle's Python book.
Open a window, let the user click on as many points as they want. Upon pressing the "Done" button the regression line of those points is drawn.
from graphics import *
def dots(win):
sumx = 0
sumy = 0
sumx2 = 0
sumxy = 0
count = 0
click = win.getMouse()
clickX = click.getX()
clickY = click.getY()
while (clickX < 31 or clickX > 39 or
clickY < 1 or clickY > 5):
sumx += clickX
sumy += clickY
sumx2 += clickX**2
sumxy += clickX*clickY
count += 1
Point(clickX, clickY).draw(win)
click = win.getMouse()
clickX = click.getX()
clickY = click.getY()
return sumx, sumy, sumx2, sumxy, count
def mean(sum, count):
mean = sum / count
return mean
def calcreg(sumx, sumy, sumx2, sumxy, count):
xleft = 0
xright = 40
m = (sumxy - count * mean(sumx, count) * mean(sumy, count)) / (sumx2 - count * mean(sumx, count)**2)
yleft = mean(sumy, count) + m * (xleft - mean(sumx, count))
yright = mean(sumy, count) + m * (xright - mean(sumx, count))
return Line(Point(xleft, yleft), Point(xright, yright))
def main():
win = GraphWin("Regression Line", 400, 400)
win.setCoords(0, 0, 40, 40)
Text(Point(35, 3), "DONE").draw(win)
Rectangle(Point(31, 1), Point(39 ,5)).draw(win)
sumx, sumy, sumx2, sumxy, count = dots(win)
try:
regline = calcreg(sumx, sumy, sumx2, sumxy, count)
regline.draw(win)
except ZeroDivisionError:
Text(Point(20, 20), "Enter at least 2 points!").draw(win)
win.getMouse()
main()
Any tips on what I should/could change here?
1 Answer 1
from <xx> import *
Prevent a *
import. You don't know what it will inject in your module namespace. Instead only import the parts you need:
from graphics import GraphWin, Line, Point, Rectangle, Text
The google style guide as linked by Mircea also says not to import the classes, but only modules and packages, but that is a matter of style and taste.
Magic numbers
There are a lot of magic numbers, scattered in your code. If ever you want to change the location of the button, you will need to adapt that in a few places, and will most likely miss some. To prevent such magic numbers, you can do something like this:
def main():
coord_max = 40
win = GraphWin("Regression Line", 10 * coord_max, 10 * coord_max)
win.setCoords(0, 0, coord_max, coord_max)
button_x, button_y = 35, 3
button_width, button_height = 8, 4
Text(Point(button_x, button_y), "DONE").draw(win)
button = Rectangle(
Point(button_x - button_width / 2, button_y - button_height / 2),
Point(button_x + button_width / 2, button_y + button_height / 2),
)
button.draw(win)
split functionality
your dots
method does a lot of things now. It:
- catches the click
- checks whether it it is the button
- draws the point
- does the summary calculations
Especially the last one should be factored to its own method. The easiest way to do this is let dots
just yield the x and y coordinate, and let another method take care of the regression.
Checking whether the click is on the button can also be done simpler. Instead of hardcoding the coordinates, simplest would be to pass in the button object, and ask that one for its bounding coordinates:
def dots(win, button):
while True:
click = win.getMouse()
x = click.getX()
y = click.getY()
if button.p1.x <= x <= button.p2.x and button.p1.y <= y <= button.p2.y:
break # button press
Point(x, y).draw(win)
yield x, y
Is to me a lot clearer and easier to understand.
If you don't know that p1 and p2 are bottomleft and topright, you can take that into account by doing:
def dots(win, button):
while True:
click = win.getMouse()
x = click.getX()
y = click.getY()
x1, x2 = sorted((button.p1.x, button.p2.x))
y1, y2 = sorted((button.p1.y, button.p2.y))
if x1 <= x <= x2 and y1 <= y <= y2: # button press
break
Point(x, y).draw(win)
yield x, y
summarize:
Instead of each summary metric getting its own variable, you can save them in a dict
(or a collections.Counter
more specifically)
def summarize(points):
summary = Counter()
for x, y in points:
summary["x"] += x
summary["y"] += y
summary["x2"] += x ** 2
summary["xy"] += x * y
summary["count"] += 1
summary["mean_x"] = summary["x"] / summary["count"]
summary["mean_y"] = summary["y"] / summary["count"]
return summary
regression:
Then pass this on to a method that calculates the regression line. This needs the maximum x
of the screen as an argument:
def regression_line(summary, x_max, x_min=0):
slope = (
summary["xy"]
- summary["count"] * summary["mean_x"] * summary["mean_y"]
) / (summary["x2"] - summary["count"] * summary["mean_x"] ** 2)
y_min = summary["mean_y"] + slope * (x_min - summary["mean_x"])
y_max = summary["mean_y"] + slope * (x_max - summary["mean_x"])
return Line(Point(x_min, y_min), Point(x_max, y_max))
rest of main
points = dots(win, button=button)
summary = summarize(points)
try:
regline = regression_line(summary, x_max=coord_max)
except ZeroDivisionError:
middle_point = Point(coord_max / 2, coord_max / 2)
Text(middle_point, "Enter at least 2 points!").draw(win)
else:
regline.draw(win)
win.getMouse()
main guard
If you put the calling of main()
behind if __name__ == "__main__":
, you can later import this module from somewhere else, and reuse some of the code
if __name__ == "__main__":
main()
-
\$\begingroup\$ Thanks for this in depth answer! Some things i did the way i did them just because i don't know more advanced methods, for example each metric having its own variable. I just don't know about summaries and dicts yet, it'll probably be further into the book. As for the * import: I don't usually do that. In this case the author of the book (who also made the module) said it's supposed to be that way for this module. I once tried "import graphics" but that didn't work. \$\endgroup\$Kevin– Kevin2019年02月14日 16:23:59 +00:00Commented Feb 14, 2019 at 16:23
Explore related questions
See similar questions with these tags.