I am writing an object-oriented shape detector for the first time in Python. I come from a Java background. Can you provide any coding advice?
https://www.pyimagesearch.com/2016/02/08/opencv-shape-detection/
@dataclass
class Coordinate:
x_value: int
y_value: int
@dataclass
class Rectangle:
coordinate_upper_left: Coordinate
width: float
height: float
from cv2 import cv2
from numpy.random import randint
import constants.shape_constants as shape_constants
from models.coordinate import Coordinate
from models.rectangle import Rectangle
class ShapeItem():
def __init__(self, curve_data):
# these items must run in sequence to be set correctly
self.curve_data = curve_data
self.drawing_item_id = uuid.uuid4()
self.approx_poly_dp = self.get_approx_poly()
self.vertices = self.get_vertices()
self.bounded_rectangle = self.get_bounded_rectangle()
self.center_coordinate = self.get_center_coordinate()
self.shape = self.get_shape()
def get_approx_poly(self):
perimeter_value = cv2.arcLength(self.curve_data, True)
return cv2.approxPolyDP(self.curve_data, 0.04 * perimeter_value, True)
def get_vertices(self):
vertices_list : list[Coordinate] = []
vertices_data_curve = self.approx_poly_dp.tolist()
for vertices_in_shape in [vertices_data_curve]:
for vertices_shape_two_brackets in vertices_in_shape:
for vertices_shape_one_bracket in vertices_shape_two_brackets:
vertices_item = Coordinate(vertices_shape_one_bracket[0], vertices_shape_one_bracket[1])
vertices_list.append(vertices_item)
return vertices_list
def get_bounded_rectangle(self):
(x_cv, y_cv, width_cv, height_cv) = cv2.boundingRect(self.approx_poly_dp)
bounded_rectangle = Rectangle(
coordinate_upper_left=Coordinate(x_cv, y_cv),
width=width_cv,
height=height_cv
)
return bounded_rectangle
def get_shape(self):
shape_value: str
if len(self.vertices) == 3:
shape_value = shape_constants.TRIANGLE
elif len(self.vertices) == 4:
shape_value = shape_constants.RECTANGLE
elif len(self.vertices) == 5:
shape_value = shape_constants.PENTAGON
else:
shape_value = shape_constants.ELLIPSE
return shape_value
def get_secondary_shape(self):
if self.shape == shape_constants.RECTANGLE and \
0.95 <= (self.bounded_rectangle.width / float(self.bounded_rectangle.height) <= 1.05):
return shape_constants.SQUARE
else:
return shape_constants.RECTANGLE
def get_center_coordinate(self):
moment_data = cv2.moments(self.curve_data)
center_x = int((moment_data["m10"] / moment_data["m00"]))
center_y = int((moment_data["m01"] / moment_data["m00"]))
center_coordinate = Coordinate(center_x, center_y)
return center_coordinate
def set_background_color(self):
random_color_index = randint(1, 10)
return random_color_index
2 Answers 2
Overview
The code layout is good, and you used meaningful names for classes, functions and variables. The names also adhere to the recommended style for Python.
Layout
All the import
lines should be at the top of the code.
I assume uuid
is from the missing line:
import uuid
Documentation
The PEP 8 style guide recommends adding docstrings for classes and functions.
The class docstring can summarize its purpose, and it can also include some example usage code.
DRY
In the get_shape
function, the expression len(self.vertices)
is repeated a few times. You could set this to a variable:
number_verts = len(self.vertices)
if number_verts == 3:
This simplifies the code, and it may be a tad more efficient.
Simpler
Consider this function:
def set_background_color(self):
random_color_index = randint(1, 10)
return random_color_index
Since it does not refer to self
in the function, it can be simplified
by removing self
from the argument list.
Also, the intermediate variable random_color_index
can be eliminated:
def set_background_color():
return randint(1, 10)
You could also simplify the return
in other functions, such as get_center_coordinate
.
a Too Big constructor
In any OO language, including java and python, this ctor looks like it's too ambitious, it's doing too much. The comment offers a big hint about that.
class ShapeItem():
def __init__(self, curve_data):
# these items must run in sequence to be set correctly
self.curve_data = curve_data
self.drawing_item_id = uuid.uuid4()
self.approx_poly_dp = self.get_approx_poly()
self.vertices = self.get_vertices()
self.bounded_rectangle = self.get_bounded_rectangle()
self.center_coordinate = self.get_center_coordinate()
self.shape = self.get_shape()
The most important thing missing here is a docstring which explains the class responsibilities.
Those first two items, storing curve data and a GUID, look fine, and could maybe even be handled by a @dataclass. The rest seems to be getting into one or more "result" objects.
A big part of why we choose to solve a problem in the OO style is we can establish class invariants early on, when the ctor finishes, and then we can rest assured that each method we call shall preserve those invariants. It simplifies analysis, and testing. But here, most of the work is performed before the ctor finishes.
Rather than self.this
and self.that
, consider defining
local this
, that
variables in a method or function.
good annotation
Thank you kindly for telling us the empty
vertices_list
is of list[Coordinate]
.
Empty lists are a bit of a rough edge for mypy --strict
these days, requiring a little extra annotation effort.
Consider just calling it vertices
, since the code
makes it very clear that it is a list.
mapping
Maybe the triangle enum is helpful? IDK, perhaps there's more
to this project than the OP excerpt.
But the repeated if
is definitely tedious.
Prefer to map with this dict:
{
3: shape_constants.TRIANGLE,
4: shape_constants.RECTANGLE,
5: shape_constants.PENTAGON,
}
naming
I'm skeptical that middle name is correct. I bet you intended shape_constants.QUADRILATERAL instead, given that there's no 90° promise being made.
long conjunct
nit: This is perfectly fine as-is:
if self.shape == shape_constants.RECTANGLE and \
0.95 <= (self.bounded_rectangle.width / float(self.bounded_rectangle.height) <= 1.05):
But PEP 8
encourages you to use "extra" parens when phrasing
a long conjunct such as that, to avoid the use of \
backwhacks.
if (m and
0.95 <= n <= 1.05):
And sometimes we define a temp var, such as n
,
so we have a pair of short statements rather than
a single statement spanning multiple lines.
distinct colors
random_color_index = randint(1, 10)
This is sampling with replacement.
The output may be more legible if you draw colors from an urn
without replacement.
Produce a list of sequential colors
,
then random.shuffle(colors)
.
Sequentially draw a color from that permutation.
Since the index might fall off the end, use %
modulo
to keep it in-bounds, in which case we're forced to resort to replacement.
Notice that this is a datastructure which has a tiny amount
of state. It would be a good candidate for breaking out
a new class
.