Automatic image cropping in Python 3

Question 1

I made a script to automate the cropping of spectrogram images I generated using Matlab.

I have 4 different types of images (Fixed height, but varying width) and they are cropped differently according to their type.

Here's an example of the input image (Type 2) (2462x256) enter image description here

Here's an example of the input image (Type 3) (34482x256) enter image description here

Image types 1 and 2 are simply cropped from the right edge to the desired dimensions (In this case 1600px), since the interesting part of the signal is on the right. I'm essentially removing the left side of the image until I have a 1600px wide image left.

Image types 3 and 4 are originally very long images, so I can crop multiple images out of each one, overlapping each by a fixed amount. (In this case, I'll crop a 1600px wide image starting at (0,0), save it, crop another 1600px wide image at (400,0) then at (800,0) and so on.)

Here's the first example after cropping. (1600x256) enter image description here

Here's the first two crops from the second example, you can see the overlap on the right. (1600x256) enter image description here enter image description here

As a beginner, I mostly want to know if I'm doing something wrong, that could be optimized or just done better.

#Packages
import cv2
import os
from imageio import imwrite, imread
#Defined parameters
#Input and output paths
path_directory_input = '/home/.../spectrograms/uncropped'
path_directory_output = '/home/.../spectrograms/cropped'
#Cropping parameters
image_height_final = 256
image_width_final = 1600
image_overlap = 400
crop_nb_maximum = 11
#Class example counters
class1,class2,class3,class4 = 0,0,0,0
class1_out,class2_out,class3_out,class4_out = 0,0,0,0
# Object slipping = 1
# Object slipping on surface = 2
# Robot movement = 3
# Robot movement with object = 4
#Iterate over all samples in the input directory
for path_image in os.listdir(path_directory_input):
 #Defines the current image path, output path and reads the image
 path_image_input = os.path.join(path_directory_input, path_image)
 path_image_output = os.path.join(path_directory_output, path_image)
 image_current = imread(path_image_input)
 #Parse the filename and determine the current class (determined by the 15th character)
 class_current = int(path_image[15])
 #Counts the number of input examples being treated
 if class_current == 1:
 class1 += 1
 if class_current == 2:
 class2 += 1
 if class_current == 3:
 class3 += 1
 if class_current == 4:
 class4 += 1
 #Get image dimensions
 image_height_current, image_width_current = image_current.shape[:2]
 #Changes the procedure depending on the current class
 if (class_current == 1) or (class_current == 2):
 print('Processing class: ', class_current)
 #Crops the image to target size (Format is Y1:Y2,X1:X2)
 image_current_cropped = image_current[0:image_height_final,
 (image_width_current-image_width_final):image_width_current]
 #Saves the new image in the output file
 imwrite(path_image_output,image_current_cropped)
 elif (class_current == 3) or (class_current == 4):
 print('Processing class: ', class_current)
 #Count how many crops can fit in the original
 crop_nb = int((image_width_current - image_width_final)/image_overlap)
 #Limit the crop number to arrive at equal class examples
 if crop_nb > crop_nb_maximum:
 if class_current == 3:
 crop_nb = crop_nb_maximum
 else:
 crop_nb = crop_nb_maximum * 2
 #Loop over that number
 for crop_current in range(0,crop_nb):
 #Counts the number of output examples
 if class_current == 3:
 class3_out += 1
 if class_current == 4:
 class4_out += 1
 #Crop the image multiple times with some overlap
 image_current_cropped = image_current[0:image_height_final,
 (crop_current * image_overlap):((crop_current * image_overlap) + image_width_final)]
 #Save the crop with a number appended
 path_image_output_new = path_image_output[:-4] #Removes the .png
 path_image_output_new = str.join('_',(path_image_output_new,str(crop_current))) #Appends the current crop number
 path_image_output_new = path_image_output_new + '.png' #Appends the .png at the end
 imwrite(path_image_output_new,image_current_cropped)
 else:
 #If the current class is not a valid selection (1-4)
 print('Something went wrong with the class selection: ',class_current)
#Prints the number of examples
print('Cropping is done. Here are the input example numbers:')
print('class1',class1)
print('class2',class2)
print('class3',class3)
print('class4',class4)
print('Here are the output example numbers')
print('class1',class1)
print('class2',class2)
print('class3',class3_out)
print('class4',class4_out)

Question 2

@GarethRees I added some clarification. Feel free to ask for more if it's stil unclear!

Question 3

This could lead to an error: crop_nb = crop_nb_maximum * 2. You've determined that crop_nb > crop_nb_maximum, but you don't know if it's larger than twice that value. How about crop_nb = min(crop_nb, crop_nb_maximum * 2)?

Question 4

I didn't want to answer this question because I'm not familiar enough with Python. But I can certainly give some pointers. I hope this answer encourages others to chime in too.

Avoid variable names with an index attached

It's always better to the make that index an actual index.

If instead of variable names class1, class2, etc. you use a list class = [0,0,0,0], some of your code will be simpler:

if class_current == 1:
 class1 += 1
if class_current == 2:
 class2 += 1
if class_current == 3:
 class3 += 1
if class_current == 4:
 class4 += 1

becomes

class[class_current] += 1

Other than that, I think your variable names are clear and make it so you don't need to add a comment explaining each one.

Document assumptions

In this bit of code:

 if crop_nb > crop_nb_maximum:
 if class_current == 3:
 crop_nb = crop_nb_maximum
 else:
 crop_nb = crop_nb_maximum * 2

you set crop_nb to twice some constant value. I presume that you do this because you know for sure crop_nb is larger than this constant, you are assuming that for class 4 images you always can crop at least 22 images. This might be true, but it is not clear from the code, and therefore it smells like a bug. So if this is assumption is actually true, add a comment.

I would probably write it like this:

if class_current == 3:
 crop_nb = min(crop_nb, crop_nb_maximum)
else:
 crop_nb = min(crop_nb, crop_nb_maximum * 2)

Simplify logic

In the inner loop for classes 3 and 4, you start like this:

for crop_current in range(0,crop_nb):
 #Counts the number of output examples
 if class_current == 3:
 class3_out += 1
 if class_current == 4:
 class4_out += 1
 # <snip>

It is simpler to instead write

class_out[class_current] += crop_nb
for crop_current in range(0,crop_nb):
 # <snip>

Question 5

All good ideas! You're right about the second point: I do know there will be more than 22 images, but I should use your notation anyways, since it covers that possible gap.

Question 6

In your last tip, I think +=crop_nb should be +=1 (this code is already inside a branch where class_current is 3 or 4)?

Question 7

@handle: No, I moved the +=1 out of the for loop that runs crop_nb times. The suggestion is to add crop_nb directly, rather than increment by 1 crop_nb times.

Question 8

Python won't let you name a variable class (because it's already a keyword) so it would need to be renamed. Something like class_count would be good since it's a list of counts of each kind of class.

Cris Luengo Cris Luengo 6,9811 gold badge14 silver badges37 bronze badges · Accepted Answer · 2018-04-10 04:26:57Z

I didn't want to answer this question because I'm not familiar enough with Python. But I can certainly give some pointers. I hope this answer encourages others to chime in too.

Avoid variable names with an index attached

It's always better to the make that index an actual index.

If instead of variable names class1, class2, etc. you use a list class = [0,0,0,0], some of your code will be simpler:

if class_current == 1:
 class1 += 1
if class_current == 2:
 class2 += 1
if class_current == 3:
 class3 += 1
if class_current == 4:
 class4 += 1

becomes

class[class_current] += 1

Other than that, I think your variable names are clear and make it so you don't need to add a comment explaining each one.

Document assumptions

In this bit of code:

 if crop_nb > crop_nb_maximum:
 if class_current == 3:
 crop_nb = crop_nb_maximum
 else:
 crop_nb = crop_nb_maximum * 2

you set crop_nb to twice some constant value. I presume that you do this because you know for sure crop_nb is larger than this constant, you are assuming that for class 4 images you always can crop at least 22 images. This might be true, but it is not clear from the code, and therefore it smells like a bug. So if this is assumption is actually true, add a comment.

I would probably write it like this:

if class_current == 3:
 crop_nb = min(crop_nb, crop_nb_maximum)
else:
 crop_nb = min(crop_nb, crop_nb_maximum * 2)

Simplify logic

In the inner loop for classes 3 and 4, you start like this:

for crop_current in range(0,crop_nb):
 #Counts the number of output examples
 if class_current == 3:
 class3_out += 1
 if class_current == 4:
 class4_out += 1
 # <snip>

It is simpler to instead write

class_out[class_current] += crop_nb
for crop_current in range(0,crop_nb):
 # <snip>

All good ideas! You're right about the second point: I do know there will be more than 22 images, but I should use your notation anyways, since it covers that possible gap.
In your last tip, I think +=crop_nb should be +=1 (this code is already inside a branch where class_current is 3 or 4)?
@handle: No, I moved the +=1 out of the for loop that runs crop_nb times. The suggestion is to add crop_nb directly, rather than increment by 1 crop_nb times.
Python won't let you name a variable class (because it's already a keyword) so it would need to be renamed. Something like class_count would be good since it's a list of counts of each kind of class.

Stack Exchange Network

Automatic image cropping in Python 3

1 Answer 1

Avoid variable names with an index attached

Document assumptions

Simplify logic

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Automatic image cropping in Python 3

1 Answer 1

Avoid variable names with an index attached

Document assumptions

Simplify logic

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions