5
\$\begingroup\$

I wrote a program for analyzing pictures. The problem I'm having is very slow processing time for medium images like 800x800. I think the root of the problem for this is the for loop where I complete my NumPy array with values. The main task for the program is to count how many times x intensity shows in each color channel and then plot them in a histogram. For example, in the end we will be able to see how many times do we get intensity 200 in red channel and so on.

My code is probably very hard to read but I tried to add comments to make things easier.

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
#Load an image
fname = 'picture1.jpg'
image_grey = Image.open(fname).convert("L")
image_rgb = Image.open(fname).convert("RGB")
arrey_grey = np.asarray(image_grey)
arrey_rgb = np.asarray(image_rgb)
#Get image size
with Image.open('picture1.jpg') as img:
 width, height = img.size
size = width * height
#Plot the uploaded image both grey and rgb
plt.imshow(arrey_grey, cmap='gray')
plt.show()
plt.imshow(arrey_rgb)
plt.show()
#Define numpy arrey for each color channel
matrix_red = np.zeros((256, 4),dtype = object)
matrix_red[:,1] = int(0)
matrix_red[:,2:] = float(0)
matrix_green = np.zeros((256, 4),dtype = object)
matrix_green[:,1] = int(0)
matrix_green[:,2:] = float(0)
matrix_blue = np.zeros((256, 4),dtype = object)
matrix_blue[:,1] = int(0)
matrix_blue[:,2:] = float(0)
# Completing first column with 0-255
for i in range(256):
 matrix_red[i][0] = i
 matrix_green[i][0] = i
 matrix_blue[i][0] = i
# Counting intensity for each color channel
for i in range(width):
 Matrix_Width = arrey_rgb[i]
 for i in range(height):
 Matrix_Height = Matrix_Width[i]
 Red_Value = Matrix_Height[0]
 Green_Value = Matrix_Height[1]
 Blue_Value = Matrix_Height[2]
 for i in range(256):
 if (matrix_red[i][0] == Red_Value):
 matrix_red[i][1] = matrix_red[i][1] + 1
 if (matrix_green[i][0] == Green_Value):
 matrix_green[i][1] = matrix_green[i][1] + 1
 if (matrix_blue[i][0] == Blue_Value):
 matrix_blue[i][1] = matrix_blue[i][1] + 1
# Data for task ahead
Hx = 0
for i in range(256):
 matrix_red[i][2] = matrix_red[i][1] / size
 Hx = (matrix_red[i][2] + matrix_red[i][3]) + Hx
 matrix_red[i][3] = Hx
#Plotting results
Frequencie_Red = np.zeros((256, 1),dtype = object)
Frequencie_Red[:,0] = int(0)
Frequencie_Green = np.zeros((256, 1),dtype = object)
Frequencie_Green[:,0] = int(0)
Frequencie_Blue = np.zeros((256, 1),dtype = object)
Frequencie_Blue[:,0] = int(0)
Intensity = np.zeros((256, 1),dtype = object)
Intensity[:,0] = int(0)
for i in range(256):
 Frequencie_Red[i] = matrix_red[i][1]
 Frequencie_Green[i] = matrix_green[i][1]
 Frequencie_Blue[i] = matrix_blue[i][1]
for i in range(256):
 Intensity[i] = i
pos = Intensity
width = 1.0
ax = plt.axes()
ax.set_xticks(pos + (width / 2))
ax.set_xticklabels(Intensity)
plt.bar(pos, Frequencie_Red, width, color='r')
plt.show()
plt.bar(pos, Frequencie_Green, width, color='g')
plt.show()
plt.bar(pos, Frequencie_Blue, width, color='b')
plt.show() 

As i got pointed out in comment section I added test image and capture with my results.

This is the image I'm using to do my calculations.Test Image

The result looks like this. Where Red histogram is red channel and so on:

*Small note, I have no idea why under the first histogram there are black line. Results

Toby Speight
87.9k14 gold badges104 silver badges325 bronze badges
asked Nov 19, 2017 at 18:03
\$\endgroup\$
4
  • \$\begingroup\$ What's the point of using ipython's matplolib inline magic of you plt.show anyway? \$\endgroup\$ Commented Nov 19, 2017 at 18:42
  • \$\begingroup\$ @MathiasEttinger It's serves no purpose, i added it in beginning but didn't use it. My apologies for not removing it. \$\endgroup\$ Commented Nov 19, 2017 at 18:57
  • \$\begingroup\$ An example test image with given expected output would help reviewers and future readers, on a general note I feel like you could use some loops to reduce repetiotions, but I am not sure \$\endgroup\$ Commented Nov 19, 2017 at 19:43
  • \$\begingroup\$ For the red image, the lines are the ax xticks. They are so dense that all the numbers got merged together. Remove the 3 ax lines to get a more visualy appealing result like the blue or green images. \$\endgroup\$ Commented Nov 19, 2017 at 22:10

1 Answer 1

5
\$\begingroup\$

Usually, when using for loops and numpy together, you’re probably doing it wrong.

Some things you could replace in your code:

  • Use np.arange(256) instead of np.zeros + for i in range(256);
  • Use slicing instead of manually extracting values out of an array;
  • Use comparisons and .sum() instead of manually counting the number of pixels of a certain color intensity ((array_rgb == 42).sum(axis=1).sum(axis=0))

But all that is nothing compared to the use of numpy.histogram which does exactly what you want.


You also open the same image twice without closing it and yet, you use a with statement with your third open. Always use a with statement to better manage your resources.


I would use several plt.figure() and a single plt.show() to get all 5 plots show at once and be able to compare them. It will also make the whole thing more responsive.


Lastly, I would put all the code in a function that can be parametrized with the image name:

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
def histogram(image_name):
 with Image.open(image_name) as image_file:
 array_grey = np.asarray(image_file.convert('L'))
 array_rgb = np.asarray(image_file.convert('RGB'))
 plt.figure()
 plt.imshow(array_grey, cmap='gray')
 plt.figure()
 plt.imshow(array_rgb)
 for i, color in enumerate('rgb'):
 hist, bins = np.histogram(array_rgb[..., i], bins=256, range=(0, 256))
 plt.figure()
 plt.bar(bins[:-1], hist, color=color)
 plt.show()
if __name__ == '__main__':
 histogram('picture1.jpg')

(Using the ... syntax from this answer)

Or you could keep the 3 different histograms in variables for further processing:

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
def build_histograms(image_array):
 for i in range(image_array.shape[-1]):
 hist, _ = np.histogram(image_array[..., i], bins=256, range=(0, 256))
 yield hist
def histogram(image_name):
 with Image.open(image_name) as image_file:
 array_grey = np.asarray(image_file.convert('L'))
 array_rgb = np.asarray(image_file.convert('RGB'))
 plt.figure()
 plt.imshow(array_grey, cmap='gray')
 plt.figure()
 plt.imshow(array_rgb)
 red_frequencies, green_frequencies, blue_frequencies = build_histograms(array_rgb)
 grey_frequencies, = build_histograms(array_grey[..., np.newaxis])
 x_axis = np.arange(256)
 plt.figure()
 plt.bar(x_axis, red_frequencies, 1.0, color='r')
 plt.figure()
 plt.bar(x_axis, green_frequencies, 1.0, color='g')
 plt.figure()
 plt.bar(x_axis, blue_frequencies, 1.0, color='b')
 plt.figure()
 plt.bar(x_axis, grey_frequencies, 1.0, color='k')
 plt.show()
if __name__ == '__main__':
 histogram('picture1.jpg')
answered Nov 19, 2017 at 22:06
\$\endgroup\$
2
  • \$\begingroup\$ I have two questions. First why there are white stripes in my plot ? Second, in variable hist we save how many times x intensity shows in our array. How could i save these values before we switch to the next color. So in the end we would have 3 arrays one for each color channel. \$\endgroup\$ Commented Nov 20, 2017 at 11:09
  • \$\begingroup\$ @RebelInc the white stripes comes from the fact that bar width, by default, are not 1.0. Change line 19 to plt.bar(bins[:-1], hist, 1.0, color=color) to remove them. \$\endgroup\$ Commented Nov 20, 2017 at 11:16

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.