"Histogram of Oriented Gradients" (HOG) feature detector for computer vision

Question 1

Here is a function that I've only slightly modified from its original context, found here.

Before mentioning anything else, it should be noted that I'm desperately trying to optimize this code for speed. It presently takes about 5.25 seconds to execute and it appears as though the bottleneck is happening in the for-loop.

In a nutshell, this function expects the user to have SimpleCV installed and expects, at a minimum, to be passed a SimpleCV.Image instance.

Does anybody have some clever approach for speeding things up? Ideally I'd be able to run this on a real-time webcam feed at 30 frames-per-second, but I'm not getting my hopes up.

from itertools import product
from math import floor, pi
import numpy as np
import cv2 # opencv 2
def findHOGFeatures(img, n_divs=6, n_bins=6):
 """
 **SUMMARY**
 Get HOG(Histogram of Oriented Gradients) features from the image.
 **PARAMETERS**
 * *img* - SimpleCV.Image instance
 * *n_divs* - the number of divisions(cells).
 * *n_divs* - the number of orientation bins.
 **RETURNS**
 Returns the HOG vector in a numpy array
 """
 # Size of HOG vector
 n_HOG = n_divs * n_divs * n_bins
 # Initialize output HOG vector
 # HOG = [0.0]*n_HOG
 HOG = np.zeros((n_HOG, 1))
 # Apply sobel on image to find x and y orientations of the image
 Icv = img.getNumpyCv2()
 Ix = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=1, dy=0, ksize=3)
 Iy = cv2.Sobel(Icv, ddepth=cv.CV_32F, dx=0, dy=1, ksize=3)
 Ix = Ix.transpose(1, 0, 2)
 Iy = Iy.transpose(1, 0, 2)
 cellx = img.width / n_divs # width of each cell(division)
 celly = img.height / n_divs # height of each cell(division)
 #Area of image
 img_area = img.height * img.width
 #Range of each bin
 BIN_RANGE = (2 * pi) / n_bins
 # m = 0
 angles = np.arctan2(Iy, Ix)
 magnit = ((Ix ** 2) + (Iy ** 2)) ** 0.5
 it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
 for m, n, i, j in it:
 # grad value
 grad = magnit[m * cellx + i, n * celly + j][0]
 # normalized grad value
 norm_grad = grad / img_area
 # Orientation Angle
 angle = angles[m*cellx + i, n*celly+j][0]
 # (-pi,pi) to (0, 2*pi)
 if angle < 0:
 angle += 2 * pi
 nth_bin = floor(float(angle/BIN_RANGE))
 HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_grad
 return HOG.transpose()

Question 2

Can you add the missing import statements so that this is runnable, please?

Question 3

@GarethRees, done, sorry about that! Naturally, you'll still need to import SimpleCV and create an Image instance to pass to the function. I suggest doing so as follows: img = SimpleCV.Image('lenna'). Also note, that doing from SimpleCV import * will handle all imports correctly.

Question 4

Hello. :) You need to profile your code to know where the bottleneck is, and to see whether you can improve things or not. Maybe you're spending a lot of time in SimpleCV or OpenCV. Unfortunately, "it looks like the for loop is slow" doesn't help much.

Question 5

@QuentinPradet, I did in fact profile using iPython's %prun magic -- I should have mentioned that. The reason I point directly to the for-loop iteration is because the function responsible for the most cumulative execution time is math.floor, which is called exactly once per loop, and the number of calls to said function matches the number of loop iterations. I'll try GarethRees' vectorization approach and post a more detailed profiler output if needed.

Question 6

As you indicated in the question, you need to vectorize the for loop:

it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
 # grad value
 grad = magnit[m * cellx + i, n * celly + j][0]
 # normalized grad value
 norm_grad = grad / img_area
 # Orientation Angle
 angle = angles[m*cellx + i, n*celly+j][0]
 # (-pi,pi) to (0, 2*pi)
 if angle < 0:
 angle += 2 * pi
 nth_bin = floor(float(angle/BIN_RANGE))
 HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_grad

If you look at what this is doing, you're effectively labelling every pixel in the magnit array with a number below n_HOG, and then summing the normalized values for the pixels with each label.

Operations on labelled regions of images are jobs for the scipy.ndimage.measurements module. Here we can use scipy.ndimage.measurements.sum:

bins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_area

Notes:

I've used % (2 * pi) to get the angles in the range [0, 2π). An alternative that's more like your code would be angles[angles < 0] += 2 * pi but using the modulus is shorter and, I think, clearer.
I postponed the division by img_area until after the summation, because it looks to me as though in the common case n_HOG is much less than img_area and so it's cheaper to do the division later when there are fewer items. (This means that the results differ very slightly from your code, so bear that in mind when checking.)
I measure the vectorized version as being about 60 times faster than your for loop, but it's still not going to be fast enough to run at 30 fps!
I've written angles[...,0] and magnit[...,0] here in order to drop the third axis. But I think it would make more sense if you dropped this axis earlier, before computing angles and magnit, by writing Ix = Ix[...,0] or just Ix = Ix.reshape((height, width)) if you know that the last axis has length 1.

Update

Based on comments, it looks as if you are using Python 2.7, where the division operator / takes the floor of the result if both arguments are integers. So I've changed the code above to use:

x = x * n_divs // width
y = y * n_divs // height

which is portable between Python 2 and Python 3, and simpler than my first attempt:

x = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)

Question 7

Fantastic! Thank you so much! I'll give this a whirl later on today and come back with questions if I have any.

Question 8

I've tried running your example and I'm getting zero values for all but the first 6 cells of the output array. Would you mind checking my implementation? I'm not quite sure what the problem could be. The vectorized approach is implemented in the function named findHOGFeaturesVect: paste.ubuntu.com/7004183

Question 9

Are you using Python 2? If so, see the revised answer.

Question 10

@blz can you copy the final working optimized code as an answer here?

Gareth Rees Gareth Rees 50.1k3 gold badges130 silver badges210 bronze badges · Accepted Answer · 2014-02-27 01:28:00Z

As you indicated in the question, you need to vectorize the for loop:

it = product(xrange(n_divs), xrange(n_divs), xrange(cellx), xrange(celly))
for m, n, i, j in it:
 # grad value
 grad = magnit[m * cellx + i, n * celly + j][0]
 # normalized grad value
 norm_grad = grad / img_area
 # Orientation Angle
 angle = angles[m*cellx + i, n*celly+j][0]
 # (-pi,pi) to (0, 2*pi)
 if angle < 0:
 angle += 2 * pi
 nth_bin = floor(float(angle/BIN_RANGE))
 HOG[((m * n_divs + n) * n_bins + int(nth_bin))] += norm_grad

If you look at what this is doing, you're effectively labelling every pixel in the magnit array with a number below n_HOG, and then summing the normalized values for the pixels with each label.

Operations on labelled regions of images are jobs for the scipy.ndimage.measurements module. Here we can use scipy.ndimage.measurements.sum:

bins = (angles[...,0] % (2 * pi) / BIN_RANGE).astype(int)
x, y = np.mgrid[:width, :height]
x = x * n_divs // width
y = y * n_divs // height
labels = (x * n_divs + y) * n_bins + bins
index = np.arange(n_HOG)
HOG = scipy.ndimage.measurements.sum(magnit[...,0], labels, index)
return HOG / img_area

Notes:

I've used % (2 * pi) to get the angles in the range [0, 2π). An alternative that's more like your code would be angles[angles < 0] += 2 * pi but using the modulus is shorter and, I think, clearer.
I postponed the division by img_area until after the summation, because it looks to me as though in the common case n_HOG is much less than img_area and so it's cheaper to do the division later when there are fewer items. (This means that the results differ very slightly from your code, so bear that in mind when checking.)
I measure the vectorized version as being about 60 times faster than your for loop, but it's still not going to be fast enough to run at 30 fps!
I've written angles[...,0] and magnit[...,0] here in order to drop the third axis. But I think it would make more sense if you dropped this axis earlier, before computing angles and magnit, by writing Ix = Ix[...,0] or just Ix = Ix.reshape((height, width)) if you know that the last axis has length 1.

Update

Based on comments, it looks as if you are using Python 2.7, where the division operator / takes the floor of the result if both arguments are integers. So I've changed the code above to use:

x = x * n_divs // width
y = y * n_divs // height

which is portable between Python 2 and Python 3, and simpler than my first attempt:

x = (x / width * n_divs).astype(int)
y = (y / height * n_divs).astype(int)

Fantastic! Thank you so much! I'll give this a whirl later on today and come back with questions if I have any.
I've tried running your example and I'm getting zero values for all but the first 6 cells of the output array. Would you mind checking my implementation? I'm not quite sure what the problem could be. The vectorized approach is implemented in the function named findHOGFeaturesVect: paste.ubuntu.com/7004183
@blz can you copy the final working optimized code as an answer here?

Stack Exchange Network

"Histogram of Oriented Gradients" (HOG) feature detector for computer vision

1 Answer 1

Update

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

"Histogram of Oriented Gradients" (HOG) feature detector for computer vision

1 Answer 1

Update

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions