This is a follow-up question for Image Processing Median Filter in C. Under the same tiny bmp image read / write framework, a sobel edge detection function has been performed.
The experimental implementation
The basic structures:
typedef struct RGB { unsigned char R, G, B; } RGB; typedef struct BMPIMAGE { char FILENAME[MAX_PATH]; unsigned int XSIZE; unsigned int YSIZE; unsigned char FILLINGBYTE; unsigned char *IMAGE_DATA; } BMPIMAGE; typedef struct RGBIMAGE { unsigned int XSIZE; unsigned int YSIZE; RGB *IMAGE_DATA; } RGBIMAGE;
SobelEdge
function implementation:RGB *SobelEdge(const int xsize, const int ysize, const RGB * const image) { RGB *output; output = malloc(xsize * ysize * sizeof(RGB)); if (output == NULL) { printf("Memory allocation error!"); return NULL; } for(long long int i = 0; i < (xsize * ysize); i++) { if( (i < xsize) || ( (i % xsize) == 0) || ( ((i + 1) % xsize) == 0) || (i >= (xsize*(ysize-1)))) { output[i].R = image[i].R; output[i].G = image[i].G; output[i].B = image[i].B; } else { int Gx = 0, Gy = 0; Gx = ( image[i+xsize-1].R * (-1) + image[i+xsize].R * 0 + image[i+xsize+1].R * 1 + image[i-1].R * (-2) + image[i].R * 0 + image[i+1].R * 2 + image[i-xsize-1].R * (-1) + image[i-xsize].R * 0 + image[i-xsize+1].R * 1 ); Gy = ( image[i+xsize-1].R * (-1) + image[i+xsize].R * (-2) + image[i+xsize+1].R * (-1) + image[i-1].R * 0 + image[i].R * 0 + image[i+1].R * 0 + image[i-xsize-1].R * 1 + image[i-xsize].R * 2 + image[i-xsize+1].R * 1 ); long double magnitude, direction; magnitude = sqrt(pow(Gx, 2) + pow(Gy, 2)); if(Gx > 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)90; } else if(Gx < 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)270; } else if( (Gx == 0) && (Gy > 0) ) { direction = 90; } else if( (Gx == 0) && (Gy < 0) ) { direction = -90; } else if( (Gx == 0) && (Gy == 0) ) { direction = 0; } output[i].R = (magnitude > 255)?255:(magnitude < 0)?0:(unsigned char)magnitude; Gx = ( image[i+xsize-1].G * (-1) + image[i+xsize].G * 0 + image[i+xsize+1].G * 1 + image[i-1].G * (-2) + image[i].G * 0 + image[i+1].G * 2 + image[i-xsize-1].G * (-1) + image[i-xsize].G * 0 + image[i-xsize+1].G * 1 ); Gy = ( image[i+xsize-1].G * (-1) + image[i+xsize].G * (-2) + image[i+xsize+1].G * (-1) + image[i-1].G * 0 + image[i].G * 0 + image[i+1].G * 0 + image[i-xsize-1].G * 1 + image[i-xsize].G * 2 + image[i-xsize+1].G * 1 ); magnitude = sqrt(pow(Gx, 2) + pow(Gy, 2)); if(Gx > 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)90; } else if(Gx < 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)270; } else if( (Gx == 0) && (Gy > 0) ) { direction = 90; } else if( (Gx == 0) && (Gy < 0) ) { direction = -90; } else if( (Gx == 0) && (Gy == 0) ) { direction = 0; } output[i].G = (magnitude > 255)?255:(magnitude < 0)?0:(unsigned char)magnitude; Gx = ( image[i+xsize-1].B * (-1) + image[i+xsize].B * 0 + image[i+xsize+1].B * 1 + image[i-1].B * (-2) + image[i].B * 0 + image[i+1].B * 2 + image[i-xsize-1].B * (-1) + image[i-xsize].B * 0 + image[i-xsize+1].B * 1 ); Gy = ( image[i+xsize-1].B * (-1) + image[i+xsize].B * (-2) + image[i+xsize+1].B * (-1) + image[i-1].B * 0 + image[i].B * 0 + image[i+1].B * 0 + image[i-xsize-1].B * 1 + image[i-xsize].B * 2 + image[i-xsize+1].B * 1 ); magnitude = sqrt(pow(Gx, 2) + pow(Gy, 2)); if(Gx > 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)90; } else if(Gx < 0) { direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)270; } else if( (Gx == 0) && (Gy > 0) ) { direction = 90; } else if( (Gx == 0) && (Gy < 0) ) { direction = -90; } else if( (Gx == 0) && (Gy == 0) ) { direction = 0; } output[i].B = (magnitude > 255)? 255: (magnitude < 0)?0:(unsigned char)magnitude; } } return output; }
Input Image:
Output Image:
All suggestions are welcome.
The summary information:
Which question it is a follow-up to?
What changes has been made in the code since last question?
Besides creating
MedianFilter33
function, the function for sobel edge detection is the main part here.Why a new review is being asked for?
If there is any possible improvement, please let me know.
-
\$\begingroup\$ Where do the directions go? You're computing them, but it seems like they're forgotten about \$\endgroup\$user555045– user5550452021年06月10日 20:01:04 +00:00Commented Jun 10, 2021 at 20:01
3 Answers 3
Avoid code duplication
Whenever you repeat some pattern multiple times, try to find some way to remove the repetition. I see you are doing several 3x3 convolutions. Consider writing a function that does this:
static int convolve(const struct RGB *pixels, int w, int color, const int *coeffs) {
const struct {
unsigned char v[3];
} *p = (void *)pixels;
return p[-w - 1].v[color] * coeffs[0] + ... + ...
... + ... + ...
... + ... + p[w + 1].v[color] * coeffs[8];
}
And then use it like so:
Gx = convolve(image + i, xsize, 0, (int[9]){-1, 0, 1, -2, 0, 2, -1, 0, 1});
Gy = convolve(image + i, xsize, 0, (int[9]){-1, -2, -1, 0, 0, 0, 1, 2, 1});
Most compilers, when optimizations are enabled, will be able to inline this and generate the same assembly as before.
Also write a function to calculate the magnitude. Then you're left with three repetitions, one for each of red, green and blue. If RGB
would store an array instead of three separate variables, then you could just wrap things in a loop. See also the little cheat I did in convolve()
to make it work for any of the three colors.
Avoid expensive math operations
Your calculation of the magnitude is quite expensive; it converts int
s to double
s, then does square and square root operations (note that you could have just used hypot()
), then the result is stored in a long double
for no good reason, and after clamping it is converted back to an unsigned char
.
float
precision is all you need here, so I would write:
float magnitude = hypotf(Gx, Gy);
output[i].R = magnitude > 255 ? 255 : magnitude < 0 ? 0 : magnitude;
Also, why are you calculating direction
when it is never used? The compiler will optimize it away, but it's still unnecessary code. (Also note that you could've used atan2()
to get the direction in one go without having to worry about quadrants.)
Use size_t
for sizes, counts and indices
You use int
for some dimensions, but long long int
for the index i
.
An int
might not be large enough since it's only guaranteed to be at least 16 bits, but a long long int
is guaranteed to be 64 bits, so it is overkill on 32-bit systems.
Always use size_t
for sizes, counts and indices, as it is guaranteed to be big enough to handle everything that can be stored in memory.
Exceptions to this are when dealing with file sizes (use the same type as that used by functions that return file sizes), or if you are sure your data structures will have a limited size, and you need to conserve how much memory you waste on storing sizes.
Going faster
For small convolutions like the one you are doing, the naive method could be the fastest. However, do notice that your convolutions themselves have a pattern in them that could be exploited. For example, you could for every pixel calculate the convolutions with {1, 2, 1}, once horizontally and once vertically, and then you just need to subtract two of those 1x3 convolutions to get the final 3x3 convolution. If you do this in a way where you don't have to store the intermediate results for the whole image, but just keep a sliding window that fits in your processor's L2 cache, it might be faster.
atan()
is good for half a circle, atan2()
for full circle
atan()
returns a value [-pi/2...pi/2]. For a full circle result [-pi...pi], use atan2()
(and drop the long double
which provide no benefit here). Code can then skip the rest of the if() ... else
chain.
direction = atan((long double)Gy / (long double)Gx) * ( 180 / M_PI) + (long double)90;
double direction = atan2(Gy , Gx) * ( 180 / M_PI) + 90;
Avoid FP math when integer math will do
// long double magnitude = sqrt(pow(Gx, 2) + pow(Gy, 2))`
int magnitude = isqrt(1LL*Gx*Gx + 1LL*Gy*Gy));
Easy enough to find/code a good int isqrt(long long)
.
Avoid "late" wide types
Here with long double magnitude = sqrt(...)
and int xsize, int ysize, ... for(long long int i = 0; i < (xsize * ysize)
, OP has twice use a wider type to store a narrow computation.
Saving double
into long double
does not make for a more precise calculation. Saving int
into long
does not afford more integer computation range.
int
vs. size_t
math
See prior review.
// output = malloc(xsize * ysize * sizeof(RGB));
output = malloc(sizeof *output * xsize * ysize);
Credit
Photo is a decades old classic Lenna and deserves citation.
Fixed sized types
Rather than unsigned
for a struct
that stores a well known file format, use uintN_t
for greater portability,
There is one thing I thought of when reading your code that hasn’t yet been mentioned in the other two reviews:
for(long long int i = 0; i < (xsize * ysize); i++)
{
if( (i < xsize) || ( (i % xsize) == 0) || ( ((i + 1) % xsize) == 0) || (i >= (xsize*(ysize-1))))
You are looping over the linear index, then computing the x and y coordinates to check for out-of-bounds access. It would be more efficient to loop over x
and y
, then compute the index from those. A multiplication is cheaper than a division (the modulo operator comes for free with the division, I believe, in modern processors).
If you are looping over the coordinates, you can even loop over the first and last image lines separately, and so avoid the condition within the loop altogether:
// loop over first line
// loop over all but first and last line
// handle first pixel on line
// loop over al but first and last pixel
// handle last pixel on line
// loop over last line
Avoiding the condition inside the loop will make your code significantly faster.
For the pixels along the outer boundary of the image, you copy the input values. The correct thing to do would be to estimate the gradient using fewer pixels (e.g. by using a [1,-1] kernel instead of a [1,0,-1] kernel, or by guessing a value for the pixels just outside the image boundary). Alternatively, fill in zeros, which is a more useful default than whatever pixel values were there in the input image.
Explore related questions
See similar questions with these tags.