I am using OpenCV 2.410 to implement a project. My project allows to segment head from video sequence which get from camera. First, I detect the head region and then apply segmentation method for that ROI region. For high accuracy segmentation, I have chosen the Grabcut method. However, it is very slow. I only achieved about 2 frames/second (although I used a downsampling method).
I have two questions:
Do you have a faster method than Grabcut which has similar accuracy? On the other hand, do we have any way to segment head region?
Could you see my code and give me some optimal way to make it faster?
#include <iostream>
#include <string>
#include <time.h>
//include opencv core
#include "opencv2\core\core.hpp"
#include "opencv2\contrib\contrib.hpp"
#include "opencv2\highgui\highgui.hpp"
#include "opencv2\objdetect\objdetect.hpp"
#include "opencv2\opencv.hpp"
//file handling
#include <fstream>
#include <sstream>
using namespace std;
using namespace cv;
//Functions
int VideoDisplay();
Mat GrabCut(Mat image);
const unsigned int BORDER = 5;
const unsigned int BORDER2 = BORDER + BORDER;
int main()
{
int value=VideoDisplay();
system("pause");
return 0;
}
Mat GrabCut(Mat image)
{
clock_t tStart_all = clock();
cv::Mat result; // segmentation result (4 possible values)
cv::Mat bgModel,fgModel; // the models (internally used)
// downsample the image
cv::Mat downsampled;
cv::pyrDown(image, downsampled, cv::Size(image.cols/2, image.rows/2));
cv::Rect rectangle(BORDER,BORDER,downsampled.cols-BORDER2,downsampled.rows-BORDER2);
clock_t tStart = clock();
// GrabCut segmentation
cv::grabCut(downsampled, // input image
result, // segmentation result
rectangle,// rectangle containing foreground
bgModel,fgModel, // models
1, // number of iterations
cv::GC_INIT_WITH_RECT); // use rectangle
printf("Time taken by GrabCut with downsampled image: %f s\n", (clock() - tStart)/(double)CLOCKS_PER_SEC);
// Get the pixels marked as likely foreground
cv::compare(result,cv::GC_PR_FGD,result,cv::CMP_EQ);
// upsample the resulting mask
cv::Mat resultUp;
cv::pyrUp(result, resultUp, cv::Size(result.cols*2, result.rows*2));
// Generate output image
cv::Mat foreground(image.size(),CV_8UC3,cv::Scalar(255,255,255));
image.copyTo(foreground,resultUp); // bg pixels not copied
return foreground;
}
int VideoDisplay(){
cout << "start recognizing..." << endl;
//lbpcascades/lbpcascade_frontalface.xml
string classifier = "C:/opencv/sources/data/haarcascades/haarcascade_frontalface_default.xml";
CascadeClassifier face_cascade;
string window = "Capture - face detection";
if (!face_cascade.load(classifier)){
cout << " Error loading file" << endl;
return -1;
}
VideoCapture cap(0);
//VideoCapture cap("C:/Users/lsf-admin/Pictures/Camera Roll/video000.mp4");
if (!cap.isOpened())
{
cout << "exit" << endl;
return -1;
}
//double fps = cap.get(CV_CAP_PROP_FPS);
//cout << " Frames per seconds " << fps << endl;
namedWindow(window, 1);
long count = 0;
int fps=0;
//Start and end times
time_t start,end;
//Start the clock
time(&start);
int counter=0;
while (true)
{
vector<Rect> faces;
Mat frame;
Mat graySacleFrame;
Mat original;
cap >> frame;
time(&end);
++counter;
double sec=difftime(end,start);
fps=counter/sec;
if (!frame.empty()){
//clone from original frame
original = frame.clone();
//convert image to gray scale and equalize
cvtColor(original, graySacleFrame, CV_BGR2GRAY);
//equalizeHist(graySacleFrame, graySacleFrame);
//detect face in gray image
face_cascade.detectMultiScale(graySacleFrame, faces, 1.1, 3, 0, cv::Size(90, 90));
//number of faces detected
//cout << faces.size() << " faces detected" << endl;
std::string frameset = std::to_string(fps);
std::string faceset = std::to_string(faces.size());
int width = 0, height = 0;
cv::Mat seg_grabcut;
//region of interest
for (int i = 0; i < faces.size(); i++)
{
//region of interest
Rect face_i = faces[i];
////crop the roi from grya image
//Mat face = graySacleFrame(face_i);
Mat crop = original(face_i);
////resizing the cropped image to suit to database image sizes
Mat face_resized;
cv::resize(crop, face_resized, Size(512,512), 1.0, 1.0, INTER_CUBIC);
//drawing green rectagle in recognize face
rectangle(original, face_i, CV_RGB(0, 255, 0), 1);
if(!face_resized.empty())
{
seg_grabcut=GrabCut(face_resized);
if (!seg_grabcut.empty())
{
imshow("segmented result", seg_grabcut);
}
}
}
putText(original, "Frames/Second: " + frameset, Point(30, 60), CV_FONT_HERSHEY_COMPLEX_SMALL, 1.0, CV_RGB(0, 255, 0), 1.0);
//display to the winodw
cv::imshow(window, original);
//cout << "model infor " << model->getDouble("threshold") << endl;
}
if (waitKey(30) >= 0) break;
}
}
1 Answer 1
Correctness
You have a subtle bug in your code. When down- and upsampling your image with pyrDown()
and pyrUp()
, you compute your image size with integer division. This will lose a pixel if your image dimensions are odd. You can fix by storing the full image size in a variable:
const auto fullSize = image.size();
and using that as the dstsize
argument for pyrUp()
.
Compiler errors
VideoDisplay()
does not compile, since it should return an int
, but there is a code path which does not return anything. Since it looks like you are using the return value to report whether an error occurred, you should add a return 0;
at the end of the function to indicate success. You should also return value;
in main()
to signal the presence of an error.
Performance
You ask about a better method than GrabCut for segmentation. GrabCut does graph-based segmentation, and is the only algorithm of this type in OpenCV. Suggesting an alternative algorithm is probably beyond the scope of Code Review.
Unnecessary copy
The line in your codeoriginal = frame.clone();
copies frame
, but you never use frame
again. This copy is unnecessary, and costs you a slight bit of performance. You can remove the declaration of original
and replace its usages with frame
with no problems.
Replace pyrUp
and pyrDown
for sampling
Both functions do a Gaussian smoothing of the image during resizing, which may be unnecessary extra work. You can use resize()
unless you have a good reason to prefer pyrUp
. It's less expensive computationally, and lets you easily downsample by a factor other than 2. A factor of 4 gives a better frame rate--I get about 10 fps. It's still not real-time, but a decent speed increase for only a modest drop in image quality.
Code Style
There are a number of issues with your code style. First, some quick notes:
Your use of whitespace and brace placement is inconsistent. Pick one style and stick with it.
Pay attention to warnings: you have a number of unused variables which clutter your code. Remove them.
Reduce the scope of your variables as much as you can. That means declaring them as close to where they are used as possible. This makes your program easier to reason about, since there are less variables floating around in each scope.
Declare variables
const
any time you want a value that cannot change. This makes your code easier to reason about, since there are fewer values which can change.Avoid
using namespace std
. I would extend this advice tonamespace cv
as well. Sure, it saves you some typing, but the increase in clarity is worth it.Since you're using C++11, you can use a range-based
for
loop to simplify your code.for (int i = 0; i < faces.size(); i++)
becomesfor (const Rect& rect_i : rects)
, which states your intent better.
Waiting for input
You use system("pause")
to pause the program. This isn't portable, since it's Windows-specific. You can use getchar()
instead to achieve the same effect while maintaining portability.
Include paths
Many of your OpenCV includes are redundant. <opencv2/opencv.hpp>
automatically includes the other OpenCV headers, so you can remove them. Also, your includes should be in brackets, not quotes. Using quotes in your includes indicates the files are local to your project. Others who have OpenCV installed will need to adjust your includes to get their code to compile if you used quotes.
-
\$\begingroup\$ Compiler errors: on
int f(int n){if(n > 0) return n;}
MSVC issues just a warning 'not all control paths return a value', althoughint fun(){}
causes an error 'must return a value'. \$\endgroup\$CiaPan– CiaPan2018年07月11日 07:45:09 +00:00Commented Jul 11, 2018 at 7:45 -
\$\begingroup\$ Please find a header name in the first line of the Include paths section and surround it with back-quotes to make it rendered as a piece of source code. Without those quotes the name in angle brackets goes to a generated web page 'as is' and becomes an unrecognized HTML tag. As a result it disappears from the rendered page, filtered either by the StackExchange software or by my web browser. \$\endgroup\$CiaPan– CiaPan2018年07月11日 21:15:04 +00:00Commented Jul 11, 2018 at 21:15
-
\$\begingroup\$ I tried to fix it myself, but my attempt has been rejected with #*! comments like 'This edit does not make the post even a little bit easier to read (...) Changes are either completely superfluous or actively harm readability' (making the subject of the sentence visible isn't an improvement, it harms readability!) and 'This edit deviates from the original intent of the post' (so reviewers know you mischievously made your post harder to understand!). \$\endgroup\$CiaPan– CiaPan2018年07月11日 21:15:09 +00:00Commented Jul 11, 2018 at 21:15