We have some video analysis software written in c# .NET that uses OpenCV via the Emgu.CV wrappers. The video frames come from a GiGEVision camera (not a normal capture device) which are then analysed, graphically annotated, and then encoded to a video file.
Previously we have used the OpenCV VideoWriter class to encode the video. However, the VideoWriter class uses video-for-windows codecs and often corrupts the indexing of the output file.
After much searching I am yet to find another .NET implementation of encoding frames to H264 video, so I decided to write my own. The code below is based on the MediaFoundation C++ SinkWriter tutorial and implemented in .NET using the MediaFoundation.NET wrapper.
The main changes I have made are:
- Everything is in a single thread, due to problems accessing the WriteFrame method from other threads. I believe this is due to interacting with the underlying COM object but I've no experience with that.
- New frames are passed to the thread using a
BlockingCollection
- IDisposable was implemented to make sure Stop() is called.
Some questions:
- Is the thread implementation using
CancellationTokenSource
appropriate? - Is
BlockingCollection
the best way to pass the frames in? - Is it possible to reuse the IMFMediaBuffer and IMFSample objects? If so, should I do this? Will it improve efficiency?
- Is the implementation of IDisposable correct?
Code:
class MFVideoEncoder : IDisposable
{
private int videoBitRate = 800000;
const int VIDEO_FPS = 30;
const int BYTES_PER_PIXEL = 3;
const long TICKS_PER_SECOND = 10 * 1000 * 1000;
const long VIDEO_FRAME_DURATION = TICKS_PER_SECOND / VIDEO_FPS;
public bool HasStarted = false;
private IMFSinkWriter sinkWriter;
private int streamIndex = 0;
private int frameSizeBytes = 0;
private long frames = 0;
private int videoWidth = 0;
private int videoHeight = 0;
private string outputFile = "//output.mp4";
private CancellationTokenSource encodeTaskCTS;
private Thread encodeThread;
BlockingCollection<Emgu.CV.Mat> FrameQueue = new BlockingCollection<Emgu.CV.Mat>();
public MFVideoEncoder()
{
}
public void Start(String outputFile, int width, int height, int bitRate)
{
this.videoWidth = width;
this.videoHeight = height;
this.outputFile = outputFile;
this.videoBitRate = bitRate;
frames = 0;
frameSizeBytes = BYTES_PER_PIXEL * videoWidth * videoHeight;
HasStarted = false;
encodeTaskCTS?.Dispose();
encodeTaskCTS = new CancellationTokenSource();
var token = encodeTaskCTS.Token;
encodeThread = new Thread(() => EncodeTask(token));
encodeThread.Priority = ThreadPriority.Highest;
//encodeThread.SetApartmentState(ApartmentState.STA);
encodeThread.Start();
}
public void Start(String outputFile, int width, int height, double compressionFactor)
{
int bitRate = (int) (VIDEO_FPS * width * height * BYTES_PER_PIXEL / compressionFactor);
Console.WriteLine("# Bit rate: {0}", bitRate);
Start(outputFile, width, height, bitRate);
}
public void Stop()
{
if (HasStarted)
{
encodeTaskCTS.Cancel();
}
}
public void AddFrame(Mat frame)
{
Mat flippedFrame = new Mat(frame.Size, frame.Depth, frame.NumberOfChannels);
CvInvoke.Flip(frame, flippedFrame, Emgu.CV.CvEnum.FlipType.Vertical);
FrameQueue.TryAdd(flippedFrame);
}
private void EncodeTask(CancellationToken token)
{
Mat frame;
// Start up
int hr = MFExtern.MFStartup(0x00020070, MFStartup.Full);
if (Succeeded(hr))
{
hr = InitializeSinkWriter(outputFile, videoWidth, videoHeight);
}
HasStarted = Succeeded(hr);
// Check encoder running
if (!HasStarted)
{
Console.WriteLine("! Encode thread didn't start");
return;
}
//Write frames
var exit = false;
while (!exit)
{
try
{
token.ThrowIfCancellationRequested();
if (FrameQueue.TryTake(out frame, 200))
{
WriteFrame(frame);
}
}
catch (Exception ex)
{
Console.WriteLine("! Thread exit: " + ex.Message);
exit = true;
}
}
//Clean up
sinkWriter.Finalize_();
COMBase.SafeRelease(sinkWriter);
MFExtern.MFShutdown();
}
private int InitializeSinkWriter(String outputFile, int videoWidth, int videoHeight)
{
IMFMediaType mediaTypeIn = null;
IMFMediaType mediaTypeOut = null;
IMFAttributes attributes = null;
int hr = 0;
if (Succeeded(hr)) hr = MFExtern.MFCreateAttributes(out attributes, 1);
if (Succeeded(hr)) hr = attributes.SetUINT32(MFAttributesClsid.MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS, 1);
//if (Succeeded(hr)) hr = attributes.SetUINT32(MFAttributesClsid.MF_SINK_WRITER_DISABLE_THROTTLING, 1);
if (Succeeded(hr)) hr = attributes.SetUINT32(MFAttributesClsid.MF_LOW_LATENCY, 1);
// Create the sink writer
if (Succeeded(hr)) hr = MFExtern.MFCreateSinkWriterFromURL(outputFile, null, attributes, out sinkWriter);
// Create the output type
if (Succeeded(hr)) hr = MFExtern.MFCreateMediaType(out mediaTypeOut);
if (Succeeded(hr)) hr = mediaTypeOut.SetGUID(MFAttributesClsid.MF_MT_MAJOR_TYPE, MFMediaType.Video);
if (Succeeded(hr)) hr = mediaTypeOut.SetGUID(MFAttributesClsid.MF_MT_SUBTYPE, MFMediaType.H264);
if (Succeeded(hr)) hr = mediaTypeOut.SetUINT32(MFAttributesClsid.MF_MT_AVG_BITRATE, videoBitRate);
if (Succeeded(hr)) hr = mediaTypeOut.SetUINT32(MFAttributesClsid.MF_MT_INTERLACE_MODE, (int) MFVideoInterlaceMode.Progressive);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeSize(mediaTypeOut, MFAttributesClsid.MF_MT_FRAME_SIZE, videoWidth, videoHeight);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeRatio(mediaTypeOut, MFAttributesClsid.MF_MT_FRAME_RATE, VIDEO_FPS, 1);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeRatio(mediaTypeOut, MFAttributesClsid.MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
if (Succeeded(hr)) hr = sinkWriter.AddStream(mediaTypeOut, out streamIndex);
// Create the input type
if (Succeeded(hr)) hr = MFExtern.MFCreateMediaType(out mediaTypeIn);
if (Succeeded(hr)) hr = mediaTypeIn.SetGUID(MFAttributesClsid.MF_MT_MAJOR_TYPE, MFMediaType.Video);
if (Succeeded(hr)) hr = mediaTypeIn.SetGUID(MFAttributesClsid.MF_MT_SUBTYPE, MFMediaType.RGB24);
if (Succeeded(hr)) hr = mediaTypeIn.SetUINT32(MFAttributesClsid.MF_MT_INTERLACE_MODE, (int)MFVideoInterlaceMode.Progressive);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeSize(mediaTypeIn, MFAttributesClsid.MF_MT_FRAME_SIZE, videoWidth, videoHeight);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeRatio(mediaTypeIn, MFAttributesClsid.MF_MT_FRAME_RATE, VIDEO_FPS, 1);
if (Succeeded(hr)) hr = MFExtern.MFSetAttributeRatio(mediaTypeIn, MFAttributesClsid.MF_MT_PIXEL_ASPECT_RATIO, 1, 1);
if (Succeeded(hr)) hr = sinkWriter.SetInputMediaType(streamIndex, mediaTypeIn, null);
// Start accepting data
if (Succeeded(hr)) hr = sinkWriter.BeginWriting();
COMBase.SafeRelease(mediaTypeIn);
COMBase.SafeRelease(mediaTypeOut);
return hr;
}
private int WriteFrame(Mat frame)
{
if (!HasStarted) return -1;
IMFSample sample = null;
IMFMediaBuffer buffer = null;
IntPtr data = new IntPtr();
int bufferMaxLength;
int bufferCurrentLength;
int hr = MFExtern.MFCreateMemoryBuffer(frameSizeBytes, out buffer);
if (Succeeded(hr)) hr = buffer.Lock(out data, out bufferMaxLength, out bufferCurrentLength);
if (Succeeded(hr))
{
using (AutoPinner ap = new AutoPinner(frame.Data))
{
hr = MFExtern.MFCopyImage(data, videoWidth * BYTES_PER_PIXEL, frame.DataPointer, videoWidth * BYTES_PER_PIXEL, videoWidth * BYTES_PER_PIXEL, videoHeight);
}
}
if (Succeeded(hr)) hr = buffer.Unlock();
if (Succeeded(hr)) hr = buffer.SetCurrentLength(frameSizeBytes);
if (Succeeded(hr)) hr = MFExtern.MFCreateSample(out sample);
if (Succeeded(hr)) hr = sample.AddBuffer(buffer);
if (Succeeded(hr)) hr = sample.SetSampleTime(TICKS_PER_SECOND * frames / VIDEO_FPS);
if (Succeeded(hr)) hr = sample.SetSampleDuration(VIDEO_FRAME_DURATION);
if (Succeeded(hr)) hr = sinkWriter.WriteSample(streamIndex, sample);
if (Succeeded(hr)) frames++;
COMBase.SafeRelease(sample);
COMBase.SafeRelease(buffer);
return hr;
}
private bool Succeeded(int hr)
{
return hr >= 0;
}
#region IDisposable Support
private bool disposedValue = false;
protected virtual void Dispose(bool disposing)
{
if (!disposedValue)
{
if (disposing)
{
if (HasStarted)
{
Stop();
}
}
disposedValue = true;
}
}
public void Dispose()
{
Dispose(true);
}
#endregion
}
1 Answer 1
Well I got this code working, so kudos for writing code that works. I don't have a lot of experience with Media Foundation or EmguCV, but I did want to share a few observations on your implementation.
It appears to be a feature of this implementation that
Start
andStop
can be called more than once, possible to produce several output files over the lifecycle of a singleMFVideoEncoder
instance. However, this is weird and probably not completely implemented.- What happens when you call
Start
a second time? YourencodeTaskCTS
is disposed, which is nice, but not cancelled first, and also you handed that cancellation token into another thread which is probably still alive, so there's a resource which will may be referenced after disposal. - Several properties, such as
HasStarted
andframes
, are written in bothencodeThread
and the calling thread. They're not set up for cross thread access (eg, safe order of operations with locking), so this is almost certainly not going to work correctly unlessStart
is called once and only once over theMFVideoEncoder
lifecycle. encodeThread
will be set to the reference to the newly created thread, so the previous thread will be GC'd at some unpredictable point in the futureFrameQueue
doesn't get cleared, so your subsequent output files will probably contain frames from previous batches.- Recommendation: It doesn't appear to be particularly expensive to instantiate new
MFVideoEncoder
's, so I'd say refactor it to enforce 1 output file per lifecycle. The simplest diff would be to makeStart
andStop
private, and callStart
from your constructor. Then everything will be a lot safer.
- What happens when you call
I think a
BlockingCollection
is perfectly fine to move frames onto the encoding thread.- You can actually take advantage of it more than you are already. For example, you could use
IsCompleted
in yourwhile
loop inEncodeTask
, instead of doingtoken.ThrowIfCancellationRequested()
. This would have the extra benefit of draining out the frame queue whenStop
is called, instead of instantaneously halting frame processing as it does now. - It's possible to call
AddFrame
afterStop
, currently. But ifStop
callsFrameQueue.CompleteAdding()
, then this will be prevented automatically. AddFrame
returnsvoid
, butFrameQueue.TryAdd
reports whether adding the frame was successful or not. So your API consumers don't know if their frames will be processed or not. If whatever your scenario is legitimately allows for frames to not be added for processing, then it sounds important enough to report back to the caller.- Recommendation: Use more of
BlockingCollection
. Also, either change the signature ofAddFrame
tobool
and returnFrameQueue.TryAdd
, or, use more ofBlockingCollection
& model add failures as exceptions thrown byBlockingCollection
.
- You can actually take advantage of it more than you are already. For example, you could use
In
EncodeTask
, what should happen ifMFStartup
doesn't succeed? Right now everything proceeds as if there were no error, except there's no output. That's kind of weird.- Recommendation: since
MFStartup
is critical to success, I'd throw an exception if it can't succeed.
- Recommendation: since
What happens if you call
Stop
instantly afterStart
? It's possible the thread won't start up & initializeMFStartup
and setHasStarted
in time, and thenStop
won't cancel the CTS, which means that when it does get called, your frame write loop will wait forever. Probably not what you want.In
EncodeTask
, you're catchingException
in your loop. But the only exception you're expecting isOperationCanceledException
, right?- Recommendation: Only catch the exceptions you're expecting, and let the rest bubble.
You're pretty diligent about checking if return values succeeded for your Media Foundation calls, which is good. However, there's a couple cases where you're expecting
out
vars to exist even if the call wasn't successful. For example, inWriteFrame
, what happens ifMFExtern.MFCreateSample(out sample);
fails? The adjacent calls will be skipped due to all theif (Succeeded(hr))
's, but what willCOMBase.SafeRelease(sample);
do ifsample
is unassigned?- Recommendation: ensure that either uses of
out
variables can survive working with unassigned vars, or, guard those uses more cautiously.
- Recommendation: ensure that either uses of
All the
if (Succeeded(hr))
's sure are awkward. I get not wanting to throw exceptions in hot code paths, but if everything is working normally, won't these calls all succeed?- Recommendation: investigate swapping
if (Succeeded(hr))
's for something likeThrowUnlessOK(hr)
, which you could wrap your MF calls with, to reduce the amount of conditions and noise associated with checking each return value. I can't say for sure that this would end up being better, but worth investigating.
- Recommendation: investigate swapping
MediaFoundation.NET
,EmguCV
? For example, what isAutoPinner
? \$\endgroup\$VideoWriter
. With the answer below, IIRC the first point was important. \$\endgroup\$