How to define "pool" for Nvidia holoscan::ops::FormatConverterOp

Question 1

I am trying to get the holoscan example "bring your own model"

https://docs.nvidia.com/holoscan/sdk-user-guide/examples/byom.html

to run, translating it from Python into CPP.

One necessary component is the operator FormatConverterOp, which seems to do some premliminary conversions before the more sophisticated AI operators take over.

So I create one:

shared_ptr<holoscan::ops::FormatConverterOp> op_formatter =
 make_operator<holoscan::ops::FormatConverterOp>("preprocessor", from_config("preprocessor"));

The config, mentioned in the parameters, is in a yaml config file, reading:

[..]
preprocessor: #<< FormatConverterOp
 out_tensor_name: source_video
 out_dtype: "float32"
 resize_width: 256
 resize_height: 256
 pool: host_allocator #<< I added this line myself.
[..]

My addition to this is the parameter pool: It appears that the operator in CPP needs a separate space of memory to work upon. Reading the internet tells me: For starters a holoscan::UnboundedAllocator should be fine.

Let's create one:

shared_ptr<UnboundedAllocator> uba(
 new UnboundedAllocator("host_allocator", component));

"host_allocator" is the name for the thing, matching the string from above config yaml. I guess, this is needed, so that the operator may find it later on by that handle.

But how to define the component? Inline help tells me, it is of type nvidia::gxf::UnboundedAllocator*. So, naively:

shared_ptr<nvidia::gxf::UnboundedAllocator> component(
 new nvidia::gxf::UnboundedAllocator());

However, using this pointer will, at runtime, cause a GXF_CONTEXT_INVALID error. Here:

https://docs.nvidia.com/metropolis/deepstream/dev-guide/graphtools-docs/docs/text/GXF_Core_C_APIs.html

Nvidia tells the world, that this signifies an "invalid context".

Playing with the API, I went as far as extending the example code to:

#include <memory>
#include <iostream>
#include <holoscan/core/resources/gxf/unbounded_allocator.hpp>
#include <gxf/core/gxf.h>
using std::shared_ptr;
using namespace holoscan;
// global
shared_ptr<nvidia::gxf::UnboundedAllocator> component;
gxf_context_t context;
shared_ptr<holoscan::UnboundedAllocator> uba;
void Ap_UsModel::compose()
{
 component = shared_ptr<nvidia::gxf::UnboundedAllocator>(new nvidia::gxf::UnboundedAllocator());
 context = component->context();
 gxf_result_t err = GxfContextCreate(&context);
 std::cout << "After err '" << err << "': Context ptr: '" << (&context) << "'." << std::endl;
 uba = shared_ptr<UnboundedAllocator>(new UnboundedAllocator("host_allocator", component.get()));
 [..]
 shared_ptr<holoscan::ops::FormatConverterOp> op_formatter =
 make_operator<holoscan::ops::FormatConverterOp>("preprocessor", 
 from_config("preprocessor"));
 [..]
}

To no avail. The err==0 and the &context looks valid, but still getting GXF_CONTEXT_INVALID.

I am afraid, I do not understand enough to find help within the Nvidia API documentation. Has anybody here ever succeeded in getting such a FormatConverterOp to run and could point me to how to create an adequate context?

P.S.: Here is the whole error message:

[info] [fragment.cpp:778] Loading extensions from configs...
[info] [gxf_executor.cpp:329] Creating context
After err '0': Context ptr: '0x641469c12aa0'.
[error] [gxf_resource.cpp:49] GXF call 'GxfComponentType(gxf_context_, gxf_cid_,
 &gxf_tid_)' in line 49 of file /workspace/holoscan-sdk/src/core/gxf/gxf_resource.cpp
 failed with 'GXF_CONTEXT_INVALID' (12)

P.P.S.: Using merely an Allocator (as opposed to an UnboundedAllocator) I progress to promising-looking

Unable to convert argument type 'PSt10shared_ptrIN8holoscan9AllocatorEE' to parameter type 'St10shared_ptrIN8holoscan9AllocatorEE' for 'pool'

I guess this reduces the problem to "how do I get rid of that prefixing 'P'?". Experimenting with pointerization and dereferencing so far yield no solution past plainer errors.

The associated code for above error:

[..]
this->allocator = std::shared_ptr<Allocator>(new Allocator());
holoscan::ArgList al = from_config("preprocessor");
// Adding this specific Allocator as 'pool' parameter directly.
al.add(holoscan::Arg("pool", &allocator));
shared_ptr<holoscan::ops::FormatConverterOp> op_formatter =
 make_operator<holoscan::ops::FormatConverterOp>("preprocessor", al);
[..]

Question 2

Found my own answer, and got the program to run under CPP on Ubuntu 22/04.

Cause of trouble was the way the memory block, that the "Allocator" draws from, had been created.

Three of the operators in the example https://docs.nvidia.com/holoscan/sdk-user-guide/examples/byom.html draw upon such a memory block.

Successful Allocators can be created like this:

const size_t sz_block = 3145728;
// Params: name, data type (1 is cuda device memory), block size, block count.
shared_ptr<holoscan::BlockMemoryPool> p =
 this->make_resource<holoscan::BlockMemoryPool>("my_pool", 1, sz_block, 16);
holoscan::Arg arg_pool("pool", p);
holoscan::Arg arg_allocator("allocator", p);
holoscan::Arg arg_post("allocator", p);

These named arguments can then be passed to their respective operators. The names are mandatory as is: The FormatConverterOp expects a "pool", the InferenceOp and the SegmentationPostprocessorOp expect an "allocator" each.

While still not entirely positive, if the following code is good practice, it works as expected:

using std::shared_ptr;
using holoscan::Application;
using holoscan::Operator;
using holoscan::Allocator;
// Member implementation from "class Ap_UsModel : public holoscan::Application"
void Ap_UsModel::compose()
{
 const size_t sz_block = 3145728;
 // Params: name, data type (1 is cuda device memory), block size, block count.
 shared_ptr<holoscan::BlockMemoryPool> p =
 this->make_resource<holoscan::BlockMemoryPool>("my_pool", 1, sz_block, 16);
 holoscan::Arg arg_pool("pool", p);
 holoscan::Arg arg_allocator("allocator", p);
 holoscan::Arg arg_post("allocator", p);
 shared_ptr<holoscan::ops::VideoStreamReplayerOp> op_replayer =
 make_operator<holoscan::ops::VideoStreamReplayerOp>("replayer",
 from_config("replayer"));
 
 shared_ptr<holoscan::ops::FormatConverterOp> op_formatter =
 make_operator<holoscan::ops::FormatConverterOp>("preprocessor",
 from_config("preprocessor"));
 op_formatter->add_arg(arg_pool);
 
 shared_ptr<holoscan::ops::InferenceOp> op_inference =
 make_operator<holoscan::ops::InferenceOp>("inference",
 from_config("inference"));
 op_inference->add_arg(arg_allocator);
 
 shared_ptr<holoscan::ops::SegmentationPostprocessorOp> op_seg =
 make_operator<holoscan::ops::SegmentationPostprocessorOp>("postprocessor",
 from_config("postprocessor"));
 op_seg->add_arg(arg_post);
 
 shared_ptr<holoscan::ops::HolovizOp> op_visualizer =
 make_operator<holoscan::ops::HolovizOp>("holoviz", from_config("holoviz"));
 
 // Direct posting of the input images to display.
 add_flow(op_replayer, op_visualizer, {{"output", "receivers"}});
 
 // Doing AI segmentation and overlaying it onto above image.
 add_flow(op_replayer, op_formatter, {{"output", "source_video"}});
 add_flow(op_formatter, op_inference, {{"tensor", "receivers"}});
 add_flow(op_inference, op_seg, {{"transmitter", "in_tensor"}});
 add_flow(op_seg, op_visualizer, {{"out_tensor", "receivers"}});
}

The .yaml config file is almost original (reduced resolutions to 256x256):

replayer:
 directory: "./model/video" # Path to gxf entity video data
 basename: "ultrasound_256x256" # Look for <basename>.gxf_{entities|index}
 frame_rate: 0 # Frame rate to replay. (default: 0 follow frame rate in timestamps)
 repeat: false # Loop video? (default: false)
 realtime: true # Play in realtime, based on frame_rate/timestamps (default: true)
 count: 0 # Number of frames to read (default: 0 for no frame count restriction)
preprocessor: # << FormatConverterOp
 out_tensor_name: source_video
 out_dtype: "float32"
 resize_width: 256
 resize_height: 256
inference:
 backend: "trt"
 model_path_map: {"byom_model": "./model/model/us_unet_256x256_nhwc.onnx"}
 pre_processor_map:
 "byom_model": ["source_video"]
 inference_map:
 "byom_model": ["output"]
postprocessor:
 in_tensor_name: output
 network_output_type: softmax
 data_format: nchw
holoviz:
 width: 256 # width of window size
 height: 256 # height of window size
 color_lut: [
 [0.65, 0.81, 0.89, 0.10],
 [0.20, 0.63, 0.17, 0.70]
 ]

Using all this in a basic program matching the Nvidia examples preceding this one, enabled the creation of this screenshot of an actual segmentation animation, which is what I wanted to achieve here:

Segmentation example in CPP

Markus-Hermann 1,06114 silver badges28 bronze badges · Accepted Answer · 2025-09-01 14:36:26Z

Found my own answer, and got the program to run under CPP on Ubuntu 22/04.

Cause of trouble was the way the memory block, that the "Allocator" draws from, had been created.

Three of the operators in the example https://docs.nvidia.com/holoscan/sdk-user-guide/examples/byom.html draw upon such a memory block.

Successful Allocators can be created like this:

const size_t sz_block = 3145728;
// Params: name, data type (1 is cuda device memory), block size, block count.
shared_ptr<holoscan::BlockMemoryPool> p =
 this->make_resource<holoscan::BlockMemoryPool>("my_pool", 1, sz_block, 16);
holoscan::Arg arg_pool("pool", p);
holoscan::Arg arg_allocator("allocator", p);
holoscan::Arg arg_post("allocator", p);

These named arguments can then be passed to their respective operators. The names are mandatory as is: The FormatConverterOp expects a "pool", the InferenceOp and the SegmentationPostprocessorOp expect an "allocator" each.

While still not entirely positive, if the following code is good practice, it works as expected:

using std::shared_ptr;
using holoscan::Application;
using holoscan::Operator;
using holoscan::Allocator;
// Member implementation from "class Ap_UsModel : public holoscan::Application"
void Ap_UsModel::compose()
{
 const size_t sz_block = 3145728;
 // Params: name, data type (1 is cuda device memory), block size, block count.
 shared_ptr<holoscan::BlockMemoryPool> p =
 this->make_resource<holoscan::BlockMemoryPool>("my_pool", 1, sz_block, 16);
 holoscan::Arg arg_pool("pool", p);
 holoscan::Arg arg_allocator("allocator", p);
 holoscan::Arg arg_post("allocator", p);
 shared_ptr<holoscan::ops::VideoStreamReplayerOp> op_replayer =
 make_operator<holoscan::ops::VideoStreamReplayerOp>("replayer",
 from_config("replayer"));
 
 shared_ptr<holoscan::ops::FormatConverterOp> op_formatter =
 make_operator<holoscan::ops::FormatConverterOp>("preprocessor",
 from_config("preprocessor"));
 op_formatter->add_arg(arg_pool);
 
 shared_ptr<holoscan::ops::InferenceOp> op_inference =
 make_operator<holoscan::ops::InferenceOp>("inference",
 from_config("inference"));
 op_inference->add_arg(arg_allocator);
 
 shared_ptr<holoscan::ops::SegmentationPostprocessorOp> op_seg =
 make_operator<holoscan::ops::SegmentationPostprocessorOp>("postprocessor",
 from_config("postprocessor"));
 op_seg->add_arg(arg_post);
 
 shared_ptr<holoscan::ops::HolovizOp> op_visualizer =
 make_operator<holoscan::ops::HolovizOp>("holoviz", from_config("holoviz"));
 
 // Direct posting of the input images to display.
 add_flow(op_replayer, op_visualizer, {{"output", "receivers"}});
 
 // Doing AI segmentation and overlaying it onto above image.
 add_flow(op_replayer, op_formatter, {{"output", "source_video"}});
 add_flow(op_formatter, op_inference, {{"tensor", "receivers"}});
 add_flow(op_inference, op_seg, {{"transmitter", "in_tensor"}});
 add_flow(op_seg, op_visualizer, {{"out_tensor", "receivers"}});
}

The .yaml config file is almost original (reduced resolutions to 256x256):

replayer:
 directory: "./model/video" # Path to gxf entity video data
 basename: "ultrasound_256x256" # Look for <basename>.gxf_{entities|index}
 frame_rate: 0 # Frame rate to replay. (default: 0 follow frame rate in timestamps)
 repeat: false # Loop video? (default: false)
 realtime: true # Play in realtime, based on frame_rate/timestamps (default: true)
 count: 0 # Number of frames to read (default: 0 for no frame count restriction)
preprocessor: # << FormatConverterOp
 out_tensor_name: source_video
 out_dtype: "float32"
 resize_width: 256
 resize_height: 256
inference:
 backend: "trt"
 model_path_map: {"byom_model": "./model/model/us_unet_256x256_nhwc.onnx"}
 pre_processor_map:
 "byom_model": ["source_video"]
 inference_map:
 "byom_model": ["output"]
postprocessor:
 in_tensor_name: output
 network_output_type: softmax
 data_format: nchw
holoviz:
 width: 256 # width of window size
 height: 256 # height of window size
 color_lut: [
 [0.65, 0.81, 0.89, 0.10],
 [0.20, 0.63, 0.17, 0.70]
 ]

Using all this in a basic program matching the Nvidia examples preceding this one, enabled the creation of this screenshot of an actual segmentation animation, which is what I wanted to achieve here:

Segmentation example in CPP

CollectivesTM on Stack Overflow

How to define "pool" for Nvidia holoscan::ops::FormatConverterOp

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related