[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News Microsoft Introduces Serverless GPUs on Azure Container Apps in Public Preview

Microsoft Introduces Serverless GPUs on Azure Container Apps in Public Preview

Dec 31, 2024 2 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article - 0:00
Audio ready to play
0:00
0:00

At the recent Microsoft Ignite conference, the company announced the public preview of Azure Container Apps with serverless GPUs powered by NVIDIA. This feature allows customers to utilize NVIDIA A100 GPUs and NVIDIA T4 GPUs in a serverless environment, providing scaling and flexibility for real-time custom model inferencing and other machine-learning tasks.

Azure Container Apps is a fully-managed serverless container service that allows developers to deploy, run, and scale containerized applications without managing infrastructure. With serverless GPUs, they can run GPU-powered applications without managing the underlying infrastructure and benefit from scale-to-zero capabilities; resources can dynamically scale based on demand, reducing idle costs. In addition, they can benefit from per-second billing for GPU usage with data governance that keeps information within container boundaries, flexible options with NVIDIA A100 and T4 GPUs, and a managed serverless platform for deploying their own AI models.

According to the company, Azure’s serverless GPUs excel in use cases like real-time AI inferencing, machine learning model deployments, and high-performance computing tasks. The platform ensures smooth integration into existing Azure workflows.

(Source: Azure Blogs on Apps blog post)

During an Ignite Session of Azure Functions Flex Consumption and GPUs, Simon Jakesch, principal product manager Azure Container Apps at Microsoft, said:

Anyone who has used serverless or in combination with Azure Container Apps has found it to be extremely powerful. This technology brings the same power to GPU use, making GPUs easily accessible.

Microsoft is not the sole provider of GPU capabilities for accelerating workloads such as real-time AI inferencing and machine learning model deployments. Others are Modal, RunPod, Replicate, Baseten, Koyeb and Fal. Furthermore, Google Cloud Run supports NVIDIA L4 GPUs for real-time AI inferencing.

Lars Wurm, a platform leader in Core Infrastructure at Inter Ikea, posted on LinkedIn:

With the introduction of serverless GPUs using Azure Container Apps, several new workloads and usage scenarios are enabled, shaping the offering into a one-stop shop for container workloads. This is particularly beneficial when workloads do not rely on committed ACA instances.

And in an NVIDIA corporate blog post, Dave Salvator wrote:

Serverless GPUs allow development teams to focus more on innovation and less on infrastructure management. With per-second billing and scale-to-zero capabilities, customers pay only for the compute they use, helping ensure resource utilization is both economical and efficient. NVIDIA is also working with Microsoft to bring NVIDIA NIM microservices to serverless NVIDIA GPUs in Azure to optimize AI model performance.

Serverless GPUs are available in a select set of Azure regions during the public preview phase. More information is available directly on Azure's platform in documentation, tutorials, and pricing details.

About the Author

Steef-Jan Wiggers

Show moreShow less

Rate this Article

Adoption
Style

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /