InfoQ Homepage News IBM Cloud Code Engine Serverless Fleets with GPUs for High-Performance AI and Parallel Computing
IBM Cloud Code Engine Serverless Fleets with GPUs for High-Performance AI and Parallel Computing
Oct 16, 2025 2 min read
Write for InfoQ
Feed your curiosity. Help 550k+ globalsenior developers
each month stay ahead.Get in touch
IBM Cloud Code Engine, the company's fully managed, strategic serverless platform, has introduced Serverless Fleets with integrated GPU support. With this new capability, the company directly addresses the challenge of running large-scale, compute-intensive workloads such as enterprise AI, generative AI, machine learning, and complex simulations on a simplified, pay-as-you-go serverless model.
Historically, as noted in academic papers, including a recent Cornell University paper, serverless technology struggled to efficiently support these demanding, parallel workloads, which often required thousands or millions of tasks to execute simultaneously using specialized hardware. With Serverless Fleets, IBM aims to bridge this gap by offering high-performance computing resources without the operational complexity of managing dedicated infrastructure.
Michael Behrendt, CTO Serverless and IBM Distinguished Engineer, commented in a LinkedIn post:
The architecture of this capability was informed and driven a lot by running large real-world workloads with 100,000s of processors. It is built in such a robust way that it can run these workloads with essentially zero SRE staff.
Serverless Fleets simplifies how data scientists and developers execute compute-intensive tasks by providing a single endpoint for submitting a large number of batch jobs. In a blog post, IBM mentions that Code Engine then automatically handles the infrastructure orchestration:
- The service automatically provisions the necessary compute resources, including virtual machines (VMs) and serverless Graphics Processing Units (GPUs), such as the NVIDIA L40, to run multiple tasks simultaneously.
- Furthermore, Serverless Fleets is designed for run-to-completion tasks that scale elastically. The system determines the optimal number of worker instances needed and deploys them to handle the parallel execution efficiently.
- And finally, the resources are automatically removed once the workloads are complete, ensuring users are charged only for the technology consumed during execution.
With the launch of IBM Cloud Code Engine's Serverless Fleets, the company brings a competitive offering. Other hyperscalers, such as AWS, offer solutions like AWS Fargate for running containers on serverless compute (often paired with EKS or ECS for orchestration), and Azure provides Serverless GPUs in Container Apps. Yet, IBM is emphasizing the unified environment with a single, simple platform for web apps, functions, and now massive, GPU-accelerated batch jobs.
Where competitors may require developers to stitch together multiple services (e.g., a serverless runtime, a container service, and a batch orchestrator), Serverless Fleets aims to simplify this by fully managing the provisioning and elastic scaling of GPU-backed Virtual Machines from a single endpoint, reducing the complexity and operational overhead often associated with running elastic, GPU-intensive workloads in the cloud. In a Medium blog post, Luke Roy concluded:
Whether you're working on media processing, AI inference, or scientific workloads, IBM Cloud Code Engine Serverless Fleets provides a robust and developer-friendly solution.
The company stated in a blog post that, in today's competitive landscape, enterprises across industries need to deliver services quickly and conveniently while prioritizing security, resilience, and cost savings.
This content is in the Cloud topic
Related Topics:
-
Related Editorial
-
Related Sponsors
-
Popular across InfoQ
-
Reddit Migrates Comment Backend from Python to Go Microservice to Halve Latency
-
Kubernetes Community Retires Popular Ingress NGINX Controller
-
Helm Improves Kubernetes Package Management with Biggest Release in 6 Years
-
Cloudflare Introduces Remote Bindings for Local Development
-
Google Launches Agent Development Kit for Go
-
How to Use Apache Spark to Craft a Multi-Year Data Regression Testing and Simulations Framework
-
Related Content
The InfoQ Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example