Prebuilt containers for inference and explanation
Vertex AI provides Docker container images that you run as prebuilt containers for serving inferences and explanations from trained model artifacts. These containers, which are organized by machine learning (ML) framework and framework version, provide HTTP inference servers that you can use to serve inferences with minimal configuration. In many cases, using a prebuilt container is simpler than creating your own custom container for inference.
This document lists the prebuilt containers for inferences and explanations, and it describes how to use them with model artifacts that you created using Vertex AI's custom training functionality or model artifacts that you created outside of Vertex AI.
Support policy and schedule
Vertex AI supports each framework version based on a schedule to minimize security vulnerabilities. Review the Support policy schedule to understand the implications of the end-of-support and end-of-availability dates.
Available container images
Each of the following container images is available in several
Artifact Registry repositories, which store data in various
locations. You can use any of
the URIs for an image when you perform custom training; each provides the same
container image. If you use the Google Cloud console to create a
Model resource,
the Google Cloud console selects the URI that best matches the location where
you are using Vertex AI in order to reduce
latency.
TensorFlow
Available TensorFlow container images (Click to expand)
| ML framework version | Supported accelerators (and CUDA version, if applicable) | End of patch and support date | End of availability | Supported images |
|---|---|---|---|---|
| 2.15 | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.15 | GPU (CUDA 12.x) | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.14 | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.14 | GPU (CUDA 12.x) | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.13 | CPU only | Nov 28, 2024 | Nov 28, 2025 |
|
| 2.13 | GPU (CUDA 12.x) | Nov 28, 2024 | Nov 28, 2025 |
|
| 2.12 | CPU only | June 30, 2024 | June 30, 2025 |
|
| 2.12 | GPU (CUDA 11.x) | June 30, 2024 | June 30, 2025 |
|
| 2.11 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.11 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.10 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.10 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.9 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.9 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.8 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.8 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.7 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.7 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.6 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.6 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.5 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.5 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.4 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.4 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.3 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.3 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.2 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.2 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.1 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.1 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.15 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.15 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
Optimized TensorFlow runtime
The following container images use the optimized TensorFlow runtime. For more information, see Use the optimized TensorFlow runtime.
Available optimized TensorFlow runtime container images (Click to expand)
| ML framework version | Supported accelerators (and CUDA version, if applicable) | End of patch and support date | End of availability | Supported images |
|---|---|---|---|---|
| nightly | CPU only | Not applicable | Not applicable |
|
| nightly | GPU (CUDA 12.x) | Not applicable | Not applicable |
|
| nightly | Cloud TPU | Not applicable | Not applicable |
|
| 2.17 | CPU only | Jul 11, 2024 | Jul 11, 2025 |
|
| 2.17 | GPU (CUDA 12.x) | Jul 11, 2024 | Jul 11, 2025 |
|
| 2.17 | Cloud TPU | Jul 11, 2024 | Jul 11, 2025 |
|
| 2.16 | CPU only | Apr 26, 2024 | Apr 26, 2025 |
|
| 2.16 | GPU (CUDA 12.x) | Apr 26, 2024 | Apr 26, 2025 |
|
| 2.16 | Cloud TPU | Apr 26, 2024 | Apr 26, 2025 |
|
| 2.15 | CPU only | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.15 | GPU (CUDA 12.x) | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.15 | Cloud TPU | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.14 | CPU only | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.14 | GPU (CUDA 12.x) | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.13 | CPU only | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.13 | GPU (CUDA 11.x) | Aug 15, 2024 | Aug 15, 2025 |
|
| 2.12 | CPU only | May 15, 2024 | May 15, 2025 |
|
| 2.12 | GPU (CUDA 11.x) | May 15, 2024 | May 15, 2025 |
|
| 2.11 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.11 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.10 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.10 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.9 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.9 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.8 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 2.8 | GPU (CUDA 11.x) | Nov 15, 2023 | Nov 15, 2024 |
|
PyTorch
Available PyTorch container images (Click to expand)
| ML framework version | Supported accelerators (and CUDA version, if applicable) | End of patch and support date | End of availability | Supported images |
|---|---|---|---|---|
| 2.4 (Python 3.9) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.4 (Python 3.9) | GPU (CUDA 12.x) | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.4 (Python 3.9) | Cloud TPU | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.3 (Python 3.9) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.3 (Python 3.9) | GPU (CUDA 12.x) | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.3 (Python 3.9) | Cloud TPU | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.2 (Python 3.9) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.2 (Python 3.9) | GPU (CUDA 12.x) | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.2 (Python 3.9) | Cloud TPU | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.1 (Python 3.9) | CPU only | Dec 01, 2024 | Dec 01, 2025 |
|
| 2.1 (Python 3.9) | GPU (CUDA 12.x) | Dec 01, 2024 | Dec 01, 2025 |
|
| 2.1 (Python 3.9) | Cloud TPU | Dec 01, 2024 | Dec 01, 2025 |
|
| 2.0 (Python 3.9) | CPU only | Jul 27, 2024 | Jul 27, 2025 |
|
| 2.0 (Python 3.9) | GPU (CUDA 11.x) | Jul 27, 2024 | Jul 27, 2025 |
|
| 1.13 (Python 3.8) | CPU only | May 15, 2024 | May 15, 2025 |
|
| 1.13 (Python 3.8) | GPU (CUDA 11.x) | May 15, 2024 | May 15, 2025 |
|
| 1.12 | CPU only | May 15, 2024 | May 15, 2025 |
|
| 1.12 | GPU (CUDA 11.x) | May 15, 2024 | May 15, 2025 |
|
| 1.11 | CPU only | May 15, 2024 | May 15, 2025 |
|
| 1.11 | GPU (CUDA 11.x) | May 15, 2024 | May 15, 2025 |
|
scikit-learn
Available scikit-learn container images (Click to expand)
| ML framework version | Supported accelerators (and CUDA version, if applicable) | End of patch and support date | End of availability | Supported images |
|---|---|---|---|---|
| 1.5 (Python 3.10) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 1.4 (Python 3.10) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 1.3 (Python 3.10) | CPU only | Nov 28, 2024 | Nov 28, 2025 |
|
| 1.2 (Python 3.10) | CPU only | June 30, 2024 | June 30, 2025 |
|
| 1.0 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.24 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.23 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.22 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.20 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
XGBoost
Available XGBoost container images (Click to expand)
| ML framework version | Supported accelerators (and CUDA version, if applicable) | End of patch and support date | End of availability | Supported images |
|---|---|---|---|---|
| 2.1 (Python 3.10) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 2.0 (Python 3.10) | CPU only | Jan 14, 2026 | Jan 14, 2027 |
|
| 1.7 (Python 3.10) | CPU only | June 30, 2024 | Dec 30, 2025 |
|
| 1.6 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.5 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.4 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.3 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.2 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 1.1 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.90 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
| 0.82 | CPU only | Nov 15, 2023 | Nov 15, 2024 |
|
Use a prebuilt container
You can specify a prebuilt container for inference when you
create a custom TrainingPipeline resource that uploads a Model or when
you import model artifacts as a Model.
To use one of these prebuilt containers, you must save your model as one or more model artifacts that comply with the requirements of the prebuilt container. For more information, see Export model artifacts for inference.
The following notebooks demonstrate how to use a prebuilt container to serve inferences.
| What do you want to do? | Notebook |
|---|---|
| Train and serve a TensorFlow model using a prebuilt container | Custom training and online inference |
| Serve a PyTorch model using a prebuilt container | Serving PyTorch image models with prebuilt containers on Vertex AI |
| Serve a Stable Diffusion model using a prebuilt container | Deploy and host a Stable Diffusion model on Vertex AI |
Notebooks
What's next
- Learn how to deploy a model to an endpoint to serve
inferences.