[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News AWS Announces General Availability of EC2 P5e Instances, Powered by NVIDIA H100 Tensor Core GPUs

AWS Announces General Availability of EC2 P5e Instances, Powered by NVIDIA H100 Tensor Core GPUs

This item in japanese

Sep 18, 2024 2 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch

Amazon Web Services (AWS) has officially launched the Amazon EC2 P5e instances, powered by NVIDIA H100 Tensor Core GPUs, to enhance its computing infrastructure for AI, machine learning, and high-performance computing (HPC) applications.

According to the company, the EC2 P5e instances bring significant improvements in performance, cost-efficiency, and scalability over their predecessors, the EC2 P5 instances, which were already known for their powerful computing capabilities.

The P5e instances are equipped with 8 H200 GPUs, offering enhanced GPU memory size and bandwidth compared to the P5 instances. They support up to 3,200 Gbps of networking using second-generation EFA technology and are deployed in Amazon EC2 UltraClusters for large-scale processing with low latency.

(Source: AWS Machine Learning blog post)

Organizations can leverage the P5e instances for a variety of advanced use cases, such as Large language model (LLM) training and inference, such as OpenAI’s GPT or Google’s BERT, and high-performance simulations, including weather forecasting, genomics research, and fluid dynamics modeling.

The authors of an AWS Machine Learning blog, the EC2 P5e instances, write:

The higher memory bandwidth of the H200 GPUs in the P5e instances allows the GPU to fetch and process data from memory more quickly. This reduces inference latency, which is critical for real-time applications like conversational AI systems where users expect near-instant responses. The higher memory bandwidth enables higher throughput, allowing the GPU to process more inferences per second.

When users launch P5 instances, they can utilize AWS Deep Learning AMIs (DLAMI) to back P5 instances. DLAMI delivers ML practitioners and researchers with the necessary infrastructure and tools to swiftly develop scalable, secure, distributed ML applications in pre-configured environments. Users can run containerized applications on P5 instances using AWS Deep Learning Containers with libraries designed for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS).

Azure and Google Cloud offer powerful instances like AWS EC2 P5e instances, designed for high-performance computing (HPC) and AI/ML workloads. Azure provides NDv5 series virtual machines equipped with NVIDIA Tensor Core GPUs, while Google Cloud offers A3 instances powered by NVIDIA GPUs.

Sanjay Siboo, a director of cloud solutions at Tata Communications, tweeted:

GPUs have become increasingly important for several large software firms, such as AWS, Google, and OpenAI, as the demand for generative AI continues to grow steadily.

Currently, P5e instances in the p5e.48xlarge size are available in the US East (Ohio) AWS region through EC2 Capacity Blocks for ML.

About the Author

Steef-Jan Wiggers

Show moreShow less

Rate this Article

Adoption
Style

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /