[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

Aug 31, 2025 3 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article - 0:00
Audio ready to play
0:00
0:00

The next generation of Kubernetes autoscaling techniques and tools is enabling organisations to make substantial cost savings in their cloud infrastructure. Svetlana Burninova recently used Karpenter to build a multi-architecture EKS cluster and managed a 70% reduction in cost whilst also improving performance.

In an article on Hackernoon, Burninova explains how her techniques also reduced pod scheduling latency from three minutes to 20 seconds.

After switching to Karpenter with about 70% spot instance usage, our monthly compute costs dropped by 70%. That's a significant reduction that freed up substantial budget for new features and infrastructure improvements.

Burninova's implementation involved replacing the traditional Kubernetes Cluster Autoscaler with Karpenter, and also moving to multi-architecture setup with both AMD64 and ARM64 instances. This change also resulted in better resource utilisation, Karpenter's right-sizing capabilities helping to increase CPU utilisation from an average of 25% on fixed nodes to 70%.

Cloud optimisation platform nOps have also written about the benefits of using Karpenter for autoscaling. In a post on their site, they explain that Karpenter functions as "an open-source, flexible, and high-performance Kubernetes cluster autoscaler, offering advanced scheduling and scaling capabilities". Unlike traditional cluster autoscalers that operate with fixed node groups, Karpenter examines pending pods and provisions the most cost-effective instances to meet the specific resource requirements. Karpenter also recently released version 1.0, a milestone that includes better stability and new functionality for disruption budgets and node consolidation.

We're running a more resilient, cost-effective infrastructure that scales intelligently. The substantial cost savings alone paid for the engineering time I spent on this migration within the first month.

Burninova's cost optimisation came from two primary strategies: price optimisation and efficiency optimisation. Price optimisation involves maximising discounts through AWS pricing models, including Reserved Instances, Savings Plans, and Spot Instances, which can offer discounts of up to 90% albeit by adding the risk of two-minute termination notices. Efficiency optimisation focuses on reducing waste through better resource utilisation and more granular scaling decisions.

Burninova's EKS architecture

Moving some workloads to ARM64 Graviton instances saved approximately 20% of costs when compared to equivalent x86 instances, and also showed a performance improvement, with an example image processing service running 15% faster on Graviton hardware. However, Burninova points out that making this change requires careful checks of application compatibility, and that node pools need to be properly configured with appropriate taints to prevent incompatible workloads from being scheduled on ARM64 nodes.

AWS has recently introduced another evolution in this space with Amazon EKS Auto Mode, launched in November 2024. In a post for AWS community builders, developer Rodrigo Fernandes portrays EKS Auto Mode as a simplified natural evolution of Karpenter. Fernandes goes on to explain how EKS Auto Mode abstracts infrastructure management by automatically provisioning and removing nodes based on pod demand rather than traditional CPU and memory metrics.

Shared Responsibility Model with EKS Auto Mode

Auto Mode attempts to be efficient with costs by scaling clusters intelligently based on pending pods, optimising spot instance usage and eliminating idle nodes. It does this by considering pod resource requirements, instance pricing, availability zone distribution and architecture compatibility. Fernandes suggests that early implementations have reduced management time by up to 80% and saved 60-70% of infrastructure expenses. However, there are some limitations for organisations which need to use custom AMIs, specialised hardware such as GPU instances, or granular configuration control for compliance environments.

The added complexity of these tools brings extra responsibility around observability and security, with Fernandes encouraging engineers to keep an eye on metrics such as node creation and termination rates, pod scheduling efficiency, and node utilisation percentages. Tools such as Kubecost can give detailed visibility into costs per namespace, and the efficacy of spot vs. on-demand ratios. Best security practices include using IAM Roles for Service Accounts (IRSA) to eliminate hardcoded credentials, proper subnet tagging for resource discovery, and carefully configuring disruption budgets to maintain application availability during scaling events.

All of the approaches detailed here aim to reduce over-provisioned resources and slow responses to scaling needs. Modern tooling enables engineers to have near-instantaneous provisioning of the best kind of resources for their workloads. This does however require careful planning, proper application configuration of resource requests and limits, and ongoing monitoring.

About the Author

Matt Saunders

Show moreShow less

Rate this Article

Adoption
Style

This content is in the DevOps topic

Related Topics:

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /