InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Don't have an InfoQ account?

Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
Save articles and read at anytimeBookmark articles to read whenever youre ready.

Logo - Back to homepage

News Articles Presentations Podcasts Guides

Topics

Development

Featured in Development

Fearless Programming with Rust

Senyo Simpson discusses how Rust's core values lead to "fearless programming." He shares front-line experience with Rust, including its application in a high-performance proxy. He explores the language's strengths in predictability and correctness, while also acknowledging its complexities and providing guidance on when to choose Rust for long-term projects.

Fearless Programming with Rust

All in development

Architecture & Design

Featured in Architecture & Design

Test Smarter, Not Harder: Achieving Confidence in Complex Distributed Systems

Elias Nogueira shares a strategic approach to modern microservice testing, detailing solutions for three core challenges. He discusses parallel execution for multiple databases using Testcontainers, creating reliable, sharable mock environments with WireMock, and handling asynchronous event delays with Awaitility, offering a blueprint for senior engineers.

Test Smarter, Not Harder: Achieving Confidence in Complex Distributed Systems

All in architecture-design

AI Infrastructure

Featured in AI, ML & Data Engineering

MCP: the Universal Connector for Building Smarter, Modular AI Agents

In this article, the authors discuss Model Context Protocol (MCP), an open standard designed to connect AI agents with tools and data they need. They also talk about how MCP empowers agent development, and its adoption in leading open-source frameworks.

MCP: the Universal Connector for Building Smarter, Modular AI Agents

All in ai-ml-data-eng

Culture & Methods

Featured in Culture & Methods

Building a Resilient and Inclusive Engineering Culture with Matthew Card

In this podcast, Shane Hastie, Lead Editor for Culture & Methods, spoke to Matthew Card about his resilience framework (CAPSS - Confidence, Adaptability, Purpose, Social Support) which has helped him overcome career challenges and now guides him in building inclusive engineering cultures by empowering teams and breaking echo chambers.

Building a Resilient and Inclusive Engineering Culture with Matthew Card

All in culture-methods

DevOps

Featured in DevOps

Extreme DevOps Automation

Sérgio Amorim shares how Revolut’s small DevOps team leverages a centralized systems catalog and automation to enable 1,300 engineers to build and deploy applications with speed and consistency. He explains how this approach reduces manual effort and improves observability, alerting, and database management at scale.

Extreme DevOps Automation

All in devops

Events

Helpful links

Choose your language

InfoQ Dev Summit Munich

Learn how senior devs at Mercedes-Benz, DKB & Zalando are solving critical dev challenges.

Limited early bird tickets until Sep 9.

QCon San Francisco

Your team looks to you for what's next. Get the foresight to lead them through software's biggest shifts.

Early bird ends Sep 9.

QCon AI New York

Trying to run AI at scale? Find the blueprints for enterprise AI from leaders who built them.

Early bird expires Sep 9.

QCon London

Leadership asking for innovation? Get the evidence to bet on the right tech and lead with confidence.

Register by Sep 9 to save.

InfoQ Homepage News Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

DevOps

Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

Aug 31, 2025 3 min read

Matt Saunders

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Listen to this article - 0:00

Audio ready to play

0:00

Reading list

The next generation of Kubernetes autoscaling techniques and tools is enabling organisations to make substantial cost savings in their cloud infrastructure. Svetlana Burninova recently used Karpenter to build a multi-architecture EKS cluster and managed a 70% reduction in cost whilst also improving performance.

In an article on Hackernoon, Burninova explains how her techniques also reduced pod scheduling latency from three minutes to 20 seconds.

After switching to Karpenter with about 70% spot instance usage, our monthly compute costs dropped by 70%. That's a significant reduction that freed up substantial budget for new features and infrastructure improvements.

Burninova's implementation involved replacing the traditional Kubernetes Cluster Autoscaler with Karpenter, and also moving to multi-architecture setup with both AMD64 and ARM64 instances. This change also resulted in better resource utilisation, Karpenter's right-sizing capabilities helping to increase CPU utilisation from an average of 25% on fixed nodes to 70%.

Cloud optimisation platform nOps have also written about the benefits of using Karpenter for autoscaling. In a post on their site, they explain that Karpenter functions as "an open-source, flexible, and high-performance Kubernetes cluster autoscaler, offering advanced scheduling and scaling capabilities". Unlike traditional cluster autoscalers that operate with fixed node groups, Karpenter examines pending pods and provisions the most cost-effective instances to meet the specific resource requirements. Karpenter also recently released version 1.0, a milestone that includes better stability and new functionality for disruption budgets and node consolidation.

We're running a more resilient, cost-effective infrastructure that scales intelligently. The substantial cost savings alone paid for the engineering time I spent on this migration within the first month.

Burninova's cost optimisation came from two primary strategies: price optimisation and efficiency optimisation. Price optimisation involves maximising discounts through AWS pricing models, including Reserved Instances, Savings Plans, and Spot Instances, which can offer discounts of up to 90% albeit by adding the risk of two-minute termination notices. Efficiency optimisation focuses on reducing waste through better resource utilisation and more granular scaling decisions.

Burninova's EKS architecture

Moving some workloads to ARM64 Graviton instances saved approximately 20% of costs when compared to equivalent x86 instances, and also showed a performance improvement, with an example image processing service running 15% faster on Graviton hardware. However, Burninova points out that making this change requires careful checks of application compatibility, and that node pools need to be properly configured with appropriate taints to prevent incompatible workloads from being scheduled on ARM64 nodes.

AWS has recently introduced another evolution in this space with Amazon EKS Auto Mode, launched in November 2024. In a post for AWS community builders, developer Rodrigo Fernandes portrays EKS Auto Mode as a simplified natural evolution of Karpenter. Fernandes goes on to explain how EKS Auto Mode abstracts infrastructure management by automatically provisioning and removing nodes based on pod demand rather than traditional CPU and memory metrics.

Shared Responsibility Model with EKS Auto Mode

Auto Mode attempts to be efficient with costs by scaling clusters intelligently based on pending pods, optimising spot instance usage and eliminating idle nodes. It does this by considering pod resource requirements, instance pricing, availability zone distribution and architecture compatibility. Fernandes suggests that early implementations have reduced management time by up to 80% and saved 60-70% of infrastructure expenses. However, there are some limitations for organisations which need to use custom AMIs, specialised hardware such as GPU instances, or granular configuration control for compliance environments.

The added complexity of these tools brings extra responsibility around observability and security, with Fernandes encouraging engineers to keep an eye on metrics such as node creation and termination rates, pod scheduling efficiency, and node utilisation percentages. Tools such as Kubecost can give detailed visibility into costs per namespace, and the efficacy of spot vs. on-demand ratios. Best security practices include using IAM Roles for Service Accounts (IRSA) to eliminate hardcoded credentials, proper subnet tagging for resource discovery, and carefully configuring disruption budgets to maintain application availability during scaling events.

All of the approaches detailed here aim to reduce over-provisioned resources and slow responses to scaling needs. Modern tooling enables engineers to have near-instantaneous provisioning of the best kind of resources for their workloads. This does however require careful planning, proper application configuration of resource requests and limits, and ongoing monitoring.

About the Author

Matt Saunders

Show moreShow less

This content is in the DevOps topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Advanced Autoscaling Helps Companies Reduce AWS Costs by 70%

Write for InfoQ

About the Author

Matt Saunders

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

Related Content

The InfoQ Newsletter