InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

InfoQ Homepage Performance & Scalability Content on InfoQ

Articles

RSS Feed

Newer Older

DevOps

Analyzing Apache Kafka Stretch Clusters: WAN Disruptions, Failure Scenarios, and DR Strategies

Proficient in analyzing the dynamics of Apache Kafka Stretch Clusters, I assess WAN disruptions and devise effective Disaster Recovery (DR) strategies. With deep expertise, I ensure high availability and data integrity across multi-region deployments. My insights optimize operational resilience, safeguarding vital services against service level agreement violations.

Srikanth Daggumalli Nishchai Jayanna Manjula
on Jun 20, 2025
Cloud

Designing Resilient Event-Driven Systems at Scale

Learn how to design resilient event-driven systems that scale. Explore key patterns like shuffle sharding and decoupling queues to handle load spikes and failures. Understand common pitfalls like over-relying on retries and neglecting observability for robust, scalable architectures.

Rajesh Kumar Pandey
on May 30, 2025
Architecture & Design

Transforming Legacy Healthcare Systems: a Journey to Cloud-Native Architecture

Discover how Livi navigated the complexities of transitioning MJog, a legacy healthcare system, to a cloud-native architecture, sharing valuable insights for successful tech modernization. Our experience illustrates that transitioning from legacy systems to cloud-based microservices is not a one-time project, but an ongoing journey.

Leander Vanderbijl
on Nov 18, 2024
Architecture & Design

How Netflix Ensures Highly-Reliable Online Stateful Systems

Building reliable stateful services at scale isn’t a matter of building reliability into the servers, the clients, or the APIs in isolation. By combining smart and meaningful choices for each of these three components, we can build massively scalable, SLO-compliant stateful services at Netflix.

Joseph Lynch
on May 14, 2024
Cloud

Magic Pocket: Dropbox’s Exabyte-Scale Blob Storage System

A horizontally scalable exabyte-scale blob storage system which operates out of multiple regions, Magic Pocket is used to store all of Dropbox’s data. Adopting SMR technology and erasure codes, the system has extremely high durability guarantees but is cheaper than operating in the cloud.

Facundo Agriel
on May 15, 2023
AI, ML & Data Engineering

Design Pattern Proposal for Autoscaling Stateful Systems

In this article, Rogerio Robetti discusses the challenges in auto-scaling stateful storage systems and proposes an opinionated design solution to automatically scale up (vertical) and scale out (horizontal) from a single node up to several nodes in a cluster with minimum configuration and interference of the operator.

Rogerio Robetti
on Jan 25, 2023
Cloud

A Recipe to Migrate and Scale Monoliths in the Cloud

In this article, I want to present a simple cloud architecture that can allow an organization to take monolithic applications to the cloud incrementally without a dramatic change in the architecture. We will discuss the minimal requirements and basic components to take advantage of the scalability of the cloud.

Luciano Mammino
on May 13, 2022
DevOps

Using the Plan-Do-Check-Act Framework to Produce Performant and Highly Available Systems

The PDCA (plan-do-check-act) framework can be used to outline the performance, availability, and monitoring to enable teams to ensure performant and highly available applications. These include infrastructure design and setup, application architecture and design, coding, performance testing, and application monitoring.

Kulkarni Girish
on Jun 09, 2021
Development

Donkey: a Highly-Performant HTTP Stack for Clojure

Donkey is the product of the quest for a highly performant Clojure HTTP stack aimed to scale at the rapid pace of growth we have been experiencing at AppsFlyer, and save us computing costs. In this article, we’ll briefly outline the use-case for a library like Donkey and present our benchmarks. Finally, we will discuss Clojure and immutability, and some of our design decisions.

Yaron Elyashiv
on Jan 12, 2021
Cloud

Four Techniques Serverless Platforms Use to Balance Performance and Cost

There are two aspects that have been key to the rapid adoption of serverless computing: the performance and the cost model. This article looks at those aspects, the tradeoffs, and opportunity ahead.

Erwin van Eyk
on Feb 13, 2019
AI, ML & Data Engineering

Scaling a Distributed Stream Processor in a Containerized Environment

The article presents our experience of scaling a distributed stream processor in Kubernetes. The stream processor should provide support for maintaining the optimal level of parallelism. However, adding more resources incurs additional cost and also it does not guarantee performance improvements. Instead, the stream processor should identify the level of resource requirement and scale accordingly.

Miyuru Dayarathna Sarangan Janakan
on Jan 13, 2019
AI, ML & Data Engineering

Columnar Databases and Vectorization

In this article, author Siddharth Teotia discusses the Dremio database which is based on Apache Arrow with vectorization capabilities.

Siddharth Teotia
on May 27, 2018

Newer Articles

Older Articles