InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

InfoQ Homepage Articles

Articles

RSS Feed

Newer Older

DevOps

How Causal Reasoning Addresses the Limitations of LLMs in Observability

Large language models excel at converting observability telemetry into clear summaries but struggle with accurate root cause analysis in distributed systems. LLMs often hallucinate explanations and confuse symptoms with causes. This article suggests how causal reasoning models with Bayesian inference offer more reliable incident diagnosis.

Dhairya Dalal
on Sep 02, 2025
Mobile

Evaluating Kotlin Multiplatform: Benefits and Trade-Offs in Cross-Platform Development

KMP is emerging as an alternative for cross-platform development, offering a path to share code without sacrificing the performance and feel of a native application. KMP comes with its own set of trade-offs and this article dives deep into those. While it focuses primarily on Android and iOS, KMP can be used to build desktop, web, and server-side applications, all from the same shared logic.

Rachit Jain
on Sep 01, 2025
AI, ML & Data Engineering

MCP: the Universal Connector for Building Smarter, Modular AI Agents

In this article, the authors discuss Model Context Protocol (MCP), an open standard designed to connect AI agents with tools and data they need. They also talk about how MCP empowers agent development, and its adoption in leading open-source frameworks.

Sanjay Surendranath Girija Lakshit Arora Shashank Kapoor
on Aug 29, 2025
Architecture & Design

The Virtual Think Tank: Using LLMs to Gain a Multitude of Perspectives

The virtual think tank leverages LLMs to simulate diverse stakeholder and expert perspectives, enabling architects to explore trade-offs, challenge biases, and refine decisions. By prompting personas of real industry experts, the method fosters rich, contextual debates—offering a scalable, low-cost alternative to a traditional think tank.

Avraham Poupko
on Aug 28, 2025
Cloud

Ransomware-Resilient Storage: the New Frontline Defense in a High-Stakes Cyber Battle

Cybersecurity has evolved, with ransomware now primarily targeting data storage and backups. To combat this, modern defense strategies focus on making storage systems more resilient. Key tactics include using immutable storage that prevents data from being altered or deleted, employing AI-powered detection, and implementing air-gapping to create isolated, tamper-proof recovery points.

Arjun Mullick
on Aug 25, 2025
AI, ML & Data Engineering

The Missing Layer in AI Infrastructure: Aggregating Agentic Traffic

In this article, author Eyal Solomon discusses AI Gateways, the outbound proxy servers that intercept and manage AI-agent-initiated traffic in real time to enforce policies and provide central management.

Eyal Solomon
on Aug 22, 2025
Cloud

Zero-Downtime Critical Cloud Infrastructure Upgrades at Scale

Engineers can avoid common pitfalls in large-scale infrastructure upgrades by studying others' experiences. The article provides lessons learned from big firms like eBay and Snowflake, offering solutions for legacy systems, performance validation, and rollback planning. It emphasizes systematic preparation and clear communication to handle challenges and ensure zero-downtime upgrades at scale.

Kiran Bhat
on Aug 18, 2025
Java

Infusing AI into Your Java applications

Equip yourself with the basic AI knowledge and skills you need to start building intelligent and responsive Enterprise Java applications. With the help of our simple chatbot application for booking interplanetary space trips, see how Java frameworks like LangChain4j with Quarkus make it easy and efficient to interact with LLMs and create satisfying applications for end-users.

Don Bourne Michal Broz Laura Cowen Daniel Oh Kevin Dubois
on Aug 15, 2025
Architecture & Design

One Network: Cloud-Agnostic Service and Policy-Oriented Network Architecture

Bringing together software infrastructure leads to faster development time and easy control of large, spread-out systems through clear rules. In this QCon SF 2024 presentation, Anna Berenberg shared learnings and achievements when building One Network, addressing complex infrastructure layers, open-source integration, and uniform policy enforcement for improved reliability and security.

Anna Berenberg
on Aug 12, 2025
Cloud

Sandbox as a Service: Building an Automated AWS Sandbox Framework

This article outlines an automated AWS Sandbox Framework to provide secure, cost-controlled environments for innovation. It leverages AWS services like Control Tower and open-source tools to automate provisioning, enforce security policies, manage resource lifecycles, and optimize costs through automated cleanup and governance.

Gaurav Mittal
on Aug 11, 2025
Architecture & Design

Keep the Terminal Relevant: Patterns for AI Agent Driven CLIs

Well-designed CLIs are crucial in the agentic AI era—serving both human users and autonomous agents with precision and reliability. Treat CLI output formats as stable API contracts and prioritize adoption of the MCP protocol for agent integration from day one.

Sriram Madapusi Vasudevan
on Aug 08, 2025
Cloud

Backend FinOps: Engineering Cost-Efficient Microservices in the Cloud

Backend FinOps integrates financial discipline into microservices, crucial for cutting cloud costs. Challenges such as resource fragmentation and cold starts underscore the need for intelligent design, effective language choice, robust tagging, and automation. Implementing FinOps via IaC, CI/CD checks, and dynamic autoscaling (e.g., Karpenter) ensures sustained efficiency.

Vivek Arora
on Aug 06, 2025

Newer Articles

Older Articles