InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Don't have an InfoQ account?

Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
Save articles and read at anytimeBookmark articles to read whenever youre ready.

Logo - Back to homepage

News Articles Presentations Podcasts Guides

Topics

Development

Featured in Development

Go Channels: Understanding Happens-Before for Safe Concurrency

This article dives into the happens-before semantics of Go channels, explaining how they relate to memory visibility, synchronization, and concurrency correctness. We'll examine subtle pitfalls, illustrate them with examples, and explore the architectural implications for system designers.

Go Channels: Understanding Happens-Before for Safe Concurrency

All in development

Architecture & Design

Featured in Architecture & Design

Building Resilient Platforms: Insights from Over Twenty Years in Mission-Critical Infrastructure

Building resilient platforms requires understanding the art and science of creating infrastructure that others depend on for critical applications. This perspective applies to anyone who builds software consumed by others at scale. Whether developing infrastructure platforms, software development platforms, or messaging systems, principles address how to build software that others consume at scale

Building Resilient Platforms: Insights from Over Twenty Years in Mission-Critical Infrastructure

All in architecture-design

AI Infrastructure

Featured in AI, ML & Data Engineering

Growing and Cultivating Strong Machine Learning Engineers

Vivek Gupta shares best practices for managing and mentoring ML engineers, from early-career development to senior leadership growth. He breaks down the crucial, distinct skills for Production Machine Learning, including data/model management, building training pipelines, LLM prompt evaluation, privacy/security, and integrating human-in-the-loop processes for reliable, scalable AI systems.

Growing and Cultivating Strong Machine Learning Engineers

All in ai-ml-data-eng

Culture & Methods

Featured in Culture & Methods

Shine Bright as an IC: Growing Yourself as Your Company Grows

Suhail Patel discusses how senior engineers and tech leaders must go beyond technical mastery to achieve staff-plus growth. He explains how to leverage one-to-ones, intentional interviewing (as learning opportunities), and visible writing to build influence and your network. Get practical advice on making ambitious bets and fixing organizational cracks to grow your team and company.

Shine Bright as an IC: Growing Yourself as Your Company Grows

All in culture-methods

DevOps

Featured in DevOps

You Are Asking the Wrong Questions (about Reliability and SRE)

David Blank-Edelman (Microsoft SRE Academy) explains seven essential questions to elevate your reliability practice. He challenges engineering leaders to redefine reliability metrics beyond availability, replace "root cause" with contributing factors, critique the five whys, re-evaluate the true goals of toil automation, and understand SRE's role (firefighting vs. partnership).

You Are Asking the Wrong Questions (about Reliability and SRE)

All in devops

Events

Helpful links

Choose your language

QCon San Francisco 2025

Get production-proven patterns from the leaders who scaled a GenAI search platform to millions, migrated a core ML system without downtime, and architected a global streaming service from the ground up.

Early Bird ends Nov 11.

QCon AI New York 2025

Move beyond AI demos to real engineering impact. Discover how teams embed LLMs, govern models, and scale inference pipelines to accelerate development securely.

Early Bird ends Nov 11.

QCon London 2026

Benchmark your systems against leading engineering teams. See what really works in FinOps, modern Java, and distributed data architectures to balance cost, scale, and reliability.

Early Bird ends Nov 11.

InfoQ Homepage News Databricks Open Sources Delta Lake to Make Data Lakes More Reliable

AI, ML & Data Engineering

Databricks Open Sources Delta Lake to Make Data Lakes More Reliable

This item in japanese

Lire ce contenu en franÃ§ais

May 20, 2019 1 min read

Alex Giamas

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Reading list

Databricks recently announced open sourcing Delta Lake, their proprietary storage layer, to bring ACID transactions to Apache Spark and big data workloads. Databricks is the company behind the creators of Apache Spark, while Delta Lake is already being used in several companies like McGraw Hill, McAffee, Upwork and Booz Allen Hamilton.

Delta Lake is addressing the heterogeneous data problem that data lakes often have. Ingesting data from multiple pipelines means that engineers need to enforce data integrity manually, throughout all the data sources. Delta Lake can bring ACID transactions to the data lake, with the strongest level of isolation applied, serializability.

Delta Lake provides time travelling, being able to fetch every version of a file in time, a feature quite useful for GDPR and other audit related requests. Metadata on files are stored using the exact same process as data, enabling the same level of processing and feature richness.

Delta Lake provides schema enforcement capabilities. Data types and presence of fields can be checked and enforced, making sure that the data can be kept clean. Schema changes on the other hand, don’t require DDL but can be applied automatically.

Delta Lake is deployed on top of the existing data lake, it is compatible with both batch and streaming data and can be plugged into an existing Spark job as a new data source. Data is stored in the familiar Apache Parquet format.

Delta Lake is also compatible with MLFlow, Databricks newest open source platform that was launched last year. The code is available on GitHub.

This content is in the AI, ML & Data Engineering topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Databricks Open Sources Delta Lake to Make Data Lakes More Reliable

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

Related Content

The InfoQ Newsletter