InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Don't have an InfoQ account?

Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
Save articles and read at anytimeBookmark articles to read whenever youre ready.

Logo - Back to homepage

News Articles Presentations Podcasts Guides

Topics

Development

Featured in Development

Go Channels: Understanding Happens-Before for Safe Concurrency

This article dives into the happens-before semantics of Go channels, explaining how they relate to memory visibility, synchronization, and concurrency correctness. We'll examine subtle pitfalls, illustrate them with examples, and explore the architectural implications for system designers.

Go Channels: Understanding Happens-Before for Safe Concurrency

All in development

Architecture & Design

Featured in Architecture & Design

If Architectures Could Talk, They’d Quote Your Boss

Software architecture reflects how organizations communicate and make decisions. Failures stem from misaligned incentives, unclear ownership, and structural gaps—not technical flaws. Architects must design not just systems, but the conditions for systems to thrive, using platform thinking to reduce friction and foster autonomy.

If Architectures Could Talk, They’d Quote Your Boss

All in architecture-design

AI Infrastructure

Featured in AI, ML & Data Engineering

Deploy MultiModal RAG Systems with vLLM

Stephen Batifol discusses building and optimizing self-hosted, multimodal RAG systems. He breaks down vector search, nearest neighbor indexes (FLAT, IVF, HNSW), and the critical role of choosing the right embedding model. He then explains vLLM inference optimization (paged attention, quantization) and uses Mistral's Pixtral to detail multimodal large language model architecture.

Deploy MultiModal RAG Systems with vLLM

All in ai-ml-data-eng

Culture & Methods

Featured in Culture & Methods

Systems Thinking for Scaling Responsible Multi-Agent Architectures

Nimisha Asthagiri explains the critical need for responsible AI in complex multi-agent systems. She shares practical techniques for engineering leaders and architects, applying systems thinking and Causal Flow Diagrams. She shows how these methods help predict and mitigate the unintended consequences and structural risks inherent in autonomous, learning agents, using a scheduler agent example.

Systems Thinking for Scaling Responsible Multi-Agent Architectures

All in culture-methods

DevOps

Featured in DevOps

From Grassroots to Enterprise: Vanguard's Journey in SRE Transformation

Christina Yakomin shares Vanguard's SRE transformation: from quarterly testing of monoliths to a mature DevOps model with continuous delivery. She explains the SRE coaching hub, self-service tools, and advanced techniques like request-rate autoscaling. She details modern challenges, including region failure game days and testing AI-backed contact centers.

From Grassroots to Enterprise: Vanguard's Journey in SRE Transformation

All in devops

Events

Helpful links

Choose your language

QCon San Francisco 2025

Get proven patterns to de-risk modern architectures. See how engineers scale cloud-native systems, improve observability, and evolve reliable platforms at pace.

Early Bird ends Oct 14.

QCon AI New York 2025

Move beyond AI demos to real engineering impact. Discover how teams embed LLMs, govern models, and scale inference pipelines to accelerate development securely.

Early Bird ends Oct 14.

QCon London 2025

Benchmark your systems against leading engineering teams. See what really works in FinOps, modern Java, and distributed data architectures to balance cost, scale, and reliability.

Early Bird ends Oct 14.

InfoQ Homepage News Google Propeller Squeezes Extra Performance from Large-Scale LLVM Binaries

Development

Google Propeller Squeezes Extra Performance from Large-Scale LLVM Binaries

This item in japanese

Mar 23, 2020 2 min read

Sergio De Simone

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

Reading list

Google Propeller is able to improve the performance of LLVM binaries by relinking and optimizing them based on a profile of their behaviour at runtime. Propeller can bring 2-9% improvements on key performance benchmarks for binaries that were previously highly optimized by LLVM, say Google engineers.

Propeller's aim is to create a new binary from the one you profiled after applying a number of transformations to its layout. In short, what you do to take advantage of Propeller is run your program, and compile using a specific flag to gather metrics about its behaviour and performance at runtime. Then, you feed that data into Propeller to have it transform your binary layout to extract the maximum performance out of it. This is what is called profile-guided optimization, which is not to be confused with dynamic optimization, such as that provided by Dynamo and other similar tools, which optimize a system at runtime.

Initially announced on the LLVM-dev mailing list in September 2019, Propeller is inspired by Facebook BOLT, another recent LLVM-based profile-based relinker, but uses a different approach that lends itself to be used with distributed build systems and scales better with the binary size.

In particular, Google engineers found that while BOLT grants significant runtime performance gains, its monolithic, single-step architecture leads to required memory and time exploding for binaries with a ~300MB segment size. Now, while 300MB binaries are admittedly large and not that many developers could need to optimize such large systems, Propeller seems to provide a couple of additional benefits over BOLT. Specifically, Propeller is designed around a two-step optimization process, where the first step can be distributed across parallel workers. In contrast to this, BOLT is implemented as a single process that directly works on the input binary in one go.

Additionally, Propeller should be much better than BOLT with incremental builds. This is possible because when small source changes do not modify the profile information significantly, Propeller may do a better job at identifying those parts of the binary that needs to be re-processed.

To put things in perspective, LLVM is already able to do both link-time optimization (LTO) and profile-guided optimization (PGO), alongside other major compilers. What tools like BOLT or Propeller do to further improve over LLVM optimizations is a more aggressive use of profiled data, favouring the use of exact profile data over approximations.

If you are interested in the details of how Propeller works, Google has made available an RFC thoroughly documenting how Propeller works (PDF) and how it can be used starting with pulling its source code from GitHub and compiling it.

EDITORIAL NOTE: In a first version of this article, we incorrectly overstated BOLT's inhability to cope with code changes due to the fact that even minor changes to the profile would invalidate it altogether and require the whole optimization process to start over. Although this statement was based on information from Google's Propeller RFC at the time of this writing, it was inaccurate. BOLT is indeed able to deal with code changes to some extent and the original sentence was therefore removed. Thanks to Google engineer Sriraman Tallam for this clarification.

This content is in the Performance topic

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

InfoQ Software Architects' Newsletter

Google Propeller Squeezes Extra Performance from Large-Scale LLVM Binaries

Write for InfoQ

Rate this Article

This content is in the Performance topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

Related Content

The InfoQ Newsletter