[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News Inside Uber’s Query Architecture: Simplifying Layers and Improving Observability

Inside Uber’s Query Architecture: Simplifying Layers and Improving Observability

Nov 06, 2025 2 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article - 0:00
Audio ready to play
0:00
0:00

Uber has redesigned its Apache Pinot query architecture to simplify execution, support richer SQL, and improve predictability for internal analytics workloads. The previous Neutrino system, which layered Presto and Pinot, has been replaced by a lightweight proxy called Cellar and uses Pinot’s Multi-Stage Engine Lite Mode. The redesign aims to reduce complexity, enforce execution limits, and provide stronger isolation for multiple tenants.

Previously, Neutrino ran as a stateless microservice combining Presto coordinator and worker processes. User-submitted PrestoSQL queries were partially pushed down to Pinot as PinotSQL, while the remaining query logic executed within Neutrino. Each query included default or user-defined limits to reduce the risk of full-table scans. Despite these safeguards, the layered architecture created complex semantics, made query plans harder to interpret, and limited isolation for tenants sharing the same proxy.

Uber's Neutrino’s query architecture(Source: Uber Blog Post)

Uber’s Apache Pinot tables can reach hundreds of terabytes with billions of records, handling query rates from single digits to thousands of QPS. Multi-stage queries at this scale can easily exceed resources or latency expectations. Pinot 1.4 introduces the Multi-Stage Engine Lite Mode, which enforces configurable leaf stage record limits and uses a scatter-gather pattern. Leaf stages run on Pinot servers while other operators execute on brokers, ensuring predictable performance for complex queries.

The new architecture introduces Cellar, a lightweight proxy that forwards queries directly to Pinot brokers. For basic workloads, Pinot's single-stage query engine handles execution, and for advanced SQL features, Uber uses the Multi-Stage Engine in Lite Mode. MSE Lite Mode enforces configurable maximum record limits at the leaf stage to prevent excessive resource usage and surfaces these limits in the explain plan for transparency. Scatter-gather execution remains, with leaf stages on data nodes and aggregation on brokers, while supporting joins and window functions under controlled conditions. Uber also added monitoring and logging enhancements to MSE Lite Mode, enabling engineering teams to track query performance and troubleshoot high-latency requests more efficiently.

High-level Cellar query architecture (Source: Uber Blog Post)

Cellar also includes a direct-connection mode that allows tenants to bypass the proxy and connect directly to Pinot brokers. Uber has also integrated a time series plugin supporting M3QL through Cellar. The rebuilt architecture powers internal analytics workloads such as tracing, log search, and segmentation. As of publication, Cellar handles roughly 20% of Neutrino's prior query volume, with plans to expand adoption and phase out Neutrino.

Cellar direct mode connection for complete isolation (Source: Uber Blog Post)

Uber also provides official client libraries for Java and Go monorepos to simplify interaction with Cellar. The clients handle Pinot’s response format, support partial results with warnings, enforce timeouts and retries, and emit metrics for latency, query success, and warnings. A Grafana dashboard provides operational visibility for new users out of the box.

According to Uber’s engineering team, the redesign reflects the evolution of OLAP systems to support high QPS and sub second latencies while maintaining isolation and predictability. They plan to release MSE Lite Mode to users later this year and improve it further.

About the Author

Leela Kumili

Show moreShow less

Rate this Article

Adoption
Style

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /