You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: references/joins.mdx
+58-1Lines changed: 58 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -571,7 +571,64 @@ There are a few situations where Lightdash doesn't currently handle inflated met
571
571
-**Metrics that reference joined dimensions**
572
572
-**Complex sequential joins**, e.g one-to-many, one-to-many and then a many-to-one
573
573
-**Custom dimensions** sometimes cause duplicate rows where they shouldn't
574
-
-**Intentional fanouts** where the fanout is desired for business logic (see below)
574
+
-**Rolledup metrics** when used with joins that create fanouts ([see below](#rolledup-metrics))
575
+
-**Intentional fanouts** where the fanout is desired for business logic ([see below](#intentional-fanouts))
576
+
577
+
#### Rolledup metrics
578
+
579
+
Rolledup metrics are pre-aggregated metrics that have already been calculated at a specific granularity in your data warehouse. When these metrics are used in queries that involve joins creating fanouts, they can become inflated because the pre-aggregated values get duplicated across the fanned-out rows.
580
+
581
+
##### Example: Orders and payments analysis
582
+
583
+
Consider a scenario where you have an orders table with total_amount and a payments table with payment methods. For each order, you can have multiple payments with different methods:
If you select payment method and average orders total_amount, the results are wrong because total_amount is a rolledup metric that can't be split by method:
599
+
600
+
| payment_method | avg_order_total |
601
+
| :------------- | :-------------- |
602
+
| cash | 100.00 |
603
+
| card | 175.00 |
604
+
605
+
This is incorrect because:
606
+
- The cash average should reflect that cash was only used for part of order 1001 (\$100), but the query shows \$100 as if cash paid for the entire order
607
+
- The card average shows \$175 (average of \$100 and \$250), but this doesn't represent the actual relationship between card payments and order totals
608
+
609
+
The issue is that `total_amount` is a rolledup metric at the order level, but when joined with payments, it gets duplicated across payment methods, making it impossible to correctly analyze the relationship between payment methods and order totals.
610
+
611
+
##### Best practices for rolledup metrics
612
+
613
+
To avoid issues with rolledup metrics in joins that create fanouts:
614
+
615
+
**Avoid rolledup metrics when possible:** Instead of using pre-aggregated values, use the underlying detail-level data. For example, instead of using a pre-calculated `total_amount` at the order level, use individual payment amounts that can be properly aggregated.
616
+
617
+
**Name rolledup metrics clearly:** If you must use rolledup metrics, give them descriptive names that indicate their pre-aggregated nature and limitations.
618
+
619
+
**Provide clear descriptions:** Always include detailed descriptions in your dbt model's YAML that explain the metric's granularity and any limitations when used with joins.
620
+
621
+
```yaml
622
+
models:
623
+
- name: orders
624
+
columns:
625
+
- name: total_amount
626
+
meta:
627
+
metrics:
628
+
total_order_amount:
629
+
type: sum
630
+
description: 'Pre-aggregated total amount per order. Cannot be meaningfully split by payment method or other transaction-level dimensions.'
0 commit comments