You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: references/joins.mdx
+121-6Lines changed: 121 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,9 @@ title: "Joins reference"
3
3
description: "Joins let you connect different models to each other so that you can explore more than one model at the same time in Lightdash and see how different parts of your data relate to each other."
4
4
sidebarTitle: "Joins"
5
5
---
6
+
<Info>
7
+
**Performance Best Practice:** For optimal query performance, we recommend using wide tables wherever possible and minimising joins in the BI layer. While we offer advanced features like fanout protection to help with complex relationships, handling data transformations and complex logic directly in your SQL models will generally yield better performance than relying heavily on joins at query time. Consider pre-joining related data during your data modeling process rather than joining tables on-the-fly in dashboards and reports.
8
+
</Info>
6
9
7
10
## Adding joins in your models
8
11
@@ -155,7 +158,7 @@ A full join returns all rows when there is a match in either the left or right t
155
158
156
159
You can define the relationship between tables in your joins to help Lightdash show warnings and generate the appropriate SQL. This is especially useful for preventing SQL fanouts issues described in the [SQL fanouts](#sql-fanouts) section.
157
160
158
-
To define a relationship, add the `relationship` field to your join configuration:
161
+
To define a relationship, add the `relationship` field to your join configuration.
159
162
160
163
```yaml
161
164
models:
@@ -167,13 +170,125 @@ models:
167
170
sql_on: ${users.user_id} = ${orders.user_id}
168
171
relationship: one-to-many
169
172
```
173
+
<Warning>
174
+
Make sure that you consider the direction of the join when defining the relationship. If you incorreclty define the join relationship, your will be affected by fanouts.
175
+
</Warning>
176
+
177
+
##### The following join relationships are supported:
178
+
179
+
- `one-to-many`- Starting table has 1 record, joined table has many matches
180
+
- `many-to-one`- Starting table has many records, joined table has 1 match
181
+
- `one-to-one`- Starting table has 1 record, joined table has 1 match
182
+
- `many-to-many`- Multiple records in the starting table match multiple records in the joined table
183
+
184
+
<Accordion title="Helpful Steps for Determining Join Relationships">
185
+
#### Step 1: Identify your starting table
186
+
Which table are you joining FROM? Direction matters: `Accounts`joining to `Users` (one-to-many) is completely different from users joining to accounts (many-to-one), even though it's the same data.
187
+
188
+
#### Step 2: Count the expected matches and name the join relationship
189
+
For any record in your starting table, ask: "How many matching records will I find in the table I'm joining to and vice versa?"Refer to the supported join relationships listed above.
190
+
191
+
The examples below detail some more complex join relationships:
192
+
193
+
##### Chained Join Example
194
+
Don't try to figure out `Accounts` → `Users` → `Tracks` all at once. Analyze each join separately:
The above setup will consider both `Accounts` and `Users` as being susceptible to fanouts and these would be handled accordingly. When you chain two one-to-many relationships, you get a one-to-many relationship from your starting table to your final table (`Accounts` can have many `Tracks`).
219
+
220
+
Note that if I wanted to join `Users` and `Accounts` onto the `Tracks`, where `Tracks` is the starting model, the direction of the relationship would look different:
221
+
222
+
The `tracks.yml` model would look like this:
223
+
```yaml
224
+
version: 2
225
+
models:
226
+
- name: tracks
227
+
meta:
228
+
primary_key: account_id
229
+
description: List of all customer and prospective customer Accounts pulled from our CRM
We want to see all Accounts and all Deals, but we only want to see Users (and their associated event tracks) for accounts that have at least one Deal in the 'Won' stage.
242
+
243
+
This requires a complex join that involves 4 different tables.
244
+
245
+
• First: `Accounts`→ `Deals` (one-to-many)
246
+
• Next: `Accounts`and `Deals` → `Users` (many-to-many) - each `Account`+ `Deal` combination can be associated with many `Users` and each user can be associated with multiple `Deals`.
247
+
• Then: `Users`→ `Tracks` (one-to-many)
248
+
249
+
A normal SQL join that does not account for fanouts would look like this:
250
+
``` sql
251
+
select
252
+
*
253
+
from
254
+
accounts
255
+
left join deals on
256
+
accounts.account_id = deals.account_id
257
+
left join users on
258
+
accounts.account_id = users.account_id and deals.stage ='Won'
259
+
left join tracks on
260
+
users.user_id = tracks.user_id
261
+
```
262
+
263
+
And the `accounts.yml` would look like this:
264
+
``` yaml
265
+
models:
266
+
- name: accounts
267
+
meta:
268
+
primary_key: account_id
269
+
description: List of all customer and prospective customer Accounts pulled from our CRM
sql_on: ${accounts.account_id} = ${users.account_id} and ${deals.stage} = 'Won'
278
+
type: left
279
+
- join: tracks
280
+
relationship: one-to-many
281
+
sql_on: ${users.user_id} = ${tracks.user_id}
282
+
type: left
283
+
```
284
+
In this case, the fanout protection logic will consider metrics from all models to be susceptible to fanouts.
170
285
171
-
Supported values:
286
+
#### Step 3: Check for conditional joins
287
+
Look for any AND conditions in your join logic (like and `${deals.stage} = 'Won'`). These can change your relationship from what you'd expect - a typical one-to-many might become many-to-many when you add conditions.
172
288
173
-
- `one-to-many`
174
-
- `many-to-one`
175
-
- `one-to-one`
176
-
- `many-to-many`
289
+
#### Step 4: Validate with sample data
290
+
Pick one record from your starting table and manually trace through the joins. Count how many final records you get - this helps catch relationship mistakes before they cause problems.
0 commit comments