You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
description: "Joins let you connect different models to each other so that you can explore more than one model at the same time in Lightdash and see how different parts of your data relate to each other."
sidebarTitle: "Joins"
---
<Info>
**Performance Best Practice:** For optimal query performance, we recommend using wide tables wherever possible and minimising joins in the BI layer. While we offer advanced features like fanout protection to help with complex relationships, handling data transformations and complex logic directly in your SQL models will generally yield better performance than relying heavily on joins at query time. Consider pre-joining related data during your data modeling process rather than joining tables on-the-fly in dashboards and reports.
</Info>
## Adding joins in your models
Expand DownExpand Up
@@ -155,7 +158,7 @@ A full join returns all rows when there is a match in either the left or right t
You can define the relationship between tables in your joins to help Lightdash show warnings and generate the appropriate SQL. This is especially useful for preventing SQL fanouts issues described in the [SQL fanouts](#sql-fanouts) section.
To define a relationship, add the `relationship` field to your join configuration:
To define a relationship, add the `relationship` field to your join configuration.
```yaml
models:
Expand All
@@ -167,13 +170,125 @@ models:
sql_on: ${users.user_id} = ${orders.user_id}
relationship: one-to-many
```
<Warning>
Make sure that you consider the direction of the join when defining the relationship. If you incorreclty define the join relationship, your will be affected by fanouts.
</Warning>
##### The following join relationships are supported:
- `one-to-many` - Starting table has 1 record, joined table has many matches
- `many-to-one` - Starting table has many records, joined table has 1 match
- `one-to-one` - Starting table has 1 record, joined table has 1 match
- `many-to-many` - Multiple records in the starting table match multiple records in the joined table
<Accordion title="Helpful Steps for Determining Join Relationships">
#### Step 1: Identify your starting table
Which table are you joining FROM? Direction matters: `Accounts` joining to `Users` (one-to-many) is completely different from users joining to accounts (many-to-one), even though it's the same data.
#### Step 2: Count the expected matches and name the join relationship
For any record in your starting table, ask: "How many matching records will I find in the table I'm joining to and vice versa?" Refer to the supported join relationships listed above.
The examples below detail some more complex join relationships:
##### Chained Join Example
Don't try to figure out `Accounts` → `Users` → `Tracks` all at once. Analyze each join separately:
The above setup will consider both `Accounts` and `Users` as being susceptible to fanouts and these would be handled accordingly. When you chain two one-to-many relationships, you get a one-to-many relationship from your starting table to your final table (`Accounts` can have many `Tracks`).
Note that if I wanted to join `Users` and `Accounts` onto the `Tracks`, where `Tracks` is the starting model, the direction of the relationship would look different:
The `tracks.yml` model would look like this:
```yaml
version: 2
models:
- name: tracks
meta:
primary_key: account_id
description: List of all customer and prospective customer Accounts pulled from our CRM
We want to see all Accounts and all Deals, but we only want to see Users (and their associated event tracks) for accounts that have at least one Deal in the 'Won' stage.
This requires a complex join that involves 4 different tables.
• First: `Accounts` → `Deals` (one-to-many)
• Next: `Accounts` and `Deals` → `Users` (many-to-many) - each `Account`+ `Deal` combination can be associated with many `Users` and each user can be associated with multiple `Deals`.
• Then: `Users` → `Tracks` (one-to-many)
A normal SQL join that does not account for fanouts would look like this:
``` sql
select
*
from
accounts
left join deals on
accounts.account_id = deals.account_id
left join users on
accounts.account_id = users.account_id and deals.stage ='Won'
left join tracks on
users.user_id = tracks.user_id
```
And the `accounts.yml` would look like this:
``` yaml
models:
- name: accounts
meta:
primary_key: account_id
description: List of all customer and prospective customer Accounts pulled from our CRM
sql_on: ${accounts.account_id} = ${users.account_id} and ${deals.stage} = 'Won'
type: left
- join: tracks
relationship: one-to-many
sql_on: ${users.user_id} = ${tracks.user_id}
type: left
```
In this case, the fanout protection logic will consider metrics from all models to be susceptible to fanouts.
Supported values:
#### Step 3: Check for conditional joins
Look for any AND conditions in your join logic (like and `${deals.stage} = 'Won'`). These can change your relationship from what you'd expect - a typical one-to-many might become many-to-many when you add conditions.
- `one-to-many`
- `many-to-one`
- `one-to-one`
- `many-to-many`
#### Step 4: Validate with sample data
Pick one record from your starting table and manually trace through the joins. Count how many final records you get - this helps catch relationship mistakes before they cause problems.
</Accordion>
## Always join a table
Expand Down
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.