I was recently querying our internal database inventory tool for a list of servers, instances and databases, and was adding the corresponding status to each server, instance and database.
Relationship Diagram
Server ˂-- 1 : n --˃ Instance ˂-- 1 : n --˃ Database
˄ ˄ ˄
| | |
| 1 : 1 |
| | |
| ˅ |
+-- 1 : 1 --˃ Status ˂-- 1 : 1 --+
Read as:
...a server can have multiple instances
...an instance can have multiple databases
...a server, an instance and a database can have a status
Setup
Status Table
CREATE TABLE [Status]
(
StatusID int,
StatusName char(20),
);
Status Data
INSERT INTO [Status] (StatusID, StatusName)
VALUES
(1,'Productive'),
(2,'Prod ACC'),
(3,'Prod APP'),
(4,'Test'),
(5,'Test ACC'),
(6,'Test APP'),
(7,'OFFLINE'),
(8,'Reserved'),
(9,'Decommisioned');
Server Table
CREATE TABLE [Server]
(
ServerID int,
ServerName char(20),
ServerStatusID int
);
Server Data
INSERT INTO [Server] (ServerID, ServerName, ServerStatusID)
VALUES
(1,'FirstServer',1),
(2,'SecondServer',2),
(3,'ThirdServer',5),
(4,'FourthServer',8),
(5,'FifthServer',8);
Instance Table
CREATE TABLE [Instance]
(
InstanceID int,
ServerID int,
InstanceName char(30),
InstanceStatusID int
);
Instance Data
INSERT INTO [Instance]
(InstanceID, ServerID, InstanceName, InstanceStatusID)
VALUES
(1,1,'GENERAL',1),
(2,1,'TAXES',1),
(3,2,'GENERAL',9),
(4,2,'SOCIAL',2),
(5,3,'GENERAL',5),
(6,3,'FBI',8),
(7,5,'COMINGSOON',8);
Database Table
CREATE TABLE [Database]
(
DatabaseID int,
InstanceID int,
DatabaseName char(30),
DatabaseStatusID int
);
Database Data
INSERT INTO [Database]
(DatabaseID, InstanceID, DatabaseName, DatabaseStatusID)
VALUES
(1,1,'master',1),
(2,1,'model',1),
(3,1,'msdb',1),
(4,1,'UserDB1',1),
(5,2,'master',1),
(6,2,'model',1),
(7,2,'msdb',1),
(8,2,'TaxesDB',1),
(9,4,'master',2),
(10,4,'model',2),
(11,4,'msdb',2),
(12,4,'HealthCareDB',2),
(13,5,'master',5),
(14,5,'model',5),
(15,5,'msdb',5),
(16,5,'GeneralUserDB',5),
(17,6,'master',8),
(18,6,'model',8),
(19,6,'msdb',8),
(20,6,'CriminalDB',8);
SELECT Statement without Status table involved
The initial SELECT statement involved simply joining the three tables: server, instance, database and was as follows:
-- Simple SELECT to get all information on Servers, Instances and Databases
-- The status of the server, instance or database is not returned
SELECT
ServerName,
InstanceName,
DatabaseName
FROM [Server] as srv
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID;
Results of 1. Statement
PLEASE OBSERVE THAT...
- there is a server without an instance and database
- there is an instances without a database
ServerName | InstanceName | DatabaseName |
---|---|---|
FirstServer | GENERAL | master |
FirstServer | GENERAL | model |
FirstServer | GENERAL | msdb |
FirstServer | GENERAL | UserDB1 |
FirstServer | TAXES | master |
FirstServer | TAXES | model |
FirstServer | TAXES | msdb |
FirstServer | TAXES | TaxesDB |
SecondServer | GENERAL | null |
SecondServer | SOCIAL | master |
SecondServer | SOCIAL | model |
SecondServer | SOCIAL | msdb |
SecondServer | SOCIAL | HealthCareDB |
ThirdServer | GENERAL | master |
ThirdServer | GENERAL | model |
ThirdServer | GENERAL | msdb |
ThirdServer | GENERAL | GeneralUserDB |
ThirdServer | FBI | master |
ThirdServer | FBI | model |
ThirdServer | FBI | msdb |
ThirdServer | FBI | CriminalDB |
FourthServer | null | null |
FifthServer | COMINGSOON | null |
SELECT Statement involving Status table
In the next statement I decide to add the status to each element (server, instance, database) and JOIN
ed each table with the Status
table as follows:
-- Advanced SELECT to get all information on Servers, Instances and Databases
-- including their status
SELECT
ServerName,
srvst.StatusName,
InstanceName,
insst.StatusName,
DatabaseName,
dbsst.StatusName
FROM [Server] as srv
JOIN [Status] as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
JOIN [Status] as insst
ON ins.InstanceStatusID = insst.StatusID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID
JOIN [Status] as dbsst
ON dbs.DatabaseStatusID = dbsst.StatusID
;
Results of 2. Statement
To my surprise the server without an instance and database and the server with an instance but without a database were no longer listed:
ServerName | StatusName | InstanceName | StatusName | DatabaseName | StatusName |
---|---|---|---|---|---|
FirstServer | Productive | GENERAL | Productive | master | Productive |
FirstServer | Productive | GENERAL | Productive | model | Productive |
FirstServer | Productive | GENERAL | Productive | msdb | Productive |
FirstServer | Productive | GENERAL | Productive | UserDB1 | Productive |
FirstServer | Productive | TAXES | Productive | master | Productive |
FirstServer | Productive | TAXES | Productive | model | Productive |
FirstServer | Productive | TAXES | Productive | msdb | Productive |
FirstServer | Productive | TAXES | Productive | TaxesDB | Productive |
SecondServer | Prod ACC | SOCIAL | Prod ACC | master | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | model | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | msdb | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | HealthCareDB | Prod ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | master | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | model | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | msdb | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | GeneralUserDB | Test ACC |
ThirdServer | Test ACC | FBI | Reserved | master | Reserved |
ThirdServer | Test ACC | FBI | Reserved | model | Reserved |
ThirdServer | Test ACC | FBI | Reserved | msdb | Reserved |
ThirdServer | Test ACC | FBI | Reserved | CriminalDB | Reserved |
Findings / Solution
After checking various options with a trial and error approach I found out that the JOIN
on the Status
table had to be changed to a LEFT JOIN
to allow for the statement to display the server without an instance or a database, and to display the instance without a database:
-- Advanced SELECT to get all information on Servers, Instances and Databases
-- including their status
SELECT
ServerName,
srvst.StatusName,
InstanceName,
insst.StatusName,
DatabaseName,
dbsst.StatusName
FROM [Server] as srv
LEFT JOIN [Status] as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
LEFT JOIN [Status] as insst
ON ins.InstanceStatusID = insst.StatusID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID
LEFT JOIN [Status] as dbsst
ON dbs.DatabaseStatusID = dbsst.StatusID;
Results of 3. Statement
ServerName | StatusName | InstanceName | StatusName | DatabaseName | StatusName |
---|---|---|---|---|---|
FirstServer | Productive | GENERAL | Productive | master | Productive |
FirstServer | Productive | GENERAL | Productive | model | Productive |
FirstServer | Productive | GENERAL | Productive | msdb | Productive |
FirstServer | Productive | GENERAL | Productive | UserDB1 | Productive |
FirstServer | Productive | TAXES | Productive | master | Productive |
FirstServer | Productive | TAXES | Productive | model | Productive |
FirstServer | Productive | TAXES | Productive | msdb | Productive |
FirstServer | Productive | TAXES | Productive | TaxesDB | Productive |
SecondServer | Prod ACC | GENERAL | Decommisioned | null | null |
SecondServer | Prod ACC | SOCIAL | Prod ACC | master | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | model | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | msdb | Prod ACC |
SecondServer | Prod ACC | SOCIAL | Prod ACC | HealthCareDB | Prod ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | master | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | model | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | msdb | Test ACC |
ThirdServer | Test ACC | GENERAL | Test ACC | GeneralUserDB | Test ACC |
ThirdServer | Test ACC | FBI | Reserved | master | Reserved |
ThirdServer | Test ACC | FBI | Reserved | model | Reserved |
ThirdServer | Test ACC | FBI | Reserved | msdb | Reserved |
ThirdServer | Test ACC | FBI | Reserved | CriminalDB | Reserved |
FourthServer | Reserved | null | null | null | null |
FifthServer | Reserved | COMINGSOON | Reserved | null | null |
Reference Material
Here a link to the db<>fiddle to reproduce my findings.
Question
Why does SQL Server require a LEFT JOIN
on the Status
table for child items that do not exist and for the query to display these items?
-
1Is it true that using INNER JOIN after any OUTER JOIN will essentially invalidate the effects of OUTER JOIN?philipxy– philipxy2024年04月13日 01:10:13 +00:00Commented Apr 13, 2024 at 1:10
3 Answers 3
You need to nest your joins. Otherwise what is happening is that it's expecting each individual join clause to return a result, but it can't if the previous one didn't return anything.
Essentially, you want the server to take the result of the inner-join of Instance
and Status
, and left-join all of that back to Server
.
SELECT
ServerName,
srvst.StatusName,
InstanceName,
insst.StatusName,
DatabaseName,
dbsst.StatusName
FROM Server as srv
JOIN Status as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN (
Instance as ins
JOIN Status as insst
ON ins.InstanceStatusID = insst.StatusID
)
ON srv.ServerID = ins.ServerID
LEFT JOIN (
Database as dbs
JOIN Status as dbsst
ON dbs.DatabaseStatusID = dbsst.StatusID
)
ON ins.InstanceID = dbs.InstanceID;
In SQL Server, the parenthesis are not essential, they are purely for readability. The key is putting JOIN table2 ON ...
inside another join, so it comes out to LEFT JOIN table1 JOIN table2 ON ... ON ...
ie the ON
clauses are nested.
This is the exact equivalent of using derived subqueries:
SELECT
ServerName,
srvst.StatusName,
InstanceName,
ins.StatusName,
DatabaseName,
dbs.StatusName
FROM Server as srv
JOIN Status as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN (
SELECT ins.*, insst.StatusName
FROM Instance as ins
JOIN Status as insst
ON ins.InstanceStatusID = insst.StatusID
) ins
ON srv.ServerID = ins.ServerID
LEFT JOIN (
SELECT dbs.*, dbsst.StatusName
Database as dbs
JOIN Status as dbsst
ON dbs.DatabaseStatusID = dbsst.StatusID
) dbs
ON ins.InstanceID = dbs.InstanceID;
Because the join with status table for the child items is happening on basis of ins.InstanceStatusID, dbs.DatabaseStatusID which turns out to be NULL when the joins prior to it takes place in some rows.
When you do an inner join later it shows rows matched in both tables and to get the rows with null values as well you will have to use left join.
To visualize the same :
SELECT
ServerName,
srvst.StatusName,
InstanceName,
DatabaseName,
ins.InstanceStatusID,
dbs.DatabaseStatusID
FROM [Server] as srv
LEFT JOIN [Status] as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID
SELECT TMP.* , insst.StatusName,
dbsst.StatusName FROM (SELECT
ServerName,
srvst.StatusName,
InstanceName,
DatabaseName,
ins.InstanceStatusID,
dbs.DatabaseStatusID
FROM [Server] as srv
LEFT JOIN [Status] as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID ) AS TMP
/* Here if left join is not used values which are null are skipped and only values which are in both tables are returned */
JOIN [Status] as insst
ON TMP.InstanceStatusID = insst.StatusID
JOIN [Status] as dbsst
ON TMP.DatabaseStatusID = dbsst.StatusID ;
Which after using Left join pulls all records from left result set and the matched values from both tables joined to the resultset
SELECT TMP.* , insst.StatusName,
dbsst.StatusName FROM (SELECT
ServerName,
srvst.StatusName,
InstanceName,
DatabaseName,
ins.InstanceStatusID,
dbs.DatabaseStatusID
FROM [Server] as srv
LEFT JOIN [Status] as srvst
ON srv.ServerStatusID = srvst.StatusID
LEFT JOIN [Instance] as ins
ON srv.ServerID = ins.ServerID
LEFT JOIN [Database] as dbs
ON ins.InstanceID = dbs.InstanceID ) AS TMP
LEFT JOIN [Status] as insst
ON TMP.InstanceStatusID = insst.StatusID
LEFT JOIN [Status] as dbsst
ON TMP.DatabaseStatusID = dbsst.StatusID ;
Why does SQL Server require a
LEFT JOIN
on the Status table for child items that do not exist and for the query to display these items?
It doesn't, in general. It is just one of the ways to express your requirement that produces the results you are after.
Your query tries to (inner) join from a null-extended row (created by an outer join) to the Status table using a join predicate that looks for an equality match on StatusID values.
This predicate does not return true because StatusID is null on one side, there is no matching null in the Status table, and the equality predicate would reject null matches anyway. Since the join predicate fails and you have specified an inner join, no result row is produced.
For clarity, the inner join to Status could produce a result with a different join predicate.
For example, there could be a null StatusID in the Status table (something your schema allows) and the join predicate could use IS NOT DISTINCT FROM
or an equivalent test where nulls match. You could also use COALESCE
or ISNULL
in the join predicate to select a particular status (as a default) without adding any new status rows.
There are many possibilities; the point is you could write a join predicate that would produce a match. The one you have does not.
It is possible you thought SQL Server would skip the join to Status for any rows previously null-extended by an outer join, but things don't work that way. An outer join is not an 'optional' join with subsequent short-circuiting. Perhaps some people have a mental image of them functioning that way.
Making any following joins 'outer' as a way to fix the output is close to using 'magic'—a recipe that works but isn't understood.
Why does it work here? Because, with a left outer join, the null-extended row is preserved despite there being no match in Status according to the join predicate. The missing status columns are populated with null, as usual.
An alternative solution
People do seem to find left outer joins more intuitive to work with, but there's no particular reason to prefer them otherwise. You don't have to start at the top of the hierarchy and 'join down' to rows that may or may not be present.
Let's see what happens with your schema if we instead start at the bottom of the hierarchy (the [Database] table) and work up. The first join to Status is straightforward:
SELECT
DBS.DatabaseName,
DBSST.StatusName
FROM dbo.[Database] AS DBS
JOIN dbo.[Status] AS DBSST
ON DBSST.StatusID = DBS.DatabaseStatusID;
Moving up the hierarchy, we now need to add rows from the Instance table. We can't use a left outer join or inner join because there may be an instance without a database. The natural solution is to use a right join, with the associated inner join to Status:
SELECT
INS.InstanceName,
INSST.StatusName,
DBS.DatabaseName,
DBSST.StatusName
FROM dbo.[Database] AS DBS
JOIN dbo.[Status] AS DBSST
ON DBSST.StatusID = DBS.DatabaseStatusID
RIGHT JOIN dbo.Instance AS INS
ON INS.InstanceID = DBS.InstanceID
JOIN dbo.[Status] AS INSST
ON INSST.StatusID = INS.InstanceStatusID;
There is no need for 'nested' joins here. We quite naturally add in any instances without a database, adding null-extended database rows as necessary. All instances are accounted for, so the inner join to Status is perfectly correct.
We follow the same approach to add in rows from the Server table, along with its status:
SELECT
SRV.ServerName,
SVS.StatusName,
INS.InstanceName,
INSST.StatusName,
DBS.DatabaseName,
DBSST.StatusName
FROM dbo.[Database] AS DBS
JOIN dbo.[Status] AS DBSST
ON DBSST.StatusID = DBS.DatabaseStatusID
RIGHT JOIN dbo.Instance AS INS
ON INS.InstanceID = DBS.InstanceID
JOIN dbo.[Status] AS INSST
ON INSST.StatusID = INS.InstanceStatusID
RIGHT JOIN dbo.[Server] AS SRV
ON SRV.ServerID = INS.ServerID
JOIN dbo.[Status] AS SVS
ON SVS.StatusID = SRV.ServerStatusID;
This produces the results you want, without invoking any magic.
There are any number of ways to write a correct query specification for your requirement. I show a RIGHT JOIN
alternative mostly because people seem somewhat scared by right outer joins or consider them redundant.
Explore related questions
See similar questions with these tags.