7

Below is an example of the type of data I get (collected by different users):

name surname
Moe Momo
Moe Momo
Jack JAJA
Jack Jacky

I would like to find when two users have collected different surnames for the same name.

The output I'm trying to get is:

name surname
Moe Momo
Jack NULL

I would see the surname if all users have collected the same, and NULL if there are differences.

I tried searching the internet but I'm not able to describe what I'm searching properly.

I tried a request using CASE, but with no success.

Paul White
95.4k30 gold badges440 silver badges689 bronze badges
asked Jun 27, 2022 at 15:37
2
  • 1
    What would you consider the right output to be if one of the surname values in the original table is NULL? Commented Jun 28, 2022 at 10:59
  • @PaulWhite it should be NULL as different values would be present for the same name thank you ! Commented Jun 28, 2022 at 13:22

5 Answers 5

6

This can be solved using COUNT(DISTINCT ...). Group the results by name. Count distinct last names per first name. If the count differs from 1, show the last name as a null, otherwise show the actual last name e.g. using MAX, like this:

SELECT
 name
, surname = CASE COUNT(DISTINCT surname) WHEN 1 THEN MAX(surname) END
FROM
 dbo.People
GROUP BY
 name
;

You have to apply an aggregate function to surname because the grouping is by name only. Since you only show it when the distinct count is 1, it should not matter much which instance you pick, since they are all the same. MIN would work as well.

answered Jun 27, 2022 at 23:15
0
5

You can just group by first_name, and compare the MIN with the MAX and see if they are the same.

SELECT
 n.first_name,
 CASE WHEN MIN(n.last_name) = MAX(n.last_name) THEN MIN(n.last_name) END AS last_name
FROM #names n
GROUP BY
 n.first_name;

db<>fiddle

answered Jun 27, 2022 at 23:11
2
  • Functionally equivalent to checking COUNT(DISTINCT x), but standard SQL and will probably perform much better! Commented Jun 29, 2022 at 7:26
  • Indeed, this does not require a sort over first_name, last_name only first_name Commented Jun 29, 2022 at 9:13
3

Another solution would be using the window function:

with cte as 
( select n.*, 
 count(*) over (partition by first_name, last_name) as cnt
from #names n
) select distinct first_name, 
 case when cnt > 1 then last_name else NULL end as last_name
 from cte ;

The partition by first_name, last_name part will count for the first_name/last_name combined. If the cnt > 1 then same user have same first and last name.

Demo

Note. If in your dataset you have same user with twice same first_name and last_name and once with differ first_name and last_name for the differ part it will return null like below example:

Moe | Momo
Moe | Momo
Jack | JAJA
Jack | Jacky
Jack | Jacky

The result would be :

first_name last_name
 Jack null
 Jack Jacky
 Moe Momo

https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=1f604e29be13490818f5e1875e8719eb

answered Jun 27, 2022 at 17:57
0
2

There are more than a few ways to do this, but to move forward with your CASE statement approach, you just need to include some aggregates, similar to the following:

SELECT first_name, CASE WHEN repeats > 1 THEN last_name ELSE NULL END as last_name
FROM
(
 SELECT *, COUNT(*) as repeats
 FROM #names
 GROUP BY first_name, last_name
) t
GROUP BY first_name, CASE WHEN repeats > 1 THEN last_name ELSE NULL END

Here's the full dbfiddle.uk for reference

There are probably more efficient approaches out there as well, but this should at least get you the limited results you're looking for.

answered Jun 27, 2022 at 16:10
2

This is a modification of Andriy's answer. It considers a NULL in the second column to be distinct:

CREATE TABLE #names
(
 first_name varchar(255) NOT NULL,
 last_name varchar(255) NULL
);
INSERT #names
 (first_name, last_name)
VALUES
 ('Moe', 'Momo'),
 ('Moe', 'Momo'),
 ('Jack', 'JAJA'),
 ('Jack', 'Jacky'),
 ('Paul', 'White'),
 ('Paul', NULL);
SELECT 
 N.first_name, 
 last_name =
 CASE
 WHEN COUNT_BIG(DISTINCT N.last_name) = 1 -- one non-null value
 AND COUNT_BIG(*) = COUNT_BIG(N.last_name) -- no nulls
 THEN MAX(N.last_name)
 ELSE NULL
 END
FROM #names AS N
GROUP BY N.first_name;
first_name last_name
Jack NULL
Moe Momo
Paul NULL

db<>fiddle online demo

answered Jun 28, 2022 at 14:30

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.