0

In MS Access 2016 I want to run a query that is synonymous with the Excel 'Remove Duplicates' function.

The excel query will remove duplicates and always leave one record. For example, if we have 4 records where Specimen = BB000127 AND ADDON = NO (last column below) and we specify the Remove Duplicate query for Specimen and ADDON columns ,

enter image description here

the excel Remove Duplicate Function always returns at least one record: enter image description here

when I run SQL query below in MS Access 2016 it remove's ALL records where duplicates are found:

 SELECT 
 ReportX.[Specimen], ReportX.[ADDON], ReportX.[Dept], ReportX.[Location], ReportX.[MRN], 
 ReportX.[Solcited_Status], ReportX.[FillerOrderNumber], ReportX.[PlaceGroupNumber], ReportX.[OrderType], 
 ReportX.[Investigation], ReportX.[Rejected], ReportX.[Mis_Un_labelled], ReportX.[No_samp_recd], 
 ReportX.[Amendedreport], ReportX.[OrderedBy], ReportX.[RespClinican], ReportX.[Collecteddate], 
 ReportX.[CollecteddateTime], ReportX.[messagedat]
FROM 
 ReportX
WHERE 
 ReportX.[Specimen] NOT IN (
 SELECT 
 [Specimen] 
 FROM 
 [ReportX] AS Tmp 
 GROUP BY 
 [Specimen], [ADDON] 
 HAVING 
 Count(*) >= 1 
 AND 
 [ADDON] = [ReportX].[ADDON]
 )
ORDER BY 
 ReportX.[Specimen], ReportX.[ADDON];

How can I modify the above query to return at least one record where a duplicate/duplicates exist?

Ive tried using FIRTS to select the first record where duplicates occur in a Sub Query, and then a JOIN to the main table form Sub Query, but duplicates are still been returned, any advice appreciated:

SELECT 
 SubQ.[Specimen], 
 SubQ.[ADDON], 
 ReportX.[Dept], 
 ReportX.[Location], 
 ReportX.[MRN], 
 ReportX.[Solcited_Status], 
 ReportX.[FillerOrderNumber], 
 ReportX.[PlaceGroupNumber], 
 ReportX.[OrderType], 
 ReportX.[Investigation], 
 ReportX.[Rejected], 
 ReportX.[Mis_Un_labelled], 
 ReportX.[No_samp_recd], 
 ReportX.[Amendedreport], 
 ReportX.[OrderedBy], 
 ReportX.[RespClinican], 
 ReportX.[Collecteddate], 
 ReportX.[CollecteddateTime], 
 ReportX.[messagedat]
FROM 
 (SELECT 
 FIRST(ReportX.[Specimen]) AS [Specimen], 
 FIRST(ReportX.[ADDON]) AS [ADDON]
 FROM 
 ReportX
 GROUP BY 
 ReportX.[Specimen], ReportX.[ADDON]) AS SubQ
INNER JOIN 
 ReportX ON SubQ.[Specimen] = ReportX.[Specimen] AND SubQ.[ADDON] = ReportX.[ADDON];

UPDATE: Providing sample data with duplicates in 'Specimen' and 'ADDON' records as requested, doesn't appear to be an option to attach a txt file so its below, exported using Windows encoding.

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Specimen | Dept | Location | MRN | Solcited_Status | FillerOrderNumber | PlaceGroupNumber | OrderType | Investigation | ADDON | Rejected | Mis_Un_labelled | No_samp_recd | Amendedreport | OrderedBy | RespClinican | hl7message | Collecteddate | CollecteddateTime | messagedat |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| HH014977V | Haematology | ED | 1184842 | 000376998^ILAB | |20240503HH014977 | 000376998^ILAB | SOLICTED | FBCD^FBC with | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031917 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| HH014978B | Haematology | ED | 1145832 | 000376985^ILAB | |20240503HH014978 | 000376985^ILAB | SOLICTED | FBCD^FBC with | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031926 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| HH014976J | Haematology | ED | 586273 | 000377002^ILAB | |20240503HH014976 | 000377002^ILAB | SOLICTED | FBCD^FBC with | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031931 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| BB047821M | Biochemistry | ED | 505432 | 000377000^ILAB | |20240503BB047821 | 000377000^ILAB | SOLICTED | CO2^Total CO2^B | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031921 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| BB047821M | Biochemistry | ED | 505432 | 000377000^ILAB | |20240503BB047821 | 000377000^ILAB | SOLICTED | CRP^C Reactive | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031921 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| BB047821M | Biochemistry | ED | 505432 | 000377000^ILAB | |20240503BB047821 | 000377000^ILAB | SOLICTED | UE^Urea and | NO | N/A | N/A | N/A | NA | | DUMMY | MSH|^~\&|ILAB|LAB | 03/05/2024 | 202405031921 | 03/05/2024 |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gord Thompson
125k39 gold badges252 silver badges458 bronze badges
asked May 3, 2024 at 20:46
10
  • 1
    The idea of deleting "all but one" involves some type of comparison. In other databases you might use a window function for this but no can do in Access so you will probably have to use an "Exists" type of query. Do you care which record remains behind (first one, last one ... and if so how would you define first or last?) Commented May 3, 2024 at 21:24
  • 1
    Sounds very much like this: stackoverflow.com/questions/44330524/… Commented May 3, 2024 at 21:25
  • 3
    SELECT Specimen, ADDON, First(Dept), First(Location), ... FROM ReportX GROUP BY Specimen, ADDON will leave one row per duplicate and is the easiest solution. Commented May 3, 2024 at 22:44
  • 1
    Review allenbrowne.com/subquery-01.html#TopN. Provide sample data as formatted text table, not image. Commented May 4, 2024 at 17:19
  • 1
    Does this answer your question? Top n records per group sql in access Commented May 4, 2024 at 17:26

1 Answer 1

3

As I see it, you don't need a subquery.

Just a GROUP BY the columns that define unique records, and an aggregate function (First) for all other columns.

SELECT 
 Specimen, 
 ADDON, 
 First(Dept), 
 First(Location),
 First(MRN), 
 First(Solcited_Status),
 ... 
FROM ReportX 
GROUP BY Specimen, ADDON

This will return one record per Specimen + ADDON.

answered May 5, 2024 at 18:44
Sign up to request clarification or add additional context in comments.

5 Comments

This looks correct. The docs for FIRST and LAST are horrible, because "These functions return the value of a specified field in the first or last record, respectively, of the result set returned by a query" is plain wrong, because then every line in the result set would have the same value, namely the one shown in the first result row.
They also say "If the query does not include an ORDER BY clause, the values returned by these functions will be arbitrary", but I don't see how an ORDER BY could possibly influence the resulting value, as ORDER BY happens last in the query, and you cannot order by values that are not in GROUP BY. With this mess of a documentation, I don't even see it guaranteed that all FIRST values apply necessarily to the same source row, but it seems rather likely still.
@ThorstenKettner: You are correct. Access SQL has several strange quirks, the FIRST() and LAST() functions are among them - they are not standard SQL. I can't remember when I have used them productively (if ever), but for this specific use case "just give me any one record per group" they do the job. Normally one would use a Window function, but Access doesn't have that. -- Here is another example
@Andre many thanks that work,s can you explain the mechanism? Don't quite follow the logic
The answer I linked in the comment above yours explains it pretty well I think. Generally, GROUP BY and aggregate functions are an important part of SQL, and you will find lots of pages that can explain it better than I ever could. Random examples @dancingbush

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.