1

I have a several tables that I have joined together to get raw data.

Some information about my query in the most basic form:

  • Table1 has a column [FacilityID] that is a unique identifier in that table (as well as several other columns, that are unimportant to the question).
  • Table2 has a column [FacilityID] that matches Table1.[FacilityID], but not all of the records from Table1.[FacilityID] are found in Table2.[FacilityID]
  • Because of this, Table1.[FacilityID] is Leftjoined to Table2.[FacilityID].
  • Table2 has additional columns [ActivityCode], [ActivityDate], [ActivityEval?], and [ActivityOutcome]
  • Table2.[ActivityOutcome] is only populated if Table2.[ActivityEval?] = "Yes"
  • Table2.[ActivityCode] contains several string values, all nonnumeric.
  • For simplicity and purposes of this example, Table2.[ActivityCode] contains values "EVALA", "EVALB", "EVALC", "EVALD", "EVALE", "EVALF", "EVALG", "EVALH", "EVALI", "EVALJ", "NOEVALA", "NOEVALB, "NOEVALC", and "NOEVALD" (not in that order, nor with the same prefixes or suffixes)

Right now my query is pulling: Table1.[FacilityID], Table2.[ActivityCode], Table2.[ActivityOutcome], Where (Table2.[ActivityDate] is between a begin date and end date) And Table2.[ActivityEval?] = "Yes"

My problem lies within the codes listed in Table2.[ActivityCode].

  • Codes: EVALA, EVALB, EVALF, EVALG, EVALH, EVALI, and EVALJ can all stand on their own and are counted as a single event
  • Codes: EVALC, EVALD, and EVALE cannot be counted as a single event, as they would only happen with at least 1 code from above.
  • If EVALA, EVALB, and EVALC are all entered within the timeframe set by the query, then an EVAL3 event has taken place.
  • If EVALA, EVALB, EVALC, EVALD, and EVALE are all entered within the timeframe set by the query, then an EVAL5 event has taken place.

I need a calculated field in my query to list out either an EVAL3 or EVAL5 event, or list out the individual codes from Table2.[ActivityCode]. This is to help our data entry folks quickly check if all their codes are entered into the system for credit. (We've been having a hard time getting consistent data entry with thousands of data points entered every year, by hundreds of people.)

I cannot, for the life of me, find how to complete this task out on the google.

Can anyone help me? Will I just end up with a Crosstab query that I have to filter in Excel? I really want this to happen all in one fell swoop, so I don't spend a "billion" hours each time the data is pulled refiltering in Excel.

I'm not the best with SQL or VBA, but I definitely try my damnedest to understand. I've tried using the Expression Builder and Concatrelated, but I am not getting the results that I want. That's why I'm finally turning here.

It's an access db, so yes, it runs on SQL. I don't necessarily need a single row, unless the 3 or 5 criteria are true and there are no other codes. Say for 1 FacilityID EVAL3 is TRUE, but they also have an EVALD, but not an EVALE, I want one row for EVAL3 and another for EVALD. Once an EVALE is added for the facility, I only want one row for the facility to show EVAL5. Any other EVAL codes should be listed out on separate rows.

Also, the ActivityCode may not be unique for the FacilityID in the query. There may be times that an EVALA occurs as part of an EVAL3 or EVAL5 event, but later in the timespan pulled for the query (usually a year) another EVALA occurs because of new information about or a change at the Facility. In this case the new EVALA is acceptable and should be counted in the activities for that facility, but not counted as part of the EVAL3 or EVAL5 event.

Paul White
95.4k30 gold badges440 silver badges689 bronze badges
asked Apr 6, 2016 at 5:45
1
  • 1
    Do you need a single row per [FacilityID] or all [ActivityCode] per id? Is [ActivityCode] unique within id? Commented Apr 6, 2016 at 6:09

1 Answer 1

1

the solution that I thought involves creating a calculated column on table2 called the_level which will give you the exact level depending on the ActivityCode.

the second step is on your query, where you join table1 and table2, there you need to identify whether or not the item is entered within the timeframe set by the query, and change the the_level accordingly.

the_level has been created as PERSISTED, you did not mention indexes and performance but you said billion hours, so you will need some indexes there.

this is an example of calculated column with case and not null.

I have, in the past, create calculated columns that reference a different table, but in my environment the performance was not the best. Somehow it slowed writes, but I had an index on the calculated column. I am not stating this is a rule, but I would monitor it before doing it on a busy updated table.

 create table table2(
 [ActivityCode] varchar(10) not null,
 [ActivityDate] datetime,
 [ActivityOutcome] int,
 [ActivityEval?] char(3) not null default('Yes')
 )
 go
 -- add the facilityID, as a primary key at least for the exercise
 alter table table2
 add [FacilityID] int not null identity(1,1) primary key clustered
 go
 --insert the activity codes 
 insert into table2 ([ActivityCode])
 select 'EVALA' 
union all select 'EVALB'
union all select 'EVALC'
union all select 'EVALD'
union all select 'EVALE'
union all select 'EVALF'
union all select 'EVALG'
union all select 'EVALH'
union all select 'EVALI'
union all select 'EVALJ'
union all select 'NOEVALA'
union all select 'NOEVALB' 
union all select 'NOEVALC' 
union all select 'NOEVALD' 
 go
 -- have a look at the table
 select * from table2
 -- test the calculation of the the_level column
 select [ActivityCode],
 the_level = 
 CASE WHEN ActivityCode LIKE '%EVALA%' OR
 ActivityCode LIKE '%EVALB%' OR
 ActivityCode LIKE '%EVALF%' OR
 ActivityCode LIKE '%EVALG%' OR
 ActivityCode LIKE '%EVALH%' OR
 ActivityCode LIKE '%EVALI%' OR 
 ActivityCode LIKE '%EVALJ%'THEN 1
 ELSE 
 CASE WHEN ActivityCode LIKE '%EVALC%' OR
 ActivityCode LIKE '%EVALD%' OR
 ActivityCode LIKE '%EVALE%' THEN 2
 ELSE 0
 END
 END
 FROM TABLE2
 ----------------------------------------------------
 -- ADDED the_level as A persisted column to table2
 ----------------------------------------------------
 alter table table2
 add the_level AS ISNULL(
 CASE WHEN ActivityCode LIKE '%EVALA%' OR
 ActivityCode LIKE '%EVALB%' OR
 ActivityCode LIKE '%EVALF%' OR
 ActivityCode LIKE '%EVALG%' OR
 ActivityCode LIKE '%EVALH%' OR
 ActivityCode LIKE '%EVALI%' OR 
 ActivityCode LIKE '%EVALJ%'THEN 1
 ELSE 
 CASE WHEN ActivityCode LIKE '%EVALC%' OR
 ActivityCode LIKE '%EVALD%' OR
 ActivityCode LIKE '%EVALE%' THEN 2
 ELSE 0
 END
 END,0) PERSISTED
 -- after this do in your query something like this
 -- maybe the case in the line 
 -- WHEN date_entered BETWEEN '2016-04-01' AND '2016-04-07' 
 -- does not need to exist
 -- you will need to add your business logic on the query below
 -- but at least you have the level calculated
 select T1.[FacilityID]
 t2.[ActivityCode], 
 T2.[ActivityOutcome], 
 the_FINAL_level = CASE 
 WHEN date_entered BETWEEN '2016-04-01' AND '2016-04-07' THEN 
 CASE WHEN T2.THE_LEVEL = 1 THEN T2.THE_LEVEL+1
 ELSE 
 CASE WHEN T2.THE_LEVEL = 2 THEN THE_LEVEL+1
 ELSE -- ADD BUSINESS LOGIC HERE
 END
 END
 ELSE 0
 END
 FROM TABLE1 T1
 INNER JOIN TABLE2 T2
 ON T1.[FacilityID] = T2.[FacilityID]
 WHERE T2.[ActivityDate] BETWEEN '2016-04-01' AND '2016-04-07'
 AND T2.[ActivityEval?] ='Yes'
answered Apr 7, 2016 at 11:37

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.