I am processing an unformatted CSV, New-AmbSYS-to-2018-Jan.csv, from NHS England Ambulance Quality Indicators, by reading its contents into an array, and carrying out a number of steps that lead to a final output array format.
The file
There are an initial 79 columns in the file and, depending on the month, a variable number of rows. The total number of rows will never be enormous.
Header row
First 5 columns:
| Year | Month | Region | Org Code | Org Name |
|------|-------|--------|----------|----------|
Rest of columns - A further 74 columns labeled A0
to A73
. These are codes for indicators.
I am not going to show you all of those columns for obvious reasons. The dataset is publicly available and I have included the direct download link and the website link.
Process:
Reduce number of columns and re-order
Function
GetData
grabs all the data from the sheet (it will always be the first sheet), ignoring the header row (optional argument), and loads into an arraydataArray
.Function
ReduceColumnsArray
takesdataArray
, grabs the columns of interest and orders them, as per the specification in arraycolumnsToKeepArray
. I tried to apply what I learnt from the answers to my prior question on CR.
The new smaller array, also called dataArray
, is now composed of 21 columns.
The column ordering is as follows (1 sample row included. Header added to aid with later column mapping description):
| Year | Month | Region | Org Code | Org Name | A8 | A10 | A11 | A12 | A24 | A30 | A33 | A36 | A25 | A31 | A34 | A37 | A26 | A32 | A35 | A38 |
|------|-------|--------|----------|------------------------------------|------|-------|-------|------|---------|----------|----------|----------|-----|------|------|------|-----|------|------|------|
| 2018 | 1 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8469 | 55555 | 21759 | 2743 | 3639835 | 68077967 | 80385509 | 11249217 | 430 | 1225 | 3694 | 4101 | 705 | 2528 | 8738 | 8344 |
Unpivot columns in array and populate output array
Output format
The output array, finalArray
, has the following output format; with 4 output rows for each single input row (in dataArray
):
| Date | Year | Month | STP | Region | Org Code | Org Name | Category | Count of Incidents | Total (hours) | Mean (hh:mm:ss) | 90th centile (hh:mm:ss) |
|------|------|-------|-----|--------|----------|----------|----------|--------------------|---------------|-----------------|-------------------------|
| | | | | | | | | A8 | A24 | A25 | A26 |
| | | | | | | | | A10 | A30 | A31 | A32 |
| | | | | | | | | A11 | A33 | A34 | A35 |
| | | | | | | | | A12 | A36 | A37 | A38 |
Observations
There are additional columns in the output,
Date
(1),STP
(4) andCategory
(8).From column 6 of input,
dataArray
, onwards, every 4 columns unpivot into 4 rows in one column such that, for the remaining 16 columns (6:21), there are 4 columns with 4 rows (for every one input row) generated.E.g. If you look at
A8
,A10
,A11
andA12
, they "unpivot", I use that term loosely, into theCount of Incidents
column. It will be the actual corresponding values i.e.8469 | 55555 | 21759 | 2743
for the sample row, but I have used the headers for illustrative purposes.
Arranging the data
Function
GetFinalArray
takesdataArray
as an argument and performs the assignment ofdataArray
values tofinalArray
.Dimensioning
finalArray
is defined by the fact UBound(finalArray,2) is known,12
; and that the UBound(finalArray,1) will be, due to the unpivot, 4 * the number of rows read in i.e. 4 * UBound(dataArray,1).
The additional output columns
Column 1 (Date): Function
GetDate
is used to populate this.Column 4 (STP): This uses a placeholder,
vbNullString
, at present.Column 8 (Category): This is populated by the
GetCategory
function. This is based on the input rows being ordered such that rows in always run in sequence 1,2,3,4, which map to "Cat1: Life Threatening" , "Cat2: Emergency", "Cat3: Urgent" , "Cat4: Less Urgent". Themod
function is applied to the current row to determine this i.e. if current row in arraymod
4, 4 for the unpivot, is0
thenCat4
, if 1 thenCat1
etc.
Input to output mapping
The individual row/column mapping from dataArray
to finalArray
i.e. input to output, using a single row example is:
| | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|-----|-----|---|-----|-----|-----|---|-----|------|------|------|
| 1 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,6 | 1,10 | 1,14 | 1,18 |
| 2 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,7 | 1,11 | 1,15 | 1,19 |
| 3 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,8 | 1,12 | 1,16 | 1,20 |
| 4 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,9 | 1,13 | 1,17 | 1,21 |
Where, as shown, in matching picture below, the red are the finalArray
(row,column) positions and the inner black, "coordinates", are the corresponding values to assign from dataArray
.
To get the "coordinates" I use the fact that the row part of the coordinate increments 1 for every 4 rows of the outer array.
So row 5 would be:
| 5 | | 2,1 | 2,2 | | 2,3 | 2,4 | 2,5 | | 2,6 | 2,10 | 2,14 | 2,18 |
Pattern to get output column for current row of finalArr
is implemented by function GetFinalColumn
, and has the following logic:
| | finalColumn |
|-------------------------|----|----|----|----|
|MOD of outer row number | 9 | 10 | 11 | 12 | <input value
|-------------------------|----|----|----|----|
| 1 | 6 | 10 | 14 | 18 | >output
| 2 | 7 | 11 | 15 | 19 | >output
| 3 | 8 | 12 | 16 | 20 | >output
| 0 | 9 | 13 | 17 | 21 | >output
So, if I am currently assigning to the outer array (1,9) then 1 mod 4 is 1; and the intersection above, of 1 and 9, is 6. So I use (1,6) as my coordinates.
Module MainSub
Option Explicit
Public Sub Main()
Dim wb As Workbook
Dim ws As Worksheet
Set wb = ThisWorkbook
Set ws = wb.Sheets(1)
Dim dataArray()
dataArray = ReduceColumnsArray(GetData(ws, 1))
Dim finalArray As Variant
finalArray = GetFinalArray(dataArray)
Dim destinationRange As Range
Set destinationRange = wb.Worksheets("Output").Range("A2")
destinationRange.Resize(UBound(finalArray, 1), UBound(finalArray, 2)) = finalArray
End Sub
Module FunctionsModule
Option Explicit
Option Base 1
Public Function GetFinalArray(ByVal dataArray As Variant) As Variant 'to convert to function
Dim i As Long
Dim j As Long
Dim finalArray()
Dim totalOutputRows As Long
Dim numberRowsInDataSet As Long
numberRowsInDataSet = UBound(dataArray, 1)
totalOutputRows = 4 * numberRowsInDataSet 'scale to account for columns becoming rows
ReDim finalArray(1 To totalOutputRows, 1 To 12)
Dim outputCol As Long
outputCol = 0
Dim sourceRow As Long
For i = LBound(finalArray, 1) To UBound(finalArray, 1)
If i Mod 4 = 1 Then sourceRow = sourceRow + 1
finalArray(i, 1) = GetDate(dataArray(sourceRow, 1), dataArray(sourceRow, 2)) 'this needs formatting to generate output 'yyyy-mm-dd' ;adding -01 as dd
finalArray(i, 2) = dataArray(sourceRow, 1)
finalArray(i, 3) = dataArray(sourceRow, 2)
finalArray(i, 4) = vbNullString 'STP - TODO Add function call in to return this value
finalArray(i, 5) = dataArray(sourceRow, 3)
finalArray(i, 6) = dataArray(sourceRow, 4)
finalArray(i, 7) = dataArray(sourceRow, 5)
finalArray(i, 8) = GetCategory(i) 'Category
Dim n As Long
For n = 9 To 12
finalArray(i, n) = dataArray(sourceRow, GetFinalColumn(n, i)) 'finalColumn
Next n
Next i
GetFinalArray = finalArray
End Function
Public Function ReduceColumnsArray(ByVal dataArray As Variant) As Variant
'From the initial 79 columns keep only 21 columns as specified in columnsToKeepArray
Const ColsToKeep As Long = 21
Dim columnsToKeepArray()
columnsToKeepArray = Array(1, 2, 3, 4, 5, 14, 16, 17, 18, 19, 25, 28, 31, 20, 26, 29, 32, 21, 27, 30, 33)
Dim i As Long
Dim j As Long
Dim tempArr()
ReDim tempArr(1 To UBound(dataArray, 1), 1 To ColsToKeep)
For i = LBound(dataArray, 1) To UBound(dataArray, 1)
For j = LBound(columnsToKeepArray) To UBound(columnsToKeepArray)
tempArr(i, j) = dataArray(i, columnsToKeepArray(j)) 'map the required columns to the output array
Next j
Next i
ReduceColumnsArray = tempArr
End Function
Public Function GetData(ByVal ws As Worksheet, Optional ByVal offsetRows As Long = 0) As Variant
Dim dataArray()
Dim startRange As Range
Set startRange = ws.Range("A1").CurrentRegion
With startRange
dataArray = .Offset(offsetRows, 0).Resize(.Rows.Count - offsetRows, .Columns.Count).Value2
End With
GetData = dataArray
End Function
Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As Date '? or string
GetDate = CDate(Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function
Public Function GetCategory(ByVal i As Long) As String
Select Case i Mod 4 'I know number will not exceed range for mod
Case 1
GetCategory = "Cat1: Life Threatening"
Case 2
GetCategory = "Cat2: Emergency"
Case 3
GetCategory = "Cat3: Urgent"
Case 0
GetCategory = "Cat4: Less Urgent"
Case Else
GetCategory = "Unknown"
End Select
End Function
Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal i As Long) As Long
Dim testVal As Long
Dim n As Long
testVal = i Mod 4 'i will always be a positive integer. It will not exceed the max for mod.
Select Case testVal
Case 0
n = 3
Case 1
n = 0
Case 2
n = 1
Case 3
n = 2
End Select
Select Case finalColumn
Case 9
GetFinalColumn = 6 + n
Case 10
GetFinalColumn = 10 + n
Case 11
GetFinalColumn = 14 + n
Case 12
GetFinalColumn = 18 + n
End Select
End Function
Test dataset:
This represents a sample of rows from dataArray
as produced in the sub main
by:
dataArray = ReduceColumnsArray(GetData(ws, 1))
Dataset -
| Year | Month | Region | Org Code | Org Name | A8 | A10 | A11 | A12 | A24 | A30 | A33 | A36 | A25 | A31 | A34 | A37 | A26 | A32 | A35 | A38 |
|------|-------|--------|----------|------------------------------------|------|-------|-------|------|---------|----------|----------|----------|-----|------|------|------|-----|------|-------|-------|
| 2017 | 8 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 9 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 10 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 11 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 7658 | 52321 | 22062 | 2586 | 3248966 | 57927986 | 75274310 | 11186014 | 424 | 1107 | 3412 | 4326 | 688 | 2192 | 8100 | 8934 |
| 2017 | 12 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8551 | 56491 | 21108 | 2748 | 3800787 | 81948038 | 94600478 | 13564482 | 444 | 1451 | 4482 | 4936 | 724 | 3071 | 10736 | 10309 |
| 2018 | 1 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8469 | 55555 | 21759 | 2743 | 3639835 | 68077967 | 80385509 | 11249217 | 430 | 1225 | 3694 | 4101 | 705 | 2528 | 8738 | 8344 |
Output from test dataset
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 7658 | 3248966 | 424 | 688 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 52321 | 57927986 | 1107 | 2192 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 22062 | 75274310 | 3412 | 8100 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2586 | 11186014 | 4326 | 8934 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 8551 | 3800787 | 444 | 724 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 56491 | 81948038 | 1451 | 3071 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 21108 | 94600478 | 4482 | 10736 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2748 | 13564482 | 4936 | 10309 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 8469 | 3639835 | 430 | 705 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 55555 | 68077967 | 1225 | 2528 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 21759 | 80385509 | 3694 | 8738 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2743 | 11249217 | 4101 | 8344 |
2 Answers 2
Output sheet
First things first - you aren't checking that your output sheet exists and will error if it doesn't.
You would end up checking for the sheet, creating if it doesn't exist, assigning it to a variable and clearing it in case it had data -
Sub Example()
If Not SheetExists("Output") Then CreateSheet ActiveWorkbook, "Output"
Dim outputSheet As Worksheet
Set outputSheet = ActiveWorkbook.Sheets("Output")
outputSheet.UsedRange.ClearContents
End Sub
Function SheetExists(sheetName As String, Optional targetBook As Workbook) As Boolean
Dim targetSheet As Worksheet
If targetBook Is Nothing Then Set targetBook = ActiveWorkbook
On Error Resume Next
Set targetSheet = targetBook.Sheets(sheetName)
On Error GoTo 0
SheetExists = Not targetSheet Is Nothing
End Function
Sub CreateSheet(ByVal targetBook As Workbook, ByVal sheetName As String)
With targetBook
.Sheets.Add(After:=.Sheets(.Sheets.Count)).Name = sheetName
End With
End Sub
You could also use a sheet's CodeName
property - View Properties window (F4) and the (Name)
field (the one at the top) can be used as the worksheet name. This way you can avoid Sheets("mySheet")
and instead just use mySheet
. But, that would be creating a need to set the CodeName
of the sheet after you create it, so I don't know how useful it would really be.
Variable names
Give those variables some meaningful names! It's much easier to tell what is going on when, for example wb
is targetBook
and outputCol
is targetOutputColumn
. Characters are free, so what's the harm in spelling "column" instead of "col"? I see
wb
ws
outputCol
n
And why not give this a type?
Dim tempArr()
Your use of i
and j
is acceptable, but why not give the reader some context like targetFirstDimension
etc?
Standard VBA naming conventions have camelCase
for local variables and PascalCase
for other variables and names. Usually constants use 'UPPER_SNAKE_STYLE` (and it's what I use) but you could just use all caps, whatever you do make sure it's clearly a constant rather than just another variable.
Base 1
I know I brought Option Base 1
up, but you should probably avoid changing the default base because it can get confusing to read arrays defined starting at "1" if you expect them to start at "0". Overall, I think you did a pretty good job of explaining the arrays, so maybe get rid of the Option and just change your iterations e.g.
For i = LBound(finalArray, 1) + 1 To UBound(finalArray, 1) + 1
But, since your arrays pulled from the sheet starts with 1 instead of 0, I don't think you actually need to change the iteration, or even have the Option because the arrays are arr(1 to x)
which clearly starts at 1. The only array with a problem is columnsToKeepArray
-
For j = (LBound(columnsToKeepArray) + 1) To (UBound(columnsToKeepArray) + 1)
tempArr(i, j) = dataArray(i, columnsToKeepArray(j - 1))
Next j
Just replace that and delete the Option Base 1
.
Original Data
The way you're getting the default data is a little bit sketchy. Also, setting your target book to ThisWorkbook
makes it so you need to always use the macro's book. You can't have any error handling when you just make the assumption that your target sheet will be the first sheet.
Why not save the macro in a certain book and have it target the downloaded data .csv? You could always download it to the same location and if you do the analysis the month after -
Dim targetMonth As String
targetMonth = StrConv(Left(MonthName(Month(Now) - 1), 3), vbProperCase)
Dim fileName As String
fileName = "PATH/TO/New-AmbSYS-to-2018-" & targetMonth & ".csv"
Of course, you need to know if the file is there with something like
Sub FileExists()
Const PATH_TO As String = "C:\Users\USER\Downloads\New-AmbSYS-to-2018-"
Dim targetMonth As String
targetMonth = StrConv(Left(MonthName(Month(Now) - 1), 3), vbProperCase)
Dim fileName As String
fileName = PATH_TO & targetMonth & ".csv"
If Dir(fileName) <> "" Then MsgBox "file exists"
End Sub
Except probably a function where you pass Now
and return Boolean
to determine if you need to exit the sub.
Populating the first array
I don't understand the offsetRows
argument, so
Set startRange = ws.Range("A1").CurrentRegion
'startRange.Select
With startRange
dataArray = .Offset(offsetRows, 0).Resize(.Rows.Count - offsetRows, .Columns.Count).Value2
End With
I added the .Select to see what is happening. I don't see the need for the .Offset
dataArray = targetSheet.Range("A1").CurrentRegion
This works just as well, but again the offsetRows
argument is a mystery to me.
GetDate
Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As Date '? or string
GetDate = CDate(Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function
Yeah, I think returning a string might be better. What happens if the value passed can't be a date?
GetDate(vbNullString, vbNullString)
Error. CDate error. You need to handle that error. e.g.
Dim myDate As String
Dim monthValue As String
monthValue = "2"
Dim yearValue As String
yearValue = "2018"
myDate = GetDate(yearValue, monthValue)
If Not IsDate(myDate) Then 'handle that error!
Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As String
GetDate = (Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function
GetCatagory
Not a bad function. I might make your comment a little more clear, already it is telling me why instead of how - awesome! Maybe move that to the top of the function with a better description of why.
GetFinalColumn
Hm. How does this work?
For n = 9 To 12
finalArray(i, n) = dataArray(sourceRow, GetFinalColumn(n, i)) 'finalColumn
Next n
Your variables in the argument and in the function aren't telling me much. It all seems arbitrary at first look. You're basing what column to use on what row is being used, correct? And you do n = 9 to 12
for every i
? This seems pretty round-about. You've already done i Mod 4
in GetCategory
.
For every 4 rows, change the final column by 4 starting at 6
Right? So the essentially Category
determines the column?
Dim sourceRow As Long
Dim categoryByRow As Long
For i = LBound(finalArray, 1) To UBound(finalArray, 1)
categoryByRow = i Mod 4
You moved the mod up before GetCategory
, so you can just pass that and remove the Mod
from that function. You can also pass the new variable to GetLastColumn
. You also pass n
to GetLastColumn
, turn it into lastColumn
and then create a new n
. That's confusing
Dim finalColumn As Long
For finalColumn = 9 To 12
finalArray(i, finalColumn) = dataArray(sourceRow, GetFinalColumn(finalColumn, categoryByRow)) 'finalColumn
Next
Public Function GetCategory(ByVal categoryByRow As Long) As String
Select Case categoryByRow
Case 1
GetCategory = "Cat1: Life Threatening"
Case 2
GetCategory = "Cat2: Emergency"
Case 3
GetCategory = "Cat3: Urgent"
Case 0
GetCategory = "Cat4: Less Urgent"
Case Else
GetCategory = "Unknown"
End Select
End Function
Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal categoryByRow As Long) As Long
Select Case categoryByRow
Case 0
categoryByRow = 3
Case 1
categoryByRow = 0
Case 2
categoryByRow = 1
Case 3
categoryByRow = 2
End Select
Select Case finalColumn
Case 9
GetFinalColumn = 6 + categoryByRow
Case 10
GetFinalColumn = 10 + categoryByRow
Case 11
GetFinalColumn = 14 + categoryByRow
Case 12
GetFinalColumn = 18 + categoryByRow
End Select
End Function
Still, this feels pretty clunky. Why does the i mod 4
give a non-result?
categoryByRow = categoryByRow - 1
If categoryByRow < 0 Then categoryByRow = 3
And your finalColumn
also give a non-result? I think there must be a way to refactor that, but I can't figure out the relationship of {9-6,10-10,11-14,12-18}. Someone better at math might be able to.
Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal categoryByRow As Long) As Long
categoryByRow = categoryByRow - 1
If categoryByRow < 0 Then categoryByRow = 3
Select Case finalColumn
Case 9
GetFinalColumn = 6 + categoryByRow
Case 10
GetFinalColumn = 10 + categoryByRow
Case 11
GetFinalColumn = 14 + categoryByRow
Case 12
GetFinalColumn = 18 + categoryByRow
End Select
End Function
-
1\$\begingroup\$ I'm not surprised. Excellent. I just knew, if you saw this, I'd get hammered over the Option Base. Very good point about the fact I could have changed columnsToKeepArray. I will enjoy digesting your feedback. Really appreciate the effort and time you have put in. \$\endgroup\$QHarr– QHarr2018年02月16日 05:47:44 +00:00Commented Feb 16, 2018 at 5:47
What follows is more of a rewrite of the OP's code then a review. The main reason for the rewrite is that the original code maps the original data to new array and then uses another map to write the data to the final array. In my opinion, mapping the data directly from the original array is more efficient, easier to read, and to modify.
The reason that the code is difficult to implement is that each row of original data contains 4 records with a lot of repeat data. Instead of creating a complex formula to map out the columns of the non-repeating data, I simply repeated the columns of the repeat data in my column map. After that it was a simply matter of adding extra counters to track the current row and column.
Private Function getColumn(c As Long) As Long
getColumn = Choose(c, _
0, 1, 2, 0, 3, 4, 5, 0, 14, 19, 20, 21, _
0, 1, 2, 0, 3, 4, 5, 0, 16, 25, 26, 27, _
0, 1, 2, 0, 3, 4, 5, 0, 17, 28, 29, 30, _
0, 1, 2, 0, 3, 4, 5, 0, 18, 31, 32, 33)
End Function
Note: Zero's represent calculated columns that do not map to the data.
Private Const COLUMN_COUNT As Long = 12
Private Const OUTPUT_SHEET_NAME = "Output"
Public Sub Main()
Dim results() As Variant
results = GetData(ThisWorkbook.Worksheets(1))
results = getPivotData(results)
With getOutputWorksheet
.Range("A2").Resize(UBound(results), UBound(results, 2)).Value = results
End With
End Sub
Private Function getPivotData(ByRef data As Variant)
Dim c As Long, c2 As Long, i As Long, r As Long, r2 As Long
Dim results() As Variant
ReDim results(1 To UBound(data) * 4, 1 To COLUMN_COUNT)
r2 = 1
For r = 1 To UBound(data, 1)
c2 = 1
For i = 1 To 4
For c = 1 To COLUMN_COUNT
If c = 1 Then
results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1) 'Date
ElseIf c = 4 Then
'STP - TODO Add function call in to return this value
ElseIf c = 8 Then
results(r2, 8) = getCategory(r2)
Else 'Non Repeated Data
results(r2, c) = data(r, getColumn(c2))
End If
'results(r2, c) = getColumn(c2)
c2 = c2 + 1
Next
r2 = r2 + 1
Next
Next
getPivotData = results
End Function
Private Function getData(ws As Worksheet) As Variant()
Dim Target As Range
Set Target = Intersect(ws.UsedRange, ws.UsedRange.Offset(1))
If Not Target Is Nothing Then getData = Target.Value2
End Function
Private Function getColumn(c As Long) As Long
getColumn = Choose(c, _
0, 1, 2, 0, 3, 4, 5, 0, 14, 19, 20, 21, _
0, 1, 2, 0, 3, 4, 5, 0, 16, 25, 26, 27, _
0, 1, 2, 0, 3, 4, 5, 0, 17, 28, 29, 30, _
0, 1, 2, 0, 3, 4, 5, 0, 18, 31, 32, 33)
End Function
Private Function getOutputWorksheet() As Worksheet
Dim ws As Worksheet
On Error Resume Next
Set ws = ThisWorkbook.Worksheets(OUTPUT_SHEET_NAME)
On Error GoTo 0
If ws Is Nothing Then
Set ws = ThisWorkbook.Worksheets.Add
ws.Name = OUTPUT_SHEET_NAME
Else
ws.Cells.Clear
End If
With ws
.Range("A1").Resize(1, COLUMN_COUNT).Value = Array("Date ", "Year", "Month", "STP", "Region", "Org Code", "Org Name", "Category", "Count of Incidents", "otal (hours)", "Mean (hh:mm:ss)", "90th centile (hh:mm:ss)")
.Columns("A:A").NumberFormat = "yyyy-mm-dd"
End With
Set getOutputWorksheet = ws
End Function
Private Function getCategory(ByVal c As Long) As String
getCategory = Array("Cat4: Less Urgent", "Cat1: Life Threatening", "Cat2: Emergency", "Cat3: Urgent")(c Mod 4)
End Function
Note: I purposely did away with the getDate
function because dates should be stored as numbers. The date format should be displayed by formatting the ranges preferably formatting the entire Column. The OP implemented this due to have different regional setting on his workstations.
I also shortened the getData
and getCategory
to demonstrate some alternate techniques.
TODO: Create Error handlers and add comments.
Addendum: DateSerial Overflow Issue
This is in response to a comment by the OP.
The OP ran into an issue while implementing the code on his complete Dataset. data(r, 1) = 42743
was causing an DateSerial()
to throw an Overflow Error
. I am not sure if the actual Dataset has a mix of Years and Dates or if his Test Dataset has Years and the actual Dataset has Dates formatted yyyy
.
If the actual data is Dates formatted as years use:
results(r2, 1) = DateSerial(Year(data(r, 1)), data(r, 2), 1)
If the values are mixed use:
If IsDate(data(r, 1)) Then data(r, 1) = Year(data(r, 1))
results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1)
-
\$\begingroup\$ I will enjoy reading this. I can see you have re-written getCategory the same way I did after you had made your very first comment. \$\endgroup\$QHarr– QHarr2018年02月18日 06:14:46 +00:00Commented Feb 18, 2018 at 6:14
-
\$\begingroup\$ If I could give you another plus one for that getData function I would. \$\endgroup\$QHarr– QHarr2018年02月18日 09:02:46 +00:00Commented Feb 18, 2018 at 9:02
-
\$\begingroup\$ I just rewrote
getData
to avoid a possible error. But the basic premise remains the same. Thanks \$\endgroup\$user109261– user1092612018年02月18日 15:57:51 +00:00Commented Feb 18, 2018 at 15:57 -
\$\begingroup\$ With the full dataset I am getting runtime error 6 - overflow on results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1) 'Date \$\endgroup\$QHarr– QHarr2018年02月22日 14:59:33 +00:00Commented Feb 22, 2018 at 14:59
-
\$\begingroup\$ data(r, 1) = 42743 and data(r,2) = 2017 and r = 67 and r2 = 265. I noticed a similar problem at this exact same number (r2 of 265) with my code and am beginning to wonder if my logic is flawed somewhere. I had run everything up until now with a smaller dataset than the full. \$\endgroup\$QHarr– QHarr2018年02月22日 15:02:57 +00:00Commented Feb 22, 2018 at 15:02
Select Case
inGetFinalColumn
is not needed. Usen = (i - 1) Mod 4
instead. \$\endgroup\$