2
\$\begingroup\$

I am processing an unformatted CSV, New-AmbSYS-to-2018-Jan.csv, from NHS England Ambulance Quality Indicators, by reading its contents into an array, and carrying out a number of steps that lead to a final output array format.

The file

There are an initial 79 columns in the file and, depending on the month, a variable number of rows. The total number of rows will never be enormous.

Header row

First 5 columns:

| Year | Month | Region | Org Code | Org Name |
|------|-------|--------|----------|----------|

Rest of columns - A further 74 columns labeled A0 to A73. These are codes for indicators.

I am not going to show you all of those columns for obvious reasons. The dataset is publicly available and I have included the direct download link and the website link.

Process:

Reduce number of columns and re-order

  1. Function GetData grabs all the data from the sheet (it will always be the first sheet), ignoring the header row (optional argument), and loads into an array dataArray.

  2. Function ReduceColumnsArray takes dataArray, grabs the columns of interest and orders them, as per the specification in array columnsToKeepArray. I tried to apply what I learnt from the answers to my prior question on CR.

The new smaller array, also called dataArray, is now composed of 21 columns.

The column ordering is as follows (1 sample row included. Header added to aid with later column mapping description):

| Year | Month | Region | Org Code | Org Name | A8 | A10 | A11 | A12 | A24 | A30 | A33 | A36 | A25 | A31 | A34 | A37 | A26 | A32 | A35 | A38 |
|------|-------|--------|----------|------------------------------------|------|-------|-------|------|---------|----------|----------|----------|-----|------|------|------|-----|------|------|------|
| 2018 | 1 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8469 | 55555 | 21759 | 2743 | 3639835 | 68077967 | 80385509 | 11249217 | 430 | 1225 | 3694 | 4101 | 705 | 2528 | 8738 | 8344 |

Unpivot columns in array and populate output array

Output format

The output array, finalArray, has the following output format; with 4 output rows for each single input row (in dataArray):

| Date | Year | Month | STP | Region | Org Code | Org Name | Category | Count of Incidents | Total (hours) | Mean (hh:mm:ss) | 90th centile (hh:mm:ss) |
|------|------|-------|-----|--------|----------|----------|----------|--------------------|---------------|-----------------|-------------------------|
| | | | | | | | | A8 | A24 | A25 | A26 |
| | | | | | | | | A10 | A30 | A31 | A32 |
| | | | | | | | | A11 | A33 | A34 | A35 |
| | | | | | | | | A12 | A36 | A37 | A38 |

Output to sheet from Array2

Observations

  1. There are additional columns in the output, Date (1), STP (4) and Category (8).

  2. From column 6 of input, dataArray, onwards, every 4 columns unpivot into 4 rows in one column such that, for the remaining 16 columns (6:21), there are 4 columns with 4 rows (for every one input row) generated.

    E.g. If you look at A8, A10, A11 andA12, they "unpivot", I use that term loosely, into the Count of Incidents column. It will be the actual corresponding values i.e. 8469 | 55555 | 21759 | 2743 for the sample row, but I have used the headers for illustrative purposes.

Arranging the data

  1. Function GetFinalArray takes dataArray as an argument and performs the assignment of dataArray values to finalArray.

  2. Dimensioning finalArray is defined by the fact UBound(finalArray,2) is known, 12; and that the UBound(finalArray,1) will be, due to the unpivot, 4 * the number of rows read in i.e. 4 * UBound(dataArray,1).

The additional output columns

  1. Column 1 (Date): Function GetDate is used to populate this.

  2. Column 4 (STP): This uses a placeholder, vbNullString, at present.

  3. Column 8 (Category): This is populated by the GetCategory function. This is based on the input rows being ordered such that rows in always run in sequence 1,2,3,4, which map to "Cat1: Life Threatening" , "Cat2: Emergency", "Cat3: Urgent" , "Cat4: Less Urgent". The mod function is applied to the current row to determine this i.e. if current row in array mod 4, 4 for the unpivot, is 0 then Cat4, if 1 then Cat1 etc.

Input to output mapping

The individual row/column mapping from dataArray to finalArray i.e. input to output, using a single row example is:

| | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
|---|---|-----|-----|---|-----|-----|-----|---|-----|------|------|------|
| 1 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,6 | 1,10 | 1,14 | 1,18 |
| 2 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,7 | 1,11 | 1,15 | 1,19 |
| 3 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,8 | 1,12 | 1,16 | 1,20 |
| 4 | | 1,1 | 1,2 | | 1,3 | 1,4 | 1,5 | | 1,9 | 1,13 | 1,17 | 1,21 |

Where, as shown, in matching picture below, the red are the finalArray (row,column) positions and the inner black, "coordinates", are the corresponding values to assign from dataArray.

Array1 to Array2 mapping

To get the "coordinates" I use the fact that the row part of the coordinate increments 1 for every 4 rows of the outer array.

So row 5 would be:

| 5 | | 2,1 | 2,2 | | 2,3 | 2,4 | 2,5 | | 2,6 | 2,10 | 2,14 | 2,18 |

Pattern to get output column for current row of finalArr is implemented by function GetFinalColumn, and has the following logic:

| | finalColumn |
|-------------------------|----|----|----|----|
|MOD of outer row number | 9 | 10 | 11 | 12 | <input value
|-------------------------|----|----|----|----|
| 1 | 6 | 10 | 14 | 18 | >output
| 2 | 7 | 11 | 15 | 19 | >output
| 3 | 8 | 12 | 16 | 20 | >output
| 0 | 9 | 13 | 17 | 21 | >output

So, if I am currently assigning to the outer array (1,9) then 1 mod 4 is 1; and the intersection above, of 1 and 9, is 6. So I use (1,6) as my coordinates.

Module MainSub

Option Explicit
Public Sub Main()
 Dim wb As Workbook
 Dim ws As Worksheet
 
 Set wb = ThisWorkbook
 Set ws = wb.Sheets(1)
 Dim dataArray()
 dataArray = ReduceColumnsArray(GetData(ws, 1))
 
 Dim finalArray As Variant
 finalArray = GetFinalArray(dataArray)
 
 Dim destinationRange As Range
 Set destinationRange = wb.Worksheets("Output").Range("A2")
 
 destinationRange.Resize(UBound(finalArray, 1), UBound(finalArray, 2)) = finalArray
End Sub

Module FunctionsModule

Option Explicit
Option Base 1
Public Function GetFinalArray(ByVal dataArray As Variant) As Variant 'to convert to function
 Dim i As Long
 Dim j As Long
 Dim finalArray()
 Dim totalOutputRows As Long
 Dim numberRowsInDataSet As Long
 
 numberRowsInDataSet = UBound(dataArray, 1)
 totalOutputRows = 4 * numberRowsInDataSet 'scale to account for columns becoming rows
 
 ReDim finalArray(1 To totalOutputRows, 1 To 12)
 
 Dim outputCol As Long
 outputCol = 0
 Dim sourceRow As Long
 
 For i = LBound(finalArray, 1) To UBound(finalArray, 1)
 If i Mod 4 = 1 Then sourceRow = sourceRow + 1
 
 finalArray(i, 1) = GetDate(dataArray(sourceRow, 1), dataArray(sourceRow, 2)) 'this needs formatting to generate output 'yyyy-mm-dd' ;adding -01 as dd
 
 finalArray(i, 2) = dataArray(sourceRow, 1)
 
 finalArray(i, 3) = dataArray(sourceRow, 2)
 
 finalArray(i, 4) = vbNullString 'STP - TODO Add function call in to return this value
 
 finalArray(i, 5) = dataArray(sourceRow, 3)
 
 finalArray(i, 6) = dataArray(sourceRow, 4)
 
 finalArray(i, 7) = dataArray(sourceRow, 5)
 
 finalArray(i, 8) = GetCategory(i) 'Category
 
 Dim n As Long
 
 For n = 9 To 12
 finalArray(i, n) = dataArray(sourceRow, GetFinalColumn(n, i)) 'finalColumn
 Next n
 
 Next i
 GetFinalArray = finalArray
End Function
Public Function ReduceColumnsArray(ByVal dataArray As Variant) As Variant
 'From the initial 79 columns keep only 21 columns as specified in columnsToKeepArray
 Const ColsToKeep As Long = 21
 Dim columnsToKeepArray()
 columnsToKeepArray = Array(1, 2, 3, 4, 5, 14, 16, 17, 18, 19, 25, 28, 31, 20, 26, 29, 32, 21, 27, 30, 33)
 Dim i As Long
 Dim j As Long
 Dim tempArr()
 ReDim tempArr(1 To UBound(dataArray, 1), 1 To ColsToKeep)
 
 For i = LBound(dataArray, 1) To UBound(dataArray, 1)
 
 For j = LBound(columnsToKeepArray) To UBound(columnsToKeepArray)
 tempArr(i, j) = dataArray(i, columnsToKeepArray(j)) 'map the required columns to the output array
 Next j
 Next i
 ReduceColumnsArray = tempArr
End Function
Public Function GetData(ByVal ws As Worksheet, Optional ByVal offsetRows As Long = 0) As Variant
 Dim dataArray()
 Dim startRange As Range
 
 Set startRange = ws.Range("A1").CurrentRegion
 
 With startRange
 dataArray = .Offset(offsetRows, 0).Resize(.Rows.Count - offsetRows, .Columns.Count).Value2
 End With
 GetData = dataArray
End Function
 
Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As Date '? or string
 GetDate = CDate(Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function
Public Function GetCategory(ByVal i As Long) As String
 Select Case i Mod 4 'I know number will not exceed range for mod
 Case 1
 GetCategory = "Cat1: Life Threatening"
 Case 2
 GetCategory = "Cat2: Emergency"
 Case 3
 GetCategory = "Cat3: Urgent"
 Case 0
 GetCategory = "Cat4: Less Urgent"
 Case Else
 GetCategory = "Unknown"
 End Select
End Function
Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal i As Long) As Long
 Dim testVal As Long
 Dim n As Long
 testVal = i Mod 4 'i will always be a positive integer. It will not exceed the max for mod.
 Select Case testVal
 
 Case 0
 n = 3
 Case 1
 n = 0
 Case 2
 n = 1
 Case 3
 n = 2
 
 End Select
 Select Case finalColumn
 Case 9
 GetFinalColumn = 6 + n
 Case 10
 GetFinalColumn = 10 + n
 Case 11
 GetFinalColumn = 14 + n
 Case 12
 GetFinalColumn = 18 + n
 End Select
 
End Function

Test dataset:

This represents a sample of rows from dataArray as produced in the sub main by:

dataArray = ReduceColumnsArray(GetData(ws, 1))

Dataset -

| Year | Month | Region | Org Code | Org Name | A8 | A10 | A11 | A12 | A24 | A30 | A33 | A36 | A25 | A31 | A34 | A37 | A26 | A32 | A35 | A38 |
|------|-------|--------|----------|------------------------------------|------|-------|-------|------|---------|----------|----------|----------|-----|------|------|------|-----|------|-------|-------|
| 2017 | 8 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 9 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 10 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . |
| 2017 | 11 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 7658 | 52321 | 22062 | 2586 | 3248966 | 57927986 | 75274310 | 11186014 | 424 | 1107 | 3412 | 4326 | 688 | 2192 | 8100 | 8934 |
| 2017 | 12 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8551 | 56491 | 21108 | 2748 | 3800787 | 81948038 | 94600478 | 13564482 | 444 | 1451 | 4482 | 4936 | 724 | 3071 | 10736 | 10309 |
| 2018 | 1 | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | 8469 | 55555 | 21759 | 2743 | 3639835 | 68077967 | 80385509 | 11249217 | 430 | 1225 | 3694 | 4101 | 705 | 2528 | 8738 | 8344 |

Output from test dataset

| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/08/2017 | 2017 | 8 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/09/2017 | 2017 | 9 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/10/2017 | 2017 | 10 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | . | . | . | . |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 7658 | 3248966 | 424 | 688 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 52321 | 57927986 | 1107 | 2192 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 22062 | 75274310 | 3412 | 8100 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/11/2017 | 2017 | 11 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2586 | 11186014 | 4326 | 8934 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 8551 | 3800787 | 444 | 724 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 56491 | 81948038 | 1451 | 3071 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 21108 | 94600478 | 4482 | 10736 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/12/2017 | 2017 | 12 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2748 | 13564482 | 4936 | 10309 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat1: Life Threatening | 8469 | 3639835 | 430 | 705 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat2: Emergency | 55555 | 68077967 | 1225 | 2528 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat3: Urgent | 21759 | 80385509 | 3694 | 8738 |
|------------|------|----|---|-----|-----|------------------------------------|------------------------|-------|----------|------|-------|
| 01/01/2018 | 2018 | 1 | | Y56 | RRU | LONDON AMBULANCE SERVICE NHS TRUST | Cat4: Less Urgent | 2743 | 11249217 | 4101 | 8344 |
asked Feb 15, 2018 at 13:10
\$\endgroup\$
4
  • 1
    \$\begingroup\$ The first Select Case in GetFinalColumn is not needed. Use n = (i - 1) Mod 4 instead. \$\endgroup\$ Commented Feb 16, 2018 at 21:14
  • \$\begingroup\$ Thank you. That is one of the two really really ugly functions in my opinion. Love to take some shears to it. \$\endgroup\$ Commented Feb 16, 2018 at 23:41
  • \$\begingroup\$ Which means I can then say GetFinalColumn = 4 * finalColumn + n -30. Reducing body of function to 3 lines. Though should add a test that finalColumn between 9 and 12 inclusive. \$\endgroup\$ Commented Feb 17, 2018 at 14:32
  • \$\begingroup\$ I had looked at that but didn't find the pattern. Nice job. I hope that you enjoy my post. Sorry that I didn't add many comments. Hopefully, the code reads itself. \$\endgroup\$ Commented Feb 18, 2018 at 3:01

2 Answers 2

2
\$\begingroup\$

Output sheet

First things first - you aren't checking that your output sheet exists and will error if it doesn't.

SheetExists

CreateSheet

You would end up checking for the sheet, creating if it doesn't exist, assigning it to a variable and clearing it in case it had data -

Sub Example()
 If Not SheetExists("Output") Then CreateSheet ActiveWorkbook, "Output"
 Dim outputSheet As Worksheet
 Set outputSheet = ActiveWorkbook.Sheets("Output")
 outputSheet.UsedRange.ClearContents
End Sub
 Function SheetExists(sheetName As String, Optional targetBook As Workbook) As Boolean
 Dim targetSheet As Worksheet
 If targetBook Is Nothing Then Set targetBook = ActiveWorkbook
 On Error Resume Next
 Set targetSheet = targetBook.Sheets(sheetName)
 On Error GoTo 0
 SheetExists = Not targetSheet Is Nothing
 End Function
Sub CreateSheet(ByVal targetBook As Workbook, ByVal sheetName As String)
 With targetBook
 .Sheets.Add(After:=.Sheets(.Sheets.Count)).Name = sheetName
 End With
End Sub

You could also use a sheet's CodeName property - View Properties window (F4) and the (Name) field (the one at the top) can be used as the worksheet name. This way you can avoid Sheets("mySheet") and instead just use mySheet. But, that would be creating a need to set the CodeName of the sheet after you create it, so I don't know how useful it would really be.


Variable names

Give those variables some meaningful names! It's much easier to tell what is going on when, for example wb is targetBook and outputCol is targetOutputColumn. Characters are free, so what's the harm in spelling "column" instead of "col"? I see

wb
ws
outputCol
n

And why not give this a type?

Dim tempArr()

Your use of i and j is acceptable, but why not give the reader some context like targetFirstDimension etc?

Standard VBA naming conventions have camelCase for local variables and PascalCase for other variables and names. Usually constants use 'UPPER_SNAKE_STYLE` (and it's what I use) but you could just use all caps, whatever you do make sure it's clearly a constant rather than just another variable.


Base 1

I know I brought Option Base 1 up, but you should probably avoid changing the default base because it can get confusing to read arrays defined starting at "1" if you expect them to start at "0". Overall, I think you did a pretty good job of explaining the arrays, so maybe get rid of the Option and just change your iterations e.g.

 For i = LBound(finalArray, 1) + 1 To UBound(finalArray, 1) + 1

But, since your arrays pulled from the sheet starts with 1 instead of 0, I don't think you actually need to change the iteration, or even have the Option because the arrays are arr(1 to x) which clearly starts at 1. The only array with a problem is columnsToKeepArray -

 For j = (LBound(columnsToKeepArray) + 1) To (UBound(columnsToKeepArray) + 1)
 tempArr(i, j) = dataArray(i, columnsToKeepArray(j - 1)) 
 Next j

Just replace that and delete the Option Base 1.


Original Data

The way you're getting the default data is a little bit sketchy. Also, setting your target book to ThisWorkbook makes it so you need to always use the macro's book. You can't have any error handling when you just make the assumption that your target sheet will be the first sheet.

Why not save the macro in a certain book and have it target the downloaded data .csv? You could always download it to the same location and if you do the analysis the month after -

Dim targetMonth As String
targetMonth = StrConv(Left(MonthName(Month(Now) - 1), 3), vbProperCase)
Dim fileName As String
fileName = "PATH/TO/New-AmbSYS-to-2018-" & targetMonth & ".csv"

Of course, you need to know if the file is there with something like

Sub FileExists()
 Const PATH_TO As String = "C:\Users\USER\Downloads\New-AmbSYS-to-2018-"
 Dim targetMonth As String
 targetMonth = StrConv(Left(MonthName(Month(Now) - 1), 3), vbProperCase)
 Dim fileName As String
 fileName = PATH_TO & targetMonth & ".csv"
 If Dir(fileName) <> "" Then MsgBox "file exists"
End Sub

Except probably a function where you pass Now and return Boolean to determine if you need to exit the sub.


Populating the first array

I don't understand the offsetRows argument, so

Set startRange = ws.Range("A1").CurrentRegion
 'startRange.Select
 With startRange
 dataArray = .Offset(offsetRows, 0).Resize(.Rows.Count - offsetRows, .Columns.Count).Value2
 End With

I added the .Select to see what is happening. I don't see the need for the .Offset

 dataArray = targetSheet.Range("A1").CurrentRegion

This works just as well, but again the offsetRows argument is a mystery to me.


GetDate

Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As Date '? or string
 GetDate = CDate(Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function

Yeah, I think returning a string might be better. What happens if the value passed can't be a date?

GetDate(vbNullString, vbNullString)

Error. CDate error. You need to handle that error. e.g.

Dim myDate As String
Dim monthValue As String
monthValue = "2"
Dim yearValue As String
yearValue = "2018"
myDate = GetDate(yearValue, monthValue)
If Not IsDate(myDate) Then 'handle that error!
Public Function GetDate(ByVal yearValue As String, ByVal monthValue As String) As String
 GetDate = (Format$(yearValue & "-" & monthValue & "-" & "01", "yyyy-mm-dd"))
End Function

GetCatagory

Not a bad function. I might make your comment a little more clear, already it is telling me why instead of how - awesome! Maybe move that to the top of the function with a better description of why.


GetFinalColumn

Hm. How does this work?

For n = 9 To 12
 finalArray(i, n) = dataArray(sourceRow, GetFinalColumn(n, i)) 'finalColumn
Next n

Your variables in the argument and in the function aren't telling me much. It all seems arbitrary at first look. You're basing what column to use on what row is being used, correct? And you do n = 9 to 12 for every i? This seems pretty round-about. You've already done i Mod 4 in GetCategory.

For every 4 rows, change the final column by 4 starting at 6

Right? So the essentially Category determines the column?

Dim sourceRow As Long
Dim categoryByRow As Long
For i = LBound(finalArray, 1) To UBound(finalArray, 1)
 categoryByRow = i Mod 4

You moved the mod up before GetCategory, so you can just pass that and remove the Mod from that function. You can also pass the new variable to GetLastColumn. You also pass n to GetLastColumn, turn it into lastColumn and then create a new n. That's confusing

Dim finalColumn As Long
 For finalColumn = 9 To 12
 finalArray(i, finalColumn) = dataArray(sourceRow, GetFinalColumn(finalColumn, categoryByRow)) 'finalColumn
 Next 
Public Function GetCategory(ByVal categoryByRow As Long) As String
 Select Case categoryByRow
 Case 1
 GetCategory = "Cat1: Life Threatening"
 Case 2
 GetCategory = "Cat2: Emergency"
 Case 3
 GetCategory = "Cat3: Urgent"
 Case 0
 GetCategory = "Cat4: Less Urgent"
 Case Else
 GetCategory = "Unknown"
 End Select
End Function
Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal categoryByRow As Long) As Long
 Select Case categoryByRow
 Case 0
 categoryByRow = 3
 Case 1
 categoryByRow = 0
 Case 2
 categoryByRow = 1
 Case 3
 categoryByRow = 2
 End Select
 Select Case finalColumn
 Case 9
 GetFinalColumn = 6 + categoryByRow
 Case 10
 GetFinalColumn = 10 + categoryByRow
 Case 11
 GetFinalColumn = 14 + categoryByRow
 Case 12
 GetFinalColumn = 18 + categoryByRow
 End Select
End Function

Still, this feels pretty clunky. Why does the i mod 4 give a non-result?

categoryByRow = categoryByRow - 1
If categoryByRow < 0 Then categoryByRow = 3

And your finalColumn also give a non-result? I think there must be a way to refactor that, but I can't figure out the relationship of {9-6,10-10,11-14,12-18}. Someone better at math might be able to.

Public Function GetFinalColumn(ByVal finalColumn As Long, ByVal categoryByRow As Long) As Long
 
 categoryByRow = categoryByRow - 1
 If categoryByRow < 0 Then categoryByRow = 3
 
 Select Case finalColumn
 Case 9
 GetFinalColumn = 6 + categoryByRow
 Case 10
 GetFinalColumn = 10 + categoryByRow
 Case 11
 GetFinalColumn = 14 + categoryByRow
 Case 12
 GetFinalColumn = 18 + categoryByRow
 End Select
End Function
answered Feb 15, 2018 at 21:50
\$\endgroup\$
1
  • 1
    \$\begingroup\$ I'm not surprised. Excellent. I just knew, if you saw this, I'd get hammered over the Option Base. Very good point about the fact I could have changed columnsToKeepArray. I will enjoy digesting your feedback. Really appreciate the effort and time you have put in. \$\endgroup\$ Commented Feb 16, 2018 at 5:47
1
\$\begingroup\$

What follows is more of a rewrite of the OP's code then a review. The main reason for the rewrite is that the original code maps the original data to new array and then uses another map to write the data to the final array. In my opinion, mapping the data directly from the original array is more efficient, easier to read, and to modify.

The reason that the code is difficult to implement is that each row of original data contains 4 records with a lot of repeat data. Instead of creating a complex formula to map out the columns of the non-repeating data, I simply repeated the columns of the repeat data in my column map. After that it was a simply matter of adding extra counters to track the current row and column.

Private Function getColumn(c As Long) As Long
 getColumn = Choose(c, _
 0, 1, 2, 0, 3, 4, 5, 0, 14, 19, 20, 21, _
 0, 1, 2, 0, 3, 4, 5, 0, 16, 25, 26, 27, _
 0, 1, 2, 0, 3, 4, 5, 0, 17, 28, 29, 30, _
 0, 1, 2, 0, 3, 4, 5, 0, 18, 31, 32, 33)
End Function

Note: Zero's represent calculated columns that do not map to the data.


Private Const COLUMN_COUNT As Long = 12
Private Const OUTPUT_SHEET_NAME = "Output"
Public Sub Main()
 Dim results() As Variant
 results = GetData(ThisWorkbook.Worksheets(1))
 results = getPivotData(results)
 With getOutputWorksheet
 .Range("A2").Resize(UBound(results), UBound(results, 2)).Value = results
 End With
End Sub
Private Function getPivotData(ByRef data As Variant)
 Dim c As Long, c2 As Long, i As Long, r As Long, r2 As Long
 Dim results() As Variant
 ReDim results(1 To UBound(data) * 4, 1 To COLUMN_COUNT)
 r2 = 1
 For r = 1 To UBound(data, 1)
 c2 = 1
 For i = 1 To 4
 For c = 1 To COLUMN_COUNT
 If c = 1 Then
 results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1) 'Date
 ElseIf c = 4 Then
 'STP - TODO Add function call in to return this value
 ElseIf c = 8 Then
 results(r2, 8) = getCategory(r2)
 Else 'Non Repeated Data
 results(r2, c) = data(r, getColumn(c2))
 End If
 'results(r2, c) = getColumn(c2)
 c2 = c2 + 1
 Next
 r2 = r2 + 1
 Next
 Next
 getPivotData = results
End Function
Private Function getData(ws As Worksheet) As Variant()
 Dim Target As Range
 Set Target = Intersect(ws.UsedRange, ws.UsedRange.Offset(1))
 If Not Target Is Nothing Then getData = Target.Value2
End Function
Private Function getColumn(c As Long) As Long
 getColumn = Choose(c, _
 0, 1, 2, 0, 3, 4, 5, 0, 14, 19, 20, 21, _
 0, 1, 2, 0, 3, 4, 5, 0, 16, 25, 26, 27, _
 0, 1, 2, 0, 3, 4, 5, 0, 17, 28, 29, 30, _
 0, 1, 2, 0, 3, 4, 5, 0, 18, 31, 32, 33)
End Function
Private Function getOutputWorksheet() As Worksheet
 Dim ws As Worksheet
 On Error Resume Next
 Set ws = ThisWorkbook.Worksheets(OUTPUT_SHEET_NAME)
 On Error GoTo 0
 If ws Is Nothing Then
 Set ws = ThisWorkbook.Worksheets.Add
 ws.Name = OUTPUT_SHEET_NAME
 Else
 ws.Cells.Clear
 End If
 With ws
 .Range("A1").Resize(1, COLUMN_COUNT).Value = Array("Date ", "Year", "Month", "STP", "Region", "Org Code", "Org Name", "Category", "Count of Incidents", "otal (hours)", "Mean (hh:mm:ss)", "90th centile (hh:mm:ss)")
 .Columns("A:A").NumberFormat = "yyyy-mm-dd"
 End With
 Set getOutputWorksheet = ws
End Function
Private Function getCategory(ByVal c As Long) As String
 getCategory = Array("Cat4: Less Urgent", "Cat1: Life Threatening", "Cat2: Emergency", "Cat3: Urgent")(c Mod 4)
End Function

Note: I purposely did away with the getDate function because dates should be stored as numbers. The date format should be displayed by formatting the ranges preferably formatting the entire Column. The OP implemented this due to have different regional setting on his workstations.

I also shortened the getData and getCategory to demonstrate some alternate techniques.

TODO: Create Error handlers and add comments.

Addendum: DateSerial Overflow Issue

This is in response to a comment by the OP.

The OP ran into an issue while implementing the code on his complete Dataset. data(r, 1) = 42743 was causing an DateSerial() to throw an Overflow Error. I am not sure if the actual Dataset has a mix of Years and Dates or if his Test Dataset has Years and the actual Dataset has Dates formatted yyyy.

If the actual data is Dates formatted as years use:

results(r2, 1) = DateSerial(Year(data(r, 1)), data(r, 2), 1)

If the values are mixed use:

If IsDate(data(r, 1)) Then data(r, 1) = Year(data(r, 1))
results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1)
answered Feb 18, 2018 at 2:59
\$\endgroup\$
6
  • \$\begingroup\$ I will enjoy reading this. I can see you have re-written getCategory the same way I did after you had made your very first comment. \$\endgroup\$ Commented Feb 18, 2018 at 6:14
  • \$\begingroup\$ If I could give you another plus one for that getData function I would. \$\endgroup\$ Commented Feb 18, 2018 at 9:02
  • \$\begingroup\$ I just rewrote getData to avoid a possible error. But the basic premise remains the same. Thanks \$\endgroup\$ Commented Feb 18, 2018 at 15:57
  • \$\begingroup\$ With the full dataset I am getting runtime error 6 - overflow on results(r2, 1) = DateSerial(data(r, 1), data(r, 2), 1) 'Date \$\endgroup\$ Commented Feb 22, 2018 at 14:59
  • \$\begingroup\$ data(r, 1) = 42743 and data(r,2) = 2017 and r = 67 and r2 = 265. I noticed a similar problem at this exact same number (r2 of 265) with my code and am beginning to wonder if my logic is flawed somewhere. I had run everything up until now with a smaller dataset than the full. \$\endgroup\$ Commented Feb 22, 2018 at 15:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.