1
\$\begingroup\$

I have two large worksheets that I need to consolidate select data into one worksheet. Both worksheets contain about 80K+ rows, the output is expected to be in that range as well. At this point in time, the code works, but it is extremely slow. I don't actually know how long it actually takes to run on the full data set. I have let is run for 8+ hours without having it finish.

The procedure uses a for each loop to move through all the rows of the primary worksheet wsICD10 and selects all rows of concern (based on the criteria in the if statement) and logs the LOS_Group for use later in the VBA autofilter of the wsDUNST15 worksheet. I use two nested for loops to exhaustively extract all the data from the wsDUNST15. Once all my variables are full I write them to my new workbook and move on to the next record of interest in my wsICD10 with the for each loop.

My guess is the combination of nested loops and auto-filtering the large worksheets is an inefficient way to complete my task. But I'm unaware of another way to select a worksheet row using multiple criteria.

Public Sub ICD10DataSet()
'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Create ICD-10-CM/PCS data set
'
'ICD10_Type | ageCategory | DiagnosisProcedure_Category | ICD10_Code | LOS_Group | AVG_Stay
'
'
' This will use Truven October, 2017 data files
'
' By Trevor Pye
'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Dim wb As Workbook
Dim wbData As Workbook
Dim ICD10_Data As Worksheet
Dim wsICD10 As Worksheet
Dim wsDUNST15 As Worksheet
Dim ICD_Cell As Range
Dim ICD_Range As Range
Dim ICD10_Type As String
Dim AgeCat As Integer
Dim diagProced_Cat As Integer
Dim ICD10_Code As String
Dim LOS_Group As String
Dim AVG_Stay As Double
Dim startTime As Double
Application.ScreenUpdating = False
startTime = Time
Set wb = Workbooks("LOS_WorkBookICD-10_201710.xlsm")
Set wbData = Workbooks.Add
Set wsICD10 = wb.Worksheets("ICD10Full")
Set wsDUNST15 = wb.Worksheets("DUNST15")
Set wsOUNST15 = wb.Worksheets("OUNST15")
Set ICD_Range = wsICD10.Range("A2", wsICD10.Range("A1048576").End(xlUp))
Set ICD_Cell = wsICD10.Range("A2")
Set ICD10_Data = wbData.Sheets(1)
ICD10_Data.Range("A1") = Format(startTime, "HH:MM:SS")
'---- Start Header with block -------
With ICD10_Data
 .Cells(2, 1).Value = "ICD10_Type"
 .Cells(2, 2).Value = "ageCategory"
 .Cells(2, 3).Value = "DiagnosisProcedure_Category"
 .Cells(2, 4).Value = "ICD10_Code"
 .Cells(2, 5).Value = "LOS_Group"
 .Cells(2, 6).Value = "AVG_Stay"
End With
'---- Header Header with block -------
t = 3 '<--- starting row number
' define ICD_Type, ICD10_Code, LOS_Group
'-----Start ICD10 Code loop ---- 
For Each ICD_Cell In ICD_Range
 If ICD_Cell.Offset(, 4) = "@" Or ICD_Cell.Offset(-1, 2) = ICD_Cell.Offset(0, 2) Or ICD_Cell.Offset(, 12) = "" Then
 GoTo SkipCodeBlock '<--- Row not of interest, skip to next row
 Else
 ICD10_Code = ICD_Cell.Offset(0, 2).Value 
 LOS_Group = ICD_Cell.Offset(, 12) 
 ICD10_Type = ICD_Cell.Value 
 End If
'-------Start of nested diagnosis Loops ---------- 
 If ICD10_Type = "D" Then
 For i = 1 To 5
 AgeCat = i ' <--- setting the Age category integer value
 For j = 1 To 4
 Select Case j ' <--- setting the diagnosis integer value
 Case 1
 diagProced_Cat = 0
 Case 2
 diagProced_Cat = 1
 Case 3
 diagProced_Cat = 3
 Case Else
 diagProced_Cat = 4
 End Select
 With wsDUNST15.UsedRange '<-- retrieving the row of interest
 .AutoFilter field:=4, Criteria1:=LOS_Group
 .AutoFilter field:=5, Criteria1:=diagProced_Cat
 .AutoFilter field:=6, Criteria1:=AgeCat
 End With
 AVG_Stay = 
(wsDUNST15.Range(wsDUNST15.Range("H1048576").End(xlUp), 
"H2").SpecialCells(xlCellTypeVisible).Value)/10
 wsDUNST15.ShowAllData ' <---Resetting filters
 With ICD10_Data ' log Results
 finalRow = .Cells(Rows.Count, 1).End(xlUp).Row + 1
 .Cells(t, 1).Value = ICD10_Type
 .Cells(t, 2).Value = AgeCat
 .Cells(t, 3).Value = diagProced_Cat
 .Cells(t, 4).Value = ICD10_Code
 .Cells(t, 5).Value = "'" & LOS_Group
 .Cells(t, 6).Value = AVG_Stay
 End With
 t = t + 1
 Next j
 Next i
 Else
 GoTo complete
 End If
'-------End of nested diagnosis Loops ------- 
SkipCodeBlock:
Next ICD_Cell
'------End ICD10 Code loop -----
Application.ScreenUpdating = True
complete:
endTime = Time
ICD10_Data.Range("b1") = Format(endTime, "HH:MM:SS")
MsgBox "Procedure Complete!"
End Sub
Sᴀᴍ Onᴇᴌᴀ
29.5k16 gold badges45 silver badges201 bronze badges
asked Dec 15, 2017 at 19:57
\$\endgroup\$
3
  • \$\begingroup\$ In order to get a quality answer you should post some sample data and sample output and any Worksheet formulas that you may be using. I think that you mentioned a result set of +10K before the question was edited. That implies autofiltering the data +10k times. If that is the case give up on this approach. It work probably take less than a minute to process all the data in arrays. \$\endgroup\$ Commented Dec 17, 2017 at 8:31
  • \$\begingroup\$ I will add sample data later today. \$\endgroup\$ Commented Dec 18, 2017 at 16:38
  • \$\begingroup\$ The Dataset I'm working with is too big for excel, it exceeds the 1,048,576 row max. I probably looking for a Database solution thanks for your help. \$\endgroup\$ Commented Dec 19, 2017 at 21:25

2 Answers 2

1
\$\begingroup\$

I didn't have any data to work with so it likely needs debugged, but the general idea is to use sorting and cell formulas to do more of the heavy lifting than AutoFilter:

'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Create ICD-10-CM/PCS data set
'
'ICD10_Type | ageCategory | DiagnosisProcedure_Category | ICD10_Code | LOS_Group | AVG_Stay
'
'
' This will use Truven October, 2017 data files
'
' By Trevor Pye
'
''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
Public Sub ICD10DataSet()
' Dim ICD10_Type As String
' Dim AgeCat As Integer
' Dim diagProced_Cat As Integer
' Dim ICD10_Code As String
' Dim LOS_Group As String
' Dim AVG_Stay As Double
 Dim startTime As Double, endTime As Double
 Application.ScreenUpdating = False
 startTime = Time
 Debug.Print "Started: " & Format(startTime, "HH:MM:SS")
 Dim wbData As Workbook, ICD10_Data As Worksheet
 Set wbData = Workbooks.Add
 Set ICD10_Data = wbData.Sheets(1)
 Dim wb As Workbook, wsICD10 As Worksheet, wsDUNST15 As Worksheet
 Set wb = Workbooks("LOS_WorkBookICD-10_201710.xlsm")
 Set wsICD10 = wb.Worksheets("ICD10Full")
 Set wsDUNST15 = wb.Worksheets("DUNST15")
' Set wsOUNST15 = wb.Worksheets("OUNST15")
' Dim ICD_Range As Range
' Set ICD_Range = wsICD10.Range("A2", wsICD10.Range("A1048576").End(xlUp))
 '---- Copy ICD10 data
 wsICD10.UsedRange.Columns(1).Copy ICD10_Data.Range("A1")
 wsICD10.UsedRange.Columns(3).Copy ICD10_Data.Range("D1")
 wsICD10.UsedRange.Columns(13).Copy ICD10_Data.Range("E1")
 '---- Filter ICD10 data
 Dim ToRange As Range, ICD_Cell As Range
 Set ToRange = ICD10_Data.UsedRange.Columns(2).Offset(1, 0)
 Set ToRange = ToRange.Resize(ToRange.Rows.Count - 1, 1)
 Set ICD_Cell = wsICD10.Range("A2")
 ToRange.Formula = "=OR(" & _
 ICD_Cell.Offset(, 4).Address(False, True, xlA1, True) & "=""@""," & _
 ICD_Cell.Offset(-1, 2).Address(False, True, xlA1, True) & "=" & ICD_Cell.Offset(0, 2).Address(False, True, xlA1, True) & "," & _
 ICD_Cell.Offset(, 12).Address(False, True, xlA1, True) & "=""""," & _
 ICD_Cell.Address(False, True, xlA1, True) & "<>""D""" & _
 ")"
 ToRange.Copy
 ToRange.PasteSpecial xlPasteValues
 Application.CutCopyMode = False
 ICD10_Data.Sort.SortFields.Clear
 ICD10_Data.Sort.SortFields.Add Key:=ICD10_Data.Range("B1"), _
 SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
 With ICD10_Data.Sort
 .SetRange ICD10_Data.UsedRange
 .Header = xlYes
 .MatchCase = False
 .Orientation = xlTopToBottom
 .Apply
 End With
 Dim f As Range
 Set f = ToRange.Find(What:="true", LookIn:=xlValues, lookat:=xlWhole, SearchOrder:=xlByColumns)
 If Not f Is Nothing Then
 ICD10_Data.Range(ICD10_Data.Cells(f.Row, 1), ICD10_Data.Cells(ICD10_Data.UsedRange.Rows.Count, 1)).EntireRow.Delete xlShiftUp
 End If
 ICD10_Data.Columns(2).ClearContents
 '---- Start Header with block -------
 With ICD10_Data
 .Cells(1, 1).Value = "ICD10_Type"
 .Cells(1, 2).Value = "ageCategory"
 .Cells(1, 3).Value = "DiagnosisProcedure_Category"
 .Cells(1, 4).Value = "ICD10_Code"
 .Cells(1, 5).Value = "LOS_Group"
 .Cells(1, 6).Value = "AVG_Stay"
 End With
 '---- Header Header with block -------
 '---- Additional Filter
' ICD10_Data.UsedRange.RemoveDuplicates Columns:=Array(1, 4, 5), Header:=xlYes
 Dim DupeRange As Range
 Set DupeRange = ICD10_Data.UsedRange
 Set DupeRange = DupeRange.Offset(1, 0).Resize(DupeRange.Rows.Count - 1, DupeRange.Columns.Count)
 Set ToRange = DupeRange.Columns(1).Cells(DupeRange.Rows.Count).Offset(1, 0)
 '---- Expand ICD10 data by AgeCat 1 to 5
 'ICD10_Data.Range("B2") = [1..5]
 Dim i As Long
 For i = 1 To 5
 DupeRange.Columns(2).Value = i
 If i < 5 Then
 DupeRange.Copy ToRange
 Set ToRange = ToRange.Offset(DupeRange.Rows.Count, 0)
 End If
 Next i
 '---- Expand ICD10 data by diagProced_Cat {0,1,3,4}
 'ICD10_Data.Range("C2") = [0,1,3,4]
 Dim diagProced_Cat As Variant
 diagProced_Cat = Array(0, 1, 3, 4)
 Set DupeRange = ICD10_Data.UsedRange
 Set DupeRange = DupeRange.Offset(1, 0).Resize(DupeRange.Rows.Count - 1, DupeRange.Columns.Count)
 Set ToRange = DupeRange.Columns(1).Cells(DupeRange.Rows.Count).Offset(1, 0)
 For i = LBound(diagProced_Cat) To UBound(diagProced_Cat)
 DupeRange.Columns(3).Value = diagProced_Cat(i)
 If i < UBound(diagProced_Cat) Then
 DupeRange.Copy ToRange
 Set ToRange = ToRange.Offset(DupeRange.Rows.Count, 0)
 End If
 Next i
 '---- Calculate AVG_Stay via cell formula
 Dim LOS_GroupAddr As String, diagProced_CatAddr As String, AgeCatAddr As String, StayAddr As String
 LOS_GroupAddr = wsDUNST15.UsedRange.Columns(4).Address(True, True, xlA1, True)
 diagProced_CatAddr = wsDUNST15.UsedRange.Columns(5).Address(True, True, xlA1, True)
 AgeCatAddr = wsDUNST15.UsedRange.Columns(6).Address(True, True, xlA1, True)
 StayAddr = wsDUNST15.UsedRange.Columns(8).Address(True, True, xlA1, True)
 Set ToRange = ICD10_Data.UsedRange.Columns(6).Offset(1, 0)
 Set ToRange = ToRange.Resize(ToRange.Rows.Count - 1, 1)
 ToRange.Formula = "=SUMPRODUCT((" & LOS_GroupAddr & "=$E2)*(" & diagProced_CatAddr & "=$C2)*(" & AgeCatAddr & "=$B2)*(" & StayAddr & "))/10"
 ToRange.Copy
 ToRange.PasteSpecial xlPasteValues
 Application.CutCopyMode = False
 ICD10_Data.Rows(1).Insert xlShiftDown
 ICD10_Data.Range("A1") = Format(startTime, "HH:MM:SS")
 endTime = Time
 ICD10_Data.Range("B1") = Format(endTime, "HH:MM:SS")
 Debug.Print "Finished: " & Format(endTime, "HH:MM:SS")
 Application.ScreenUpdating = True
'complete:
 MsgBox "Procedure Complete!"
End Sub
answered Dec 16, 2017 at 9:09
\$\endgroup\$
1
  • \$\begingroup\$ Looks promising, there are some bugs that I will work through and will publish the new version when complete. \$\endgroup\$ Commented Dec 18, 2017 at 16:37
0
\$\begingroup\$

SQL version based on comment:

I'm probably looking for a database solution

Since you're in Excel I'm guessing you'd be able to try a MS-Access solution

  1. Import ICD10Full and DUNST15 as tables
  2. Manually Create the two Tables [AgeCats] and [DiagProcs]:
AgeCats: DiagProcs:
 ageCategory DiagnosisProcedure_Category
 ----------- ---------------------------
 1 0
 2 1
 3 3
 4 4
 5
  1. Create query [010 Filter ICD10Full], Note: ColumnE is referred to in your code If ICD_Cell.Offset(, 4) = "@" but never given a name
SELECT distinct ICD10_Type, ICD10_Code, LOS_Group
FROM [ICD10Full]
WHERE
 ColumnE <> '@'
 AND LOS_Group <> ''
 AND ICD10_Type = 'D';

4. Create Query [020 ICD10Reduced]

SELECT ICD10_Type,ageCategory,DiagnosisProcedure_Category,ICD10_Code,LOS_Group
FROM [010 Filter ICD10Full],[AgeCats],[DiagProcs];

5. Create Query [030 ICD10 Summary], Note: I've left it as LEFT JOIN to match your original code, but you'll probably want INNER JOIN to reduce the nothingness, also ColumnH is never given a name in your code wsDUNST15.Range("H1048576")

SELECT
 ICD10.ICD10_Type
 ,ICD10.ageCategory
 ,ICD10.DiagnosisProcedure_Category
 ,ICD10.ICD10_Code
 ,ICD10.LOS_Group
 ,Avg(DUNS15.ColumnH) AS AVG_Stay
FROM [020 ICD10Reduced] AS ICD10 LEFT JOIN [DUNS15] ON
 ICD10.geCategory = DUNS15.ageCategory
 AND ICD10.DiagnosisProcedure_Category = DUNS15.DiagnosisProcedure_Category
 AND ICD10.LOS_Group = DUNS15.LOS_Group
GROUP BY
 ICD10.ICD10_Type
 ,ICD10.ageCategory
 ,ICD10.DiagnosisProcedure_Category
 ,ICD10.ICD10_Code
 ,ICD10.LOS_Group;

Other databases (SQL-Server, Oracle, MySQL, PostgreSQL, ...) would have different formatting of the SQL code, but the steps would more or less remain the same.

answered Dec 28, 2017 at 23:22
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.