I have a repetitive data reduction where I have a list of ID's that I want to find in a longer list. I have a method, but it is slower than I'd like. How can can I speed this up?
The process: After each contiguous section of ID's I want to insert a blank row (I'll do other things with the blank row but that's not important).If I have a unique list containing 10, 20, 30, 40. I want to insert a row below each of the contiguous sections of these numbers for example.
As in,
ID | -- |
---|---|
10 | |
10 | |
10 | *Insert below here |
15 | |
15 | *DO NOT INSERT HERE |
20 | |
20 | |
20 | *Insert below here |
30 | |
30 | *Insert below here |
10 | |
10 | |
10 | *Insert below here |
40 | *Insert below here |
I have created a way to do this, but for >40,000 lines of data and about 20 "ID's of interest" which can go for a thousand rows a piece it is not particularly efficient. Would auto filtering potentially improve efficiency or what about other methods? Storing data in array rather than using Find?
Here is an example of the module that will run on data formatted like the screenshot and workbook attached.
Option Explicit
Public Sub SummaryDataPopulate()
Dim DataWs As Worksheet: Set DataWs = ActiveSheet
Dim IDNum As Long
Dim IDCurRow As Long
Dim IDofInterestRows As Long
Dim StartRow As Long
Dim EndRow As Long
Dim Counter As Long: Counter = 0
Const IDCol As Long = 3
Application.ScreenUpdating = False
With DataWs
IDofInterestRows = .Columns("A").Cells.Find("*", SearchOrder:=xlByRows, LookIn:=xlValues, SearchDirection:=xlPrevious).Row
For IDCurRow = 3 To IDofInterestRows
IDNum = DataWs.Range("A" & IDCurRow).Value
StartRow = .Columns(IDCol).Find(IDNum).Row 'could update this in future to handle repeated ID's, currently finds first ID
EndRow = find_end_row(IDNum, StartRow, IDCol, DataWs)
'insert row & color
.Rows(EndRow + 1).Insert
.Rows(EndRow + 1).Interior.ThemeColor = xlThemeColorAccent6
'check now if there is another section
Do While ID_rerun_check(IDNum, EndRow, IDCol, DataWs) 'true if another section of same ID
StartRow = .Columns(IDCol).Find(What:=IDNum, After:=.Cells(EndRow + 2, IDCol)).Row
EndRow = find_end_row(IDNum, StartRow, IDCol, DataWs) 'new end row
'insert row & Color
.Rows(EndRow + 1).Insert
.Rows(EndRow + 1).Interior.ThemeColor = xlThemeColorAccent6
Loop
Next IDCurRow
End With
Application.ScreenUpdating = True
End Sub
Private Function ID_rerun_check(ByVal IDNum As Long, ByVal PreviousEndRow As Long, ByVal IDCol As Long, ByRef data_ws As Worksheet) As Boolean
Dim FoundRow As Long
FoundRow = data_ws.Columns(IDCol).Find(What:=IDNum, After:=data_ws.Cells(PreviousEndRow + 1, IDCol)).Row
ID_rerun_check = Not (FoundRow <= PreviousEndRow) 'false if the find has looped, 'true if find finds another section
End Function
Private Function find_end_row(ByVal IDNum As Long, ByVal StartRow As Long, ByVal IDCol As Long, ByRef data_ws As Worksheet) As Long
Dim ii As Long
Dim EndRow As Long
Dim DataRows As Long
DataRows = data_ws.Columns("C").Cells.Find("*", SearchOrder:=xlByRows, LookIn:=xlValues, SearchDirection:=xlPrevious).Row
For ii = StartRow To DataRows
EndRow = data_ws.Columns(IDCol).Find(What:=IDNum, After:=data_ws.Cells(ii, IDCol)).Row
If EndRow - 1 <> ii Then 'check if cycle back to top or if jump in value
Exit For 'exits for loop
End If
Next ii
find_end_row = ii
End Function
Here is a datafile: Sheet1 has the simple data, Sheet3 has data more realistic https://docs.google.com/spreadsheets/d/1ftCcHfiw7Sujg5L791gyn877bqmhaENy/edit?usp=sharing&ouid=111738563125877088842&rtpof=true&sd=true
1 Answer 1
If I understand you correctly....
Before & after running the sub
enter image description here ===> enter image description here
The list is 10,20,30,40,50,60,70,80,90,100.
The yellow fill just to show that it won't insert row because the value is not consecutive to the next row (only one cell).
I test the time spent by multiplying the data in "before running the sub" until cell A155649
(around 155k rows) with 10 item to compare (10 to 100) it takes 7 second something, quite a very long time to wait which I hope it's still OK for you to wait 7 second something.
Sub test()
Dim arrList: Dim arr: Dim itm: Dim el
Dim rg As Range: Dim addr As String
mulai = Timer
Application.ScreenUpdating = False
arrList = Array(10, 20, 30, 40, 50, 60, 70, 80, 90, 100)
Set rg = Range("A2", Range("A2").End(xlDown))
For Each el In arrList
With rg
.Replace el, True, xlWhole, , False, , False, False
arr = Split(.SpecialCells(xlConstants, xlLogical).Address, ",")
.Replace True, el, xlWhole, , False, , False, False
End With
For Each itm In arr
If InStr(itm, ":") Then addr = addr & "," & Split(itm, ":")(1)
Next
If addr <> "" Then
addr = Right(addr, Len(addr) - 1)
Range(addr).Offset(1, 0).EntireRow.Insert Shift:=xlDown
addr = ""
End If
Next
Debug.Print Timer - mulai '(7.375 / 7.4375 / 7.242188)
End Sub
Basically it just collect all the last consecutive address of each looped element (10,20,30 etc), offset one row below then insert entire row.
With the data until A155649
I thought I can just collect all the looped element in one string address, offset it one row below and insert entire row all at once, but I don't know why it gave me error 1004 at the insert code line. So that's why I break the insert row process to each looped element.
Please note that the code assume each list (10,20,30...100) exist in column A. It will throw error if one of the list does not exist in column A. Need to add checking code to avoid the error, something like if not .find(el) is nothing then
Not sure though if the code can be applied to your situation or not.
I want to insert a row below each of the contiguous sections of these numbers
I wonder why in your sample ID 40, you put "*Insert below here". Please CMIIW.
DM_INSERT2
UDF function in my array repo. Or, if not needed as an UDF then use theInsertRowsAtValChange
that works in any application supporting VBA. These methods also handle top and bottom rows besides the rows where value changes. There's a Demo workbook that showcases the UDF. As they are they won't handle special cases like in your example with 15 but you can add a helper column to the array and run the method on that instead. E.g. round down to multiple of ten and then just insert blank rows at value change \$\endgroup\$InsertRowsAtValChange
is FAST. The only change I had to make on my end is where you checkIf currentKey <> previousKey then
I added a nestedIf
statement:If IsNumeric(Application.Match(arr(i-1,columnIndex), inRng, 0)) Then
whereinRng
is a range variable storing the ID's of interest. They already exist printed to a sheet in a known location and it seems to work well with the Match function which is fast enough. I don't love the nest but it helps to avoid arr(i-1) when i<LBound(arr) \$\endgroup\$