the below code used to:
Concatenate the values on a specific column "N" depend on the value of column "A" then delete the remaining rows.
It works, but with range of 30k rows the macro takes a long time to finish (14 seconds) on powerful PC.
Edit:
the bottle neck is on this line .SpecialCells(xlCellTypeConstants).EntireRow.Delete
(it takes 13.5 seconds) from the overall code time ( 14 seconds).
I tried to replace it with VBA AutoFilter
, but the same issue.
This is updated screenshot: of current values and the current result,
My goal is to do all processing on arrays or dictionary to achieve the fastest speed.
I have office 2016 on my work.
Option Explicit
Option Compare Text
Sub Concatenate_column_N_values_Delete_remaining_Rows()
Dim t: t = Timer
Const sep As String = vbLf
Dim arrKeys, arrVals, arrFlags, rngRows As Range, key, currKey, s As String
Dim ub As Long, n As Long, ws As Worksheet, rngVals As Range, i As Long
Set ws = ActiveSheet
Set rngRows = ws.Range("A2:A" & ws.Cells(ws.Rows.Count, "A").End(xlUp).Row) 'Column Contains WO
Set rngVals = rngRows.EntireRow.Columns("N") 'Column contains pure string
Application.ScreenUpdating = False
arrKeys = rngRows.Value2
ub = UBound(arrKeys, 1)
arrVals = rngVals.Value2
ReDim arrFlags(1 To UBound(arrKeys, 1), 1 To 1)
currKey = Chr(0) 'non-existing key...
For i = ub To 1 Step -1 'looping from bottom up
key = arrKeys(i, 1) 'this row's key
If key <> currKey Then 'different key from row below?
If i < ub Then arrVals(i + 1, 1) = s 'populate the collected info for any previous key
s = arrVals(i, 1) 'collect this row's "N" value
currKey = key 'set as current key
Else
If i < ub Then
arrFlags(i + 1, 1) = "x" 'flag for deletion
n = n + 1
End If
s = arrVals(i, 1) & sep & s 'concatenate the "N" value
End If
Next i
arrVals(1, 1) = s 'populate the last (first) row...
rngVals.Value = arrVals 'drop the concatenated values
If n > 0 Then 'any rows to delete?
With rngRows.Offset(0, 100) 'use any empty column
.Value = arrFlags
.SpecialCells(xlCellTypeConstants).EntireRow.Delete
End With
End If
Application.ScreenUpdating = True
Debug.Print "Concatenate_column_N_values_Delete_remaining_Rows, in " & Round(Timer - t, 2) & " sec"
End Sub
-
1\$\begingroup\$ You can do this almost instantly with powerquery groupby learn.microsoft.com/en-us/power-query/group-by. Alternatively using the new array helper functions UNIQUE & BYROW should be pretty speedy for that amount of data (not that I've got anything against VBA, I just don't think it's great for data manipulation) \$\endgroup\$Greedo– Greedo2023年04月17日 17:41:23 +00:00Commented Apr 17, 2023 at 17:41
2 Answers 2
As mentioned, in Excel 2016 you should have access to the Power Query Editor through Data
tab -> From Table/Range
Then combining the text as you have done and loading to a new table should be easy - this is the entire code:
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Concat = Table.Group(Source, {"WO"}, {{"TEXT", each Text.Combine([TEXT]," "), type text}})
The first line loads "Table1" from your workbook. The second line Group
s the table rows based on {"WO"}
column, and aggregates each
group of rows using the Text.Combine
function on the [TEXT]
column of the table, with spaces as separators.
To generate that you follow these steps:
- Click inside the table with the data
Data Tab -> From Table/Range
to launch PowerQuery (PQ)Home -> GroupBy
- Choose the WO column in the first dropdown, and for the aggregation function choose anything, we'll overwrite this
- Click okay and now in the formula bar you should see
= Table.Group(#"Changed Type", {"WO"} ...
, replace that with= Table.Group(#"Changed Type", {"WO"}, {{"TEXT", each Text.Combine([TEXT]," "), type text}})
. Hit enter- Optional - In the steps pane on the right you can right click delete the
Changed Type
step as it is not really needed. You can also rename the steps.
- Optional - In the steps pane on the right you can right click delete the
- Now in the top left hit the close and load 💾 icon and your PQ will load to a new tab by default. Every time your source data changes, go
Data -> Refresh All
to re-run the PQ.- Optional - You can use the dropdown under the save icon to
Close and Load To
a different location, e.g. a table next to the source data
- Optional - You can use the dropdown under the save icon to
Justification
Like VBA
- It is built into Excel and the powerquery travels embedded in your workbook so you won't lose it
- It can dynamically switch to pulling data from different sources
This is an improvement over VBA because
- It is simpler (much less code) and easier to understand, adjust and modify as a result
- It should be very fast, powerquery can run multithreaded and generally is optimised for data manipulation
- You will find it supports built in features VBA does not, if you want to build on this data manipulation in future
- The editor for powerquery is graphical - so you don't type the code but instead click buttons to insert it for you, making it simpler to learn and less error prone
Update
To join with newlines use Text.Combine([TEXT],"#(cr)#(lf)")
instead of Text.Combine([TEXT]," ")
, or use Lines.ToText([TEXT])
but this adds a trailing newline.
-
\$\begingroup\$ I tried, but I got a space between each text and I need each text to be on a new line like my provided photo. I replaced
" "
withvbCrLf
but it did not work ,= Table.Group(#"Changed Type", {"WO"}, {{"Urlname", each Text.Combine([Urlname], vbCrLf), type text}})
\$\endgroup\$Leedo– Leedo2023年04月19日 06:30:57 +00:00Commented Apr 19, 2023 at 6:30 -
\$\begingroup\$ @Leedo sorry different spot that. You can use
Lines.ToText([Urlname])
orText.Combine([Urlname], "#(lf)")
instead I believe stackoverflow.com/q/63959297/6609896 \$\endgroup\$Greedo– Greedo2023年04月19日 08:40:52 +00:00Commented Apr 19, 2023 at 8:40 -
1\$\begingroup\$ @Leedo see update. please let me know the performance \$\endgroup\$Greedo– Greedo2023年04月19日 09:01:11 +00:00Commented Apr 19, 2023 at 9:01
-
\$\begingroup\$ I tried and it works, up-voted. \$\endgroup\$Leedo– Leedo2023年04月19日 11:40:38 +00:00Commented Apr 19, 2023 at 11:40
The Bottle Neck:
the bottle neck is on this line .SpecialCells(xlCellTypeConstants).EntireRow.Delete
(it takes 13.5 seconds) from the code overall time ( 14 seconds).
Reason:
It is tuned out that deletion of a lot of non-continuous rows takes a lot of time to finish, even after using Application optimizations (ScreenUpdateing.False
,...)
Answer:
I tried another approach by sort the values (rows) which need to be deleted and then set this rows to a range and then delete that range ,
I measured (Sorting values + Deletion of that range) and it toke 0.12 sec to finish (significantly faster).
I replaced .SpecialCells(xlCellTypeConstants).EntireRow.Delete
with the below code.
Sub Sort_vlaues_x_and_Delete()
Dim ws As Worksheet: Set ws = ActiveSheet
Dim rng As Range, lastR As Long, lastC As Long, lastcol As String
lastR = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row 'Last Row number on coulmn A
lastC = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column + 1 'next Last Column on Row 1
lastcol = Split(Cells(1, lastC).Address(True, False), "$")(0) 'Last Column Letter
Set rng = ws.Range(ws.Cells(1, 1), ws.Cells(lastR, lastC))
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'--- Sort values x
ws.Sort.SortFields.Clear
ws.Sort.SortFields.Add key:=Range(lastcol & "2:" & lastcol & lastR), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
With ws.Sort
.SetRange Range("A1:" & lastcol & lastR)
.Header = xlYes: .MatchCase = False
.Orientation = xlTopToBottom: .SortMethod = xlPinYin
.Apply
End With
'--- Delete range of values x
lastR = ws.Cells(ws.Rows.Count, lastcol).End(xlUp).Row
Set rng = ws.Range(ws.Cells(2, 1), ws.Cells(lastR, lastC))
rng.Rows.EntireRow.Delete
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub