My code does exactly what I want it to. However, being relatively new to VBA, I feel it could be a lot more efficient - namely I think I have overused loops and worksheet functions which are slowing it down. At the moment it takes around 3 minutes for ~15k rows of data.
Currently it's more of a combination of separate steps joined together so it doesn't flow nicely, rather for each steps it iterates through every row which, while it gets the job done, is frustratingly inefficient.
At the moment I am trying to remove the loops perhaps using Range objects instead, but I would really appreciate any pointers in the right direction.
Sub RunDataClean_Click()
With Sheets("Data")
'ensures code only loops through rows with data and not full worksheet
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
endrow = .Cells.Find(What:="*", _
After:=.Range("A4"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
Else
endrow = 4
End If
End With
Application.ScreenUpdating = False
Dim i As Long
'Checks another sheet to see if we have the cleaned customer name on file
For i = 5 To endrow
'does a vlookup in CDM file
Acc = Application.Cells(i, 5)
Cname = Application.Cells(i, 4)
Acname = Application.VLookup(Acc, Sheet3.Range("D2:F315104"), 3, False)
If IsError(Acname) Then
Cells(i, 32).Value = ""
Else
Cells(i, 32).Value = Acname
End If
Map = Application.VLookup(Acc, Sheet3.Range("C2:F315104"), 4, False)
If IsEmpty(Cells(i, 32)) Then
If IsError(Map) Then
Cells(i, 32).Value = ""
Else
Cells(i, 32).Value = Map
End If
End If
FXid = Application.VLookup(Acc, Sheet3.Range("B2:F315104"), 5, False)
If IsEmpty(Cells(i, 32)) Then
If IsError(FXid) Then
Cells(i, 32).Value = ""
Else
Cells(i, 32).Value = FXid
End If
End If
FXP = Application.VLookup(Cname, Sheet3.Range("A2:F315104"), 6, False)
If IsEmpty(Cells(i, 32)) Then
If IsError(FXP) Then
Cells(i, 32).Value = ""
Else
Cells(i, 32).Value = FXP
End If
End If
LkpName = Application.VLookup(Cname, Sheet3.Range("F2:F315104"), 1, False)
If IsEmpty(Cells(i, 32)) Then
If IsError(LkpName) Then
Cells(i, 32).Value = ""
Else
Cells(i, 32).Value = LkpName
End If
End If
If IsEmpty(Cells(i, 32)) Then
Cells(i, 32).Value = Cells(i, 4).Value
End If
Next i
For i = 5 To endrow
Cells(i, 28).Value = Cells(i, 3).Value & Cells(i, 5).Value
Length = Len(Cells(i, 28))
Cells(i, 29).Value = Length
Cells(i, 31).Value = Cells(i, 4).Value
'does a vlookup in CDM file (CDM)
Acc = Application.Cells(i, 28)
BP = Application.VLookup(Acc, Sheet3.Range("E2:G315104"), 3, False)
If IsError(BP) Then
Cells(i, 30).Value = ""
Else
Cells(i, 30).Value = BP
End If
'assigns B or P based on payment details (Business_Personal)
If Cells(i, 12).Value = "N" Then
Cells(i, 24).Value = "B"
ElseIf Cells(i, 30).Value = "Business" Then
Cells(i, 24).Value = "B"
ElseIf Cells(i, 30).Value = "Personal" Then
Cells(i, 24).Value = "P"
ElseIf Cells(i, 12).Value = "Y" Then
Cells(i, 24).Value = "P"
ElseIf InStr(1, Cells(i, 32), "LTD") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "LIMITED") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "MANAGE") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "BUSINESS") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "CONSULT") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "INTERNATIONAL") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "T/A") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "TECH") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "CLUB") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "OIL") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "SERVICE") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "SOLICITOR") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf InStr(1, Cells(i, 32), "CORP") <> 0 Then
Cells(i, 24).Value = "B"
ElseIf Left(Cells(i, 5).Value, 3) = "999" Then
Cells(i, 24).Value = "P"
End If
Next i
'Week_Of_Year
For i = 5 To endrow
WeekNo = Application.Cells(i, 1)
WeekNumba = Application.WeekNum(WeekNo)
Cells(i, 21).Value = WeekNumba
Next i
'Deal_Channel concatenation
For i = 5 To endrow
Cells(i, 22).Value = Cells(i, 6).Value & Cells(i, 13).Value & Cells(i, 17).Value
Next i
'Deal_Source_System
For i = 5 To endrow
DealSS = Application.Cells(i, 22)
Deal_Source = Application.VLookup(DealSS, Sheet4.Range("F2:H354"), 3, False)
If IsError(Deal_Source) Then
Cells(i, 23).Value = "#N/A"
Else
Cells(i, 23).Value = Deal_Source
End If
Next i
'Reporting_Quarter (only worked for type double)
'does a lookup in calendar tab to return reporting quarter - could move this to Access
For i = 5 To endrow
qdate = Cells(i, 1)
qlkp = Application.VLookup(CDbl(qdate), Sheet5.Range("A1:C500"), 3, False)
Cells(i, 26).Value = qlkp
Next i
'copies any #N/A deal channel to lookup tables and then sets deal source to map
lastrow = Sheet4.Cells(Rows.Count, "F").End(xlUp).Row + 1
With Sheet1.Range("W5:W" & endrow)
Set DS = .Find(What:="#N/A", LookIn:=xlValues)
If Not DS Is Nothing Then
firstAddress = DS.Address
Do
DS.Offset(, -1).Copy
Sheet3.Range("F" & lastrow).PasteSpecial xlPasteValues
DS.Value = "Map"
Set DS = .FindNext(DS)
lastrow = lastrow + 1
Loop While Not DS Is Nothing
End If
End With
Application.ScreenUpdating = True
End Sub
1 Answer 1
Code that's hard to read, is code that's hard to modify without breaking. Consistent indentation helps with that:
For i = 5 To endrow qdate = Cells(i, 1) qlkp = Application.VLookup(CDbl(qdate), Sheet5.Range("A1:C500"), 3, False) Cells(i, 26).Value = qlkp Next i
For i = 5 To endrow
qdate = Cells(i, 1)
qlkp = Application.VLookup(CDbl(qdate), Sheet5.Range("A1:C500"), 3, False)
Cells(i, 26).Value = qlkp
Next i
That's already better!
The first thing I would do would be to indent the entire project in a single click with Rubberduck, and then review the inspection results:
Rubberduck inspection results
Undeclared variables are a huge red flag: Option Explicit
isn't specified and VBA will happily compile any typos and carry on running the code in an erroneous logical state, by declaring the new identifier on-the-spot as an implicit Variant
. Using disemvoweled, abbreviated, and otherwise unpronounceable names makes it even easier for this to happen, and harder for the bugs it introduces to be found.
Since this code is in the code-behind of a UserForm
, there are a lot of implicit ActiveSheet
references, and this is making the code extremely frail, prone to blow up with error 1004, or to work off the wrong sheet (although, not Select
ing and Activate
ing any sheets and toggling ScreenUpdating
off does minimize the odds of that happening, albeit indirectly).
There's a Range.Find
call at the top of the procedure that assumes there is data on the Sheets("Data")
worksheet. In the event where that sheet would be empty, the chained .Row
member call would raise error 91.
Acc = Application.Cells(i, 5) Cname = Application.Cells(i, 4)
These instructions are invoking worksheet members off Application
: it's equivalent to ActiveSheet.Cells
, or simply Cells
. Just reading the code isn't sufficient to understand what sheet that is expected to be active, and thus all these unqualified Cells
calls are very ambiguous, at least for a reader that doesn't know what they're looking at.
Barring a few false positives, everything Rubberduck picks up is essentially a low-hanging fruit that should be addressed before deving into the more substantial stuff:
- Implicit
ActiveSheet
andActiveWorkbook
references, should be qualified with a specificWorksheet
orWorkbook
object, or explicitly referenceActiveSheet
/ActiveWorkbook
, to clarify the intent of the code. I believe the intent is not to work off whatever workbook/sheet is currently active though. - Avoid Systems Hungarian Notation prefixing. It's harmful, and brings no value.
- Don't make event handler procedures
Public
, implicitly or not. Event handlers arePrivate
by default, and they should remain that way: they are meant to be invoked by VBA, not user code. - Use string-typed functions where possible, e.g.
Left
takes and returns aVariant
, butLeft$
takes and returns an actualString
: that's a rather insignificant (to an extent) point performance-wise, but using explicit types should be preferred overVariant
(and the slight run-time overhead usingVariant
incurs).
Since a UserForm
is involved, I encourage you to read this answer and the article it links to (I wrote both). The crux being, the last thing you want is to have a form that manipulates a worksheet directly, inside some button's Click
handler. A first step towards a thorough refactoring would to turn the click handler into something like this:
Private Sub RunDataClean_Click()
Macros.RunDataClean
End Sub
...and then move the entire body of the procedure into a Public Sub RunDataClean
procedure in some Macros
module, but that's just a first step.
Performance-wise, it's hard to justify all that VBA code to do work that looks very much like it could be done using standard worksheet formulas.
But one thing strikes me:
For i = 5 To endrow
This line appears 6 times in the procedure, so the macro is iterating every single one of these 15K rows, ...6 times. Remove all but the first For i = 5 To endrow
and all but the last Next i
, and you will likely instantly slash 83% of the work being done.
-
\$\begingroup\$ Thanks so much Mathieu I really appreciate the help! I'll get working on tidying it up like you have suggested \$\endgroup\$edev– edev2019年04月18日 13:38:45 +00:00Commented Apr 18, 2019 at 13:38
Click
event handler for some button. What kind of module is it written in? Whether it's a worksheet module or a standard module will make a significant difference in how reliable this code is. \$\endgroup\$