7
\$\begingroup\$

I have this Excel/VBA code and here is what it does:

  1. I have 3 sheets, where first sheet has two columns that will be used (A,C).
  2. 2nd Sheet is just the raw data file that will be imported from external source.
  3. 3rd sheet is the final output.
  4. Column A in the first sheet is "Raw Column Headings" where the column headers from raw data table are copied, transposed, and pasted. Third column in the first sheet is the final headers that I need the raw ones to be changed to. Then, I will run a macro which will copy the final headers into the first row of second sheet and then copy/paste the entire columns with the certain headers specified in the code to the final output.

The problem is that this process takes about 20~30 seconds per file and I have so many files to be processed. Can anyone take a look at the code and tell me how it can be done faster?

Option Explicit
Private Function GetHeaders() As Collection
 Dim result As New Collection
 With result
 .Add "Account_ID"
 .Add "Claim_ID"
 .Add "Account_Name"
 .Add "Claim_Type"
 .Add "Coverage"
 .Add "Claim_Level"
 .Add "Claim_Count"
 .Add "File_Date"
 .Add "File_Year"
 .Add "Resolution_Date"
 .Add "Resolution_Year"
 .Add "Claim_Status"
 .Add "Indemnity_Paid"
 .Add "Disease_Category"
 .Add "State_Filed"
 .Add "First_Exposure_Date"
 .Add "Last_Exposure_Date"
 .Add "Claimant_Employee"
 .Add "Claimant_DOB"
 .Add "Claimant_Deceased"
 .Add "Claimant_Name"
 .Add "Claimant_DOD"
 .Add "Claimant_Diagnosis_Date"
 .Add "Product_Type"
 .Add "Product_Line"
 .Add "Company/Entity/PC"
 .Add "Plaintiff_Law_Firm"
 .Add "Asbestos_Type"
 .Add "Evaluation_Date"
 .Add "Tier"
 .Add "Data_Source"
 .Add "Data_Source_Category"
 .Add "Jurisdiction/County"
 .Add "Settlement_Demand"
 .Add "Jury_Verdict"
 .Add "Exposure_Site"
 .Add "National_Defendant_Firm"
 .Add "Local_Defendant_Firm"
 .Add "Expense_Amount"
 .Add "NCC_Expense_Amount"
 .Add "Non_NCC_Expense_Amount"
 End With
 Set GetHeaders = result
End Function
Private Function FindHeaderRange(ByVal ws As Worksheet, ByVal header As String) As Range
 Set FindHeaderRange = ws.Cells.Find(header, , , xlWhole)
End Function
Private Function BuildMessage(ByVal currentMessage As String, ByVal ws As Worksheet, ByVal header As String) As String
 BuildMessage = currentMessage & vbLf & header & Space(1) & ws.Name
End Function
Public Sub ProjectionTemplateFormat()
On Error GoTo ExitSub
 Sheets(1).Range("C2", Cells(Rows.Count, "C").End(xlUp)).Copy
 Sheets(2).Range("A1").PasteSpecial transpose:=True
 Range("A1").ClearOutline
 Dim headers As Collection
 Set headers = GetHeaders
 Dim msg As String
 Dim wsImport As Worksheet, wsMain As Worksheet
 Set wsImport = ThisWorkbook.Sheets(2)
 Set wsMain = ThisWorkbook.Sheets(3)
 Application.ScreenUpdating = False
 Dim header As Variant
 Dim source As Range
 Dim dest As Range
 For Each header In headers
 Set source = FindHeaderRange(wsImport, header)
 If source Is Nothing Then
 msg = BuildMessage(msg, wsImport, header)
 Else
 Set dest = FindHeaderRange(wsMain, header)
 If dest Is Nothing Then
 msg = BuildMessage(msg, wsMain, header)
 Else
 wsImport.Range(source.Offset(1), wsImport.Cells(Rows.Count, source.Column).End(xlUp)).Copy _
 wsMain.Cells(Rows.Count, dest.Column).End(xlUp)(2)
 End If
 End If
 Next
With wsMain
 .Columns("A:AO").AutoFit
 .Cells.ClearFormats
 .Rows(1).Font.Bold = True
 .Cells.Font.Name = "Georgia"
 .Cells.Font.Color = RGB(0, 0, 225)
 .Cells.Resize(wsMain.Rows.Count - 1).Offset(1).Interior.Color = RGB(216, 228, 188)
End With
'Apply Style
Dim ws As Worksheet
 For Each ws In Worksheets
 ws.Select
 ActiveWindow.Zoom = 85
 Next ws
Dim rng As Range
Set rng = wsMain.Cells
With rng.Borders
 .LineStyle = xlDot
 .Weight = xlThin
End With
Dim cell As Range
 For Each cell In Range("$A2ドル:" & Range("$A2ドル").SpecialCells(xlLastCell).Address)
 If Len(cell) > 0 Then cell = UCase(cell)
 Next cell
ExitSub:
 Application.ScreenUpdating = True
End Sub
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jun 4, 2015 at 14:52
\$\endgroup\$
1
  • \$\begingroup\$ How big is the data set? \$\endgroup\$ Commented Jun 4, 2015 at 17:07

3 Answers 3

2
\$\begingroup\$

Previous answer is getting a bit long, so here is the rest of the code and the flow I'd suggest:

.

Option Explicit
Public Sub mainProcess()
 ProjectionTemplateFormat
End Sub
Private Function GetHeaders() As Collection
 '...
End Function
Private Sub ProjectionTemplateFormat()
 GetHeaders
 '...
 applyFormat ThisWorkbook.Worksheets(1)
End Sub
Private Sub applyFormat(ByRef sh As Worksheet)
 With sh.UsedRange
 .Columns.AutoFit
 .ClearFormats
 With .Font
 .Name = "Georgia"
 .Color = RGB(0, 0, 225)
 End With
 With .Borders
 .LineStyle = xlDot
 .Weight = xlThin
 End With
 .Interior.Color = RGB(216, 228, 188)
 With .Rows(1)
 .Font.Bold = True
 .ColorIndex = xlAutomatic
 End With
 End With
 Dim ws As Worksheet
 For Each ws In Worksheets
 ws.Zoom = 85
 Next
 allUpper1 sh 'or allUpper2 sh
End Sub
Private Sub allUpper1(ByRef sh As Worksheet) 'FOR loop, with VBA UCase()
 Dim arr As Variant, i As Long, j As Long
 If WorksheetFunction.CountA(sh.UsedRange) > 0 Then
 arr = sh.UsedRange 'one interaction with the sheet
 For i = 2 To UBound(arr, 1) 'each "row"
 For j = 1 To UBound(arr, 2) 'each "col"
 arr(i, j) = UCase(arr(i, j))
 Next
 Next
 sh.UsedRange = arr 'second interaction with the sheet
 End If
End Sub
Private Sub allUpper2(ByRef sh As Worksheet) 'No loop, with Excel UPPER()
 Const FIRST_ROW As Long = 2
 Dim lRow As Long
 Dim lCol As Long
 Dim usedRng As Range
 Dim tempRng As Range
 If WorksheetFunction.CountA(sh.UsedRange) > 0 Then
 Set usedRng = sh.UsedRange
 With usedRng
 lRow = .Rows.Count
 lCol = .Columns.Count
 End With
 'remove header row from working range
 Set usedRng = usedRng.Offset(1, 0).Resize(lRow - 1, lCol)
 'offset cell: 2 collumns to the right of 1st cell in used range
 Set tempRng = sh.Cells(FIRST_ROW, lCol + 2)
 With tempRng
 'apply formula to offset cell: UPPER(A2)
 .Formula = "=Upper(" & sh.Cells(FIRST_ROW, 1).Address(0, 0) & ")"
 'fill down the column
 .AutoFill Destination:=sh.Range(tempRng, .Offset(lRow, 0)), Type:=xlFillDefault
 'fill right all rows
 sh.Range(tempRng, .Offset(lRow, 0)).AutoFill Destination:=sh.Range(tempRng, .Offset(lRow, lCol)), Type:=xlFillDefault
 With sh.Range(tempRng, .Offset(lRow, lCol))
 usedRng.Value2 = .Value2 'copy upper case values back
 .EntireColumn.Delete 'remove temp range
 End With
 End With
 End If
End Sub

.

As a performance reference - measurement between UCase(), and UPPER() methods:

FOR loop method: 6.61 seconds
Excel formula method: 15.29 seconds

With 100,000 rows, and 26 columns

Most text cells contain "Test Cell 1", "Test Cell 2", "Test Cell 3",..., and 11 cells with numbers, 5 in the first few rows and 6 in the last rows

answered Jun 9, 2015 at 17:32
\$\endgroup\$
2
  • 1
    \$\begingroup\$ This really would have been better in the other answer... :-/ Too late now: this one got accepted, and the other one has the votes. It's a mess that could have been avoided. Next time, please use just one answer if they are really about the same line of logic. \$\endgroup\$ Commented Sep 23, 2015 at 14:00
  • \$\begingroup\$ @janos - Noted and thanks (this was my first post on CR, and I won't make this mistake again) \$\endgroup\$ Commented Sep 23, 2015 at 15:01
8
\$\begingroup\$

Your code has tremendously improved since the first time I saw it - good job!


This particular line is hard to parse:

wsImport.Range(source.Offset(1), wsImport.Cells(Rows.Count, source.Column).End(xlUp)).Copy _
 wsMain.Cells(Rows.Count, dest.Column).End(xlUp)(2)

Literally: it's crashing the parser!

You could introduce a local variable here:

Dim target As Range
Set target = wsMain.Cells(Rows.Count, dest.Column).End(xlUp)
wsImport.Range(source.Offset(1), wsImport.Cells(Rows.Count, source.Column).End(xlUp)) _
 .Copy target(RowIndex:=2)

Notice how the line continuation is placed so that no instruction is split; doesn't care about line continuations, but it's much easier for the human eye to see what function calls return the arguments for which procedure if you don't split an instruction between the name of a procedure and its arguments - vertically lining up .Range and .Copy also make it clearer that .Copy operates on the result of .Range.


Dim wsImport As Worksheet, wsMain As Worksheet

Is this really buying you anything? Multiple declarations on a single line make it harder to locate declarations at a glance. Compare to:

Dim wsImport As Worksheet
Dim wsMain As Worksheet

My eye sees Dim, my brain sees "variable declaration here" - two Dims, two variables. And I read the variable name at pretty much the exact same millisecond as the one I notice the Dim statement, because I don't need to mentally scroll horizontally and locate the comma. Two variables isn't too bad, but more than that would be problematic. Better avoid multiple declarations on a single line.


There's a redundant reference to wsMain in this With block:

With wsMain
 .Columns("A:AO").AutoFit
 .Cells.ClearFormats
 .Rows(1).Font.Bold = True
 .Cells.Font.Name = "Georgia"
 .Cells.Font.Color = RGB(0, 0, 225)
 .Cells.Resize(wsMain.Rows.Count - 1).Offset(1).Interior.Color = RGB(216, 228, 188)
End With

See it? Right here:

.Cells.Resize(wsMain.Rows.Count - 1).Offset(1).Interior.Color = RGB(216, 228, 188)

Could be

.Cells.Resize(.Rows.Count - 1).Offset(1).Interior.Color = RGB(216, 228, 188)

The extraneous empty lines before End With should be removed, too.


Your indentation isn't consistent.

Dim ws As Worksheet
 For Each ws In Worksheets
 ws.Select
 ActiveWindow.Zoom = 85
 Next ws

The only thing that should be at the same indentation level as Public Sub/End Sub, is line labels (which the VBE forces to start at column 1 anyway).

answered Jun 4, 2015 at 16:05
\$\endgroup\$
4
\$\begingroup\$

@Mat's Mug made good suggestions (all), but in particular: redundant or implicit references:

Sheets(1).Range("C2", Cells(Rows.Count, "C")

Try to always be explicit as possible - the above line has one (semi) explicit reference Sheets(1) and 3 implicit ones:

  • Rows.Count implies "ActiveSheet.Cells.Rows.Count"
  • Cells(Rows.Count, "C") = ActiveSheet.Cells(Rows.Count, "C")
  • Sheets(1) should be ThisWorkbook.Worksheets(1)

All explicit:

Sheets(1).Range("C2", Sheets(1).Cells(Sheets(1).Cells.Rows.Count, "C")

Or

With Sheets(1)
 .Range("C2", .Cells(.Cells.Rows.Count, "C")
End With

As Mat pointed out: Sheets(1) is as also an implicit reference to ThisWorkbook.Worksheets(1)

So the complete reference is

With ThisWorkbook.Worksheets(1)
 .Range("C2", .Cells(.Cells.Rows.Count, "C")
End With

A few suggestions that might improve performance:

  1. In function FindHeaderRange(), replace the Find method with Match.

    From

    Set FindHeaderRange = ws.Cells.Find(header, , , xlWhole)
    ...
    Set source = FindHeaderRange(wsImport, header)
    If source Is Nothing Then...
    

    To

    FindHeaderRange = Application.Match(header, ws.UsedRange.Rows(1), 0)
    

    FindHeaderRange()'s return type will change from Range to Variant - Match returns a Double if it finds something and error if not, so checking the result changes to:

    If Not IsError(FindHeaderRange) Then ...
    

    (if ok, FindHeaderRange will be 3, 4, 99, etc because we're searching only Row 1)

    Match is significantly faster than Find.

  2. In function BuildMessage(): replace Space(1) with " ".

    Space(1) is a function call not worth calling for just one space.

  3. Collections (and dictionaries) are great in reducing the amount of code, and retrieving items, but are quite slow at loading data.

    When using the "headers" collection you're not reducing code too much, compared to "headers" as an array; arrays are very fast in both loading and retrieving; more complex code can get convoluted and hard to follow with multidimensional arrays, but in your case a 1-dimensional array would not change the structure: For Each element in Collection = For Each Element in array.

  4. Your code is very well organized, modularized, and easy to read - great work! However, there is a price to pay for over-modularizing.

    Function are meant to isolate blocks of code for easier maintenance and to make logical distinction between different functionalities, but calls to functions can get expensive, especially when they're very frequent and have a small number of lines; in-line code is much faster to execute than the extra memory navigation to the function called.

    Your 2 small (and neat) functions FindHeaderRange() and BuildMessage() can be executed directly, without negatively impacting readability.

Another important performance improvement:

The most time consuming line in your For loop:

wsImport.Range(source.Offset(1), wsImport.Cells(Rows.Count, source.Column).End(xlUp)).Copy _
 wsMain.Cells(Rows.Count, dest.Column).End(xlUp)(2)

This is meant to copy and paste the used range part of the current column.

Finding current column used range can be optimized (and explicit):

With wsImport
 Set fromCol = .Range( source.Offset(1), .Cells(.UsedRange.Rows.Count + 1, source.Column).End(xlUp))
End With
With wsMain
 Set toCol = .Cells(.UsedRange.Rows.Count + 1, dest.Column).End(xlUp)(2)
End With
fromCol.Copy toCol

Do you need to dynamically find the used range in wsMain in order to remove previous data?

If so, it would be faster to delete the contents of the entire column:

wsMain.UsedRange.Column(dest.Column).Value2 = vbNullString

Then, if all columns in wsImport are the same size you can get the last used row before the For loop (just once) and use that value inside the loop.

Edit: tested version of the code:

.

Option Explicit
Public Sub projectionTemplateFormat()
 Dim t1 As Double, t2 As Double
 xlSpeed True
 t1 = Timer
 mainProcess
 t2 = Timer
 xlSpeed False
 MsgBox "Duration: " & t2 - t1 & " seconds"
End Sub
Private Sub mainProcess()
 Const SPACE_DELIM As String = " "
 Dim wsIndex As Worksheet
 Dim wsImport As Worksheet 'Raw
 Dim wsFinal As Worksheet
 Dim importHeaderRng As Range
 Dim importColRng As Range
 Dim importHeaderFound As Variant
 Dim importLastRow As Long
 Dim finalHeaderRng As Range
 Dim finalColRng As Range
 Dim finalHeaderRow As Variant
 Dim finalHeaderFound As Variant
 Dim indexHeaderCol As Range
 Dim header As Variant 'Each item in the FOR loop
 Dim msg As String
 Set wsIndex = aIndex 'This is the Code Name; top-left pane: aIndex (Index)
 Set wsImport = bImport 'Direct reference to Code Name: bImport.Range("A1")
 Set wsFinal = cFinal 'Reference using Sheets collection: ThisWorkbook.Worksheets("Final")
 With wsImport.UsedRange
 Set importHeaderRng = .Rows(1) 'Import - Headers
 importLastRow = .Rows.Count 'Import - Total Rows
 End With
 With wsFinal.UsedRange
 finalHeaderRow = .Rows(1) 'Final - Headers (as Array)
 Set finalHeaderRng = .Rows(1) 'Final - Headers (as Range)
 End With
 With wsIndex.UsedRange 'Transpose col 3 from Index (without the header), as column names in Import
 Set indexHeaderCol = .Columns(3).Offset(1, 0).Resize(.Rows.Count - 1, 1)
 wsImport.Range(wsImport.Cells(1, 1), wsImport.Cells(1, .Rows.Count - 1)).Value2 = Application.Transpose(indexHeaderCol)
 End With
 If Len(bImport.Cells(2, 2).Value2) > 0 Then 'if Import sheet is not empty (excluding header row)
 With Application
 For Each header In finalHeaderRow 'Loop through all headers in Final
 If Len(Trim(header)) > 0 Then 'If the Final heade is not empty
 importHeaderFound = .Match(header, importHeaderRng, 0) 'Find header in Import sheet
 If IsError(importHeaderFound) Then
 msg = msg & vbLf & header & SPACE_DELIM & wsImport.Name 'Import doesn't have current header
 Else
 finalHeaderFound = .Match(header, finalHeaderRng, 0) 'Find header in Final sheet
 With wsImport
 Set importColRng = .UsedRange.Columns(importHeaderFound).Offset(1, 0).Resize(.UsedRange.Rows.Count - 1, 1)
 End With
 With wsFinal
 Set finalColRng = .Range(.Cells(2, finalHeaderFound), .Cells(importLastRow, finalHeaderFound))
 finalColRng.Value2 = vbNullString 'Delete previous values (entire column)
 End With
 finalColRng.Value2 = importColRng.Value2 'Copy Import data in Final columns
 End If
 End If
 Next header
 End With
 allUpper wsFinal
 'wsFinal.UsedRange.AutoFilter
 applyFormat wsFinal.Range(wsFinal.Cells(1, 1), wsFinal.Cells(importLastRow, wsFinal.UsedRange.Columns.Count))
 Dim ws As Worksheet
 For Each ws In Worksheets
 ws.Activate
 ActiveWindow.Zoom = 85
 ws.Cells(2, 2).Activate
 ActiveWindow.FreezePanes = True
 ws.Cells(1, 1).Activate
 Next
 Else
 MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, " Missing Raw Data"
 End If
End Sub
Private Sub applyFormat(ByRef rng As Range)
 With rng
 .ClearFormats
 With .Font
 .Name = "Georgia"
 .Color = RGB(0, 0, 225)
 End With
 .Interior.Color = RGB(216, 228, 188)
 With .Rows(1)
 .Font.Bold = True
 .Interior.ColorIndex = xlAutomatic
 End With
 With .Borders
 .LineStyle = xlDot 'xlContinuous
 .ColorIndex = xlAutomatic
 .Weight = xlThin
 End With
 End With
 refit rng
End Sub
Private Sub allUpper(ByRef sh As Worksheet)
 Dim arr As Variant, i As Long, j As Long
 If WorksheetFunction.CountA(sh.UsedRange) > 0 Then
 arr = sh.UsedRange
 For i = 2 To UBound(arr, 1) 'each "row"
 For j = 1 To UBound(arr, 2) 'each "col"
 arr(i, j) = UCase(RTrim(Replace(arr(i, j), Chr(10), vbNullString)))
 Next
 Next
 sh.UsedRange = arr
 End If
End Sub
Public Sub allImportTrim()
 Dim arr As Variant, i As Long, j As Long, sh As Worksheet
 Set sh = bImport
 If WorksheetFunction.CountA(sh.UsedRange) > 0 Then
 arr = sh.UsedRange
 For i = 2 To UBound(arr, 1) 'each "row"
 For j = 1 To UBound(arr, 2) 'each "col"
 arr(i, j) = RTrim(Replace(arr(i, j), Chr(10), vbNullString))
 Next
 Next
 sh.UsedRange = arr
 End If
 refit sh.UsedRange
End Sub
Private Sub refit(ByRef rng As Range)
 With rng
 .WrapText = False
 .HorizontalAlignment = xlGeneral
 .VerticalAlignment = xlCenter
 .Columns.EntireColumn.AutoFit
 .Rows.EntireRow.AutoFit
 End With
End Sub
answered Jun 6, 2015 at 14:11
\$\endgroup\$
16
  • \$\begingroup\$ Nice answer. Welcome to Code Review! We'll have to disagree about the call to Space though. I think it offers readability over the literal. \$\endgroup\$ Commented Jun 6, 2015 at 22:55
  • \$\begingroup\$ Welcome to CR! That is a beautiful answer, I hope you stick around! \$\endgroup\$ Commented Jun 6, 2015 at 23:03
  • \$\begingroup\$ Ahh. I just noticed that you used A constant. Nice choice! \$\endgroup\$ Commented Jun 6, 2015 at 23:29
  • \$\begingroup\$ @RubberDuck and Mat - thank you both (or all 3 - don't know who else upvoted) - I've already noticed your work and it's very impressive ! \$\endgroup\$ Commented Jun 7, 2015 at 0:12
  • 2
    \$\begingroup\$ I count Sheet1(1) as an implicit reference to ThisWorkbook.Sheets, which is probably intended to be ThisWorkbook.Worksheets. \$\endgroup\$ Commented Jun 7, 2015 at 20:42

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.