1
\$\begingroup\$

The code is doing everything that I need it to. It's taking much too long though upwards of a couple minutes to complete the macro. Does anyone know how I code optimize this VBA code to run quicker? I'm new to VBA and I'm not quite sure how to proceed. This code will be running on roughly 35,000 lines of data.

Public Sub matchRow()
 Dim dumpSheet, referencesheet, outputSheet As Worksheet
 Dim startRow, outputRow, tempDumpRow, tempActiveRow, referenceRowCount, finishedreferenceIndex As Integer
 Dim finishedreference() As String
 Dim isExist As Boolean
 'Set sheets
 Set dumpSheet = Sheets("Dump")
 Set referencesheet = Sheets("Active Directory")
 Set outputSheet = Sheets("Output")
 'Set start row of each sheet for data
 startRow = 5
 outputRow = 5
 'Get row count from Active Depository sheet
 referenceRowCount = referencesheet.Range("B5:D5").End(xlDown).Row
 'Set index
 finishedreferenceIndex = 5
 'Re-define array
 ReDim finishedreference(5 To referenceRowCount - 1)
 'Set the start row
 tempDumpRow = startRow
 'Here I looped with OR state, you can modify it to AND start if you want
 Do While dumpSheet.Range("B" & tempDumpRow) <> "" Or dumpSheet.Range("C" & tempDumpRow) <> "" Or dumpSheet.Range("D" & tempDumpRow) <> ""
 'Reset exist flag
 isExist = False
 'loop all row in Active Depository sheet
 For tempActiveRow = 5 To referenceRowCount Step 1
 'If row is not finished for checking.
 If UBound(Filter(finishedreference, tempActiveRow)) < 0 Then
 'If all cell are equal
 If dumpSheet.Range("B" & tempDumpRow) = referencesheet.Range("B" & tempActiveRow) And _
 dumpSheet.Range("C" & tempDumpRow) = referencesheet.Range("C" & tempActiveRow) And _
 dumpSheet.Range("D" & tempDumpRow) = referencesheet.Range("D" & tempActiveRow) Then
 'Set true to exist flag
 isExist = True
 'Store finished row
 finishedreference(finishedreferenceIndex) = tempActiveRow
 finishedreferenceIndex = finishedreferenceIndex + 1
 'exit looping
 Exit For
 End If
 End If
 Next tempActiveRow
 'Show result
 outputSheet.Range("B" & outputRow) = dumpSheet.Range("B" & tempDumpRow)
 outputSheet.Range("C" & outputRow) = dumpSheet.Range("C" & tempDumpRow)
 outputSheet.Range("D" & outputRow) = dumpSheet.Range("D" & tempDumpRow)
 If isExist Then
 outputSheet.Range("E" & outputRow) = ""
 Else
 outputSheet.Range("E" & outputRow) = "Item found in ""Dump"" but not in ""Active Directory"""
 End If
 'increase output row
 outputRow = outputRow + 1
 'go next row
 tempDumpRow = tempDumpRow + 1
 Loop
 'loop all row in Active Depository sheet
 For tempActiveRow = 5 To referenceRowCount Step 1
 'If row is not finished for checking.
 If UBound(Filter(finishedreference, tempActiveRow)) < 0 Then
 'Show result
 outputSheet.Range("B" & outputRow) = referencesheet.Range("B" & tempActiveRow)
 outputSheet.Range("C" & outputRow) = referencesheet.Range("C" & tempActiveRow)
 outputSheet.Range("D" & outputRow) = referencesheet.Range("D" & tempActiveRow)
 outputSheet.Range("E" & outputRow) = "Item found in ""Active Directory"" but not in ""Dump"""
 'increase output row
 outputRow = outputRow + 1
 End If
 Next tempActiveRow
End Sub
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Nov 26, 2019 at 18:24
\$\endgroup\$
6
  • 1
    \$\begingroup\$ Could you add detail to explain what your code does? \$\endgroup\$ Commented Nov 26, 2019 at 19:23
  • 1
    \$\begingroup\$ My immediate suggestion is to look through previous Code Reviews for VBA and pick out some constant themes - apply those themes to your code and then resubmit for review. \$\endgroup\$ Commented Nov 26, 2019 at 20:22
  • 1
    \$\begingroup\$ Key themes to look for are Option Explicit, declaring variables on one line (hint: Dim dumpSheet, referencesheet, outputSheet As Worksheet does not do what you think it does), explicitly using default actions (e.g. Range(x).Value="" instead of Range(x) = "") and declaring variables near where you use them. \$\endgroup\$ Commented Nov 26, 2019 at 20:26
  • 1
    \$\begingroup\$ Reading the previous reviews you will see many examples of using arrays to improve performance. The reason I mention all this is because while there are many people here happy to review code and provide advice, we are not a free refactoring or rewriting service. Your main question "Someone suggested loading data as variant arrays as opposed to ranges. I'm stuck." implies that you are looking for specific help, not a review. There are also many examples of using arrays instead of ranges on StackOverflow. Remember to filter searches with '[vba]' \$\endgroup\$ Commented Nov 26, 2019 at 20:29
  • 1
    \$\begingroup\$ Using SQL is great for these type of set based comparisons. You can use ADO in Excel if you wish, although it plays nicer in an actual database. I'd recommend checking that out though! \$\endgroup\$ Commented Nov 26, 2019 at 21:55

1 Answer 1

1
\$\begingroup\$

This is the sample dataset I created. The OP's code suggests that the Active Directory tab has an extra row.

Sample Dataset

Always Use Worksheets CodeNames Whenever Possible

Referencing worksheets by their code names will avoid naming conflicts while working with multiple workbooks and changing the worksheet name will not break any code.

  • Sheets("Active Directory") -> wsActiveDirectory
  • Sheets("Dump")-> wsDump
  • Sheets("Output") -> wsOutput

Use Constants for Magic Numbers

Using constants for values that should only be set once will make your code easier to read and maintain. Constants will also throw an error if you try to change their values.

Before

startRow = 5
outputRow = 5

After

Const startRow As Long = 5, outputRow As Long = 5

Matching Lists

Dictionaries are optimised for fast lookups. Using a Scripting.Dictionary will match the values will easily make the code run 100 times faster.

The trick is to create a composite key for all fields. Note: make sure to use a delimiter.

1;Towney;Research and Development

Private Function getKey(ByVal rowIndex As Long, ByRef Target As Range) As String
 getKey = Target.Cells(rowIndex, 1) & ";" & Target.Cells(rowIndex, 2) & ";" & Target.Cells(rowIndex, 3)
End Function

SQL Solution

As Ryan Wildry stated "Using SQL is great for these type of set based comparisons." but this can be a little tricky. The way I did it is I pasted my datasets into an Access Database as tables and use the Query Designer to help me write the code.

SELECT Dump.ID, Dump.Name, Dump.Department, "Item found in ""Dump"" but not in ""Active Directory""" AS [Found]
FROM Dump
WHERE (((Exists (SELECT NULL
 FROM [Active Directory]
 WHERE ([Active Directory].ID = Dump.ID) AND ([Active Directory].Name = Dump.Name) AND ([Active Directory].Department = Dump.Department)
))=False));
UNION ALL SELECT [Active Directory].ID, [Active Directory].Name, [Active Directory].Department, "Item found in ""Active Directory"" but not in ""[Active Directory]""" AS [Found]
FROM [Active Directory]
WHERE (((Exists (SELECT NULL
 FROM [Dump]
 WHERE ([Active Directory].ID = Dump.ID) AND ([Active Directory].Name = Dump.Name) AND ([Active Directory].Department = Dump.Department)
))=False));

I then aliased the tables to make it easier to replace the table names with the Excel Table Definition.

 SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Dump" but not in "Active Directory"' AS [Found]
 FROM [Dump$B4:E23] As t1
 WHERE (((Exists (SELECT NULL
 FROM [Active Directory] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
 ))=False))
 UNION ALL
 SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Active Directory" but not in "Dump"' AS [Found]
 FROM [Active Directory$B4:E20] As t1
 WHERE (((Exists (SELECT NULL
 FROM [Dump] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
 ))=False))

Now that I had the SQL working, I replaced the messages and created a single Select statement that I could modify to handle both selecting record in Dump and not in Active Directory or selecting records in Active Directory that are not in Dump.

SELECT t1.ID, t1.Name, t1.Department, "Message" AS [Found]
FROM [Dump] As t1
WHERE (((Exists (SELECT NULL
 FROM [Active Directory] As t2
 WHERE (t2.ID = t1.ID) AND (t2.Name = t1.Name) AND (t2.Department = t1.Department)
))=False));

Sub FindUnmatchedRowsCopyFromRecordset()

Create a recordset and use Range.CopyFromRecordset to transfer the records.

Sample SQl:

SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Dump" but not in "Active Directory"' AS [Found]
FROM [Dump$B4:E23] As t1
WHERE (((Exists (SELECT NULL
 FROM [Active Directory$B4:E20] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
))=False))
UNION ALL
SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Active Directory" but not in "Dump"' AS [Found]
FROM [Active Directory$B4:E20] As t1
WHERE (((Exists (SELECT NULL
 FROM [Dump$B4:E23] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
))=False))

Sub FindUnmatchedRowsAppend()

This is a slightly more complicated technique that appends the records directly to the Output tab.

Sample SQl:

INSERT INTO [Output$B4:E4] SELECT t3.* FROM (SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Dump" but not in "Active Directory"' AS [Found]
FROM [Dump$B4:E23] As t1
WHERE (((Exists (SELECT NULL
 FROM [Active Directory$B4:E20] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
))=False))
UNION ALL
SELECT t1.Name, t1.ID, t1.Department, 'Item found in "Active Directory" but not in "Dump"' AS [Found]
FROM [Active Directory$B4:E20] As t1
WHERE (((Exists (SELECT NULL
 FROM [Dump$B4:E23] As t2
 WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)
))=False))) as t3

Code

Option Explicit
Sub FindUnmatchedRowsAppend()
 Dim conn As Object
 Set conn = getThisWorkbookConnection
 conn.Open
 DeleteOutputResults
 Dim OutputDef As String
 OutputDef = getTableDefinition(wsOutput)
 Dim SQL As String
 SQL = "INSERT INTO " & OutputDef & " SELECT t3.* FROM (" & getOutputResultQuery & ") as t3"
 conn.Execute SQL
 conn.Close
End Sub
Public Sub FindUnmatchedRowsCopyFromRecordset()
 Dim conn As Object
 Set conn = getThisWorkbookConnection
 conn.Open
 Dim SQL As String
 SQL = getOutputResultQuery
 Dim rs As Object
 Set rs = conn.Execute(SQL)
 DeleteOutputResults
 wsOutput.Range("B5").CopyFromRecordset rs
 conn.Close
End Sub
Private Function getOutputResultQuery() As String
 Dim ActiveDirectoryDef As String
 ActiveDirectoryDef = getTableDefinition(wsActiveDirectory)
 Dim DumpDef As String
 DumpDef = getTableDefinition(wsDump)
 Const BaseSQl As String = "SELECT t1.Name, t1.ID, t1.Department, '@Message' AS [Found]" & vbNewLine & _
 "FROM [xlTable1] As t1" & vbNewLine & _
 "WHERE (((Exists (SELECT NULL" & vbNewLine & _
 " FROM [xlTable2] As t2" & vbNewLine & _
 " WHERE (t2.ID = t1.ID) And (t2.Name = t1.Name) And (t2.Department = t1.Department)" & vbNewLine & _
 "))=False))"
 Dim SelectDump As String
 SelectDump = Replace(BaseSQl, "[xlTable1]", DumpDef)
 SelectDump = Replace(SelectDump, "[xlTable2]", ActiveDirectoryDef)
 SelectDump = Replace(SelectDump, "@Message", "Item found in ""Dump"" but not in ""Active Directory""")
 Dim SelectAD As String
 SelectAD = Replace(BaseSQl, "[xlTable1]", ActiveDirectoryDef)
 SelectAD = Replace(SelectAD, "[xlTable2]", DumpDef)
 SelectAD = Replace(SelectAD, "@Message", "Item found in ""Active Directory"" but not in ""Dump""")
 Dim SQL As String
 SQL = SelectDump & vbNewLine & "UNION ALL" & vbNewLine & SelectAD
 getOutputResultQuery = SQL
End Function
Private Sub DeleteOutputResults()
 Dim Target As Range
 With wsOutput
 Set Target = .Range("B4:E4", .Cells(.Rows.Count, "B").End(xlUp))
 Target.Offset(1).ClearContents
 End With
End Sub
Private Function getTableDefinition(ws As Worksheet) As String
 Dim Target As Range
 Select Case ws.Name
 Case wsActiveDirectory.Name, wsDump.Name
 With ws
 Set Target = .Range("B4:E4", .Cells(.Rows.Count, "B").End(xlUp))
 End With
 If ws Is wsActiveDirectory Then
 Rem Remove Summary Row
 Set Target = Target.Resize(Target.Rows.Count - 1)
 End If
 Case wsOutput.Name
 With ws
 Set Target = .Range("B4:E4", .Cells(.Rows.Count, "B").End(xlUp))
 End With
 End Select
 getTableDefinition = getTableDefinitionFromRange(Target)
End Function
Private Function getThisWorkbookConnection() As Object
 Dim conn As Object
 Set conn = CreateObject("ADODB.Connection")
 With conn
 .Provider = "Microsoft.ACE.OLEDB.12.0"
 .ConnectionString = "Data Source=" & ThisWorkbook.FullName & ";" & _
 "Extended Properties=""Excel 12.0 Xml;HDR=YES"";"
 End With
 Set getThisWorkbookConnection = conn
End Function
Private Function getTableDefinitionFromRange(Target As Range) As String
 Dim SheetName As String
 SheetName = Target.Parent.Name
 Dim Address As String
 Address = Target.Address(RowAbsolute:=False, ColumnAbsolute:=False)
 getTableDefinitionFromRange = "[" & SheetName & "$" & Address & "]"
End Function

Download Link

ADDump.xlsm

answered Nov 27, 2019 at 22:00
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.