5
\$\begingroup\$

This code is a result of a lot of help from this community. What does it do? It compares two excel sheets cell by cell and copies the differences into another excel file.

I’m trying to modify the Code so it does the following:

It goes to "Name" and then searches for that name is the other file. If it doesn’t find the file it just copies that entire row including all the columns to the report file (file that is created) In case it does find the name, then it compares all the columns of that row and if something different it shows the difference in report with color red.

The pictures below show what I'm trying to achieve with this code

Sheet1 enter image description here

Sheet2

enter image description here

Report enter image description here

Sub Compare2WorkSheets(ws1 As Worksheet, ws2 As Worksheet)
 Dim ws1row As Long, ws2row As Long, ws1col As Integer, ws2col As Integer
 Dim maxrow As Long, maxcol As Integer, colval1 As String, colval2 As String
 Dim Report As Workbook, difference As Long
 Dim row As Long, col As Integer
 Dim Arr1 As Variant, Arr2 As Variant, Arr3 As Variant, Rng As Range
 Dim tm As Double
 tm = Timer
 'Application.ScreenUpdating = False
 'Application.Calculation = xlCalculationManual
 'Application.EnableEvents = False
 With ws1.UsedRange
 ws1row = .Rows.Count
 ws1col = .Columns.Count
 End With
 With ws2.UsedRange
 ws2row = .Rows.Count
 ws2col = .Columns.Count
 End With
 maxrow = ws1row
 maxcol = ws1col
 If maxrow < ws2row Then maxrow = ws2row
 If maxcol < ws2col Then maxcol = ws2col
 Debug.Print maxrow, maxcol
 Arr1 = ws1.Range(ws1.Cells(1, 1), ws1.Cells(maxrow, maxcol)).Formula
 Arr2 = ws2.Range(ws2.Cells(1, 1), ws2.Cells(maxrow, maxcol)).Formula
 ReDim Arr3(1 To UBound(Arr1, 1), 1 To UBound(Arr1, 2))
 difference = 0
 For col = 1 To maxcol
 For row = 1 To maxrow
 If Arr1(row, col) <> Arr2(row, col) Then
 difference = difference + 1
 Arr3(row, col) = Arr1(row, col) & "<> " & Arr2(row, col)
 Else
 Arr3(row, col) = ""
 End If
 Next row
 Next col
 Debug.Print " Calc secs " & Timer - tm
 If difference > 0 Then
 Set Report = Workbooks.Add
 With Report.ActiveSheet
 .Range("A1").Resize(UBound(Arr3, 1), UBound(Arr3, 2)).Value = Arr3
 .Columns("A:B").ColumnWidth = 25
 Set Rng = .Range(Report.ActiveSheet.Cells(1, 1), Report.ActiveSheet.Cells(UBound(Arr3, 1), UBound(Arr3, 2)))
 End With
 With Rng
 .FormatConditions.Add Type:=xlCellValue, Operator:=xlGreater, Formula1:="=""""" '""""""""
 .FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
 With .FormatConditions(1)
 .Interior.Color = 255
 .Font.Bold = True
 .Font.ColorIndex = 2
 End With
 End With
 Debug.Print "Report Generated secs " & Timer - tm
 End If
 'Set Report = Nothing
 'Application.ScreenUpdating = True
 'Application.Calculation = xlCalculationAutomatic
 'Application.EnableEvents = True
 MsgBox difference & " cells contain different data! ", vbInformation, "Comparing Two Worksheets"
End Sub

How do I run this code?

I add the code into a module and then create a button and do the following.

The first picture shows how I call the sub to compare the two sheets

enter image description here

The second pictures shows how I open up another file and compare two Excel files.

enter image description here

asked Aug 17, 2019 at 21:19
\$\endgroup\$
8
  • \$\begingroup\$ It is possible if the names are unique only. if the rows in question are as high as in your last question then it seems that is unlikely. Please comment back. \$\endgroup\$ Commented Aug 18, 2019 at 23:48
  • \$\begingroup\$ @Ahmed AU I want to compare by name( column H). Find the sheet which has more rows and start from the first row (Lenovo700), then loop through column H of sheet2 and find which row has Lenovo700 and then compare the first row Sheet1 with the row where we located the same NAME in Sheet2 \$\endgroup\$ Commented Aug 19, 2019 at 6:15
  • \$\begingroup\$ @ Ahmed AU as far I understand we should only modify the loops but can't seem to get it right \$\endgroup\$ Commented Aug 19, 2019 at 7:25
  • 1
    \$\begingroup\$ Your logic seems clear that you want to find a Name and compare rows, but @AhmedAU question is still important: what happens if you the same name appearing many times in both sheets? For example, "Lenovo700" could appear 12 times on Sheet1 and 23 times on Sheet2. How can you decide if any row is unique if a name is repeated like that? One possibility is that a combination of Name and Country may create a unique entry. Your other post indicated that you could have over 10,000 rows, so it seems unlikely that there will be 10,000 unique names (though it's possible, if that is your situation). \$\endgroup\$ Commented Aug 19, 2019 at 13:43
  • \$\begingroup\$ @PeterT Thanks for explaining my point better than me. \$\endgroup\$ Commented Aug 19, 2019 at 13:47

1 Answer 1

3
\$\begingroup\$

This answer assumes that all Names in the data are unique. There is no provision in this example to handle duplicate Names except to issue a note in the debug output.

This answer will involve Dictionaries. Please review this website for complete information on how and why they are an efficient way to store unique data. The short answer is that you can create a large dictionary by looking at a unique "key", which is a string that uniquely represents some data that you want to track. In your case, you've asserted that all of the Names are unique. Dictionaries exist for speedy access to any single entry -- no looping through 200k entries to find the one you want. Use your unique key string and you have near-instant access to the data associated with that key.

For your situation, my example builds two Dictionaries, one for each set of data. The keys are the Names. The values associated with each key (Name) is the row number on which each Name is used. The row numbers will play a major role later on.

As with the other Code Review for your previous version of code, I'll reiterate:

  1. Identify your function parameters ByRef or ByVal
  2. Declare your variables as close to their first use as possible.

As an example:

Dim arr1 As Variant
Dim arr2 As Variant
arr1 = BuildDataArrays(ws1, startRow:=2)
arr2 = BuildDataArrays(ws2, startRow:=2)

You'll see that there is a call to a BuildDataArrays function. That brings me to

  1. Functional Isolation. When your routine starts getting very long, that is the perfect time to begin breaking parts of the logic out into separate functions/subs. This is especially useful if you have repetitive logic where only the variable is different. This is the case for BuildDataArrays.

Give this function a worksheet and optionally the starting row or column and it determines the range of available data, returning a memory-based array. Breaking out code into separate routines is very helpful because it makes your main logic easier to follow.

Private Function BuildDataArrays(ByRef ws As Worksheet, _
 Optional ByVal startRow As Long = 1, _
 Optional ByVal startCol As Long = 1) As Variant
 '--- assumes all rows and columns are valid data
 Dim lastRow As Long
 Dim lastCol As Long
 Dim dataArea As Range
 Dim data As Variant
 With ws
 lastRow = .Cells(.Rows.Count, startRow).End(xlUp).row
 lastCol = .Cells(startCol, .Columns.Count).End(xlToLeft).Column
 Set dataArea = .Cells(startRow, startCol).Resize(lastRow - startRow + 1, _
 lastCol - startCol + 1)
 data = dataArea
 End With
 BuildDataArrays = data
End Function
  1. Use Dictionaries to collect your data. As with the previous point, this is a perfect opportunity to isolate the logic in a separate function.

The BuildDataDictionary function will accept your memory-based array and use the selected column of data as a unique key (currently defaulted to column "I").

Private Function BuildDataDictionary(ByRef data As Variant, _
 Optional ByVal keyColumn As Long = 8) As Dictionary
 Dim row As Long
 Dim name As String
 Dim names As Dictionary
 Set names = New Dictionary
 For row = LBound(data, 1) To UBound(data, 1)
 name = Trim$(data(row, keyColumn))
 If Len(name) > 0 Then
 If Not names.Exists(name) Then
 '--- add the new name to the dictionary and save the row number
 names.Add name, row
 Else
 '--- if you get here, it means that the Name is NOT unique
 ' and you'll have to change your logic, or change the name
 Debug.Print "ERROR: Duplicate name detected on " & _
 " on row " & row & ": '" & name & "'"
 End If
 End If
 Next row
 Set BuildDataDictionary = names
End Function

Next we'll build the resulting report data. According to your description, the report will consist of all data rows (each with a unique Name), with any differences noted in the data itself. In your original post, you are assuming that the larger of the two row counts for the sheets will be your output array. This isn't true.

Consider that, by definition, all of the data from Sheet1 is unique (because each row's Name is unique). That means if you have 10 rows of data on Sheet1, your output data will have at least ten rows. It's possible that your data on Sheet2 also has ten rows of data, and only one of those rows repeats a Name on Sheet1. So your resulting report of data will have 19 rows.

Dim totalRows As Long
totalRows = ws1Names.Count
'--- now add on the number of unique rows from the other sheet
Dim name As Variant
For Each name In ws2Names
 If Not ws1Names.Exists(name) Then
 '--- name is unique
 totalRows = totalRows + 1
 Else
 '--- name is not unique
 End If
Next name
Debug.Print "There are " & totalRows & " unique Names between the sheets"
'--- now build a correctly sized output array
' ASSUMES both arrays have the same number of columns!!
Dim reportData As Variant
ReDim reportData(1 To totalRows, 1 To UBound(arr1, 2))

Now that we're about to generate the report data, we have to consider how to make note of any errors encountered. For this I'm using a Collection, which is a simple way to generate a running list of items. In this case, for each difference I'm adding a string that notes the row and column of each difference in the data arrays. I can use this later on to highlight the difference cells.

'--- and create an object to list which cells are different
Dim diffCells As Collection
Set diffCells = New Collection

After that, we simply move the data over to the report array, making note of any differences.

'--- we know that all Names are unique in sheet1, so move the all that
' data from sheet1 into the report array
Dim row As Long
Dim col As Long
Dim ws1row As Long
Dim ws2row As Long
row = 1
For Each name In ws1Names
 If ws2Names.Exists(name) Then
 '--- this row will have a difference because the Names match!
 ' so get the rows for each sheet that match the name
 ws1row = ws1Names(name)
 ws2row = ws2Names(name)
 For col = 1 To UBound(reportData, 2)
 If arr1(ws1row, col) = arr2(ws2row, col) Then
 reportData(row, col) = arr1(ws1row, col)
 Else
 '--- note the different values in the cell and add the
 ' row and column to the difference list
 reportData(row, col) = arr1(ws1row, col) & " <> " & _
 arr2(ws2row, col)
 diffCells.Add CLng(row) & "," & CLng(col)
 End If
 Next col
 Else
 '--- this is a unique row, so a straight copy of all columns
 For col = 1 To UBound(reportData, 2)
 reportData(row, col) = arr1(row, col)
 Next col
 End If
 row = row + 1
Next name
'--- the remaining data are the unique rows that exist in sheet2
' the "row" variable count is continued in this loop
For Each name In ws2Names
 If Not ws1Names.Exists(name) Then
 '--- this is a unique row, so a straight copy of all columns
 ws2row = ws2Names(name)
 For col = 1 To UBound(reportData, 2)
 reportData(row, col) = arr2(ws2row, col)
 Next col
 row = row + 1
 End If
Next name

The final step is to output the report data. In my example, I am not creating a new workbook, but only creating a new worksheet. You can un-comment some code lines there to change it back for your purposes.

If diffCells.Count > 0 Then
 Dim report As Workbook
 Dim reportWS As Worksheet
 'Set report = Workbooks.Add 'un-comment to report to a new workbook
 'Set reportWS = report.ActiveSheet 'un-comment to report to a new workbook
 Set reportWS = ThisWorkbook.Sheets.Add 'comment to report to a new workbook
 '--- copy the resulting report to the worksheet
 Dim reportArea As Range
 Set reportArea = reportWS.Range("A1").Resize(UBound(reportData, 1), UBound(reportData, 2))
 With reportArea
 .Value = reportData
 .Columns("A:B").ColumnWidth = 25
 '--- now highlight the cells that are different
 Dim rowcol As Variant
 Dim parts() As String
 For Each rowcol In diffCells
 parts = Split(rowcol, ",")
 With .Cells(CLng(parts(0)), CLng(parts(1)))
 .Font.Bold = True
 .Font.ColorIndex = 3
 End With
 Next rowcol
 End With
 Debug.Print "Report Generated secs " & Timer - tm
End If

Here is the whole code module in one block:

Option Explicit
Sub test1()
 Compare2WorkSheets Sheet1, Sheet2
End Sub
Sub Compare2WorkSheets(ByRef ws1 As Worksheet, ByRef ws2 As Worksheet)
 Dim tm As Double
 tm = Timer
 'Application.ScreenUpdating = False
 'Application.Calculation = xlCalculationManual
 'Application.EnableEvents = False
 '--- establish the data in the arrays, skip the header row
 Dim arr1 As Variant
 Dim arr2 As Variant
 arr1 = BuildDataArrays(ws1, startRow:=2)
 arr2 = BuildDataArrays(ws2, startRow:=2)
 '--- buidl a dictionary of Names for each worksheet
 Dim ws1Names As Dictionary
 Dim ws2Names As Dictionary
 Set ws1Names = BuildDataDictionary(arr1)
 Set ws2Names = BuildDataDictionary(arr2)
 '--- we don't know how many rows the report will be, so compare
 ' names between the two sheets to find out. it's basically
 ' the sum of the number of unique names between the sheets
 Dim totalRows As Long
 totalRows = ws1Names.Count
 '--- now add on the number of unique rows from the other sheet
 Dim name As Variant
 For Each name In ws2Names
 If Not ws1Names.Exists(name) Then
 '--- name is unique
 totalRows = totalRows + 1
 Else
 '--- name is not unique
 End If
 Next name
 Debug.Print "There are " & totalRows & " unique Names between the sheets"
 '--- now build a correctly sized output array
 ' ASSUMES both arrays have the same number of columns!!
 Dim reportData As Variant
 ReDim reportData(1 To totalRows, 1 To UBound(arr1, 2))
 '--- and create an object to list which cells are different
 Dim diffCells As Collection
 Set diffCells = New Collection
 '--- we know that all Names are unique in sheet1, so move the all that
 ' data from sheet1 into the report array
 Dim row As Long
 Dim col As Long
 Dim ws1row As Long
 Dim ws2row As Long
 row = 1
 For Each name In ws1Names
 If ws2Names.Exists(name) Then
 '--- this row will have a difference because the Names match!
 ' so get the rows for each sheet that match the name
 ws1row = ws1Names(name)
 ws2row = ws2Names(name)
 For col = 1 To UBound(reportData, 2)
 If arr1(ws1row, col) = arr2(ws2row, col) Then
 reportData(row, col) = arr1(ws1row, col)
 Else
 '--- note the different values in the cell and add the
 ' row and column to the difference list
 reportData(row, col) = arr1(ws1row, col) & " <> " & _
 arr2(ws2row, col)
 diffCells.Add CLng(row) & "," & CLng(col)
 End If
 Next col
 Else
 '--- this is a unique row, so a straight copy of all columns
 For col = 1 To UBound(reportData, 2)
 reportData(row, col) = arr1(row, col)
 Next col
 End If
 row = row + 1
 Next name
 '--- the remaining data are the unique rows that exist in sheet2
 ' the "row" variable count is continued in this loop
 For Each name In ws2Names
 If Not ws1Names.Exists(name) Then
 '--- this is a unique row, so a straight copy of all columns
 ws2row = ws2Names(name)
 For col = 1 To UBound(reportData, 2)
 reportData(row, col) = arr2(ws2row, col)
 Next col
 row = row + 1
 End If
 Next name
 Debug.Print " Calc secs " & Timer - tm
 If diffCells.Count > 0 Then
 Dim report As Workbook
 Dim reportWS As Worksheet
 'Set report = Workbooks.Add 'un-comment to report to a new workbook
 'Set reportWS = report.ActiveSheet 'un-comment to report to a new workbook
 Set reportWS = ThisWorkbook.Sheets.Add 'comment to report to a new workbook
 '--- copy the resulting report to the worksheet
 Dim reportArea As Range
 Set reportArea = reportWS.Range("A1").Resize(UBound(reportData, 1), UBound(reportData, 2))
 With reportArea
 .Value = reportData
 .Columns("A:B").ColumnWidth = 25
 '--- now highlight the cells that are different
 Dim rowcol As Variant
 Dim parts() As String
 For Each rowcol In diffCells
 parts = Split(rowcol, ",")
 With .Cells(CLng(parts(0)), CLng(parts(1)))
 .Font.Bold = True
 .Font.ColorIndex = 3
 End With
 Next rowcol
 End With
 Debug.Print "Report Generated secs " & Timer - tm
 End If
 'Application.ScreenUpdating = True
 'Application.Calculation = xlCalculationAutomatic
 'Application.EnableEvents = True
 If diffCells.Count > 0 Then
 Debug.Print diffCells.Count & " cells contain different data!"
 Else
 Debug.Print "No differences found between the sheets."
 End If
End Sub
Private Function BuildDataArrays(ByRef ws As Worksheet, _
 Optional ByVal startRow As Long = 1, _
 Optional ByVal startCol As Long = 1) As Variant
 '--- assumes all rows and columns are valid data
 Dim lastRow As Long
 Dim lastCol As Long
 Dim dataArea As Range
 Dim data As Variant
 With ws
 lastRow = .Cells(.Rows.Count, startRow).End(xlUp).row
 lastCol = .Cells(startCol, .Columns.Count).End(xlToLeft).Column
 Set dataArea = .Cells(startRow, startCol).Resize(lastRow - startRow + 1, _
 lastCol - startCol + 1)
 data = dataArea
 End With
 BuildDataArrays = data
End Function
Private Function BuildDataDictionary(ByRef data As Variant, _
 Optional ByVal keyColumn As Long = 8) As Dictionary
 Dim row As Long
 Dim name As String
 Dim names As Dictionary
 Set names = New Dictionary
 For row = LBound(data, 1) To UBound(data, 1)
 name = Trim$(data(row, keyColumn))
 If Len(name) > 0 Then
 If Not names.Exists(name) Then
 '--- add the new name to the dictionary and save the row number
 names.Add name, row
 Else
 '--- if you get here, it means that the Name is NOT unique
 ' and you'll have to change your logic, or change the name
 Debug.Print "ERROR: Duplicate name detected on " & _
 " on row " & row & ": '" & name & "'"
 End If
 End If
 Next row
 Set BuildDataDictionary = names
End Function

EDIT: added an example on how to call the routine from a button click

It seems that you're adding an ActiveX command button to your worksheet. In this case, the CommandButton1_Click() method will be executed in the Sheet1 module. Take the code above with the Compare2WorkSheets routine and paste it into a regular code module. Then, in your sheet1 module, fix up your button-click code like this:

Option Explicit
Private Sub CommandButton1_Click()
 Dim myWorkbook1 As Workbook
 Dim myWorkbook2 As Workbook
 '--- if Sheet1 is contained in the workbook where the code is running, use this
 Set myWorkbook1 = ThisWorkbook
 '--- if Sheet1 is in a different -- but already open -- workbook, use this
 Set myWorkbook1 = Workbooks("the-already-open-workbook-filename.xlsx")
 '--- if Sheet1 is in a different -- but unopened -- workbook, use this
 Set myWorkbook1 = Workbooks.Open("the-workbook-filename-to-open.xlsx")
 '--- you can make the same decisions for setting myWorkbook2
 Set myWorkbook2 = Workbooks.Open("C:\Temp\testreport1.xlsx")
 Compare2WorkSheets myWorkbook1.Worksheets("Sheet1"), myWorkbook2.Worksheets("Sheet1")
 myWorkbook1.Close
 myWorkbook2.Close
End Sub
answered Aug 19, 2019 at 21:25
\$\endgroup\$
16
  • \$\begingroup\$ thank you for this incredible answer . I read the description and learned a lot. Until now I wasn't able to make it run. I usually just use a button and call the sub. I added the part on how I run the code in question above. Until now I don't know how to do that with your code \$\endgroup\$ Commented Aug 20, 2019 at 17:57
  • \$\begingroup\$ @MiriamList - there are a couple issues: 1) you're trying to execute your VBA code from a non-macro workbook named File1.xlsx', if this is a temporary workbook for testing, that's okay but you can't save your macro code in it, and 2) in your parameter Workbooks("testcompare1.xlsm").Worksheets("Sheet1"), this assumes that the workbook named "testcompare1.xlsm" is already open in Excel. The call will fail if its not open. To fix this, create another variable (such as myWorkbook2) and set it using Workbooks.Open, then use that as a parameter. It should work then. \$\endgroup\$ Commented Aug 20, 2019 at 19:42
  • \$\begingroup\$ yes the file was just and example to show you. I just wanted to ask for some help on how to run your code from above or how to add the sub and the functions from your code to the button. \$\endgroup\$ Commented Aug 20, 2019 at 19:56
  • \$\begingroup\$ @MiriamList - please see if the added code answers your question \$\endgroup\$ Commented Aug 20, 2019 at 20:10
  • \$\begingroup\$ I followed your code. so I added a module and then put the active button on the first sheet and I wanted to compare sheet1 with sheet2 of the same excel file. When I run the code I'm getting Error while compiling custom type not defined at Private Function BuildDataDictionary(ByRef data As Variant, _ Optional ByVal keyColumn As Long = 8) As Dictionary \$\endgroup\$ Commented Aug 21, 2019 at 7:54

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.