This loops through folder to open files and get important info from the columns of names "HOLDER" and "CUTTING TOOL" and printing all the info to one excel document, masterfile.
It utilizes a dictionary and collection. I am very new to VBA So I do not know if I have stayed consistent with my naming of sheets and files and using the code to make the program work as fast as possible.
Any tips on how to streamline the code more?
Option Explicit
Sub LoopThroughDirectory()
Const ROW_HEADER As Long = 10
Dim objFSO As Object
Dim objFolder As Object
Dim objFile As Object
Dim MyFolder As String
Dim StartSht As Worksheet, ws As Worksheet
Dim WB As Workbook
Dim i As Integer
Dim LastRow As Integer, erow As Integer
Dim Height As Integer
Dim RowLast As Long
Dim f As String
Dim dict As Object
Dim hc As Range, hc1 As Range, hc2 As Range, hc3 As Range, d As Range
Set StartSht = Workbooks("masterfile.xlsm").Sheets("Sheet1")
'turn screen updating off - makes program faster
Application.ScreenUpdating = False
'location of the folder in which the desired TDS files are
MyFolder = "C:\Users\trembos\Documents\TDS\progress\"
'find the headers on the sheet
Set hc1 = HeaderCell(StartSht.Range("B1"), "HOLDER")
Set hc2 = HeaderCell(StartSht.Range("C1"), "CUTTING TOOL")
'create an instance of the FileSystemObject
Set objFSO = CreateObject("Scripting.FileSystemObject")
'get the folder object
Set objFolder = objFSO.GetFolder(MyFolder)
i = 2
'loop through directory file and print names
'(1)
For Each objFile In objFolder.Files
If LCase(Right(objFile.Name, 3)) = "xls" Or LCase(Left(Right(objFile.Name, 4), 3)) = "xls" Then
'(2)
'Open folder and file name, do not update links
Set WB = Workbooks.Open(fileName:=MyFolder & objFile.Name, UpdateLinks:=0)
Set ws = WB.ActiveSheet
'(3)
'find CUTTING TOOL on the source sheet
Set hc = HeaderCell(ws.Cells(ROW_HEADER, 1), "CUTTING TOOL")
If Not hc Is Nothing Then
Set dict = GetValues(hc.Offset(1, 0))
If dict.count > 0 Then
Set d = StartSht.Cells(Rows.count, hc2.Column).End(xlUp).Offset(1, 0)
'add the values to the masterfile, column 3
d.Resize(dict.count, 1).Value = Application.Transpose(dict.items)
End If
Else
'header not found on source worksheet
End If
'(4)
'find HOLDER on the source sheet
Set hc3 = HeaderCell(ws.Cells(ROW_HEADER, 1), "HOLDER")
If Not hc3 Is Nothing Then
Set dict = GetValues(hc3.Offset(1, 0))
If dict.count > 0 Then
Set d = StartSht.Cells(Rows.count, hc1.Column).End(xlUp).Offset(1, 0)
'add the values to the master list, column 2
d.Resize(dict.count, 1).Value = Application.Transpose(dict.items)
End If
Else
'header not found on source worksheet
End If
'(5)
With WB
'print TDS information
For Each ws In .Worksheets
'print the file name to Column 1
StartSht.Cells(i, 1) = objFile.Name
'print TDS name from J1 cell to Column 4
With ws
.Range("J1").Copy StartSht.Cells(i, 4)
End With
i = GetLastRowInSheet(StartSht) + 1
'move to next file
Next ws
'(6)
'close, do not save any changes to the opened files
.Close SaveChanges:=False
End With
End If
'move to next file
Next objFile
'turn screen updating back on
Application.ScreenUpdating = True
ActiveWindow.ScrollRow = 1
'(7)
End Sub
'(8)
'get all unique column values starting at cell c
Function GetValues(ch As Range) As Object
Dim dict As Object, rng As Range, c As Range, v
Set dict = CreateObject("scripting.dictionary")
For Each c In ch.Parent.Range(ch, ch.Parent.Cells(Rows.count, ch.Column).End(xlUp)).Cells
v = Trim(c.Value)
If Len(v) > 0 And Not dict.exists(v) Then
dict.Add c.Address, v
End If
Next c
Set GetValues = dict
End Function
'(9)
'find a header on a row: returns Nothing if not found
Function HeaderCell(rng As Range, sHeader As String) As Range
Dim rv As Range, c As Range
For Each c In rng.Parent.Range(rng, rng.Parent.Cells(rng.Row, Columns.count).End(xlToLeft)).Cells
If Trim(c.Value) = sHeader Then
Set rv = c
Exit For
End If
Next c
Set HeaderCell = rv
End Function
'(10)
Function GetLastRowInColumn(theWorksheet As Worksheet, col As String)
With theWorksheet
GetLastRowInColumn = .Range(col & .Rows.count).End(xlUp).Row
End With
End Function
'(11)
Function GetLastRowInSheet(theWorksheet As Worksheet)
Dim ret
With theWorksheet
If Application.WorksheetFunction.CountA(.Cells) <> 0 Then
ret = .Cells.Find(What:="*", _
After:=.Range("A1"), _
Lookat:=xlPart, _
LookIn:=xlFormulas, _
SearchOrder:=xlByRows, _
SearchDirection:=xlPrevious, _
MatchCase:=False).Row
Else
ret = 1
End If
End With
GetLastRowInSheet = ret
End Function
-
\$\begingroup\$ Can you provide small generic test files that would allow me to run the code? \$\endgroup\$paul bica– paul bica2015年06月08日 23:25:49 +00:00Commented Jun 8, 2015 at 23:25
-
\$\begingroup\$ I can do that! Do you know what would be the best way to make those files available to you? @paulbica \$\endgroup\$Taylor– Taylor2015年06月09日 11:31:06 +00:00Commented Jun 9, 2015 at 11:31
-
\$\begingroup\$ You can send them to [email protected] (I'm not sure if there are better ways) \$\endgroup\$paul bica– paul bica2015年06月09日 12:36:56 +00:00Commented Jun 9, 2015 at 12:36
1 Answer 1
I've got the following points to make which are mostly about style and readability rather than the speed of the code.
- Why not use early binding for the Microsoft Scripting Runtime? This will let declare variables such as
FileSystemObject
and asDictionary
rather than asObject
. - Some of your functions do not state the return type, e.g.
GetLastRowInSheet
andGetLastRowInColumn
should both return a Long I guess andGetValues
should return a Dictionary. - Variable declaration. Personally, I always declare each variable on a separate line but if you want to condense them, then I'd suggest grouping them by type. In
GetValues
you mix variable types which makes it harder than it should to check the type of a variable. - Variable naming. Perhaps consider using more meaningful variable names.
Dim d As Range
doesn't really convey the meaning. In the
HeaderCell
function you loop through cells looking at their value. If you use theRange.Find
method it will probably be much quicker.Function HeaderCell(rng As Range, sHeader As String) As Range Set HeaderCell = rng.Parent.Range(rng, rng.Parent.Cells(rng.Row, Columns.Count).End(xlToLeft)) _ .Find(What:=sHeader, LookIn:=xlValues, LookAt:=xlPartial, MatchCase:=True) End Function
In the
GetLastRowInSheet
function you use the WorksheetFunction and Find methods. The former is quite slow. Consider using theUsedRange
property of the Worksheet object.Function GetLastRowInSheet(theWorksheet As Worksheet) As Long With theWorksheet.UsedRange GetLastRowInSheet = (.Row + .Rows.Count) End With End Function
In your section (5) you've got
With WB ... End With
but I think it doesn't add much but increases the level of indentation.
-
\$\begingroup\$ Be careful with
UsedRange
, it can sometimes return surprising results. ++ Nice review. \$\endgroup\$RubberDuck– RubberDuck2015年06月10日 13:54:53 +00:00Commented Jun 10, 2015 at 13:54 -
\$\begingroup\$ @RubberDuck - I see you changed my paragraph numbering from "N)" to "N." I'm curious why? Thanks \$\endgroup\$ChipsLetten– ChipsLetten2015年06月10日 14:04:01 +00:00Commented Jun 10, 2015 at 14:04
-
\$\begingroup\$ It makes it an actual markdown numbered list, which in turn gets transformed into an HTML ordered list. Just nicer formatting. It makes your answer more readable. \$\endgroup\$RubberDuck– RubberDuck2015年06月10日 14:05:52 +00:00Commented Jun 10, 2015 at 14:05