8
\$\begingroup\$

I have found Scripting.FileSystemObject to be slow and unstable. Calling file_object.Name repeatedly has caused my code to crash excel on multiple instances. Therefore I made a module where all of my file System helper functions reside. I called it os after python's os module as I tried to emulate it as much as possible. Admittedly, I have not emulated it enough and much is written from scratch.

While I haven't been able to run extensive comparisons against it and FileSystemObject. I have found that it is significantly faster. It varies from operation to operation but the range is 4-10 times faster. However, I do suspect that FileSystemObject may out-perform os in larger data-sets.

There are some few questionable design calls. Nothing is object oriented. Functional programming is preferred whenever possible. Almost all variables are strings (file paths) or collections of strings. The exceptions are the file system operations like FileCopy, Move, Rename etc that return True if succeeded or False if failed as Error Handling is a horrid mess of GoTos in VB6. Also they are hard set to never overwrite files; only Remove deletes anything.

I am looking for more functions, features and improvements for existing functions.

Option Explicit
' Constants
' ------
Public Const EXTSEP As String = "."
Public Const PARDIR As String = ".."
Public Const CURDIR As String = "."
Public Const SEP As String = "\"
Public Const PATHSEP As String = ";"
Private Const ALLPAT As String = "*"

Path Manipulations

These functions would be in os.path. They have no interaction with the file system at all and are primarily helper functions.

''
' Returns the base name of a path, either the lowest folder or file
' Note! that `suffix` will be removed from the end regardless if its an actual filename
' extension or not.
Function BaseName(ByVal file_path As String, Optional suffix As String) As String
 Dim fsplit As Variant
 fsplit = Split(file_path, sep)
 BaseName = fsplit(UBound(fsplit))
 If suffix <> "" Then
 Dim base_length As Integer
 base_length = Len(BaseName) - Len(suffix)
 BaseName = left$(BaseName, base_length) & _
 Replace$(BaseName, suffix, "", base_length + 1)
 End If
End Function
''
' Returns the path of the parent folder. This is the opposite of `BaseName`.
Function RootName(ByVal file_path As String) As String
 Dim fsplit As Variant
 fsplit = Split(file_path, sep)
 If UBound(fsplit) = 0 Then
 RootName = ""
 Else
 Dim i As Integer
 For i = LBound(fsplit) To UBound(fsplit) - 1
 RootName = RootName & fsplit(i) & SEP
 Next i
 RootName = left$(RootName, Len(RootName) - 1)
 End If
End Function
Function ParentDir(ByVal file_path As String, _
 ByVal parent_height As Integer) As String
 ParentDir = file_path
 Dim i As Integer
 For i = 1 To parent_height
 ParentDir = RootName(ParentDir)
 Next i
End Function
''
' Returns the file extension of the file.
Function ext(ByVal file_path As String) As String
 Dim base_name As String
 base_name = BaseName(file_path)
 If InStr(base_name, EXTSEP) Then
 Dim fsplit As Variant
 fsplit = Split(base_name, EXTSEP)
 ext = EXTSEP & fsplit(UBound(fsplit))
 End If
End Function
Function ChangeExt(ByVal file_path As String, ByVal new_ext As String) As String
 Dim current_ext As String
 current_ext = ext(file_path)
 ChangeExt = left$(file_path, Len(file_path) - Len(current_ext)) & _
 EXTSEP & Replace$(new_ext, EXTSEP, "", 1, 1)
End Function
Private Function RTrimSep(ByVal file_path As String) As String
 If right$(file_path, 1) = sep Then
 RTrimSep = left$(file_path, Len(file_path) - 1)
 Else
 RTrimSep = file_path
 End If
End Function
''
' safely join two strings to form a path
Function pJoin(ByVal root_path As String, ByVal file_path As String) As String
 pJoin = RTrimSep(root_path) & SEP & file_path
End Function
Function Append(ByVal file_path As String, ByVal to_append As String) As String
 Dim file_ext As String
 file_ext = ext(file_path)
 Append = pJoin(RootName(file_path), _
 BaseName(file_path, suffix:=file_ext) & _
 to_append & file_ext)
End Function
Function Prepend(ByVal file_path As String, ByVal to_prepend As String) As String
 Prepend = pJoin(RootName(file_path), to_prepend & BaseName(file_path))
End Function

Introspect File System

The following functions read the file system but make no changes.

''
' returns true if the file exists else false.
' You can give a pat as f to see if anything matches the pat
Function Exists(ByVal file_path As String, _
 Optional vbType As Integer = vbDirectory) As Boolean
 If file_path <> "" Then Exists = (Dir$(RTrimSep(file_path), vbType) <> "")
End Function
Function FileExists(ByVal file_path As String)
 FileExists = Exists(file_path, vbNormal)
End Function
''
' vbDirectory option still includes files.
' FML
Function FolderExists(ByVal file_path As String)
 FolderExists = Exists(file_path, vbDirectory) And Not Exists(file_path, vbNormal)
End Function
''
' returns a collection of strings that are paths of subitems in root which
' match pat.
Function SubItems(ByVal root As String, Optional pat As String = ALLPAT, _
 Optional vbType As Integer = vbDirectory) As Collection
 Set SubItems = New Collection
 Dim sub_item As String
 sub_item = Dir$(pJoin(root, pat), vbType)
 While sub_item <> ""
 SubItems.Add (pJoin(root, sub_item))
 sub_item = Dir$()
 Wend
End Function
Function SubFiles(ByVal root As String, _
 Optional pat As String = ALLPAT) As Collection
 Set SubFiles = SubItems(root, pat, vbNormal)
End Function
Function SubFolders(ByVal root As String, Optional pat As String = "", _
 Optional skipDots As Boolean = True) As Collection
 Set SubFolders = SubItems(root, pat, vbDirectory)
 If skipDots And SubFolders.count > 0 Then
 Dim dot As String, dotdot As String
 dot = pJoin(root, CURDIR)
 dotdot = pJoin(root, PARDIR)
 Do While SubFolders.Item(1) = dot Or SubFolders.Item(1) = dotdot
 SubFolders.Remove (1)
 If SubFolders.count = 0 Then Exit Do
 Loop
 End If
 Dim i As Integer
 For i = SubFolders.count To 1 Step -1
 If FileExists(SubFolders.Item(i)) Then
 SubFolders.Remove (i)
 End If
 Next i
End Function
''
' recursive search
Sub sWalk(ByVal root As String, ByRef collec As Collection, _
 Optional pat As String = "*", Optional vbType As Integer = vbNormal)
 Dim file_path As Variant
 For Each file_path In SubItems(root, pat, vbType)
 collec.Add file_path
 Next file_path
 Dim folder_path As Variant
 For Each folder_path In SubFolders(root)
 sWalk folder_path, collec, pat, vbType
 Next folder_path
End Sub
Function fWalk(ByVal root As String, Optional pat As String = "*", _
 Optional vbType As Integer = vbNormal) As Collection
 Set fWalk = New Collection
 sWalk root, fWalk, pat, vbType
End Function

File System Operations

The following functions are the only functions that actually change the file system. Note that only Remove deletes any data. The rest will return False if they would remove or overwrite any data, or if they cannot perform the operation for any other reason.

Function Move(ByVal file_path As String, ByVal new_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 Move = Exists(file_path) And Not Exists(new_path)
 If Move Then
 If create_parent Then CreateRootPath new_path
 Name file_path As new_path
 End If
End Function
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
 Debug.Assert BaseName(new_name) = new_name
 new_name = pJoin(RootName(file_path), new_name)
 Rename = Exists(file_path) And Not Exists(new_name)
 If Rename Then Name file_path As new_name
End Function
Function Remove(file_path As String) As Boolean
 Remove = FileExists(file_path)
 If Remove Then Kill file_path
End Function
''
' $ mkdir -p
Private Sub CreateRootPath(file_path As String)
 If Not Exists(RootName(file_path)) Then
 MakeDir RootName(file_path), create_parent:=True
 End If
End Sub
Function MakeDir(folder_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 MakeDir = Not Exists(folder_path)
 If MakeDir Then
 If create_parent Then CreateRootPath folder_path
 MkDir folder_path
 End If
End Function
Function CopyFile(file_path As String, dest_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 CopyFile = Exists(file_path) And Not Exists(dest_path)
 If CopyFile Then
 If create_parent Then CreateRootPath dest_path
 FileCopy file_path, dest_path
 End If
End Function

Error Handling

I implemented the Error Handling as suggested. I noticed I could remove almost all of my guards because the VB methods would throw errors anyways. Except for FileCopy which would overwrite existing files.

Function Move(ByVal src_path As String, ByVal dest_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 Dim check As Boolean
 On Error GoTo ErrHandler
 If right$(dest_path, 1) = SEP Or FolderExists(dest_path) Then
 ' Move the file to a folder. Note this cannot be used with the
 ' create_parent option.
 dest_path = pJoin(dest_path, BaseName(src_path))
 End If
 If create_parent Then CreateRootPath dest_path
 Name src_path As dest_path
 check = Exists(dest_path)
CleanExit:
 Move = check
 Exit Function
ErrHandler:
 Err.clear
 Debug.Assert (Not check)
 Resume CleanExit
End Function
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
 Debug.Assert BaseName(new_name) = new_name
 Rename = Move(file_path, pJoin(RootName(file_path), new_name))
End Function
Function Remove(file_path As String) As Boolean
 Dim check As Boolean
 On Error GoTo ErrHandler
 Kill file_path
 check = (Not FileExists(file_path))
CleanExit:
 Remove = check
 Exit Function
ErrHandler:
 Err.clear
 Debug.Assert (Not check)
 Resume CleanExit
End Function
Function MakeDir(folder_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 Dim check As Boolean
 On Error GoTo ErrHandler
 If create_parent Then CreateRootPath folder_path
 MkDir folder_path
 check = FolderExists(folder_path)
CleanExit:
 MakeDir = check
 Exit Function
ErrHandler:
 Err.clear
 Debug.Assert (Not check)
 Resume CleanExit
End Function
Function CopyFile(src_path As String, dest_path As String, _
 Optional create_parent As Boolean = False) As Boolean
 Dim check As Boolean
 On Error GoTo ErrHandler
 If FileExists(dest_path) Then GoTo CleanExit:
 If create_parent Then CreateRootPath dest_path
 FileCopy src_path, dest_path
 check = FileExists(dest_path)
CleanExit:
 CopyFile = check
 Exit Function
ErrHandler:
 Err.clear
 Debug.Assert (Not check)
 Resume CleanExit
End Function
Private Sub CreateRootPath(path As String)
 Dim parent_folder As String
 parent_folder = RootName(path)
 If Not FolderExists(parent_folder) Then
 MakeDir parent_folder, create_parent:=True
 End If
End Sub
asked Jun 2, 2014 at 19:59
\$\endgroup\$
1

3 Answers 3

7
\$\begingroup\$

Just one quick little comment, about this function... which turns out being applicable to the whole:

Function Remove(file_path As String) As Boolean
 Remove = FileExists(file_path)
 If Remove Then Kill file_path
End Function

You're returning True if FileExists(file_path) returns True, not if the Kill call succeeded. And what happens if Kill blows up, your method blows up with it, and your client code is probably simply expecting the call to return False if the Kill somehow can't happen right now, whatever the reason is. What the client wants, is to delete the file.

I'd avoid underscores in identifiers, and keep them for event handler procedures and interface implementations. Also I don't like using the function's return value as a variable in a function's logic.

Consider this:

Function Remove(path As String) As Boolean
Const method As String = "Remove"
On Error GoTo ErrHandler
 Dim exists As Boolean
 Dim check As Boolean
 exists = FileExists(path)
 If exists Then 
 Kill path 
 check = Not FileExists(path)
 End If
CleanExit:
 Remove = check
 Exit Function
ErrHandler:
' Err.Raise Err.Number, TypeName(Me) & "." & method, Err.Description
 Err.Clear
 Debug.Assert (Not check)
 Resume CleanExit
End Function

If I un-comment the Err.Raise call, I'll bubble up the error to the client code like your code does now - but you can take control of that, and add something meaningful to the error's Source property, so the client code can know in which method the error was raised from. In this case though, I think it's best to handle any errors that may be raised in the process, and simply return True or False - your client code likely is only expecting a False if anything failed.

Notice how exists does not have the same meaning as check and Remove:

  • exists is the condition we need to fulfill in order to perform the Kill call
  • check defaults to False, and can only be True if the file exists and the Kill call didn't blow up.
  • Remove is the return value, accessed/assigned once.

In case of an error - any error, code jumps to Err.Clear and then Resume CleanExit jumps to Remove = check which is False at that point, and then Exit Function: client code sees nothing but a False return value.

In the debugger, the Debug.Assert will break whenever check is True, just prior to returning that value. Use Debug.Assert to verify assumptions the code is making - in this case, that check is False.

answered Jun 3, 2014 at 2:00
\$\endgroup\$
6
  • 1
    \$\begingroup\$ It would be even better to explicitly set check = false after clearing the error. \$\endgroup\$ Commented Jun 3, 2014 at 2:10
  • \$\begingroup\$ Depends how you see it: I find it more explicit to see check only assigned once... implicitly.... right. lol, indeed, it would be more explicit :) \$\endgroup\$ Commented Jun 3, 2014 at 2:17
  • 1
    \$\begingroup\$ Added Debug.Assert as a counterpoint ;) \$\endgroup\$ Commented Jun 3, 2014 at 2:34
  • \$\begingroup\$ Love the Debug.Assert, but I'm not sure how client code will react to it. I'll have to research that later. \$\endgroup\$ Commented Jun 3, 2014 at 12:40
  • 1
    \$\begingroup\$ Debug.Assert (not check) will never be false. FileExists would need to throw and error after returning True which AFAIK is impossible. \$\endgroup\$ Commented Jun 3, 2014 at 17:07
9
\$\begingroup\$

Once you've fixed the things that could potentially crash your code and added some proper error handling you should go back and take a look at some style & readability issues.

  1. Remove the underscores from your variable names. Style conventions say to use camelCase. file_path should be filePath, etc.
  2. I like that you created constants for the different separators, but give them better names. Unless you're coding in a plain text editor, intellisense will help you and there's no reason to abbreviate.

    Public Const EXTSEPERATOR As String = "."
    Public Const PARENTDIR As String = ".."
    Public Const CURRENTDIR As String = "."
    Public Const FWDSLASH As String = "\"
    Public Const PATHSEPERATOR As String = ";"
    Private Const WILDCARD As String = "*"
    
  3. On the topic of better names, what is fsplit()? It took me way too long to figure out that it's an array that stores the separate parts of a file path.

  4. I recently had it pointed out to me that it's better to use Not = vbNullString than <> "" and I'll pass that advice along to you.
  5. Functions should have verb-noun type names.

    Function SubFiles(ByVal root As String, _
     Optional pat As String = ALLPAT) As Collection
     Set SubFiles = SubItems(root, pat, vbNormal)
    End Function
    

    Should be getSubFiles() or returnSubFiles, but you could keep your naming if...

  6. You considered an object oriented approach. Instead of returning a collection of SubItems or SubFiles from a function, you could store them a a property of a parent class. I haven't thought out how I would refactor your code into classes, but any client code might benefit from it. (Of course, this would still require a private getSubFiles() routine though.)

  7. Think of Mr. Maintainer; add some comments to non obvious logic.

    Dim base_length As Integer
    base_length = Len(BaseName) - Len(suffix)
    BaseName = left$(BaseName, base_length) & _
     Replace$(BaseName, suffix, "", base_length + 1)
    

    Could I spend a few minutes figuring out what exactly you're doing here? Sure, but it'd be a heck of a lot nicer if you wrote one sentence telling me what you're doing.

All that said, I still think it's a pretty cool little module of functions. Fun idea.

answered Jun 3, 2014 at 12:32
\$\endgroup\$
7
\$\begingroup\$

Just a thing I wanted to get finished as the other answers didn't mention it yet:

Spacing:

Be consistent in your usage of newlines:

Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
 Debug.Assert BaseName(new_name) = new_name
 new_name = pJoin(RootName(file_path), new_name)
 Rename = Exists(file_path) And Not Exists(new_name)
 If Rename Then Name file_path As new_name
End Function

compared to:

Private Function RTrimSep(ByVal file_path As String) As String
 If right$(file_path, 1) = sep Then
 RTrimSep = left$(file_path, Len(file_path) - 1)
 Else
 RTrimSep = file_path
 End If
End Function

I would love to see a consistent spacing behavior in that code ;)
IMO it does not matter which of the two you choose (though personally I prefer the latter one), but you should definitely stick with one way to do it.

It's rather confusing for me to have so much newlines as I use them to separate logically independent actions:

Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean 
 Debug.Assert BaseName(new_name) = new_name
 new_name = pJoin(RootName(file_path), new_name)
 Rename = Exists(file_path) And Not Exists(new_name)
 If Rename Then 
 Name file_path As new_name
End Function 
answered Jun 3, 2014 at 16:00
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.