I have found Scripting.FileSystemObject to be slow and unstable. Calling file_object.Name
repeatedly has caused my code to crash excel on multiple instances. Therefore I made a module where all of my file System helper functions reside. I called it os
after python's os
module as I tried to emulate it as much as possible. Admittedly, I have not emulated it enough and much is written from scratch.
While I haven't been able to run extensive comparisons against it and FileSystemObject. I have found that it is significantly faster. It varies from operation to operation but the range is 4-10 times faster. However, I do suspect that FileSystemObject may out-perform os
in larger data-sets.
There are some few questionable design calls. Nothing is object oriented. Functional programming is preferred whenever possible. Almost all variables are strings (file paths) or collections of strings. The exceptions are the file system operations like FileCopy
, Move
, Rename
etc that return True
if succeeded or False
if failed as Error Handling is a horrid mess of GoTo
s in VB6. Also they are hard set to never overwrite files; only Remove
deletes anything.
I am looking for more functions, features and improvements for existing functions.
Option Explicit
' Constants
' ------
Public Const EXTSEP As String = "."
Public Const PARDIR As String = ".."
Public Const CURDIR As String = "."
Public Const SEP As String = "\"
Public Const PATHSEP As String = ";"
Private Const ALLPAT As String = "*"
Path Manipulations
These functions would be in os.path
. They have no interaction with the file system at all and are primarily helper functions.
''
' Returns the base name of a path, either the lowest folder or file
' Note! that `suffix` will be removed from the end regardless if its an actual filename
' extension or not.
Function BaseName(ByVal file_path As String, Optional suffix As String) As String
Dim fsplit As Variant
fsplit = Split(file_path, sep)
BaseName = fsplit(UBound(fsplit))
If suffix <> "" Then
Dim base_length As Integer
base_length = Len(BaseName) - Len(suffix)
BaseName = left$(BaseName, base_length) & _
Replace$(BaseName, suffix, "", base_length + 1)
End If
End Function
''
' Returns the path of the parent folder. This is the opposite of `BaseName`.
Function RootName(ByVal file_path As String) As String
Dim fsplit As Variant
fsplit = Split(file_path, sep)
If UBound(fsplit) = 0 Then
RootName = ""
Else
Dim i As Integer
For i = LBound(fsplit) To UBound(fsplit) - 1
RootName = RootName & fsplit(i) & SEP
Next i
RootName = left$(RootName, Len(RootName) - 1)
End If
End Function
Function ParentDir(ByVal file_path As String, _
ByVal parent_height As Integer) As String
ParentDir = file_path
Dim i As Integer
For i = 1 To parent_height
ParentDir = RootName(ParentDir)
Next i
End Function
''
' Returns the file extension of the file.
Function ext(ByVal file_path As String) As String
Dim base_name As String
base_name = BaseName(file_path)
If InStr(base_name, EXTSEP) Then
Dim fsplit As Variant
fsplit = Split(base_name, EXTSEP)
ext = EXTSEP & fsplit(UBound(fsplit))
End If
End Function
Function ChangeExt(ByVal file_path As String, ByVal new_ext As String) As String
Dim current_ext As String
current_ext = ext(file_path)
ChangeExt = left$(file_path, Len(file_path) - Len(current_ext)) & _
EXTSEP & Replace$(new_ext, EXTSEP, "", 1, 1)
End Function
Private Function RTrimSep(ByVal file_path As String) As String
If right$(file_path, 1) = sep Then
RTrimSep = left$(file_path, Len(file_path) - 1)
Else
RTrimSep = file_path
End If
End Function
''
' safely join two strings to form a path
Function pJoin(ByVal root_path As String, ByVal file_path As String) As String
pJoin = RTrimSep(root_path) & SEP & file_path
End Function
Function Append(ByVal file_path As String, ByVal to_append As String) As String
Dim file_ext As String
file_ext = ext(file_path)
Append = pJoin(RootName(file_path), _
BaseName(file_path, suffix:=file_ext) & _
to_append & file_ext)
End Function
Function Prepend(ByVal file_path As String, ByVal to_prepend As String) As String
Prepend = pJoin(RootName(file_path), to_prepend & BaseName(file_path))
End Function
Introspect File System
The following functions read the file system but make no changes.
''
' returns true if the file exists else false.
' You can give a pat as f to see if anything matches the pat
Function Exists(ByVal file_path As String, _
Optional vbType As Integer = vbDirectory) As Boolean
If file_path <> "" Then Exists = (Dir$(RTrimSep(file_path), vbType) <> "")
End Function
Function FileExists(ByVal file_path As String)
FileExists = Exists(file_path, vbNormal)
End Function
''
' vbDirectory option still includes files.
' FML
Function FolderExists(ByVal file_path As String)
FolderExists = Exists(file_path, vbDirectory) And Not Exists(file_path, vbNormal)
End Function
''
' returns a collection of strings that are paths of subitems in root which
' match pat.
Function SubItems(ByVal root As String, Optional pat As String = ALLPAT, _
Optional vbType As Integer = vbDirectory) As Collection
Set SubItems = New Collection
Dim sub_item As String
sub_item = Dir$(pJoin(root, pat), vbType)
While sub_item <> ""
SubItems.Add (pJoin(root, sub_item))
sub_item = Dir$()
Wend
End Function
Function SubFiles(ByVal root As String, _
Optional pat As String = ALLPAT) As Collection
Set SubFiles = SubItems(root, pat, vbNormal)
End Function
Function SubFolders(ByVal root As String, Optional pat As String = "", _
Optional skipDots As Boolean = True) As Collection
Set SubFolders = SubItems(root, pat, vbDirectory)
If skipDots And SubFolders.count > 0 Then
Dim dot As String, dotdot As String
dot = pJoin(root, CURDIR)
dotdot = pJoin(root, PARDIR)
Do While SubFolders.Item(1) = dot Or SubFolders.Item(1) = dotdot
SubFolders.Remove (1)
If SubFolders.count = 0 Then Exit Do
Loop
End If
Dim i As Integer
For i = SubFolders.count To 1 Step -1
If FileExists(SubFolders.Item(i)) Then
SubFolders.Remove (i)
End If
Next i
End Function
''
' recursive search
Sub sWalk(ByVal root As String, ByRef collec As Collection, _
Optional pat As String = "*", Optional vbType As Integer = vbNormal)
Dim file_path As Variant
For Each file_path In SubItems(root, pat, vbType)
collec.Add file_path
Next file_path
Dim folder_path As Variant
For Each folder_path In SubFolders(root)
sWalk folder_path, collec, pat, vbType
Next folder_path
End Sub
Function fWalk(ByVal root As String, Optional pat As String = "*", _
Optional vbType As Integer = vbNormal) As Collection
Set fWalk = New Collection
sWalk root, fWalk, pat, vbType
End Function
File System Operations
The following functions are the only functions that actually change the file system. Note that only Remove
deletes any data. The rest will return False
if they would remove or overwrite any data, or if they cannot perform the operation for any other reason.
Function Move(ByVal file_path As String, ByVal new_path As String, _
Optional create_parent As Boolean = False) As Boolean
Move = Exists(file_path) And Not Exists(new_path)
If Move Then
If create_parent Then CreateRootPath new_path
Name file_path As new_path
End If
End Function
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
Debug.Assert BaseName(new_name) = new_name
new_name = pJoin(RootName(file_path), new_name)
Rename = Exists(file_path) And Not Exists(new_name)
If Rename Then Name file_path As new_name
End Function
Function Remove(file_path As String) As Boolean
Remove = FileExists(file_path)
If Remove Then Kill file_path
End Function
''
' $ mkdir -p
Private Sub CreateRootPath(file_path As String)
If Not Exists(RootName(file_path)) Then
MakeDir RootName(file_path), create_parent:=True
End If
End Sub
Function MakeDir(folder_path As String, _
Optional create_parent As Boolean = False) As Boolean
MakeDir = Not Exists(folder_path)
If MakeDir Then
If create_parent Then CreateRootPath folder_path
MkDir folder_path
End If
End Function
Function CopyFile(file_path As String, dest_path As String, _
Optional create_parent As Boolean = False) As Boolean
CopyFile = Exists(file_path) And Not Exists(dest_path)
If CopyFile Then
If create_parent Then CreateRootPath dest_path
FileCopy file_path, dest_path
End If
End Function
Error Handling
I implemented the Error Handling as suggested. I noticed I could remove almost all of my guards because the VB methods would throw errors anyways. Except for FileCopy
which would overwrite existing files.
Function Move(ByVal src_path As String, ByVal dest_path As String, _
Optional create_parent As Boolean = False) As Boolean
Dim check As Boolean
On Error GoTo ErrHandler
If right$(dest_path, 1) = SEP Or FolderExists(dest_path) Then
' Move the file to a folder. Note this cannot be used with the
' create_parent option.
dest_path = pJoin(dest_path, BaseName(src_path))
End If
If create_parent Then CreateRootPath dest_path
Name src_path As dest_path
check = Exists(dest_path)
CleanExit:
Move = check
Exit Function
ErrHandler:
Err.clear
Debug.Assert (Not check)
Resume CleanExit
End Function
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
Debug.Assert BaseName(new_name) = new_name
Rename = Move(file_path, pJoin(RootName(file_path), new_name))
End Function
Function Remove(file_path As String) As Boolean
Dim check As Boolean
On Error GoTo ErrHandler
Kill file_path
check = (Not FileExists(file_path))
CleanExit:
Remove = check
Exit Function
ErrHandler:
Err.clear
Debug.Assert (Not check)
Resume CleanExit
End Function
Function MakeDir(folder_path As String, _
Optional create_parent As Boolean = False) As Boolean
Dim check As Boolean
On Error GoTo ErrHandler
If create_parent Then CreateRootPath folder_path
MkDir folder_path
check = FolderExists(folder_path)
CleanExit:
MakeDir = check
Exit Function
ErrHandler:
Err.clear
Debug.Assert (Not check)
Resume CleanExit
End Function
Function CopyFile(src_path As String, dest_path As String, _
Optional create_parent As Boolean = False) As Boolean
Dim check As Boolean
On Error GoTo ErrHandler
If FileExists(dest_path) Then GoTo CleanExit:
If create_parent Then CreateRootPath dest_path
FileCopy src_path, dest_path
check = FileExists(dest_path)
CleanExit:
CopyFile = check
Exit Function
ErrHandler:
Err.clear
Debug.Assert (Not check)
Resume CleanExit
End Function
Private Sub CreateRootPath(path As String)
Dim parent_folder As String
parent_folder = RootName(path)
If Not FolderExists(parent_folder) Then
MakeDir parent_folder, create_parent:=True
End If
End Sub
-
\$\begingroup\$ You might be interested in reviewing this somewhat related code \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年06月03日 04:09:31 +00:00Commented Jun 3, 2014 at 4:09
3 Answers 3
Just one quick little comment, about this function... which turns out being applicable to the whole:
Function Remove(file_path As String) As Boolean Remove = FileExists(file_path) If Remove Then Kill file_path End Function
You're returning True
if FileExists(file_path)
returns True
, not if the Kill
call succeeded. And what happens if Kill
blows up, your method blows up with it, and your client code is probably simply expecting the call to return False
if the Kill
somehow can't happen right now, whatever the reason is. What the client wants, is to delete the file.
I'd avoid underscores in identifiers, and keep them for event handler procedures and interface implementations. Also I don't like using the function's return value as a variable in a function's logic.
Consider this:
Function Remove(path As String) As Boolean
Const method As String = "Remove"
On Error GoTo ErrHandler
Dim exists As Boolean
Dim check As Boolean
exists = FileExists(path)
If exists Then
Kill path
check = Not FileExists(path)
End If
CleanExit:
Remove = check
Exit Function
ErrHandler:
' Err.Raise Err.Number, TypeName(Me) & "." & method, Err.Description
Err.Clear
Debug.Assert (Not check)
Resume CleanExit
End Function
If I un-comment the Err.Raise
call, I'll bubble up the error to the client code like your code does now - but you can take control of that, and add something meaningful to the error's Source
property, so the client code can know in which method the error was raised from. In this case though, I think it's best to handle any errors that may be raised in the process, and simply return True
or False
- your client code likely is only expecting a False
if anything failed.
Notice how exists
does not have the same meaning as check
and Remove
:
exists
is the condition we need to fulfill in order to perform theKill
callcheck
defaults toFalse
, and can only beTrue
if the fileexists
and theKill
call didn't blow up.Remove
is the return value, accessed/assigned once.
In case of an error - any error, code jumps to Err.Clear
and then Resume CleanExit
jumps to Remove = check
which is False
at that point, and then Exit Function
: client code sees nothing but a False
return value.
In the debugger, the Debug.Assert
will break whenever check
is True
, just prior to returning that value. Use Debug.Assert
to verify assumptions the code is making - in this case, that check
is False
.
-
1\$\begingroup\$ It would be even better to explicitly set check = false after clearing the error. \$\endgroup\$RubberDuck– RubberDuck2014年06月03日 02:10:49 +00:00Commented Jun 3, 2014 at 2:10
-
\$\begingroup\$ Depends how you see it: I find it more explicit to see
check
only assigned once... implicitly.... right. lol, indeed, it would be more explicit :) \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年06月03日 02:17:45 +00:00Commented Jun 3, 2014 at 2:17 -
1\$\begingroup\$ Added
Debug.Assert
as a counterpoint ;) \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年06月03日 02:34:50 +00:00Commented Jun 3, 2014 at 2:34 -
\$\begingroup\$ Love the
Debug.Assert
, but I'm not sure how client code will react to it. I'll have to research that later. \$\endgroup\$RubberDuck– RubberDuck2014年06月03日 12:40:13 +00:00Commented Jun 3, 2014 at 12:40 -
1\$\begingroup\$
Debug.Assert (not check)
will never be false.FileExists
would need to throw and error after returningTrue
which AFAIK is impossible. \$\endgroup\$cheezsteak– cheezsteak2014年06月03日 17:07:20 +00:00Commented Jun 3, 2014 at 17:07
Once you've fixed the things that could potentially crash your code and added some proper error handling you should go back and take a look at some style & readability issues.
- Remove the underscores from your variable names. Style conventions say to use camelCase.
file_path
should befilePath
, etc. I like that you created constants for the different separators, but give them better names. Unless you're coding in a plain text editor, intellisense will help you and there's no reason to abbreviate.
Public Const EXTSEPERATOR As String = "." Public Const PARENTDIR As String = ".." Public Const CURRENTDIR As String = "." Public Const FWDSLASH As String = "\" Public Const PATHSEPERATOR As String = ";" Private Const WILDCARD As String = "*"
On the topic of better names, what is
fsplit()
? It took me way too long to figure out that it's an array that stores the separate parts of a file path.- I recently had it pointed out to me that it's better to use
Not = vbNullString
than<> ""
and I'll pass that advice along to you. Functions should have verb-noun type names.
Function SubFiles(ByVal root As String, _ Optional pat As String = ALLPAT) As Collection Set SubFiles = SubItems(root, pat, vbNormal) End Function
Should be
getSubFiles()
orreturnSubFiles
, but you could keep your naming if...You considered an object oriented approach. Instead of returning a collection of
SubItems
orSubFiles
from a function, you could store them a a property of a parent class. I haven't thought out how I would refactor your code into classes, but any client code might benefit from it. (Of course, this would still require a privategetSubFiles()
routine though.)Think of Mr. Maintainer; add some comments to non obvious logic.
Dim base_length As Integer base_length = Len(BaseName) - Len(suffix) BaseName = left$(BaseName, base_length) & _ Replace$(BaseName, suffix, "", base_length + 1)
Could I spend a few minutes figuring out what exactly you're doing here? Sure, but it'd be a heck of a lot nicer if you wrote one sentence telling me what you're doing.
All that said, I still think it's a pretty cool little module of functions. Fun idea.
Just a thing I wanted to get finished as the other answers didn't mention it yet:
Spacing:
Be consistent in your usage of newlines:
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean Debug.Assert BaseName(new_name) = new_name new_name = pJoin(RootName(file_path), new_name) Rename = Exists(file_path) And Not Exists(new_name) If Rename Then Name file_path As new_name End Function
compared to:
Private Function RTrimSep(ByVal file_path As String) As String If right$(file_path, 1) = sep Then RTrimSep = left$(file_path, Len(file_path) - 1) Else RTrimSep = file_path End If End Function
I would love to see a consistent spacing behavior in that code ;)
IMO it does not matter which of the two you choose (though personally I prefer the latter one), but you should definitely stick with one way to do it.
It's rather confusing for me to have so much newlines as I use them to separate logically independent actions:
Function Rename(ByVal file_path As String, ByVal new_name As String) As Boolean
Debug.Assert BaseName(new_name) = new_name
new_name = pJoin(RootName(file_path), new_name)
Rename = Exists(file_path) And Not Exists(new_name)
If Rename Then
Name file_path As new_name
End Function