I need help making this code run faster. I am using it to import a CSV file into Excel. It works, but it is very slow. The file is almost 20MB.
{Sub OpenTextFile()
Dim FilePath As String
Dim linitem As Variant
FilePath = "filepath.txt"
Open FilePath For Input As #1
row_number = 0
Do
Line Input #1, LineFromFile
LineItems = Split(LineFromFile, "|")
ActiveCell.Offset(row_number, 30).Value = LineItems(0)
ActiveCell.Offset(row_number, 29).Value = LineItems(1)
ActiveCell.Offset(row_number, 28).Value = LineItems(2)
ActiveCell.Offset(row_number, 27).Value = LineItems(3)
ActiveCell.Offset(row_number, 26).Value = LineItems(4)
ActiveCell.Offset(row_number, 25).Value = LineItems(5)
ActiveCell.Offset(row_number, 24).Value = LineItems(6)
ActiveCell.Offset(row_number, 23).Value = LineItems(7)
ActiveCell.Offset(row_number, 22).Value = LineItems(8)
ActiveCell.Offset(row_number, 21).Value = LineItems(9)
ActiveCell.Offset(row_number, 20).Value = LineItems(10)
ActiveCell.Offset(row_number, 19).Value = LineItems(11)
ActiveCell.Offset(row_number, 18).Value = LineItems(12)
ActiveCell.Offset(row_number, 17).Value = LineItems(13)
ActiveCell.Offset(row_number, 16).Value = LineItems(14)
ActiveCell.Offset(row_number, 15).Value = LineItems(15)
ActiveCell.Offset(row_number, 14).Value = LineItems(16)
ActiveCell.Offset(row_number, 13).Value = LineItems(17)
ActiveCell.Offset(row_number, 12).Value = LineItems(18)
ActiveCell.Offset(row_number, 11).Value = LineItems(19)
ActiveCell.Offset(row_number, 10).Value = LineItems(20)
ActiveCell.Offset(row_number, 9).Value = LineItems(21)
ActiveCell.Offset(row_number, 8).Value = LineItems(22)
ActiveCell.Offset(row_number, 7).Value = LineItems(23)
ActiveCell.Offset(row_number, 6).Value = LineItems(24)
ActiveCell.Offset(row_number, 5).Value = LineItems(25)
ActiveCell.Offset(row_number, 4).Value = LineItems(26)
ActiveCell.Offset(row_number, 3).Value = LineItems(27)
ActiveCell.Offset(row_number, 2).Value = LineItems(28)
ActiveCell.Offset(row_number, 1).Value = LineItems(29)
ActiveCell.Offset(row_number, 0).Value = LineItems(30)
row_number = row_number + 1
Loop Until EOF(1)
Close #1
End Sub
}
5 Answers 5
Not sure about the "faster" thing but you can replace the 31 lines of assignments with a 3 line for loop:
For r As Integer = 0 To 30
ActiveCell.Offset(row_number, 30 - r).Value = LineItems(r)
Next
Programming is about automating repetitive tasks, this should also apply to the code you write.
-
\$\begingroup\$ True, i am a beginner. I tried a loop but i could not write one that works. Grateful for your help \$\endgroup\$user25830– user258302013年11月03日 16:41:41 +00:00Commented Nov 3, 2013 at 16:41
-
2\$\begingroup\$ I would hesitate to tag this as an answer. If your new here as well, you might want to wait awhile for answers to your actual question: How to make the code faster... vs this honestly good answer to an unasked question: How to make the code cleaner. \$\endgroup\$WernerCD– WernerCD2013年11月03日 17:35:20 +00:00Commented Nov 3, 2013 at 17:35
-
\$\begingroup\$ I have to agree with @WernerCD, I don't really think this solves your problem. Although to be fair codereview is not about solving programing problems, it's about reviewing existing code ;) \$\endgroup\$ChrisWue– ChrisWue2013年11月03日 19:02:26 +00:00Commented Nov 3, 2013 at 19:02
-
\$\begingroup\$ Point well taken. \$\endgroup\$user25830– user258302013年11月03日 23:25:09 +00:00Commented Nov 3, 2013 at 23:25
If I were to rewrite your code, I'd do it like this:
Option Explicit
Step one, always be explicit. Nobody likes seeing a variable being used that they never saw declared anywhere.
Public Sub ImportTextFile()
Step two, name the procedure after what it's doing, and make it do that.
Dim filePath As String
filePath = "filepath.txt"
Dim fileNumber As Long
fileNumber = FreeFile
FreeFile
will give you a file number that's ok to use. Hard-coding a file number is never a good idea, even if you're only using one file.
Dim inputData As String
Dim lineCount As Long
On Error Goto ErrHandler
ToggleWaitMode True, "Importing text file..."
On Error Goto ErrHandler
means if anything wrong happens (file not found, more data in the file than can fit on the worksheet, or whatever), we'll skip straight to an error-handling subroutine where we'll take good care of closing our file and toggle "wait mode" back off.
Open filePath For Input As #fileNumber
Do Until EOF(fileNumber)
Line Input #fileNumber, inputData
lineCount = lineCount + 1
WriteLineContentToActiveSheet inputData, lineCount
Loop
Close #fileNumber
So the ImportTextFile
procedure does nothing but reading the file's content. WriteLineContentToActiveSheet
will do the worksheet-writing part.
ErrHandler:
ToggleWaitMode
If Err.Number <> 0 Then
Close 'closes any file left open
MsgBox Err.Description
Err.Clear
End If
End sub
The code under this label will execute even if no error occurred, this is why we're checking for Err.Number
being other than 0. But regardless of whether there's an error or not, we always want to ToggleWaitMode
.
Private Sub ToggleWaitMode(Optional ByVal wait As Boolean = False, Optional ByVal statusBarMessage As String = vbNullString)
Application.ScreenUpdating = True 'ensure status message gets displayed
Application.StatusBar = statusBarMessage
Application.Cursor = IIf(wait, xlWait, xlDefault)
Application.ScreenUpdating = Not wait
Application.EnableEvents = Not Wait
End Sub
This small procedure takes two optional parameters. If none are specified, "wait mode" is toggled off - this means the status bar message is reset, the mouse cursor goes back to default, Excel resumes redrawing the screen and fires events whenever something happens. "Wait mode" is just all that reversed :)
Private Sub WriteLineContentToActiveSheet(ByVal inputData As String, ByVal targetRow As Long)
Dim lineFields() As String
lineFields = Split(inputData, "|")
Dim fieldCount As Integer
fieldCount = UBound(lineFields)
Dim currentField As Integer
Dim targetColumn As Integer
Dim fieldValue As String
With ActiveSheet
For currentField = 0 To fieldCount
fieldValue = lineFields(currentField)
targetColumn = fieldCount - currentField + 1
.Cells(targetRow, targetColumn).Value = fieldValue
Next
End With
End Sub
This small procedure takes a single line of data, splits it into as many fields as it has, and writes each field value into a cell of the active sheet. Using UBound
we're no longer tied to a specific number of fields - if the file format changes, the procedure should still work. And if it doesn't, the procedure that called it will cleanly handle the error for us.
Local variables currentField
, targetColumn
and fieldValue
are not necessary - they could all be inlined, but having them makes code easier to follow and allows you to place breakpoints to validate their values as you run the code line-by-line with F8 when you're debugging.
Couple observations:
- You need to indent your code. Indentation makes it much easier to read, for yourself and anyone looking at it. Give that Tab button some lovin'!
- Procedures define a scope, anything under it should be indented.
- Within a scope, code blocks should also be indented - this includes
If
blocks,With
blocks and any code that runs inside a loop.
OpenTextFile
is a bad name for a procedure that actually imports a text file.- Be consistent with your naming: chose a convention, and stick to it.
- You're reading a file, that's an I/O operation and by Murphy's Law this is going to fail one day or another. Make sure your code properly handles any error that could happen while processing the file, you don't want to leave a file handle opened by mistake.
- Importing a 20mb text file will take a while, even with screen updating turned off. Tell your user you're working, give'em a hourglass cursor, and you can even update the statusbar message every couple thousand lines read/written (just call
ToggleWaitMode True
with a new status message) so they know it's not frozen.
-
\$\begingroup\$ Thank you, i am here to learn. I will take the advice / critique and learn from it. Thank you for taking the time to look at my "code". \$\endgroup\$user25830– user258302013年11月03日 16:38:34 +00:00Commented Nov 3, 2013 at 16:38
-
\$\begingroup\$ Thanks retailcoder, i appreciate your help. I learned a lot. I tried your code, and i am getting a "Array expected" @ Ubound(inputData). It just freeses \$\endgroup\$user25830– user258302013年11月03日 20:36:33 +00:00Commented Nov 3, 2013 at 20:36
-
\$\begingroup\$ oops my bad, typo - should read
UBound(lineFields)
... fixed now. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2013年11月03日 21:30:00 +00:00Commented Nov 3, 2013 at 21:30 -
\$\begingroup\$ UBound stands for "upper bound" and returns the upper boundary of an array. That's the number of fields we have in
LineFields
after theSplit
, and so it represents the number of fields we have - hencefieldCount
. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2013年11月03日 21:40:38 +00:00Commented Nov 3, 2013 at 21:40 -
\$\begingroup\$ Thanks, i tried it. it works but not faster. It gets stuck in imporitng the file. It is a large file so it might be that it will take time no matter what. Thanks a lot for your help, i appreciate it. God Bless \$\endgroup\$user25830– user258302013年11月03日 23:16:20 +00:00Commented Nov 3, 2013 at 23:16
You can also use the With
statement this way:
With ActiveCell
For r As Integer = 0 To 30
.Offset(row_number, 30 - r).Value = LineItems(r)
Next
End With
Thus, Excel doesn't have to call the ActiveCell
object at each loop.
-
\$\begingroup\$ It's still calling it - the
with
block is just sort-of hiding it, hence the readability debate around that keyword :) \$\endgroup\$Mathieu Guindon– Mathieu Guindon2013年11月03日 17:34:06 +00:00Commented Nov 3, 2013 at 17:34 -
\$\begingroup\$ @retailcoder: are you sure it calls it in each line, or does it call it once and maintain a reference to it? Given that ActiveCell is a function, it should do the later, which should be a bit faster. \$\endgroup\$jmoreno– jmoreno2013年11月03日 19:35:52 +00:00Commented Nov 3, 2013 at 19:35
-
\$\begingroup\$ After verification, it seems you're right about this. Sorry! \$\endgroup\$Mathieu Guindon– Mathieu Guindon2013年11月03日 19:52:38 +00:00Commented Nov 3, 2013 at 19:52
-
2\$\begingroup\$ @Jmax: Assuming that VBA and VB.net do the same thing, that should be faster. See this SO answer: stackoverflow.com/a/18288030/234954 \$\endgroup\$jmoreno– jmoreno2013年11月04日 04:03:54 +00:00Commented Nov 4, 2013 at 4:03
-
\$\begingroup\$ Yup. +1 and I'm stealing this, I really thought
With
was merely syntactic sugar. Thanks @jmoreno for that link. \$\endgroup\$Mathieu Guindon– Mathieu Guindon2013年11月04日 13:41:52 +00:00Commented Nov 4, 2013 at 13:41
One of the slowest parts of Excel VBA is writing to the grid. Whenever you have a lot of data to write you should put that data in an array and write it all at once. Here's how I would rewrite your code to do that.
Sub OpenTextFile()
Dim sFilePath As String
Dim sInput As String
Dim vaLines As Variant
Dim vaLineItems As Variant
Dim aNew() As Variant
Dim i As Long, j As Long
Dim lCnt As Long
sFilePath = "filepath.txt"
Open sFilePath For Input As #1
'Read in whole file at one, then close it
sInput = Input$(LOF(1), 1)
Close #1
'Split the lines and the first line to set up aNew
vaLines = Split(sInput, vbNewLine)
vaLineItems = Split(vaLines(0), "|")
ReDim aNew(LBound(vaLines) To UBound(vaLines), LBound(vaLineItems) To UBound(vaLineItems))
'Loop through the lines
For i = LBound(vaLines) To UBound(vaLines)
vaLineItems = Split(vaLines(i), "|")
'Loop through the items, add to aNew in reverse
For j = LBound(vaLineItems) To UBound(vaLineItems)
aNew(lCnt, UBound(vaLineItems) - j) = vaLineItems(j)
Next j
lCnt = lCnt + 1
Next i
'Write to the range all at once
Sheet1.Range("A1").Resize(UBound(aNew, 1) + 1, UBound(aNew, 2) + 1).Value = aNew
End Sub
aNew is a poorly named variable that will hold the array that gets written to the sheet. By reading in the whole text file at once, you can redimension aNew to the right number of rows. Then by reading in the first line and splitting it, you get the number of columns for aNew (assuming they're all the same number of columns).
The loops simply fill aNew with the info in the order we want it. Finally, the array is assigned to the Value property.
An alternative approach would be to code the import of the file as a separate sheet. Then select used range and copy to where those values have to go.
This way you will let Excel do the hard part.
Application.ScreenUpdating
toFalse
and then back toTrue
once you're done with the file? \$\endgroup\$