2
\$\begingroup\$

I am new to stackoverflow and a newbie to VBA coding. At my work, we are supplied with shipment data in the form of Ms Word which is not very useful. I have found a way to transfer the data using VBA and have a code that is fully functional. However, the data set contains hundreds of thousands of records. I tried running a month's worth of data with 200k records and it took 5 days. Just wondering if there is anything in my code that I could be improved to speed up the process. I've tried turning off screen updates, events, calculations but it didn't do much. Thanks in advance for your help.

Sub Word_to_Excel()
Dim FName As String, FD As FileDialog
Dim wdApp As Object
Dim wdDoc As Object
Dim WDR, WDCheck, ShipmentID As Object
Dim ExR As Range
Dim file
Dim Path As String
Dim ImportDate As Object
Dim ImportValue As String
Dim ShipmentIDcheck As String
Dim objResult
 Set objShell = CreateObject("WScript.Shell")
 Set ExR = Selection ' current location in Excel Sheet
 ' Select Folder containing word documents
 Set FD = Application.FileDialog(msoFileDialogFolderPicker)
 FD.Show
 FName = FD.SelectedItems(1)
 file = Dir(FName & "\*.docx")
 Set wdApp = CreateObject("Word.Application")
 ' Open word document in the folder, run macro, close it and open the next word document until there are none left
 Do While file <> ""
 wdApp.Documents.Open Filename:=FName & "\" & file
 wdApp.ActiveWindow.ActivePane.View.Type = 1
 wdApp.Visible = True
 ' Once the word doc is open, go to beginning of document and search for CTY/SITE/SORT:
 wdApp.Selection.HomeKey Unit:=6
 wdApp.Selection.Find.ClearFormatting
 wdApp.Selection.Find.Execute "CTY/SITE/SORT:"
 Set WDCheck = wdApp.Selection
 ' If "CTY/SITE/SORT:" is found, then look for Shipment ID
 Do While WDCheck = "CTY/SITE/SORT:"
 ' Find first shipment
 wdApp.Selection.HomeKey Unit:=5
 wdApp.Selection.MoveDown Unit:=5, Count:=11
 wdApp.Selection.MoveRight Unit:=1, Count:=1
 wdApp.Selection.MoveRight Unit:=1, Count:=11, Extend:=1
 Set ShipmentID = wdApp.Selection
 ShipmentIDcheck = Replace(ShipmentID, " ", "")
 ' Transfer information from Word to Excel for a Shipment ID and go to the next one.
 ' Shipment ID should be a string that is 11 characters long
 ' If Shipment ID no longer exist, go to next page by searching for the next CTY/SITE/SORT:
 Do While Len(Trim(ShipmentIDcheck)) = 11
 i = i + 1
 ExR(i, 1) = file
 ExR(i, 2) = ShipmentIDcheck
 ' Consignee Name
 wdApp.Selection.MoveUp Unit:=5, Count:=1
 wdApp.Selection.MoveRight Unit:=1, Count:=12
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 3) = Trim(WDR)
 ' Importer Name
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 8) = Trim(WDR)
 ' Shipper Name
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 13) = Trim(WDR)
 ' Quantity
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=10, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 19) = Trim(WDR)
 ' Weight
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=12, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 20) = Trim(WDR)
 ' Value
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=12, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 21) = Trim(WDR)
 ' Broker
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=11, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 23) = Trim(WDR)
 ' Consignee Street
 wdApp.Selection.HomeKey Unit:=5
 wdApp.Selection.MoveDown Unit:=5, Count:=1
 wdApp.Selection.MoveRight Unit:=1, Count:=13
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 4) = Trim(WDR)
 ' Importer Street
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 9) = Trim(WDR)
 ' Shipper Street
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=23, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 14) = Trim(WDR)
 ' Description
 wdApp.Selection.MoveRight Unit:=1, Count:=8
 wdApp.Selection.MoveRight Unit:=1, Count:=40, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 18) = Trim(WDR)
 ' Consignee City
 wdApp.Selection.HomeKey Unit:=5
 wdApp.Selection.MoveDown Unit:=5, Count:=1
 wdApp.Selection.MoveRight Unit:=1, Count:=13
 wdApp.Selection.MoveRight Unit:=1, Count:=13, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 5) = Trim(WDR)
 ' Consignee Province
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=2, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 6) = Trim(WDR)
 ' Consignee Postal
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=6, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 7) = Trim(WDR)
 ' Importer City
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=13, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 10) = Trim(WDR)
 ' Importer Province
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=2, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 11) = Trim(WDR)
 ' Importer Postal
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=6, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 12) = Trim(WDR)
 ' Shipper City
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=13, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 15) = Trim(WDR)
 ' Shipper Province
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=2, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 16) = Trim(WDR)
 ' Shipper Postal
 wdApp.Selection.MoveRight Unit:=1, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=6, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 17) = Trim(WDR)
 ' Country of Origin
 wdApp.Selection.MoveRight Unit:=1, Count:=29
 wdApp.Selection.MoveRight Unit:=1, Count:=21, Extend:=1
 Set WDR = wdApp.Selection
 ExR(i, 22) = Trim(WDR)
 wdApp.Selection.HomeKey Unit:=5
 wdApp.Selection.MoveDown Unit:=5, Count:=2
 wdApp.Selection.MoveRight Unit:=1, Count:=1
 wdApp.Selection.MoveRight Unit:=1, Count:=11, Extend:=1
 Set ShipmentID = wdApp.Selection
 ' Remove spaces from selection. Selection is then used to check if it is a shipment ID.
 ' If it is, then data for that shipment ID is transferred. If not, macro will go to the next page in the Word Doc.
 ShipmentIDcheck = Replace(ShipmentID, " ", "")
 ActiveCell.Offset(1).Select
 Loop
 'Simulate keyboard press "NUMLOCK" to prevent screen from locking
 objResult = objShell.SendKeys("{NUMLOCK}")
 wdApp.Selection.HomeKey Unit:=5
 wdApp.Selection.Find.ClearFormatting
 wdApp.Selection.Find.Execute "CTY/SITE/SORT:"
 Set WDCheck = wdApp.Selection
 Loop
 wdApp.ActiveDocument.Close SaveChanges:=wdDoNotSaveChanges
 ActiveWorkbook.Save
 file = Dir()
 Loop
 wdApp.Quit
 MsgBox "Data extraction completed at:" & vbNewLine & Format(Now, "mmmm d, yyyy hh:mm AM/PM")
End Sub

This is how the data is formatted in Ms Word. There are multiple word documents containing pages and pages of this dataset per day. Number of shipments per page varies. But the format are the same throughout. There are no tables in the word documents, just text separated by spaces. CTY/SITE/SORT: is unique to every page and I used it as an anchor point. if the macro finds it, then it goes down 11 lines and takes the first shipment ID and the other information. It then checks for the next shipment ID. If it is not there, then it goes to the next page and repeats the process.

REPORT NUM : ABC1234 OPERATIONS SYSTEM PAGE NUM: 2 
 CTY/SITE/SORT: CA 00123 SUMMARY CARGO RUN TIME: 07:33:43 
 SORT DATE : INBOUND - SCAN RUN DATE: 01AUG19 
 OPER ID : ABC123 
 MVMT: 12345678 MVMT DT: 01AUG19 MAWB: PROD TYP: DTY TYP: IMP CTY: EXP CTY: BL TYP: 
 COURIER REMISSION MANIFEST EXPORT SITE: US 12345 
 GCCN ID: EXPECTED SHPTS: EXPECTED PKGS: EXPECTED WEIGHT: 
 CUSTOMS NUM CONSIGNEE NAME IMPORTER NAME SHIPPER NAME CSA QTY WGT(LBS) VALUE BROKER 
 SHIPMENT ID DESCRIPTION (CAD) CTRY OF ORIGIN
 JOHN SMITH ABC COMPANY XYZ COMPANY 1 1.1 1000.00 UNCONSIGNED
 ABC12345678 123 MAIN STREET 345 RANDOM ROAD UNIVERSITY OF WASHINGTO BICYCLE PARTS 
 VANCOUVER BC V1A1A1 VANCOUVER BC V2B1B2 SEATTLE WA 981234 US 
 JOHN SMITH ABC COMPANY XYZ COMPANY 1 1.1 1000.00 UNCONSIGNED
 ABC12345678 123 MAIN STREET 345 RANDOM ROAD UNIVERSITY OF WASHINGTO BICYCLE PARTS 
 VANCOUVER BC V1A1A1 VANCOUVER BC V2B1B2 SEATTLE WA 981234 US 
 JOHN SMITH ABC COMPANY XYZ COMPANY 1 1.1 1000.00 UNCONSIGNED
 ABC12345678 123 MAIN STREET 345 RANDOM ROAD UNIVERSITY OF WASHINGTO BICYCLE PARTS 
 VANCOUVER BC V1A1A1 VANCOUVER BC V2B1B2 SEATTLE WA 981234 US 
 JOHN SMITH ABC COMPANY XYZ COMPANY 1 1.1 1000.00 UNCONSIGNED
 ABC12345678 123 MAIN STREET 345 RANDOM ROAD UNIVERSITY OF WASHINGTO BICYCLE PARTS 
 VANCOUVER BC V1A1A1 VANCOUVER BC V2B1B2 SEATTLE WA 981234 US 
 TOTAL FOR DUTY TYPE COURIER REMISSION 
 TOTAL SHIPMENTS: 4 
 TOTAL PACKAGES : 4 
 TOTAL WEIGHT : 70.9 LBS 
 TOTAL VALUES : 4000.00 
* * * 
asked Jan 14, 2020 at 22:04
\$\endgroup\$
4
  • 1
    \$\begingroup\$ Is the word document laid out with tables? If so I'd consider plain copy-pasting the Word tables into a spreadsheet, and try processing that instead. \$\endgroup\$ Commented Jan 15, 2020 at 14:14
  • \$\begingroup\$ no tables. it is laid out into columns and separated by spaces. \$\endgroup\$ Commented Jan 15, 2020 at 15:42
  • 2
    \$\begingroup\$ Then I try text import. Anything but line-by-line, column-by-column processing ;-) \$\endgroup\$ Commented Jan 15, 2020 at 15:56
  • \$\begingroup\$ the-macro-recorder-curse can be cured! And 200K data should be stored in a database! \$\endgroup\$ Commented Jan 22, 2020 at 1:41

1 Answer 1

5
\$\begingroup\$

Declarations

My first comment is Option explicit. Every. Single. Time.

Your first line of code is :

Set objShell = CreateObject("WScript.Shell")

Why? objshell is not declared or used. And while on the matter of declarations:

Dim WDR, WDCheck, ShipmentID As Object

declares WDR and WDCheck as Variant, not Object.

Youi are writing a utility tool - using early binding instead of late binding will improve the code. (Dim wdApp as Word.Application : Set wdApp = New Word.Application, assuming you are running this from Excel).

Macro recorder

To me it is obvious you used the macro recorder and then simply copied the code to get what you wanted. In order to improve your code, look at each step that has been recorded (a couple of lines each time) to work out what is really happening

You open a word document, but do not assign that open document to the declared variable wdDoc. Which should be declared as Word.Document not Object. Hint: Word.Application.Documents.Open returns a Document.

Once you start looking at the recorded code and making sensible changes, you will stop working with the nebulous Selection and start working directly with defined objects that you can control better.

Approach

A good approach is to first clean the input data. This can be as simple as identifying the block of text to be imported, copying that to an intermediate work area (perhaps a temporary word document, or a work area in your excel file) and then setting up the data format to suit your next step (the direct import).

What I have inferred from your code is that each column is separated by multiple spaces to create a nicely formatted output. So you can have two approaches here:

  • replace those spaces with a known delimiter
  • use the fixed column widths to do a text input into Excel (noted by @MathieuGuindon in the comments)

The first approach is useful if there is no consistency between the documents. The second is useful if there is this consistency.

Either way, at the end of these steps you have a consistent form of input data that you can now directly import into Excel.

I have deliberately not included any example code. The initial steps of refactoring the macro-recorded code is great learning experience for yourself and is something that will give you good insight into how you can improve your own code.

Looking forwards to the seeing the refactored code as a new article here for further review!

answered Jan 15, 2020 at 19:41
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.