4
\$\begingroup\$

I wrote some code to extract the information from a table, but it takes an extremely long time.

The table is in the format of a calendar. I need the information on an Excel sheet with column 1 as the day number and column 2 as the accommodation type. On the accommodation type, I only want the "Camp" types and the number of available units to show. I’m sure there will be a better way to loop through the table. Any help will be appreciated.

Here is my code:

Dim driver As New WebDriver, i As Integer, mysheet As Worksheet 
Dim ele As WebElement 
Set driver = New EdgeDriver 
Application.ScreenUpdating = False 
driver.Start "edge" 
driver.Get "http://www.sanparks.org/reservations/accommodation/calendar-month/park/26/camp/41/date/2022-07-01"
Application.Wait Now + TimeValue("00:00:05")
Set mysheet = Sheets("Sheet1")
i = 7
eRow = Sheet1.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row
For Each ele In driver.FindElementsByClass("day")
If ele.FindElementByCss("span").Text <> "" Then
 mysheet.Cells(i, 2).Value = ele.FindElementByCss("span").Text
 On Error Resume Next
 If Mid(ele.FindElementByClass("bold-name").Text, 1, 4) = "Camp" Then 'AccommodationType
 mysheet.Cells(i, 3).Value = ele.FindElementByClass("bold-name").Text & " " & ele.FindElementByClass("unit-numbers").Text
 
 End If
 On Error GoTo 0
 i = i + 1
End If
Next ele
Greedo
2,6252 gold badges15 silver badges36 bronze badges
asked Mar 6, 2022 at 21:19
\$\endgroup\$
1

2 Answers 2

4
\$\begingroup\$

If you aren't locked into using Selenium, you can accomplish something similar to this using a web request. Web Requests should be faster as they don't need to worry about rendering any items on screen, they just return data.

I looked at the web traffic, and noticed this site uses an API to provide the data to the front end. I'm using this API, to return the data as JSON, parsing that, then returning that to an Excel range. This takes about 2 seconds on my machine.

In order to get this to work, you'll need a copy of this --> https://github.com/VBA-tools/VBA-JSON in your project. Follow the directions on the project page to get that setup.

Option Explicit
Public Sub ScrapeCampsites()
 Dim URL As String
 URL = "https://www.sanparks.org/includes/SANParksApp/API/v1/bookings/accommodation/getAvailabilityAccommodationMonthList.php?resort=10&month=7&year=2022"
 
 Dim response As String
 
 With CreateObject("MSXML2.XMLHTTP.6.0")
 .Open "GET", URL, False
 .send
 response = .responseText
 End With
 
 'Include https://github.com/VBA-tools/VBA-JSON into your project
 Dim Json As Object: Set Json = JsonConverter.ParseJson(response)
 Dim Accomodations As Object: Set Accomodations = Json("DATA")
 Dim Accomodation As Variant
 Dim Availabilities As Variant
 Dim Availability As Variant
 Dim Results As Variant
 
 ReDim Results(1 To 2, 1 To 50000)
 
 Dim i As Long
 
 For Each Accomodation In Accomodations
 
 If Mid$(Accomodation("accommodationtypedesc"), 1, 4) = "Camp" Then
 Availabilities = Accomodation("availabilities").Items
 
 For Each Availability In Availabilities
 i = i + 1
 Results(1, i) = Availability("availableDate")
 Results(2, i) = Availability("available")
 Next
 
 End If
 
 Next
 
 ReDim Preserve Results(1 To 2, i)
 
 Dim mySheet As Worksheet
 Set mySheet = Sheets("Sheet1")
 
 mySheet.Range("A1:B1").Value = Array("Date", "Available")
 mySheet.Range("A2:B" & UBound(Results, 2) + 1) = Application.WorksheetFunction.Transpose(Results)
End Sub
answered Mar 7, 2022 at 21:18
\$\endgroup\$
7
  • \$\begingroup\$ Out of interest, how did you inspect the traffic, what was the process? \$\endgroup\$ Commented Mar 7, 2022 at 22:04
  • \$\begingroup\$ I used the network tab in Chrome to via the web requests back and forth on the page. I noticed one of the endpoints had API in the URI, so I took a peek at what was included in the response. It was the JSON that feeds into the calendar on the page. \$\endgroup\$ Commented Mar 8, 2022 at 13:32
  • \$\begingroup\$ Greedo - I had my vba code checking the availability on IE and not through a webdriver. This took about 2 seconds to complete. \$\endgroup\$ Commented Mar 11, 2022 at 4:50
  • \$\begingroup\$ Ryan Wildry - Thank you Ryan for having a look. I have followed the instructions and got the following error: KeyNotFoundError - Dictionary key not found: STATUS The error is on the following line: "json_ParseObject.Item(json_Key) = json_ParseValue(json_String, json_Index)" \$\endgroup\$ Commented Mar 11, 2022 at 4:54
  • \$\begingroup\$ @Nelj This is a weird error, I don't see STATUS listed in the keys for code above. It might be related to requiring a reference to Microsoft Scripting Runtime the json parser uses this data structure to deserialize the data. \$\endgroup\$ Commented Mar 11, 2022 at 13:57
2
\$\begingroup\$

I notice a couple of things:

  1. I found reference that says "Webdriver Get will wait until the page has fully loaded before returning the control" so we don't need to wait 5 seconds.

  2. FindElementsByClass is an expensive operation. We should never call it multiple times if we can avoid it. I've implemented storing the results in a variable to cut the calls in half per loop.

Try this out and let me know if it's any faster:

Application.ScreenUpdating = False
Dim driver As WebDriver
Set driver = New EdgeDriver
driver.Start "edge"
driver.Get "http://www.sanparks.org/reservations/accommodation/calendar-month/park/26/camp/41/date/2022-07-01"
Dim mysheet As Worksheet
Set mysheet = Sheets("Sheet1")
Dim i As Integer
i = 7
eRow = mysheet.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row
Dim dayClass As Variant
dayClass = driver.FindElementsByClass("day")
Dim ele As WebElement
For Each ele In dayClass
 Dim spanText As String
 spanText = ele.FindElementByCss("span").Text
 If spanText <> "" Then
 mysheet.Cells(i, 2).Value = spanText
 Dim boldText As String
 boldText = ele.FindElementByClass("bold-name").Text
 
 On Error Resume Next
 If Mid(boldText, 1, 4) = "Camp" Then 'AccommodationType
 mysheet.Cells(i, 3).Value = boldText & " " & ele.FindElementByClass("unit-numbers").Text
 
 End If
 On Error GoTo 0
 
 i = i + 1
 End If
Next ele
```
answered Mar 7, 2022 at 16:38
\$\endgroup\$
2
  • \$\begingroup\$ Thank you for having a look at my problem and the advice. I have tried your solution. It is working, but still takes a long time to complete. I had the following code when the website used to be available on IE and I didn't have to work through a web driver (this is only the web scraping part): \$\endgroup\$ Commented Mar 12, 2022 at 5:26
  • \$\begingroup\$ eRow = Sheet1.Cells(Rows.Count, 3).End(xlUp).Offset(7, 0).Row For Each ele In .document.all Select Case ele.className Case "day" RowCount = RowCount + 1 sht.Range("B" & RowCount) = ele.innerText Case "units" If Mid(ele.NextSibling.innerText, 1, 4) = StandType Then sht.Range("C" & RowCount) = ele.NextSibling.innerText & " - " & ele.innerText End If End Select Next ele \$\endgroup\$ Commented Mar 13, 2022 at 7:06

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.