7
\$\begingroup\$

If you work with a database that you only have access through a web interface, this one is for you. You probably don't have access to any programming languages other than VBA through MS Office, which I 95% guarantee you do have.

Since we will be controlling an external instance of Internet Explorer, you will need to reference Mircosoft Internet Controls for this to work. One could hack this to work with Firefox or Chrome with extensions like Selenium.

Usage

This is meant to be placed in an add-in so we can create an instance from the pseudo-constructors.

Dim navi As Gator
Set navi = Gator.Home("https://www.google.com/", php:=False) 

Navigating to a url will raise an error if the browser doesn't navigate to the requested URL. The php:=False option ignores trailing php code added. You can also grab an existing instance of IE using the window title.

Set navi = Gator.FromWindow("Google")

It's nifty if you don't want to re-login to a database or catch a popup window.

There are currently two ways to access HTML elements on the page.

  1. By id or name
  2. Using HtmlFields object

By id is the recommended method if available. You find the id or name of the element and use

With Gator.Home("http://codereview.stackexchange.com/")
 .GetElByID("nav-questions").Click
End With

If the element does not have an ID you can specify the tag of the element and any other properties of the sought element

With Gator.Home("http://codereview.stackexchange.com/questions/69009/vba-clickbot-featuring-ajax-waiting-and-element-searching")
 .FindEl(HtmlFields.Make("a", innerText:="up vote", _
 title:="This question shows research effort; it is useful and clear")).Click
 .FindEl(HtmlFields.Make("a", innerText:="add a comment")).Click
 .FindEl(HtmlFields.Make("textarea")).value = "WOW! This is super neat!"
 .FindEl(HtmlFields.Make("input", value:="Add Comment")).Click
End With

I highly recommend you give that a try

Notice that we aren't required to wait for pages or elements to load. That is the primary purpose of Gator. It will attempt to access the element until it is found or it times out and raises an error. Both methods have an optional timeOut parameter which is default 5 seconds.

Because of the brute force waiting, the code is extremely inefficient. FindEl will perform a linear search on all elements of the given tag name at least 3 times before a successful return. I am looking for ways to reduce that. Still, the inefficiency is negligible in comparison to IE's operations.

Begin Code Dump


HtmlFields

VERSION 1.0 CLASS
BEGIN
 MultiUse = -1 'True
END
Attribute VB_Name = "HtmlFields"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
' Private Members
' ---------------
''
' Dictionary of `property => result`
Private criteria As Object
Private Sub Class_Initialize()
 Set criteria = CreateObject("Scripting.Dictionary")
End Sub

Constructor

I would prefer to allow users to specify the property name of the element searching for instead of restricting them to the one's I have provided. It would be more flexible for the client and less code for me. However, I don't know how I should implement the association between the property and the expected value without inline dictionary definitions which could almost negate this class.

Public Function Make(ByVal tagName As String, _
 Optional ByVal className As String = "", _
 Optional ByVal value As String = "", _
 Optional ByVal innerText As String = "", _
 Optional ByVal innerHTML As String = "", _
 Optional ByVal typeName As String = "", _
 Optional ByVal title As String = "", _
 Optional ByVal href As String = "", _
 Optional ByVal style As String = "") As HtmlFields
 Set Make = New HtmlFields
 With Make
 If tagName = "" Then
 Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
 End If
 .Add "tagName", UCase(tagName)
 .Add "className", className
 .Add "value", value
 .Add "innerText", innerText
 .Add "innerHTML", innerHTML
 .Add "type", typeName
 .Add "title", title
 .Add "href", href
 .Add "style", style
 End With
End Function

Friend Methods

Soley used for constructor

''
' If value is blank then the option wasn't provided.
Friend Sub Add(ByVal key As String, ByVal value As String)
 If value <> "" Then criteria.Add key, value
End Sub
Friend Property Get Count() As Long
 Count = criteria.Count
End Property

Public Properties

Public Function GetTags(ByVal htmlDoc As Object) As Object
 Set GetTags = htmlDoc.getelementsbytagname(criteria("tagName"))
End Function
Public Function IsMatch(ByVal element As Object) As Boolean
 Dim pass As Boolean
 Dim field As Variant
 For Each field In criteria.Keys
 pass = (CallByName(element, field, VbGet) = criteria(field))
 If Not pass Then Exit For
 Next field
 IsMatch = pass
End Function

Gator

VERSION 1.0 CLASS
BEGIN
 MultiUse = -1 'True
END
Attribute VB_Name = "Gator"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
' Required .dlls
' `Microsoft Internet Controls`
'
' # Members
' ## Private Members
Private pBrowser As Object
Enum GatorErrors
 wrong_location = 360
 failed_el_get
End Enum

Public Browser Properties

Public Property Let Visible(ByVal state As Boolean)
 pBrowser.Visible = state
End Property

Friend Properties

' For constructors
'
Friend Property Set Browser(ByRef other As InternetExplorer)
 Set pBrowser = other
End Property
Friend Property Get Browser() As InternetExplorer
 Set Browser = pBrowser
End Property

Constructors

Public Function FromWindow(ByVal title As String) As Gator
 WaitFor "PopupExists", title, 30
 Dim navi As New Gator
 Set navi.Browser = GetWindowFromTitle(title)
 Set FromWindow = navi
End Function
Public Function Home(ByVal homeURL As String) As Gator
 Dim navi As New Gator
 navi.UrlGo homeURL
 Set Home = navi
End Function

Navigation

Public Sub UrlGo(ByVal destination As String, Optional php As Boolean = False)
 If pBrowser Is Nothing Then Set pBrowser = New InternetExplorer
 pBrowser.Visible = True
 pBrowser.navigate destination
 Wait
 Dim location As String
 If php Then
 location = pBrowser.LocationURL
 Else
 location = Split(pBrowser.LocationURL, "?")(0)
 End If
 If location <> destination Then
 pBrowser.Visible = True
 Err.Raise GatorErrors.wrong_location, "UrlGo", _
 "Hey! Listen! Browser didn't navigate to: " & destination
 End If
End Sub

Public Routines

Public Sub Quit()
 pBrowser.Quit
End Sub
Public Sub Wait()
 While pBrowser.readyState <> READYSTATE_COMPLETE Or pBrowser.Busy
 DoEvents
 Wend
End Sub
Public Sub WaitFor(ByVal existsName As String, ByVal arg0 As Variant, _
 ByVal timeOut As Integer)
 Dim start As Single
 start = timer
 Do Until CallByName(Me, existsName, VbMethod, arg0) _
 Or start + timeOut < timer
 DoEvents
 Loop
 If Not CallByName(Me, existsName, VbMethod, arg0) Then
 pBrowser.Visible = True
 Err.Raise GatorErrors.failed_el_get, "WaitFor", _
 "Hey!, Listen! " & existsName & " failed to find requested element"
 End If
End Sub

Public Accessors

' ### byID
'
' The best way to do it by far. If the website is properly built and every
' element has some kind of ID then this is the way to go.
' You can wait for elements to exist if you know the ID
Public Function GetElByID(ByVal id As String, Optional ByVal timeOut As Long = 5) As Object
 WaitFor "IdExists", id, timeOut
 Set GetElByID = pBrowser.document.getelementbyID(id)
End Function
Public Function IdExists(ByVal id As String) As Boolean
 On Error Resume Next
 Dim el As Object
 Set el = pBrowser.document.getelementbyID(id)
 IdExists = Not (el Is Nothing)
End Function
'
' ### By Filter
'
Public Function FindEl(ByVal fields As HtmlFields, Optional ByVal timeOut As Long = 5) As Object
 WaitFor "CanFindElement", fields, timeOut
 Set FindEl = FieldSearch(fields)
End Function
Public Function CanFindElement(ByVal fields As HtmlFields) As Boolean
 CanFindElement = Not (FieldSearch(fields) Is Nothing)
End Function
### Popup
Public Function PopupExists(ByVal title As String) As Boolean
 PopupExists = Not (GetWindowFromTitle(title) Is Nothing)
End Function

Private Accessors

Private Function GetDoc() As Object
 Wait
 Set GetDoc = pBrowser.document
End Function
Private Function FieldSearch(ByRef fields As HtmlFields) As Object
 Dim element As Object
 For Each element In fields.GetTags(GetDoc)
 If fields.IsMatch(element) Then
 Set FieldSearch = element
 Exit Function
 End If
 Next element
End Function

Popup Catching

''
' heavily refactored from StackExchange
' http://stackoverflow.com/questions/14446951/how-to-intercept-and-manipulate-a-internet-explorer-popup-with-vba
' You will need to reference `Microsoft Internet Controls`
Private Function GetWindowFromTitle(ByVal sTitle As String, _
 Optional ByVal bCaseSensitive As Boolean = False, _
 Optional ByVal bExact As Boolean = False) As SHDocVw.InternetExplorer
 Dim objShellWindows As New SHDocVw.ShellWindows
 Dim window As SHDocVw.InternetExplorer
 For Each window In objShellWindows
 If hasWindowTitle(window, sTitle, bCaseSensitive, bExact) Then
 Set GetWindowFromTitle = window
 Exit Function
 End If
 Next
 Set GetWindowFromTitle = Nothing
End Function
Private Function hasWindowTitle(win As SHDocVw.InternetExplorer, ByVal title As String, _
 ByVal bCaseSensitive As Boolean, ByVal bExact As Boolean) As Boolean
 On Error GoTo Nope
 Dim pass As Boolean
 pass = (typeName(win.document) = "HTMLDocument")
 If Not pass Then GoTo CleanExit
 Dim docTitle As String
 docTitle = win.document.title
 If Not bCaseSensitive Then
 docTitle = LCase(docTitle)
 title = LCase(title)
 End If
 If bExact Then
 pass = (docTitle = title)
 Else
 pass = InStr(1, docTitle, title) <> 0
 End If
CleanExit:
 hasWindowTitle = pass
Exit Function
Nope:
 pass = False
 Resume CleanExit
End Function
RubberDuck
31.1k6 gold badges73 silver badges176 bronze badges
asked Nov 5, 2014 at 19:34
\$\endgroup\$

2 Answers 2

4
\$\begingroup\$

I have to say that this is pretty damn impressive. Really. I'm not going to go through this line by line; my general impression is that it looks pretty readable, procedures are of decent length, it's clear what's going on.

I wonder why the late binding here:

Private criteria As Object
Private Sub Class_Initialize()
 Set criteria = CreateObject("Scripting.Dictionary")
End Sub

With a reference to the Scripting library, it could look like this:

Private criteria As New Scripting.Dictionary

And nothing in the Initialize handler. But this whole binding point is moot, I'll get back to why.

There's this Make function:

Public Function Make(...)

The function is being assigned a new instance of HtmlFields, and then that instance is being modified:

Set Make = New HtmlFields
With Make
 If tagName = "" Then
 Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
 End If
 .Add "tagName", UCase(tagName)
 .Add "className", className
 .Add "value", value
 .Add "innerText", innerText
 .Add "innerHTML", innerHTML
 .Add "type", typeName
 .Add "title", title
 .Add "href", href
 .Add "style", style
End With

I like the usage you've made of the With statement, but not how you're using the function's handle as a simple object variable; you and I know that it can be used like that, but it's always better not to. I'd much rather like to see this:

Set result = New HtmlFields
With result
 If tagName = "" Then
 Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
 End If
 .Add "tagName", UCase(tagName)
 .Add "className", className
 .Add "value", value
 .Add "innerText", innerText
 .Add "innerHTML", innerHTML
 .Add "type", typeName
 .Add "title", title
 .Add "href", href
 .Add "style", style
End With
Set Make = result

It's not only a matter of personal preference: treating the function's identifier as a variable can lead to unexpected issues if you're not careful, and almost always creates a little surprise to the reader.

I would prefer to allow users to specify the property name of the element searching for instead of restricting them to the one's I have provided.

Oh yeah. Ok. Ready for a little bit of ?

Attribute VB_Name = "HtmlField"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
Private Type THtmlField
 FieldName As String
 FieldValue As String
End Type
Private this As THtmlField
Public Function Make(ByVal field As String, ByVal value As String) As HtmlField
 Dim result As New HtmlField
 result.FieldName = field
 result.FieldValue = value
 Set Make = result
End Function
'default property:
Public Property Get FieldName() As String
 Attribute FieldName.VB_UserMemId = 0
 FieldName = this.FieldName
End Property
Friend Property Let FieldName(ByVal value As String)
 If value = vbNullString Then
 Err.Raise 720, "HtmlField.FieldName (Property Let)", "Invalid argument: [value] cannot be set to an empty string."
 End If
 this.FieldName = value
End Property
Public Property Get FieldValue As String
 FieldValue = this.FieldValue
End Property
Friend Property Let FieldValue(ByVal value As String)
 this.FieldValue = value
End Property

So we have this sort-of-public-interface-immutable type with a default instance, that encapsulates its own validation rule(s). With the first recommended modification, this means we can now remove all these optional parameters and have the Make function look like this:

Public Function Make(ByVal tagName As String, ) As HtmlFields
 Set Make = New HtmlFields
 With Make
 If tagName = "" Then
 Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
 End If
 .Add "tagName", UCase(tagName)
 .Add "className", className
 .Add "value", value
 .Add "innerText", innerText
 .Add "innerHTML", innerHTML
 .Add "type", typeName
 .Add "title", title
 .Add "href", href
 .Add "style", style
 End With
End Function

And now we don't need this anymore, because value is never going to be blank:

''
' If value is blank then the option wasn't provided.
Friend Sub Add(ByVal key As String, ByVal value As String)
 If value <> "" Then criteria.Add key, value
End Sub

Actually, the HtmlField abstraction implements the key/value concept; there's no need to use a Scripting.Dictionary anymore, because instead of this:

Public Function IsMatch(ByVal element As Object) As Boolean
 Dim pass As Boolean
 Dim field As Variant
 For Each field In criteria.Keys
 pass = (CallByName(element, field, VbGet) = criteria(field))
 If Not pass Then Exit For
 Next field
 IsMatch = pass
End Function

We could have that:

Public Function IsMatch(ByVal element As Object) As Boolean
 Dim pass As Boolean
 Dim condition As HtmlField
 For Each condition In criteria
 pass = (CallByName(element, condition.FieldName, VbGet) = condition.FieldValue)
 If Not pass Then Exit For
 Next
 IsMatch = pass
End Function

And then criteria can be a simple everyday Collection instance, and whether you late-bind or reference the Scripting library is moot as promised.

answered Nov 13, 2014 at 4:17
\$\endgroup\$
1
  • \$\begingroup\$ Can you provide an example usage of the new structure? My understanding is that HtmlFeilds would now have an ParamArray constructor that accepts HtmlField objects. \$\endgroup\$ Commented Nov 13, 2014 at 14:16
4
\$\begingroup\$

It seems the other answer didn't touch the Gator class, so I'll focus on that.


' Required .dlls
' `Microsoft Internet Controls`

Well done right there. All too often there is no documentation about what references a piece of code needs to work. Someone will thank you for that someday without a doubt.

Public Property Let Visible(ByVal state As Boolean)
 pBrowser.Visible = state
End Property

There's nothing wrong with this code, but it might be nice to expose a getter too. That way the client code could take action based on whether or not the browser is visible. There is such a thing as too much symmetry, but I definitely don't see any harm in exposing it. It always feels a little weird when you can set a property, but not inspect it.

' For constructors
'
Friend Property Set Browser(ByRef other As InternetExplorer)
 Set pBrowser = other
End Property
Friend Property Get Browser() As InternetExplorer
 Set Browser = pBrowser
End Property
Public Function FromWindow(ByVal title As String) As Gator
 WaitFor "PopupExists", title, 30
 Dim navi As New Gator
 Set navi.Browser = GetWindowFromTitle(title)
 Set FromWindow = navi
End Function
Public Function Home(ByVal homeURL As String) As Gator
 Dim navi As New Gator
 navi.UrlGo homeURL
 Set Home = navi
End Function

I find this to be a beautiful use of the underutilized Friend scope. You wouldn't want to expose the property outside of the project, and you don't. Fantastic!

 location = Split(pBrowser.LocationURL, "?")(0)

That is smart and clean way to get a substring, but has potential to confuse a rookie. Consider adding a comment.

Public Sub Wait()
 While pBrowser.readyState <> READYSTATE_COMPLETE Or pBrowser.Busy
 DoEvents
 Wend
End Sub

This is really picky, but I'd like to see a set of parenthesis here. It just clears up the intent.

While (pBrowser.readyState <> READYSTATE_COMPLETE) Or pBrowser.Busy
 DoEvents
Wend
Public Sub WaitFor(ByVal existsName As String, ByVal arg0 As Variant, _
 ByVal timeOut As Integer)
 Dim start As Single
 start = timer
 Do Until CallByName(Me, existsName, VbMethod, arg0) _
 Or start + timeOut < timer
 DoEvents
 Loop
 If Not CallByName(Me, existsName, VbMethod, arg0) Then
 pBrowser.Visible = True
 Err.Raise GatorErrors.failed_el_get, "WaitFor", _
 "Hey!, Listen! " & existsName & " failed to find requested element"
 End If
End Sub

arg0 is a variant here because you need to pass an HTMLFields object to it in one place. Again, consider a comment explaining why you declared it as a variant. Also, I don't much like the name arg0, but I'm drawing a blank on something better at the moment.

The other thing I'm seeing here is that you raise the same error if the element is not found as you do if it simply times out. I would raise two different errors depending on the situation if you can. It makes for a friendlier API to do so.

Public Function IdExists(ByVal id As String) As Boolean
 On Error Resume Next
 Dim el As Object
 Set el = pBrowser.document.getelementbyID(id)
 IdExists = Not (el Is Nothing)
End Function

You're a good programmer, so I won't question that you need to use On Error Resume Next here, but you definitely should add a comment here explaining why you're doing it. As it is, I don't have to domain knowledge to understand why you're doing this (and neither will the person maintaining the code someday).

Private Function GetDoc() As Object
 Wait
 Set GetDoc = pBrowser.document
End Function
Private Function FieldSearch(ByRef fields As HtmlFields) As Object
 Dim element As Object
 For Each element In fields.GetTags(GetDoc)
 If fields.IsMatch(element) Then
 Set FieldSearch = element
 Exit Function
 End If
 Next element
End Function

You're already aware that this is a linear and slow search method, but as you pointed out, it shouldn't affect overall performance much. What I wanted to mention is that there's a lot of Object here. You're already referencing the library, so you should go ahead and strongly type these if you can.

' heavily refactored from StackExchange
' http://stackoverflow.com/questions/14446951/how-to-intercept-and-manipulate-a-internet-explorer-popup-with-vba
' You will need to reference `Microsoft Internet Controls`
Private Function GetWindowFromTitle(ByVal sTitle As String, _
 Optional ByVal bCaseSensitive As Boolean = False, _
 Optional ByVal bExact As Boolean = False) As SHDocVw.InternetExplorer
 Dim objShellWindows As New SHDocVw.ShellWindows
 Dim window As SHDocVw.InternetExplorer
 For Each window In objShellWindows
 If hasWindowTitle(window, sTitle, bCaseSensitive, bExact) Then
 Set GetWindowFromTitle = window
 Exit Function
 End If
 Next
 Set GetWindowFromTitle = Nothing
End Function

Thanks for the proper attribution!, but OP's hungarian is showing. Burn it. It mars your otherwise beautiful naming.


That's all I've got. This is pretty darn impeccable looking code as far as I see. Pretty cool project too in my opinion.

answered Nov 13, 2014 at 17:42
\$\endgroup\$
3
  • \$\begingroup\$ IdExists should say On Error GoTo 0 immediately after the line that expects the error, to restore error handling. Otherwise it may be hiding problems, because On Error Resume Next turns off error handling altogether, even beyond the procedure's scope (until another On Error statement is encountered). \$\endgroup\$ Commented Nov 13, 2014 at 17:58
  • 1
    \$\begingroup\$ @Mat'sMug I might just use On Error GoTo Nope as I usually do. It's more lines but it will be self documenting. \$\endgroup\$ Commented Nov 13, 2014 at 18:04
  • 1
    \$\begingroup\$ @ptwales nevermind \$\endgroup\$ Commented Nov 13, 2014 at 18:17

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.