If you work with a database that you only have access through a web interface, this one is for you. You probably don't have access to any programming languages other than VBA through MS Office, which I 95% guarantee you do have.
Since we will be controlling an external instance of Internet Explorer, you will need to reference Mircosoft Internet Controls
for this to work. One could hack this to work with Firefox or Chrome with extensions like Selenium.
Usage
This is meant to be placed in an add-in so we can create an instance from the pseudo-constructors.
Dim navi As Gator
Set navi = Gator.Home("https://www.google.com/", php:=False)
Navigating to a url will raise an error if the browser doesn't navigate to the requested URL. The php:=False
option ignores trailing php
code added. You can also grab an existing instance of IE using the window title.
Set navi = Gator.FromWindow("Google")
It's nifty if you don't want to re-login to a database or catch a popup window.
There are currently two ways to access HTML elements on the page.
- By id or name
- Using
HtmlFields
object
By id is the recommended method if available. You find the id
or name
of the element and use
With Gator.Home("http://codereview.stackexchange.com/")
.GetElByID("nav-questions").Click
End With
If the element does not have an ID you can specify the tag of the element and any other properties of the sought element
With Gator.Home("http://codereview.stackexchange.com/questions/69009/vba-clickbot-featuring-ajax-waiting-and-element-searching")
.FindEl(HtmlFields.Make("a", innerText:="up vote", _
title:="This question shows research effort; it is useful and clear")).Click
.FindEl(HtmlFields.Make("a", innerText:="add a comment")).Click
.FindEl(HtmlFields.Make("textarea")).value = "WOW! This is super neat!"
.FindEl(HtmlFields.Make("input", value:="Add Comment")).Click
End With
I highly recommend you give that a try
Notice that we aren't required to wait for pages or elements to load. That is the primary purpose of Gator
. It will attempt to access the element until it is found or it times out and raises an error. Both methods have an optional timeOut
parameter which is default 5 seconds.
Because of the brute force waiting, the code is extremely inefficient. FindEl
will perform a linear search on all elements of the given tag name at least 3 times before a successful return. I am looking for ways to reduce that. Still, the inefficiency is negligible in comparison to IE's operations.
Begin Code Dump
HtmlFields
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "HtmlFields"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
' Private Members
' ---------------
''
' Dictionary of `property => result`
Private criteria As Object
Private Sub Class_Initialize()
Set criteria = CreateObject("Scripting.Dictionary")
End Sub
Constructor
I would prefer to allow users to specify the property name of the element searching for instead of restricting them to the one's I have provided. It would be more flexible for the client and less code for me. However, I don't know how I should implement the association between the property and the expected value without inline dictionary definitions which could almost negate this class.
Public Function Make(ByVal tagName As String, _
Optional ByVal className As String = "", _
Optional ByVal value As String = "", _
Optional ByVal innerText As String = "", _
Optional ByVal innerHTML As String = "", _
Optional ByVal typeName As String = "", _
Optional ByVal title As String = "", _
Optional ByVal href As String = "", _
Optional ByVal style As String = "") As HtmlFields
Set Make = New HtmlFields
With Make
If tagName = "" Then
Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
End If
.Add "tagName", UCase(tagName)
.Add "className", className
.Add "value", value
.Add "innerText", innerText
.Add "innerHTML", innerHTML
.Add "type", typeName
.Add "title", title
.Add "href", href
.Add "style", style
End With
End Function
Friend Methods
Soley used for constructor
''
' If value is blank then the option wasn't provided.
Friend Sub Add(ByVal key As String, ByVal value As String)
If value <> "" Then criteria.Add key, value
End Sub
Friend Property Get Count() As Long
Count = criteria.Count
End Property
Public Properties
Public Function GetTags(ByVal htmlDoc As Object) As Object
Set GetTags = htmlDoc.getelementsbytagname(criteria("tagName"))
End Function
Public Function IsMatch(ByVal element As Object) As Boolean
Dim pass As Boolean
Dim field As Variant
For Each field In criteria.Keys
pass = (CallByName(element, field, VbGet) = criteria(field))
If Not pass Then Exit For
Next field
IsMatch = pass
End Function
Gator
VERSION 1.0 CLASS
BEGIN
MultiUse = -1 'True
END
Attribute VB_Name = "Gator"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
' Required .dlls
' `Microsoft Internet Controls`
'
' # Members
' ## Private Members
Private pBrowser As Object
Enum GatorErrors
wrong_location = 360
failed_el_get
End Enum
Public Browser Properties
Public Property Let Visible(ByVal state As Boolean)
pBrowser.Visible = state
End Property
Friend Properties
' For constructors
'
Friend Property Set Browser(ByRef other As InternetExplorer)
Set pBrowser = other
End Property
Friend Property Get Browser() As InternetExplorer
Set Browser = pBrowser
End Property
Constructors
Public Function FromWindow(ByVal title As String) As Gator
WaitFor "PopupExists", title, 30
Dim navi As New Gator
Set navi.Browser = GetWindowFromTitle(title)
Set FromWindow = navi
End Function
Public Function Home(ByVal homeURL As String) As Gator
Dim navi As New Gator
navi.UrlGo homeURL
Set Home = navi
End Function
Navigation
Public Sub UrlGo(ByVal destination As String, Optional php As Boolean = False)
If pBrowser Is Nothing Then Set pBrowser = New InternetExplorer
pBrowser.Visible = True
pBrowser.navigate destination
Wait
Dim location As String
If php Then
location = pBrowser.LocationURL
Else
location = Split(pBrowser.LocationURL, "?")(0)
End If
If location <> destination Then
pBrowser.Visible = True
Err.Raise GatorErrors.wrong_location, "UrlGo", _
"Hey! Listen! Browser didn't navigate to: " & destination
End If
End Sub
Public Routines
Public Sub Quit()
pBrowser.Quit
End Sub
Public Sub Wait()
While pBrowser.readyState <> READYSTATE_COMPLETE Or pBrowser.Busy
DoEvents
Wend
End Sub
Public Sub WaitFor(ByVal existsName As String, ByVal arg0 As Variant, _
ByVal timeOut As Integer)
Dim start As Single
start = timer
Do Until CallByName(Me, existsName, VbMethod, arg0) _
Or start + timeOut < timer
DoEvents
Loop
If Not CallByName(Me, existsName, VbMethod, arg0) Then
pBrowser.Visible = True
Err.Raise GatorErrors.failed_el_get, "WaitFor", _
"Hey!, Listen! " & existsName & " failed to find requested element"
End If
End Sub
Public Accessors
' ### byID
'
' The best way to do it by far. If the website is properly built and every
' element has some kind of ID then this is the way to go.
' You can wait for elements to exist if you know the ID
Public Function GetElByID(ByVal id As String, Optional ByVal timeOut As Long = 5) As Object
WaitFor "IdExists", id, timeOut
Set GetElByID = pBrowser.document.getelementbyID(id)
End Function
Public Function IdExists(ByVal id As String) As Boolean
On Error Resume Next
Dim el As Object
Set el = pBrowser.document.getelementbyID(id)
IdExists = Not (el Is Nothing)
End Function
'
' ### By Filter
'
Public Function FindEl(ByVal fields As HtmlFields, Optional ByVal timeOut As Long = 5) As Object
WaitFor "CanFindElement", fields, timeOut
Set FindEl = FieldSearch(fields)
End Function
Public Function CanFindElement(ByVal fields As HtmlFields) As Boolean
CanFindElement = Not (FieldSearch(fields) Is Nothing)
End Function
### Popup
Public Function PopupExists(ByVal title As String) As Boolean
PopupExists = Not (GetWindowFromTitle(title) Is Nothing)
End Function
Private Accessors
Private Function GetDoc() As Object
Wait
Set GetDoc = pBrowser.document
End Function
Private Function FieldSearch(ByRef fields As HtmlFields) As Object
Dim element As Object
For Each element In fields.GetTags(GetDoc)
If fields.IsMatch(element) Then
Set FieldSearch = element
Exit Function
End If
Next element
End Function
Popup Catching
''
' heavily refactored from StackExchange
' http://stackoverflow.com/questions/14446951/how-to-intercept-and-manipulate-a-internet-explorer-popup-with-vba
' You will need to reference `Microsoft Internet Controls`
Private Function GetWindowFromTitle(ByVal sTitle As String, _
Optional ByVal bCaseSensitive As Boolean = False, _
Optional ByVal bExact As Boolean = False) As SHDocVw.InternetExplorer
Dim objShellWindows As New SHDocVw.ShellWindows
Dim window As SHDocVw.InternetExplorer
For Each window In objShellWindows
If hasWindowTitle(window, sTitle, bCaseSensitive, bExact) Then
Set GetWindowFromTitle = window
Exit Function
End If
Next
Set GetWindowFromTitle = Nothing
End Function
Private Function hasWindowTitle(win As SHDocVw.InternetExplorer, ByVal title As String, _
ByVal bCaseSensitive As Boolean, ByVal bExact As Boolean) As Boolean
On Error GoTo Nope
Dim pass As Boolean
pass = (typeName(win.document) = "HTMLDocument")
If Not pass Then GoTo CleanExit
Dim docTitle As String
docTitle = win.document.title
If Not bCaseSensitive Then
docTitle = LCase(docTitle)
title = LCase(title)
End If
If bExact Then
pass = (docTitle = title)
Else
pass = InStr(1, docTitle, title) <> 0
End If
CleanExit:
hasWindowTitle = pass
Exit Function
Nope:
pass = False
Resume CleanExit
End Function
2 Answers 2
I have to say that this is pretty damn impressive. Really. I'm not going to go through this line by line; my general impression is that it looks pretty readable, procedures are of decent length, it's clear what's going on.
I wonder why the late binding here:
Private criteria As Object
Private Sub Class_Initialize()
Set criteria = CreateObject("Scripting.Dictionary")
End Sub
With a reference to the Scripting
library, it could look like this:
Private criteria As New Scripting.Dictionary
And nothing in the Initialize
handler. But this whole binding point is moot, I'll get back to why.
There's this Make
function:
Public Function Make(...)
The function is being assigned a new instance of HtmlFields
, and then that instance is being modified:
Set Make = New HtmlFields
With Make
If tagName = "" Then
Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
End If
.Add "tagName", UCase(tagName)
.Add "className", className
.Add "value", value
.Add "innerText", innerText
.Add "innerHTML", innerHTML
.Add "type", typeName
.Add "title", title
.Add "href", href
.Add "style", style
End With
I like the usage you've made of the With
statement, but not how you're using the function's handle as a simple object variable; you and I know that it can be used like that, but it's always better not to. I'd much rather like to see this:
Set result = New HtmlFields
With result
If tagName = "" Then
Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
End If
.Add "tagName", UCase(tagName)
.Add "className", className
.Add "value", value
.Add "innerText", innerText
.Add "innerHTML", innerHTML
.Add "type", typeName
.Add "title", title
.Add "href", href
.Add "style", style
End With
Set Make = result
It's not only a matter of personal preference: treating the function's identifier as a variable can lead to unexpected issues if you're not careful, and almost always creates a little surprise to the reader.
I would prefer to allow users to specify the property name of the element searching for instead of restricting them to the one's I have provided.
Oh yeah. Ok. Ready for a little bit of oop?
Attribute VB_Name = "HtmlField"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = True
Option Explicit
Private Type THtmlField
FieldName As String
FieldValue As String
End Type
Private this As THtmlField
Public Function Make(ByVal field As String, ByVal value As String) As HtmlField
Dim result As New HtmlField
result.FieldName = field
result.FieldValue = value
Set Make = result
End Function
'default property:
Public Property Get FieldName() As String
Attribute FieldName.VB_UserMemId = 0
FieldName = this.FieldName
End Property
Friend Property Let FieldName(ByVal value As String)
If value = vbNullString Then
Err.Raise 720, "HtmlField.FieldName (Property Let)", "Invalid argument: [value] cannot be set to an empty string."
End If
this.FieldName = value
End Property
Public Property Get FieldValue As String
FieldValue = this.FieldValue
End Property
Friend Property Let FieldValue(ByVal value As String)
this.FieldValue = value
End Property
So we have this sort-of-public-interface-immutable type with a default instance, that encapsulates its own validation rule(s). With the first recommended modification, this means we can now remove all these optional parameters and have the Make
function look like this:
Public Function Make(ByVal tagName As String, ) As HtmlFields
Set Make = New HtmlFields
With Make
If tagName = "" Then
Err.Raise 720, "HtmlFields.Make", "tagName cannot be a null string"
End If
.Add "tagName", UCase(tagName)
.Add "className", className
.Add "value", value
.Add "innerText", innerText
.Add "innerHTML", innerHTML
.Add "type", typeName
.Add "title", title
.Add "href", href
.Add "style", style
End With
End Function
And now we don't need this anymore, because value is never going to be blank:
''
' If value is blank then the option wasn't provided.
Friend Sub Add(ByVal key As String, ByVal value As String)
If value <> "" Then criteria.Add key, value
End Sub
Actually, the HtmlField
abstraction implements the key/value concept; there's no need to use a Scripting.Dictionary
anymore, because instead of this:
Public Function IsMatch(ByVal element As Object) As Boolean
Dim pass As Boolean
Dim field As Variant
For Each field In criteria.Keys
pass = (CallByName(element, field, VbGet) = criteria(field))
If Not pass Then Exit For
Next field
IsMatch = pass
End Function
We could have that:
Public Function IsMatch(ByVal element As Object) As Boolean
Dim pass As Boolean
Dim condition As HtmlField
For Each condition In criteria
pass = (CallByName(element, condition.FieldName, VbGet) = condition.FieldValue)
If Not pass Then Exit For
Next
IsMatch = pass
End Function
And then criteria
can be a simple everyday Collection
instance, and whether you late-bind or reference the Scripting
library is moot as promised.
-
\$\begingroup\$ Can you provide an example usage of the new structure? My understanding is that
HtmlFeilds
would now have anParamArray
constructor that acceptsHtmlField
objects. \$\endgroup\$cheezsteak– cheezsteak2014年11月13日 14:16:29 +00:00Commented Nov 13, 2014 at 14:16
It seems the other answer didn't touch the Gator
class, so I'll focus on that.
' Required .dlls ' `Microsoft Internet Controls`
Well done right there. All too often there is no documentation about what references a piece of code needs to work. Someone will thank you for that someday without a doubt.
Public Property Let Visible(ByVal state As Boolean) pBrowser.Visible = state End Property
There's nothing wrong with this code, but it might be nice to expose a getter too. That way the client code could take action based on whether or not the browser is visible. There is such a thing as too much symmetry, but I definitely don't see any harm in exposing it. It always feels a little weird when you can set a property, but not inspect it.
' For constructors ' Friend Property Set Browser(ByRef other As InternetExplorer) Set pBrowser = other End Property Friend Property Get Browser() As InternetExplorer Set Browser = pBrowser End Property Public Function FromWindow(ByVal title As String) As Gator WaitFor "PopupExists", title, 30 Dim navi As New Gator Set navi.Browser = GetWindowFromTitle(title) Set FromWindow = navi End Function Public Function Home(ByVal homeURL As String) As Gator Dim navi As New Gator navi.UrlGo homeURL Set Home = navi End Function
I find this to be a beautiful use of the underutilized Friend
scope. You wouldn't want to expose the property outside of the project, and you don't. Fantastic!
location = Split(pBrowser.LocationURL, "?")(0)
That is smart and clean way to get a substring, but has potential to confuse a rookie. Consider adding a comment.
Public Sub Wait() While pBrowser.readyState <> READYSTATE_COMPLETE Or pBrowser.Busy DoEvents Wend End Sub
This is really picky, but I'd like to see a set of parenthesis here. It just clears up the intent.
While (pBrowser.readyState <> READYSTATE_COMPLETE) Or pBrowser.Busy
DoEvents
Wend
Public Sub WaitFor(ByVal existsName As String, ByVal arg0 As Variant, _ ByVal timeOut As Integer) Dim start As Single start = timer Do Until CallByName(Me, existsName, VbMethod, arg0) _ Or start + timeOut < timer DoEvents Loop If Not CallByName(Me, existsName, VbMethod, arg0) Then pBrowser.Visible = True Err.Raise GatorErrors.failed_el_get, "WaitFor", _ "Hey!, Listen! " & existsName & " failed to find requested element" End If End Sub
arg0
is a variant here because you need to pass an HTMLFields
object to it in one place. Again, consider a comment explaining why you declared it as a variant. Also, I don't much like the name arg0
, but I'm drawing a blank on something better at the moment.
The other thing I'm seeing here is that you raise the same error if the element is not found as you do if it simply times out. I would raise two different errors depending on the situation if you can. It makes for a friendlier API to do so.
Public Function IdExists(ByVal id As String) As Boolean On Error Resume Next Dim el As Object Set el = pBrowser.document.getelementbyID(id) IdExists = Not (el Is Nothing) End Function
You're a good programmer, so I won't question that you need to use On Error Resume Next
here, but you definitely should add a comment here explaining why you're doing it. As it is, I don't have to domain knowledge to understand why you're doing this (and neither will the person maintaining the code someday).
Private Function GetDoc() As Object Wait Set GetDoc = pBrowser.document End Function Private Function FieldSearch(ByRef fields As HtmlFields) As Object Dim element As Object For Each element In fields.GetTags(GetDoc) If fields.IsMatch(element) Then Set FieldSearch = element Exit Function End If Next element End Function
You're already aware that this is a linear and slow search method, but as you pointed out, it shouldn't affect overall performance much. What I wanted to mention is that there's a lot of Object
here. You're already referencing the library, so you should go ahead and strongly type these if you can.
' heavily refactored from StackExchange ' http://stackoverflow.com/questions/14446951/how-to-intercept-and-manipulate-a-internet-explorer-popup-with-vba ' You will need to reference `Microsoft Internet Controls` Private Function GetWindowFromTitle(ByVal sTitle As String, _ Optional ByVal bCaseSensitive As Boolean = False, _ Optional ByVal bExact As Boolean = False) As SHDocVw.InternetExplorer Dim objShellWindows As New SHDocVw.ShellWindows Dim window As SHDocVw.InternetExplorer For Each window In objShellWindows If hasWindowTitle(window, sTitle, bCaseSensitive, bExact) Then Set GetWindowFromTitle = window Exit Function End If Next Set GetWindowFromTitle = Nothing End Function
Thanks for the proper attribution!, but OP's hungarian is showing. Burn it. It mars your otherwise beautiful naming.
That's all I've got. This is pretty darn impeccable looking code as far as I see. Pretty cool project too in my opinion.
-
\$\begingroup\$
IdExists
should sayOn Error GoTo 0
immediately after the line that expects the error, to restore error handling. Otherwise it may be hiding problems, becauseOn Error Resume Next
turns off error handling altogether, even beyond the procedure's scope (until anotherOn Error
statement is encountered). \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年11月13日 17:58:36 +00:00Commented Nov 13, 2014 at 17:58 -
1\$\begingroup\$ @Mat'sMug I might just use
On Error GoTo Nope
as I usually do. It's more lines but it will be self documenting. \$\endgroup\$cheezsteak– cheezsteak2014年11月13日 18:04:08 +00:00Commented Nov 13, 2014 at 18:04 -
1\$\begingroup\$ @ptwales nevermind \$\endgroup\$Mathieu Guindon– Mathieu Guindon2014年11月13日 18:17:29 +00:00Commented Nov 13, 2014 at 18:17