-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Automating screenshot capture of phishing websites #3694
-
Hi Michael,
I work at a CERT in Australia and am currently trying to automate one of the services we offer our members using SeleniumBase. Our members will submit phishing takedown requests which we analyse, and if deemed to be a phishing site, we will submit a takedown request to the relevant registrar.
As part of the analysis and takedown submission, we gather screenshots of the phishing website as evidence. This is where we were hoping to use SeleniumBase to automate this process :)
We receive numerous phishing takedown submissions daily where the site may or may not be behind a captcha (the captcha's themselves can vary).
My question is: do you think this screenshot capture process is able to be automated using SeleniumBase given that we are not trying to access the same site each time? (e.g. the site may or may not have a captcha, the captcha's themselves can vary). If we were to try and host something on AWS would we need some sort of rotating proxy to avoid bot detection?
Any suggestions would be greatly appreciated :)
Thanks,
Dan
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 1 comment
-
Hi Dan,
The screenshot part is rather easy, eg:
from seleniumbase import SB with SB(uc=True, test=True) as sb: url = "https://www.selenium.dev/ecosystem/" sb.activate_cdp_mode(url) sb.sleep(1) sb.save_screenshot("image.png", selector="body")
As for the anti-bot bypass and CAPTCHA-solving, that depends on the CAPTCHA / anti-bot system, as those are not uniformly made. The examples in SeleniumBase/examples/cdp_mode demonstrate various ways of bypassing the anti-bot tech. One of the easiest CAPTCHAs to bypass are Cloudflare Turnstiles (found in several examples). You'll likely need to use a residential proxy, since you would get blocked by sites for coming from a non-residential IP address range such as AWS. There's a proxy
arg that you can set to change your proxy settings. If you have your own data center (or local machines to use), then you might not need to use residential proxies. GitHub Actions works well too. The CDP Mode ReadMe can help you get started with creating stealthy scripts.
Beta Was this translation helpful? Give feedback.