-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Sb_cdp with proxy / sb.get() not responding after some navigation #3943
-
Hello,
Sorry for my previous post. I thought I was doing the right thing by shortening my code as much as possible to make it easier for you to reproduce the problem, but i must have given the impression that I was trying to attack or disrupt the same website with 100 requests. This is obviously not the purpose of my original code.
I'm currently developing a component, behind an API, that receives a URL with a scenario (clicking a button, filling in a field, scrolling, etc...) and returns the page content after executing the actions. The URLs received are always different (normally). I use a proxy because the machine running the docker container does not have direct access to the internet, for security reasons.
Here is the code that might allow you to reproduce the problem :
from seleniumbase import sb_cdp
proxy = "XXX@XXX:22225"
sb = sb_cdp.Chrome(proxy=proxy)
urls = ["url1", "url2", "url3", ...]
for url in urls:
print(f"URL : {url}")
sb.get(url)
print("DONE")
When i use the sb.get() function multiple times, after a certain number of times the function stops responding.
My logs hang like this:
URL : xxx
DONE
URL : xxx
DONE
URL : xxx
(stuck here indefinitely)
To try to understand, I installed a VNC server inside the Docker container so that I could view what was happening on Google Chrome: I can still navigate in the browser window and change websites. The browser isn't crashing, but I have a feeling the communication with the script is broken.
I don't meet any problem without using proxy, with the same code.
Also, if it can be helpful, I don't have the problem with SB either, like this :
with SB(uc=True, proxy=proxy) as sb:
sb.activate_cdp_mode()
for url in urls:
print(f"URL : {url}")
sb.get(url)
print("DONE")
Could there be a bug with sb_cdp and using proxy? Is there a way around it?
Thank you in advance for your help.
If anything else is not suitable, please let me know.
Beta Was this translation helpful? Give feedback.
All reactions
Due to the removal of the --load-extension
command-line switch in Google Chrome 137, if you have issues using a proxy in pure CDP Mode (via sb_cdp
), then you may need to switch to UC Mode + CDP Mode. There's some basic proxy functionality, but that might not work in extreme situations. Also, running in Docker is detectable, unless you know something I don't.
Replies: 1 comment 3 replies
-
Due to the removal of the --load-extension
command-line switch in Google Chrome 137, if you have issues using a proxy in pure CDP Mode (via sb_cdp
), then you may need to switch to UC Mode + CDP Mode. There's some basic proxy functionality, but that might not work in extreme situations. Also, running in Docker is detectable, unless you know something I don't.
Beta Was this translation helpful? Give feedback.
All reactions
-
Perfect, thanks for your help!
Instead of:
sb_cdp = sb_cdp.Chrome(proxy=proxy, window_size="1920,1080", headed=True)
I now do:
driver = Driver(uc=True, proxy=proxy, window_size="1920,1080", headed=True)
driver.uc_activate_cdp_mode()
And that seems to fix the problem.
However, it seems to create a new screen for each driver, in addition to the screen I created myself upfront:
from pyvirtualdisplay import Display
display = Display(size=(1920, screen_size_y))
display.start()
For example if I instantiate 2 displays :
root 9 1 1 09:44 ? 00:00:00 Xvfb -br -nolisten tcp -screen 0 1920x3240x24 -displayfd 12
root 477 1 0 09:44 ? 00:00:00 Xvfb -br -nolisten tcp -screen 0 1366x768x24 :1100
root 478 1 0 09:44 ? 00:00:00 Xvfb -br -nolisten tcp -screen 0 1366x768x24 :1101
With sb_cdp, I only had to pass the headed=True parameter to create the windows on the existing screen.
Maybe I missed something?
Regarding Docker, I haven't yet encountered any sites that detect it, but most of the sites I visit don't have very sophisticated anti-bots. I'll let you know if I notice a difference on a website between a native execution (under Windows or WSL) and an execution under Docker, and if I find a solution to hide the fingerprint.
Beta Was this translation helpful? Give feedback.
All reactions
-
If you're running on Linux, you probably want to use the SB()
format, as it already includes the Xvfb virtual display code.
uc_activate_cdp_mode()
should only be called once per driver. After that, just navigate to a new URL with open()
when you need to go to a new URL.
Beta Was this translation helpful? Give feedback.
All reactions
-
Yes, I started using SB, but we realized that the object only worked with a context (with), and since I wanted to run multiple instances in the background, we followed these examples:
https://github.com/seleniumbase/SeleniumBase/blob/master/examples/cdp_mode/raw_multi_cdp.py
https://github.com/seleniumbase/SeleniumBase/blob/master/examples/cdp_mode/raw_driver.py
Almost everything is perfect now, but I still have one last problem : when I instantiate a Driver(), I can't use the existing xfvb screen, as I can with SB, or sb_cdp.Chrome().
However, I found that modifying the core/browser_launcher.py file and setting "headed = True" on line 622 fixed the problem.
So, using the driver like this:
driver = Driver(uc=True, proxy=proxy, window_size="1920,1080", headed=True)
driver.uc_activate_cdp_mode()
Could I force or inject properly the sb_config configuration to force the "headed" variable to True?
I did:
from seleniumbase import config as sb_config
sb.config.headed = True
I don't know if it's clean, but it works.
Beta Was this translation helpful? Give feedback.