-
Notifications
You must be signed in to change notification settings - Fork 1.4k
-
Hi Folks,
Only recently stumbled upon this project and just looking to understand a bit more about how it works in the background, I've had a read through the code but my python knowledge isn't too good and the project is so massive struggling to find my bearings (I come from more of a Java background). I'm particularly interested in the sub processing and CDP commands.
I understand the logic of starting chrome in a separate process reduces chances of detection but then how is python communicating with the browser, is it via remote debugging or is it more generic CDP commands? I assumed it chromedriver wouldnt be able to communicate with a chrome window that's been opened externally so assumed it was either CDP or remote webdriver. Or am I missing understanding.
If not is there any benefit in not enabling the remote debugging in the browser until after the page load ie send CDP command to activate the remote debugging after the fact? (I mainly write java so apologies if I've massively misunderstood the setup, mainly looking for some advice on whether my understanding is correct?)
Beta Was this translation helpful? Give feedback.
All reactions
Hello! Have you seen the new UC Mode video showing how it works, and how it fixes bugs with undetected-chromedriver?
https://www.youtube.com/watch?v=5dMFI3e85ig
Many of your questions will be answered there. It dives into specifics.
The remote-debugging-host
and remote-debugging-port
are key components of the chromedriver/Chrome connection.
Replies: 2 comments 7 replies
-
Hello! Have you seen the new UC Mode video showing how it works, and how it fixes bugs with undetected-chromedriver?
https://www.youtube.com/watch?v=5dMFI3e85ig
Many of your questions will be answered there. It dives into specifics.
The remote-debugging-host
and remote-debugging-port
are key components of the chromedriver/Chrome connection.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks Michael,
the specific part im struggling with is near enough one sentence. You launch chrome in new subprocess which makes sense (great idea btw).
Then "attach chromedriver to it". How is that attaching done, is that done via remote-debugging-host and port or sother method that I'm unaware of?
Beta Was this translation helpful? Give feedback.
All reactions
-
It's a 3-step process.
- Add the
remote-debugging-port
to your options:
- Launch Chrome using the options you set above:
- Launch chromedriver using the options you set above:
If you did everything correctly, the chromedriver you launched in Step 3 should be able to control the Chrome browser you launched in Step 2.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks Michael,
That makes sense!
Beta Was this translation helpful? Give feedback.
All reactions
-
How can I start SB and define another port to start the Browser and the Drive?
I know there is a parameter for multithreading sys.argv.append("-n") that searches for a free port
But I would like to define a port to start each instance I start.
Beta Was this translation helpful? Give feedback.
All reactions
-
chromium_arg="remote-debugging-port=PORT"
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for the answer.
I have another question. I made a script using multiprocessing and a queue to automate 400 different profiles/data_dir_base running 5 threads at the same time. After passing profile 150 the system starts to slow down. After 250 I have to restart the computer because everything slows down. But the CPU and RAM consumption is low, neither exceeding 40%.
with SB(uc=True, user_data_dir=self.user_data_dir, extension_dir=self.extension_path) as sb:
Beta Was this translation helpful? Give feedback.
All reactions
-
There may be options you can add to chromium_arg
in order to reduce Chrome's memory usage: https://superuser.com/questions/952302/how-to-make-google-chrome-or-chromium-use-less-memory
But you could also manually run commands to terminate processes that are hogging memory.
Beta Was this translation helpful? Give feedback.
All reactions
-
Setting the remote-debugging-port improved things a bit. Then I used LatencyMon and identified a bottleneck in tcpip.sys!
After a quick search I found someone saying to disable network traffic monitoring in Windows (the one that counts how many GB you used on the network) and it improved a lot.
altering the registry key value of the "Start" entry to 4 in HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Services\Ndu\ to disable the Windows Network Data Usage Monitor Driver
now latencyMon is showing bottleneck in NVidia drive nvlddmks.sys
Beta Was this translation helpful? Give feedback.