-
Notifications
You must be signed in to change notification settings - Fork 1.4k
-
Hello,
Let me elaborate on my question.
browser = Driver(uc = True, headless = True)
browser.default_get(url) # Faster, but Selenium can be detected
I have recently used this method to parse 4 000 urls, which took 53 000 seconds with 10 core-threads. Looks a lot, 14 hours per iteration. 120 seconds per URL on average.
Earlier I used:
options = ChromeOptions()
options.add_argument("--headless=new")
options.page_load_strategy = 'eager'
browser = webdriver.Chrome(options = options)
browser.set_page_load_timeout(timeout_secs=10)
browser.get(url)
from pure Selenium. Which resluts in circa 20 times less time per the same amount of URLs visited, but with less successes opening the page.
Namely: 2 665 seconds per 4 545 URLs in 10-core parallelism, so 6 seconds per URL.
However, I am flexible here in tuning the timeout and strategy, so I can get less but quicker and vice versa.
So, why the UC method is much slower, and what exact strategy and timeout does it utilizes? Could you advice on the parameter set of the UC's default_get(url)
to make it faster?
Beta Was this translation helpful? Give feedback.
All reactions
There's a whole set of YouTube videos explaining how UC Mode and CDP Mode work:
(Watch the 1st UC Mode tutorial on YouTube!
(Watch the 2nd UC Mode tutorial on YouTube!
(Watch the 3rd UC Mode tutorial on YouTube!
(Watch the 4th UC Mode tutorial on YouTube!
For anything not covered in the videos, there's documentation on GitHub. And the code is open source.
Replies: 1 comment 3 replies
-
There's a whole set of YouTube videos explaining how UC Mode and CDP Mode work:
(Watch the 1st UC Mode tutorial on YouTube!
(Watch the 2nd UC Mode tutorial on YouTube!
(Watch the 3rd UC Mode tutorial on YouTube!
(Watch the 4th UC Mode tutorial on YouTube!
For anything not covered in the videos, there's documentation on GitHub. And the code is open source.
Beta Was this translation helpful? Give feedback.
All reactions
-
Thank you! I have not watched youtube yet, but I read quite much of the documentaton. Now I have found the code that exaplins much more to me: Driver.
However, I could not find any mention of a parameter associated with timeout or wait time (given that the waits were built-in), could you please tell me if I can set timeouts while using UC mode?
Beta Was this translation helpful? Give feedback.
All reactions
-
Many methods have an optional timeout
arg.
See the docs:
Beta Was this translation helpful? Give feedback.
All reactions
-
I think I found what I wanted: settings
I will need to modidy the global settings to override defaukt timeout values. I think I was getting exactly 120 seconds of page loading time by default.
I am happy now, thank you again for the product! Have a blessed day.
Beta Was this translation helpful? Give feedback.