-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Closed
Assignees
@gmyers-aiq
Description
Zyte is a paid proxy service for web scraping. When you sign up for an account, the credentials they give you include a username but no password. Here's a code snippet I'm trying to run:
#!/usr/bin/env python import os from seleniumbase import SB from selenium.common.exceptions import TimeoutException from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.wait import WebDriverWait PROXY_STRING = f"{os.getenv('PROXY_USER')}:@api.zyte.com:8011" with SB(uc=True, headless=True, proxy=PROXY_STRING) as sb: try: sb.driver.get("https://example.com/") html = sb.driver.page_source sb.driver.sleep(5) WebDriverWait(sb.driver, 5).until( EC.presence_of_element_located((By.TAG_NAME, "p")) ) print(html) except TimeoutException as ex: print(ex)
When I set my PROXY_USER
environment variable and then run this script, it fails with a TimeoutException. Here's a sample stack trace:
Message:
Stacktrace:
0 chromedriver 0x000000010274ee00 cxxbridge1$str$ptr + 2742224
1 chromedriver 0x0000000102746d00 cxxbridge1$str$ptr + 2709200
2 chromedriver 0x00000001022910b8 cxxbridge1$string$len + 90520
3 chromedriver 0x00000001022d85d8 cxxbridge1$string$len + 382648
4 chromedriver 0x0000000102319980 cxxbridge1$string$len + 649824
5 chromedriver 0x00000001022cc8f4 cxxbridge1$string$len + 334292
6 chromedriver 0x0000000102712478 cxxbridge1$str$ptr + 2494024
7 chromedriver 0x00000001027156a4 cxxbridge1$str$ptr + 2506868
8 chromedriver 0x00000001026f33b0 cxxbridge1$str$ptr + 2366848
9 chromedriver 0x0000000102715f4c cxxbridge1$str$ptr + 2509084
10 chromedriver 0x00000001026e44a8 cxxbridge1$str$ptr + 2305656
11 chromedriver 0x0000000102735644 cxxbridge1$str$ptr + 2637844
12 chromedriver 0x00000001027357d0 cxxbridge1$str$ptr + 2638240
13 chromedriver 0x000000010274694c cxxbridge1$str$ptr + 2708252
14 libsystem_pthread.dylib 0x0000000180b73c0c _pthread_start + 136
15 libsystem_pthread.dylib 0x0000000180b6eb80 thread_start + 8
If I turn off uc=True
it becomes more apparent what's actually happening, as seen in this screenshot:
As you can see, even though I included the credentials in the proxy string (per the documentation), they were not correctly passed to the Chromedriver, because Chrome is asking for them to be provided manually in a dialog box.