Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Unable to Fetch XHR Response Body with CDP #2731

Answered by mdmintz
AnirbanPatragithub asked this question in Q&A
Discussion options

I'm attempting to integrate Wire and UC, but they're incompatible. So, I'm exploring using CDP to retrieve XHR in a standard Selenium Webdriver setup, with plans to apply the same in SeleniumBase later. I'm also generating a log.txt file with comprehensive information. My goal is to find 'Learn More' in the output, but it's not present in the logs_raw. Here's the code.@mdmintz

from selenium import webdriver
import json
import time
options = webdriver.ChromeOptions()
service = webdriver.ChromeService(service_args=["--verbose", "--log-path=log.txt"])
url = 'https://www.facebook.com/ads/library/?id=2567767530063004'
# url = 'https://weatherstack.com/' #<--this url works as expected
options.set_capability(
 "goog:loggingPrefs", {"performance": "ALL"}
 )
driver = webdriver.Chrome(options=options,service=service)
driver.implicitly_wait(15)
time.sleep(5)
driver.get(url)
time.sleep(30)
# extract requests from logs
logs_raw = driver.get_log("performance")
logs = [json.loads(lr["message"])["message"] for lr in logs_raw]
def log_filter(log_):
 return (
 # is an actual response
 log_["method"] == "Network.responseReceived"
 # and json
 and "json" in log_["params"]["response"]["mimeType"]
 )
for log in filter(log_filter, logs):
 request_id = log["params"]["requestId"]
 resp_url = log["params"]["response"]["url"]
 print(f"Caught {resp_url}")
 print(driver.execute_cdp_cmd("Network.getResponseBody", {"requestId": request_id}))
You must be logged in to vote

Examples of fetching responses via CDP:

from rich.pretty import pprint
from seleniumbase import Driver
driver = Driver(uc=True, log_cdp=True)
try:
 url = "seleniumbase.io/apps/turnstile"
 driver.uc_open_with_reconnect(url, 2)
 driver.switch_to_frame("iframe")
 driver.uc_click("span.mark")
 driver.sleep(3)
 pprint(driver.get_log("performance"))
finally:
 driver.quit()
from rich.pretty import pprint
from seleniumbase import BaseCase
BaseCase.main(__name__, __file__, "--uc", "--uc-cdp", "-s")
class CDPTests(BaseCase):
 def add_cdp_listener(self):
 # (To print everything, use "*"...

Replies: 4 comments 11 replies

Comment options

Examples of fetching responses via CDP:

from rich.pretty import pprint
from seleniumbase import Driver
driver = Driver(uc=True, log_cdp=True)
try:
 url = "seleniumbase.io/apps/turnstile"
 driver.uc_open_with_reconnect(url, 2)
 driver.switch_to_frame("iframe")
 driver.uc_click("span.mark")
 driver.sleep(3)
 pprint(driver.get_log("performance"))
finally:
 driver.quit()
from rich.pretty import pprint
from seleniumbase import BaseCase
BaseCase.main(__name__, __file__, "--uc", "--uc-cdp", "-s")
class CDPTests(BaseCase):
 def add_cdp_listener(self):
 # (To print everything, use "*". Otherwise select specific headers.)
 # self.driver.add_cdp_listener("*", lambda data: print(pformat(data)))
 self.driver.add_cdp_listener(
 "Network.requestWillBeSentExtraInfo",
 lambda data: pprint(data)
 )
 def click_turnstile_and_verify(sb):
 sb.switch_to_frame("iframe")
 sb.driver.uc_click("span.mark")
 sb.assert_element("img#captcha-success", timeout=3)
 sb.highlight("img#captcha-success", loops=8)
 def test_display_cdp_events(self):
 if not (self.undetectable and self.uc_cdp_events):
 self.get_new_driver(undetectable=True, uc_cdp_events=True)
 url = "seleniumbase.io/apps/turnstile"
 self.driver.uc_open_with_reconnect(url, 2)
 self.add_cdp_listener()
 self.click_turnstile_and_verify()
 self.sleep(1)
 self.refresh()
 self.sleep(0.5)

If you don't need UC Mode, you can use Wire Mode: #2145

You must be logged in to vote
0 replies
Answer selected by mdmintz
Comment options

I have tried both the approaches already with seleniumbase
IMG_20240430_111119_370

I am not getting any output.
Thanks a lot for the help.

You must be logged in to vote
1 reply
Comment options

Follow the examples. You may need to use driver.refresh() to get the logs from driver.get_log("performance"), especially if you just called a method that disconnects the driver, such as driver.uc_open_with_reconnect().

This generates lots of logs from the WeatherStack website:

from rich.pretty import pprint
from seleniumbase import Driver
driver = Driver(uc=True, log_cdp=True)
try:
 url = "weatherstack.com"
 driver.uc_open_with_reconnect(url, 2)
 driver.refresh()
 pprint(driver.get_log("performance"))
finally:
 driver.quit()
Comment options

url = "https://www.facebook.com/ads/library/?id=2567767530063004"
from rich.pretty import pprint
from seleniumbase import Driver
import time
driver = Driver(uc=True, log_cdp=True)
try:
 # url = "weatherstack.com"
 driver.uc_open_with_reconnect(url, 2)
 driver.refresh()
 time.sleep(10)
 log = driver.get_log("performance")
 pprint(log)
 with open('Adlog.txt','w') as f:
 f.write(str(log))
finally:
 driver.quit()

I am trying to scrape the ad data from url but in the Adlog.txt the string 'Learn more' is missing.Most Probably the data is logged in bytes as it intercepted using normal selenium-wire with bytes and converted to string.The 'content-encoding' used is either 'br' or 'zstd'(Most likely).

image

from seleniumwire.utils import decode
body = decode(byte_data, 'zstd')

Cant decode the byte data to string.Any help is appreciated.

You must be logged in to vote
5 replies
Comment options

It looks like you're trying to ask a selenium-wire question, but this is the SeleniumBase repo.
selenium-wire questions should be asked in their repo: https://github.com/wkeeling/selenium-wire

Comment options

driver = Driver(uc=True, log_cdp=True)
try:
 url="https://www.facebook.com/ads/library/?id=2567767530063004"
 driver.uc_open_with_reconnect(url, 2)
 driver.refresh()
 time.sleep(10)
 log = driver.get_log("performance")
except:
 print('Error') 

Is there a way to decode bytes data received in XHR response in SeleniumBase?

Comment options

Paste fully-coded scripts when showing examples, like this:

from rich.pretty import pprint
from seleniumbase import Driver
driver = Driver(uc=True, log_cdp=True)
try:
 url = "https://www.facebook.com/ads/library/?id=2567767530063004"
 driver.uc_open_with_reconnect(url, 2)
 driver.refresh()
 driver.sleep(3)
 pprint(driver.get_log("performance"))
finally:
 driver.quit()

"Is there a way to decode bytes data received in XHR response in SeleniumBase?"

Check StackOverflow. Something like that definitely falls outside of SeleniumBase's scope.

Comment options

will paste fully coded scripts in future.
Thanks for the help.
Any plans to include the functionality which decodes bytes data received in XHR response in SeleniumBase??

Comment options

Comment options

@AnirbanPatragithub did you find a way to get the response body ? I am not getting it how to do it

You must be logged in to vote
5 replies
Comment options

Selenium base won't be able to do it. You can try seleniumwire Or nodriver.

Comment options

There's a SeleniumBase example that does it: SeleniumBase/examples/cdp_mode/raw_xhr_sb.py

Comment options

There's a SeleniumBase example that does it: SeleniumBase/examples/cdp_mode/raw_xhr_sb.py

I checked it, and it's working for me. Unfortunately, I don’t fully understand the code. It doesn’t seem like undetected Selenium is enabled, though. Kudos to you! I wonder if the implementation could be simplified by using nodriver instead.

Comment options

@mdmintz , are you planning on implementing this listen and receive XHR functions inside the seleniumbase in future or it will be always as separate function to listen when using uc ?

Comment options

That uses the CDP async API, so modifications must be done in an async way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /