-3

So this is in regards to scraping yes; no language in particular. Some sites allow you to see a JSON modal if you pull it directly from a web browser. But, at any notion a program is used, immediately declines the request and asks for an API? How does the site know the difference between user request and say a request from a CLI or selenium?

asked Feb 5, 2021 at 18:09
1

1 Answer 1

4

In general, websites look for anomalous traffic patterns. Cookies that don't match up, requests made in the wrong order, that sort of thing.

If a website is keenly interested in this distinction, they can find out if you're a human user by presenting a challenge that is hard for automated programs to negotiate, like a Captcha.

There are other techniques they can employ to limit "bots," like rate limiting.

Beyond that, some websites look at things like the user agent string and session tokens, things that can be easily defeated by a savvy scraper. Generally, what you want to do is look at the Internet traffic using WireShark or Fiddler, and mimic the traffic that the web browser produces. Selenium doesn't do that out of the box.

answered Feb 5, 2021 at 18:50
2
  • So technically a simple scrape could take place if you had authenticated with the site, or would the rejected the request becomes it would not be common request made by that typical user? Edit: By mimic you mean creating a custom header data and cookie to communicate with the site? Commented Feb 5, 2021 at 19:12
  • 1
    By mimic you mean creating a custom header data and cookie to communicate with the site? -- Yes, in the same way that the browser would. Commented Feb 5, 2021 at 19:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.