I want to make a script in python that interacts with a webpage that has quite a lot of javascript in it (it's a webpage that computes a bunch of physics stuff).
I don't want my code to break if the page formatting changes and I want it to run offline so I would prefer my script to run on a local html copy of the page I got (all the JS code is accessible in the HTML source, there is no call to an external server). I wanted to use the requests library to do it, but it only works with URLs. Is there any library to do this? Note that I want to interact with the HTML (input values and look at the outputs etc..), I know that I can parse the file but that's not what I'm asking. I'm also totally new to web bots or anything related.
Right now I can open my .html version of the page offline with chrome and interact with it, so there has to be a way to automate this somehow. I'm also not against using something else than python if there is a better library for this in another language.
-
1Try selenium. It helps in parsing JavaScript enabled HTML content.Compro Prasad– Compro Prasad2020年11月15日 05:06:06 +00:00Commented Nov 15, 2020 at 5:06
-
Requests won’t retrieve from a local file system. You could very very easily serve the page locally using http.server in which case requests could retrieve it, BUT why bother using Requests if the file is local anyway.DisappointedByUnaccountableMod– DisappointedByUnaccountableMod2020年11月15日 18:57:54 +00:00Commented Nov 15, 2020 at 18:57
-
@barny Because there’s some pretty complicated JS code on the page every time I press a button that serves some result and I want to interact with it automatically and I haven’t found any other way to do that. If not Requests then what should I use? Understanding how the JS code works would take more time than just having a bot enter a value press the button and retrieve the result.johan boscher– johan boscher2020年11月15日 19:06:26 +00:00Commented Nov 15, 2020 at 19:06
-
1So yes you need a browser-simulaion such as Selenium. Requests can make HTTP GET requests but a browser is needed to interpret html+JSDisappointedByUnaccountableMod– DisappointedByUnaccountableMod2020年11月15日 19:31:32 +00:00Commented Nov 15, 2020 at 19:31
-
Maybe request-html + a local httpserver? See stackoverflow.com/questions/54889023/… Personally, I avoid Selenium with a passion, but Cypress IO which I know for QA isn’t suited for automation.JL Peyret– JL Peyret2020年11月15日 22:52:48 +00:00Commented Nov 15, 2020 at 22:52
1 Answer 1
interesting question, best way I can think to do that is use a web framework and then just scrape the data using requests. I am familiar with flask and its simple to use but im sure there are other options as well
Comments
Explore related questions
See similar questions with these tags.