I wrote some basic Python code to scrape a remote webpage and grab a few pieces of data. On a different page I'm trying to scrape, the data is hidden from view, and only appears after changing the value of a <select> box.
After de-minifying and digging through the remote website's javascript, I confirmed that it is using AJAX (custom implementation of Prototype I think) to switch the <tbody> of the <table> I'm interested in scraping.
Is there a way to use Python (or Javascript via Python) to trigger the onChange event of that select box so I can "refresh" the DOM and grab the new HTML?
-
5Why not just do the AJAX request directly and parse the data you want from that?kindall– kindall2012年01月20日 05:43:58 +00:00Commented Jan 20, 2012 at 5:43
-
Take a look at PySide's WebKit bindings - you can do exactly that. (I'm working on a similar project)synthesizerpatel– synthesizerpatel2012年01月20日 06:21:49 +00:00Commented Jan 20, 2012 at 6:21
-
1@kindall i originally couldn't find the ajax request url (was using chrome's built-in inspect element tool). then a friend suggested i use firebug on firefox (which i use normally, but stupidly didnt think to try) and the request url popped up in the console as expected. if you want to post an answer i'll accept yours.Jonathan Drake– Jonathan Drake2012年01月20日 15:31:31 +00:00Commented Jan 20, 2012 at 15:31
1 Answer 1
Figure out the AJAX request URL and request it directly. :-)