Execute HTML Javascript from Command Line

Asked 13 years, 7 months ago

Viewed 4k times

I have a number of web pages that I am attempting to parse information from obtained using curl. Each of the page uses JQuery to transform its content upon the document being loaded in the browser (using the document.ready function) - mostly setting the classes/ids of divs. The information is much easier to parse once the Javascript functions have been loaded.

What are my options for (preferably from the command line) executing the Javascript content of the pages and dumping the transformed HTML?

Improve this question

asked May 20, 2012 at 8:34

mmccomb's user avatar

mmccomb

13.8k5 gold badges41 silver badges46 bronze badges

1

getfirebug.com/commandline ?? is this what you are looking for man.

Tats_innit
– Tats_innit

2012年05月20日 08:41:04 +00:00
Commented May 20, 2012 at 8:41
+1 sounds interesting :) I thought about node.js for a while but that won't work for you =/

Ja͢ck
– Ja͢ck

2012年05月20日 08:44:01 +00:00
Commented May 20, 2012 at 8:44

Add a comment |

1 Answer 1

Sorted by: Reset to default

To scrape dynamic web, don't use static download tools like curl.

If you want to scrape dynamic web use a headless webbrowser which you can control from your programming language. The most popular tool for this is Selenium

http://code.google.com/p/selenium/

With Selenium you can export modified DOM tree out of the browser as HTML.

An example use case:

https://stackoverflow.com/a/10053589/315168

Improve this answer

edited May 23, 2017 at 11:56

Community's user avatar

Community Bot

11 silver badge

answered May 20, 2012 at 9:35

Mikko Ohtamaa's user avatar

Mikko Ohtamaa

85.1k63 gold badges297 silver badges482 bronze badges

1 Comment

mmccomb

mmccomb Over a year ago

Thanks Mikko, I ended up using Selenium with the Java & Chrome bindings to load each page and subsequently dump the page source - it worked a treat!

2012年05月20日T12:59:01.793Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

default

CollectivesTM on Stack Overflow

Execute HTML Javascript from Command Line

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related