1

I want to convert a webpage into an HTML page programatically.
I searched many sites but only providing details like converting into pdf format etc.
For my program now I'm saving a page as .html and then extracting the necessary data.
Is there any way to convert the webpage to an html page? Can anyone help me?
Any help would be appreciated.

Well I can explain in detail

I am extracting the names of users who like a page which i'm admin of . So I found a link https://www.facebook.com/browse/?type=page_fans&page_id=pageid where i can find the list of users. So for getting it first of all i have to save it as a .html page and then extract necessary data. So here I'm converting it into .html and then extract the data. But what I need is that convert that page into an HTML page using my program. I hope my question is clear now

asked Apr 24, 2014 at 11:02
8
  • 4
    What do you mean by converting a webpage into an HTML page? Aren't they same? Commented Apr 24, 2014 at 11:03
  • What do you want to convert to HTML page? Do you mean you want to generate HTML page with Java? Commented Apr 24, 2014 at 11:04
  • 1
    what do you want from this conversion? Commented Apr 24, 2014 at 11:06
  • Perhaps you question is how to programmatically fetch a web page? Commented Apr 24, 2014 at 11:10
  • Do you mean converting a web page to a standalone HTML page that can be used offline? Should it be a single HTML file or can it be a collection of files (possibly packaged as a zip file)? Please clarify by editing the question (including its title). Commented Apr 24, 2014 at 11:12

2 Answers 2

1

Oracle provides the following code snippet for programmatically retrieving an html page here.

import java.net.*;
import java.io.*;
public class URLReader {
 public static void main(String[] args) throws Exception {
 URL oracle = new URL("http://www.oracle.com/");
 BufferedReader in = new BufferedReader(
 new InputStreamReader(oracle.openStream()));
 String inputLine;
 while ((inputLine = in.readLine()) != null)
 System.out.println(inputLine);
 in.close();
 }
}

Instead of printing to console, you can save the contents to a file by using a FileWriter and BufferedWriter (example from this question):

 FileWriter fstream = new FileWriter("fileName");
 BufferedWriter fbw = new BufferedWriter(fstream);
 while ((line = in.readLine()) != null) {
 fbw.write(line + "\n");
 }
answered Apr 24, 2014 at 13:58
Sign up to request clarification or add additional context in comments.

1 Comment

thank u for your reply. But it's not what I want.I have edited my question and explained it more clearly. Plz take a look. I'm sure that you can help me
0

Webpages are already HTML, if you want to save a webpage as HTML you can do this via the Firefox> Save Page As menu on Firefox. Or through File menu on other browsers.

If you need to download multiple pages in HTML from the same website or from a list of URLs there is a software that will make it easier for you: http://www.httrack.com/

answered Apr 24, 2014 at 11:37

1 Comment

the user is asking how to do it programatically

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.