I have a ArrayList containing list of websites in this format:-
- google.com
- facebook.com
- youtube.com
- yahoo.com
- wikipedia.org
t.co
And I have to read html text from all the links. But some links are creating problem like (t.co) and other are working fine.
Code:-
try { String line="t.co"; String[] Add_words = line.split("[//:.]"); if (Add_words[0].contains("http")) { } else if (Add_words[0].contains("www")) line = "http://" + line; else if (!Add_words[0].contains("http") && !Add_words[0].contains("www")) line = "http://www." + line; URL url = new URL(line); URLConnection urlConnection = url.openConnection(); HttpURLConnection connection = null; if(urlConnection instanceof HttpURLConnection) { connection = (HttpURLConnection) urlConnection; } else { System.out.println("Please enter an HTTP URL."); return; } BufferedReader in = new BufferedReader( new InputStreamReader(connection.getInputStream())); String urlString = ""; String current; while((current = in.readLine()) != null) { urlString += current+"\n"; } System.out.println(urlString); }catch(IOException e) { e.printStackTrace(); } And I'm getting the error with the last link `t.co`error:-
java.io.FileNotFoundException: http://www.t.co at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1834) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1439) at com.test.code.Main.main(Main.java:109)What i need is, I have list of link in above format and my code should access all the link, whatever the link format will be.
3 Answers 3
You are adding www. to t.co, but www.t.co is not correct and will result in an 404 Not Found.
Just do not add the www. to the URL and it should work.
3 Comments
line = "http://www." + line; by line = "http://" + line; it should work. Why do you have to append the www.?You get FileNotFoundException because getting a response from http://www.t.co returns:
HTTP/1.1 404 Not Found
9 Comments
url.openConnection() throws an Exception, that means the url is unreachable/invalid. Move on to the next one. Or don't create invalid/false urls in the first place.You are adding www. to your t.co link which is causing the problem. Do not add that prefix and only try with http://t.co and it should work if your link is valid.
EDIT
Change:
else if (Add_words[0].contains("www"))
line = "http://" + line;
else if (!Add_words[0].contains("http")
&& !Add_words[0].contains("www"))
line = "http://www." + line;
to
else if (Add_words[0].contains("www") ||
(line.contains("t.co") && !Add_words[0].contains("www")))
line = "http://" + line;
else if (!Add_words[0].contains("http")
&& !Add_words[0].contains("www")
&& !line.contains("t.co"))
line = "http://www." + line;
This is not the best way but will do. The only case left is if you have line=www.t.co in which you will need to remove the www. prefix before those if statements.
As @Tim said www. append is unnessecary anyway so the most efficient solution will be fixing the second else if as he suggested.