I am trying to create a Java application that reads a list of URLs
from a file and stores their contents on the local file system. I
have succeeded in accessing normal websites, but I am unable to access
the secured websites (using the HTTPS protocol) using this approach.
I would greatly appreciate if someone could suggest a way out. I have
looked at the HttpsURLConnect ion class, but unfortunately this class
is abstract.
[My apolgies to readers of comp.lang.java and comp.sources.d since
this posting may appear to be a repost of my previous posting
(although it is not) --- Bhat]
My source code follows:
/////////////////// Source code begin /////////////////////
// This program reads a list of URLs to access and store on the local
// file system from a file. The name of the file is passed as the
// first command line argument. Each URL is on a separate line.
// Lines beginning with the '#' character are treated as blanks and
// are skipped.
//
import java.io.*;
import java.net.*;
import java.security.* ;
class WebsiteLoader
{
public static char replaceChar = '~';
public static void main(String argv[]) throws IOException
{
// The following two lines were suggested by the following
website:
// http://www.javaworld.com/javaworld/j...javatip96.html
// They help in suppressing the java.net.Malfor medURLException
System.setPrope rty("java.proto col.handler.pkg s",
"com.sun.net.ss l.internal.www. protocol");
Security.addPro vider(new com.sun.net.ssl .internal.ssl.P rovider());
BufferedReader br;
String origName;
if(argv.length != 0)
{
br = new BufferedReader( new FileReader(argv[0]));
// Read URLs from the file. Skip blank lines and lines
beginning
// with the '#' character.
for(;;)
{
origName = br.readLine();
if(origName == null)
break;
origName = origName.trim() ;
if(origName.len gth() == 0)
continue;
if(origName.cha rAt(0) == '#')
continue;
URL url = new URL(origName);
if(url == null)
continue;
BufferedReader bufRdr = new BufferedReader( new
InputStreamRead er(url.openStre am()));
// The name of the file to which the website contents are
written
// is derived from the URL by substituting the following
characters
// with some "non-offending" character:
// \,/,:,*,?,",<,>,|
String modName = origName;
modName = modName.replace ('\\', replaceChar);
modName = modName.replace ('/', replaceChar);
modName = modName.replace (':', replaceChar);
modName = modName.replace ('*', replaceChar);
modName = modName.replace ('?', replaceChar);
modName = modName.replace ('"', replaceChar);
modName = modName.replace ('<', replaceChar);
modName = modName.replace ('>', replaceChar);
modName = modName.replace ('|', replaceChar);
FileWriter fWriter = new FileWriter(modN ame);
System.out.prin tln("Writing contents of " + origName + " to "
+
"the following file: " + modName);
for(;;)
{
String thisLine = bufRdr.readLine ();
if(thisLine == null)
break;
fWriter.write(t hisLine);
}
}
}
}
}
/////////////////// Source code end //////////////////////