I've written a very small ASP.NET page to scrape thousands of pages of
content based on database IDs. It loops through a dataset to get the IDs. It
worked well in testing but now I am getting an annoying 403 error that
causes the script to abort halfway through my download.
I am wondering if there is a way in ASP.NET to have my code ignore 403
errors and other network errors, catch the error, and iterate to the next ID
in the dataset rather than aborting the whole job.
My code appears below. Thank you in advance.
-KF
string strConnection;
strConnection = ConfigurationSe ttings.AppSetti ngs["connwhatev er"];
SqlConnection conn = new SqlConnection(s trConnection);
string query = // [my query];
SqlDataAdapter a = new SqlDataAdapter( query, conn);
DataSet s = new DataSet();
a.Fill(s);
int counter = 0;
foreach (DataRow dr in s.Tables[0].Rows)
{
counter++;
System.Net.WebC lient wc = new WebClient();
string strData =
wc.DownloadStri ng("http://whatever.org/article.asp?art icleid=" +
dr[0].ToString());
FileStream fstream = new FileStream(@"c: \whateverpath\" + dr[0].ToString() +
".htm", FileMode.Create , FileAccess.Writ e);
StreamWriter stream = new StreamWriter(fs tream);
stream.Write(st rData);
stream.Close();
fstream.Close() ;