I'm writing a console utility to download specific files from web sites
based on the command line options. In most cases, I can trap the 404
error when the file isn't available because the operator mistyped the
URL or it's offline for whatever reason. The problem I'm running into
is with certain sites where the admin has set up a redirect to handle
the 404 condition and redirects the request to another page.
In this case, the redirected page gets downloaded and saved which is not
the desired result. This utility is being used in a scheduling process
to download and process specific data files from public web sites and
the files must exist, or return an error to halt processing.
Is there anyway to tell the WebClient or WebRequest objects to not allow
redirected content? Or, is there a property in either of the objects
that reflects the actual URL of the source when it's redirected? Or, is
there an alternative object that can be used for this purpose to
download files from web sites?
Any help would be greatly appreciated.
- Glen 9 15004
I don't think there is a way using the WebClient class.
If you are using HttpWebRequest, there is the AllowAutoRedirect property
which may allow you to accomplish what you need.
--
Adam Clauss ca*****@tamu.edu
"Glen" <Bu****@hotmail.com> wrote in message
news:u4****************@TK2MSFTNGP14.phx.gbl... I'm writing a console utility to download specific files from web sites based on the command line options. In most cases, I can trap the 404 error when the file isn't available because the operator mistyped the URL or it's offline for whatever reason. The problem I'm running into is with certain sites where the admin has set up a redirect to handle the 404 condition and redirects the request to another page.
In this case, the redirected page gets downloaded and saved which is not the desired result. This utility is being used in a scheduling process to download and process specific data files from public web sites and the files must exist, or return an error to halt processing.
Is there anyway to tell the WebClient or WebRequest objects to not allow redirected content? Or, is there a property in either of the objects that reflects the actual URL of the source when it's redirected? Or, is there an alternative object that can be used for this purpose to download files from web sites?
Any help would be greatly appreciated.
- Glen
Glen,
I know a lot of sites that have an URL only as a start point and than
redirect, how do you handle those when you have your implementation.
Not that I have the answer however maybe is finding it opening a new problem
and is searching for this solution useless..
Just my thought,
Cor
Hi Cor,
Since I'm not building a browser and I'm looking to retrieve very specific
data from the web, I expect that any changes in the remote data source will
be documented and handled by the system operators. Again, this is a critical
task, much like getting files via FTP. I don't expect that when I log into
an FTP site, I'm going to get a different file than the one I requested.
In any case, thank you for your input.
- Glen
"Cor Ligthert" <no************@planet.nl> wrote in message
news:OJ****************@TK2MSFTNGP12.phx.gbl... Glen,
I know a lot of sites that have an URL only as a start point and than redirect, how do you handle those when you have your implementation.
Not that I have the answer however maybe is finding it opening a new
problem and is searching for this solution useless..
Just my thought,
Cor
Thanks Adam. I'll give that one a look.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message
news:OM****************@TK2MSFTNGP10.phx.gbl... I don't think there is a way using the WebClient class.
If you are using HttpWebRequest, there is the AllowAutoRedirect property which may allow you to accomplish what you need.
-- Adam Clauss ca*****@tamu.edu "Glen" <Bu****@hotmail.com> wrote in message news:u4****************@TK2MSFTNGP14.phx.gbl... I'm writing a console utility to download specific files from web sites based on the command line options. In most cases, I can trap the 404 error when the file isn't available because the operator mistyped the
URL or it's offline for whatever reason. The problem I'm running into is
with certain sites where the admin has set up a redirect to handle the 404 condition and redirects the request to another page.
In this case, the redirected page gets downloaded and saved which is not the desired result. This utility is being used in a scheduling process
to download and process specific data files from public web sites and the files must exist, or return an error to halt processing.
Is there anyway to tell the WebClient or WebRequest objects to not allow redirected content? Or, is there a property in either of the objects
that reflects the actual URL of the source when it's redirected? Or, is
there an alternative object that can be used for this purpose to download
files from web sites?
Any help would be greatly appreciated.
- Glen
Unfortunately, this parameter doesn't appear to work in some cases. The
site in question is using IIS 5.0 and is not returning a redirect code or
page not found code in any instance. All redirects are done silently with
an OK status returned.
"Glen" <bu****@hotmail.com> wrote in message
news:%2******************@TK2MSFTNGP11.phx.gbl... Thanks Adam. I'll give that one a look.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message news:OM****************@TK2MSFTNGP10.phx.gbl... I don't think there is a way using the WebClient class.
If you are using HttpWebRequest, there is the AllowAutoRedirect property which may allow you to accomplish what you need.
-- Adam Clauss ca*****@tamu.edu "Glen" <Bu****@hotmail.com> wrote in message news:u4****************@TK2MSFTNGP14.phx.gbl... I'm writing a console utility to download specific files from web
sites based on the command line options. In most cases, I can trap the 404 error when the file isn't available because the operator mistyped the URL or it's offline for whatever reason. The problem I'm running into is with certain sites where the admin has set up a redirect to handle the 404 condition and redirects the request to another page.
In this case, the redirected page gets downloaded and saved which is
not the desired result. This utility is being used in a scheduling
process to download and process specific data files from public web sites and the files must exist, or return an error to halt processing.
Is there anyway to tell the WebClient or WebRequest objects to not
allow redirected content? Or, is there a property in either of the objects
that reflects the actual URL of the source when it's redirected? Or, is there an alternative object that can be used for this purpose to download files from web sites?
Any help would be greatly appreciated.
- Glen
Hmm, wait, so the website is not telling the client to redirect anywhere?
In this case - there probably is no way to tell that anything was happening.
It sounds like the redirection is actually happening on the server side of
things...
--
Adam Clauss ca*****@tamu.edu
"Glen" <bu****@hotmail.com> wrote in message
news:O$****************@TK2MSFTNGP10.phx.gbl... Unfortunately, this parameter doesn't appear to work in some cases. The site in question is using IIS 5.0 and is not returning a redirect code or page not found code in any instance. All redirects are done silently with an OK status returned.
"Glen" <bu****@hotmail.com> wrote in message news:%2******************@TK2MSFTNGP11.phx.gbl... Thanks Adam. I'll give that one a look.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message news:OM****************@TK2MSFTNGP10.phx.gbl... > I don't think there is a way using the WebClient class. > > If you are using HttpWebRequest, there is the AllowAutoRedirect > property > which may allow you to accomplish what you need. > > -- > Adam Clauss > ca*****@tamu.edu > "Glen" <Bu****@hotmail.com> wrote in message > news:u4****************@TK2MSFTNGP14.phx.gbl... > > I'm writing a console utility to download specific files from web sites > > based on the command line options. In most cases, I can trap the 404 > > error when the file isn't available because the operator mistyped the URL > > or it's offline for whatever reason. The problem I'm running into is with > > certain sites where the admin has set up a redirect to handle the 404 > > condition and redirects the request to another page. > > > > In this case, the redirected page gets downloaded and saved which is not > > the desired result. This utility is being used in a scheduling process to > > download and process specific data files from public web sites and > > the > > files must exist, or return an error to halt processing. > > > > Is there anyway to tell the WebClient or WebRequest objects to not allow > > redirected content? Or, is there a property in either of the objects that > > reflects the actual URL of the source when it's redirected? Or, is there > > an alternative object that can be used for this purpose to download files > > from web sites? > > > > Any help would be greatly appreciated. > > > > - Glen > >
Hi Adam,
I checked a few things and even in the browser (IE) when you enter and
incorrect path on this particular server it doesn't change the actual
address when it displays the help text. I guess it's more of a content
substitution than a redirect scenario on the server side to handle 404
errors.
I'm going to have to figure out another way to validate the download and
return the appropriate codes to the application.
Thanks for your help.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message
news:eh****************@TK2MSFTNGP09.phx.gbl... Hmm, wait, so the website is not telling the client to redirect anywhere? In this case - there probably is no way to tell that anything was
happening. It sounds like the redirection is actually happening on the server side of things...
-- Adam Clauss ca*****@tamu.edu
"Glen" <bu****@hotmail.com> wrote in message news:O$****************@TK2MSFTNGP10.phx.gbl... Unfortunately, this parameter doesn't appear to work in some cases. The site in question is using IIS 5.0 and is not returning a redirect code
or page not found code in any instance. All redirects are done silently
with an OK status returned.
"Glen" <bu****@hotmail.com> wrote in message news:%2******************@TK2MSFTNGP11.phx.gbl... Thanks Adam. I'll give that one a look.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message news:OM****************@TK2MSFTNGP10.phx.gbl... > I don't think there is a way using the WebClient class. > > If you are using HttpWebRequest, there is the AllowAutoRedirect > property > which may allow you to accomplish what you need. > > -- > Adam Clauss > ca*****@tamu.edu > "Glen" <Bu****@hotmail.com> wrote in message > news:u4****************@TK2MSFTNGP14.phx.gbl... > > I'm writing a console utility to download specific files from web sites > > based on the command line options. In most cases, I can trap the
404 > > error when the file isn't available because the operator mistyped
the URL > > or it's offline for whatever reason. The problem I'm running into
is with > > certain sites where the admin has set up a redirect to handle the
404 > > condition and redirects the request to another page. > > > > In this case, the redirected page gets downloaded and saved which
is not > > the desired result. This utility is being used in a scheduling process to > > download and process specific data files from public web sites and > > the > > files must exist, or return an error to halt processing. > > > > Is there anyway to tell the WebClient or WebRequest objects to not allow > > redirected content? Or, is there a property in either of the
objects that > > reflects the actual URL of the source when it's redirected? Or, is there > > an alternative object that can be used for this purpose to download files > > from web sites? > > > > Any help would be greatly appreciated. > > > > - Glen > >
NP... good luck on that.
Do you know what type of file you are EXPECTING to retrieve? Could you
check the actual content you downloaded to see if it is correct (or some
sort of "error" page?)?
--
Adam Clauss ca*****@tamu.edu
"Glen" <bu****@hotmail.com> wrote in message
news:%2******************@TK2MSFTNGP12.phx.gbl... Hi Adam,
I checked a few things and even in the browser (IE) when you enter and incorrect path on this particular server it doesn't change the actual address when it displays the help text. I guess it's more of a content substitution than a redirect scenario on the server side to handle 404 errors.
I'm going to have to figure out another way to validate the download and return the appropriate codes to the application.
Thanks for your help.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message news:eh****************@TK2MSFTNGP09.phx.gbl... Hmm, wait, so the website is not telling the client to redirect anywhere? In this case - there probably is no way to tell that anything was happening. It sounds like the redirection is actually happening on the server side of things...
-- Adam Clauss ca*****@tamu.edu
"Glen" <bu****@hotmail.com> wrote in message news:O$****************@TK2MSFTNGP10.phx.gbl... > Unfortunately, this parameter doesn't appear to work in some cases. > The > site in question is using IIS 5.0 and is not returning a redirect code or > page not found code in any instance. All redirects are done silently with > an OK status returned. > > > "Glen" <bu****@hotmail.com> wrote in message > news:%2******************@TK2MSFTNGP11.phx.gbl... >> Thanks Adam. I'll give that one a look. >> >> - Glen >> >> "Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message >> news:OM****************@TK2MSFTNGP10.phx.gbl... >> > I don't think there is a way using the WebClient class. >> > >> > If you are using HttpWebRequest, there is the AllowAutoRedirect >> > property >> > which may allow you to accomplish what you need. >> > >> > -- >> > Adam Clauss >> > ca*****@tamu.edu >> > "Glen" <Bu****@hotmail.com> wrote in message >> > news:u4****************@TK2MSFTNGP14.phx.gbl... >> > > I'm writing a console utility to download specific files from web > sites >> > > based on the command line options. In most cases, I can trap the 404 >> > > error when the file isn't available because the operator mistyped the >> URL >> > > or it's offline for whatever reason. The problem I'm running into is >> with >> > > certain sites where the admin has set up a redirect to handle the 404 >> > > condition and redirects the request to another page. >> > > >> > > In this case, the redirected page gets downloaded and saved which is > not >> > > the desired result. This utility is being used in a scheduling > process >> to >> > > download and process specific data files from public web sites and >> > > the >> > > files must exist, or return an error to halt processing. >> > > >> > > Is there anyway to tell the WebClient or WebRequest objects to not > allow >> > > redirected content? Or, is there a property in either of the objects >> that >> > > reflects the actual URL of the source when it's redirected? Or, >> > > is >> there >> > > an alternative object that can be used for this purpose to >> > > download >> files >> > > from web sites? >> > > >> > > Any help would be greatly appreciated. >> > > >> > > - Glen >> > >> > >> >> > >
Yes, these particular files are regulated (pipe delimited) with constant
headers. I'll probably just build a simple script to open the file and
verify the header in a seperate step when the transfer is completed. At
least this way, I can keep my C# project generic enough to use with other
processes.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message
news:ue****************@TK2MSFTNGP14.phx.gbl... NP... good luck on that.
Do you know what type of file you are EXPECTING to retrieve? Could you check the actual content you downloaded to see if it is correct (or some sort of "error" page?)?
-- Adam Clauss ca*****@tamu.edu "Glen" <bu****@hotmail.com> wrote in message news:%2******************@TK2MSFTNGP12.phx.gbl... Hi Adam,
I checked a few things and even in the browser (IE) when you enter and incorrect path on this particular server it doesn't change the actual address when it displays the help text. I guess it's more of a content substitution than a redirect scenario on the server side to handle 404 errors.
I'm going to have to figure out another way to validate the download and return the appropriate codes to the application.
Thanks for your help.
- Glen
"Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message news:eh****************@TK2MSFTNGP09.phx.gbl... Hmm, wait, so the website is not telling the client to redirect
anywhere? In this case - there probably is no way to tell that anything was happening. It sounds like the redirection is actually happening on the server side of things...
-- Adam Clauss ca*****@tamu.edu
"Glen" <bu****@hotmail.com> wrote in message news:O$****************@TK2MSFTNGP10.phx.gbl... > Unfortunately, this parameter doesn't appear to work in some cases. > The > site in question is using IIS 5.0 and is not returning a redirect
code or > page not found code in any instance. All redirects are done silently with > an OK status returned. > > > "Glen" <bu****@hotmail.com> wrote in message > news:%2******************@TK2MSFTNGP11.phx.gbl... >> Thanks Adam. I'll give that one a look. >> >> - Glen >> >> "Adam Clauss" <ca*****@nospam.tamu.edu> wrote in message >> news:OM****************@TK2MSFTNGP10.phx.gbl... >> > I don't think there is a way using the WebClient class. >> > >> > If you are using HttpWebRequest, there is the AllowAutoRedirect >> > property >> > which may allow you to accomplish what you need. >> > >> > -- >> > Adam Clauss >> > ca*****@tamu.edu >> > "Glen" <Bu****@hotmail.com> wrote in message >> > news:u4****************@TK2MSFTNGP14.phx.gbl... >> > > I'm writing a console utility to download specific files from
web > sites >> > > based on the command line options. In most cases, I can trap
the 404 >> > > error when the file isn't available because the operator
mistyped the >> URL >> > > or it's offline for whatever reason. The problem I'm running
into is >> with >> > > certain sites where the admin has set up a redirect to handle
the 404 >> > > condition and redirects the request to another page. >> > > >> > > In this case, the redirected page gets downloaded and saved
which is > not >> > > the desired result. This utility is being used in a scheduling > process >> to >> > > download and process specific data files from public web sites
and >> > > the >> > > files must exist, or return an error to halt processing. >> > > >> > > Is there anyway to tell the WebClient or WebRequest objects to
not > allow >> > > redirected content? Or, is there a property in either of the objects >> that >> > > reflects the actual URL of the source when it's redirected? Or, >> > > is >> there >> > > an alternative object that can be used for this purpose to >> > > download >> files >> > > from web sites? >> > > >> > > Any help would be greatly appreciated. >> > > >> > > - Glen >> > >> > >> >> > >
This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Seymen Ertas |
last post by:
Hi,
I don't know if this is the correct place to post this but i am having a
little problem with the OpenRead function of the WebClient.
Below is the code i am trying to get it to work:
...
|
by: A.M-SG |
last post by:
Hi,
I have an aspx page at the web server that provides PDF documents for smart
client applications.
Here is the code in aspx page that defines content type:
Response.ContentType =...
|
by: jwgoerlich |
last post by:
Here is the scenario: New WebClient, DownloadData from a URL, URL
returns a Response.redirect and then the data, WebClient has the data.
I need to determine the final URL after the remote IIS...
|
by: Earl Teigrob |
last post by:
I am loading a page using WebClient using the code below. The thing is that
on a redirect, I do not end up knowing where the page was loaded from. I
need this information because I am parsing the...
|
by: Rippo |
last post by:
Hi
I need to post a form to an external URL, get a repsonse, then repost
to an external URL and redirect at the same time. I can figure out step
1 and step 2 fine but I cant seem to figure out...
|
by: Lehel Kovach |
last post by:
I'm having a problem with the WebClient object. When I post data to certain
sites, it will receive a command from the webserver (object moved) and
continue to the next link by downloading that. ...
|
by: Mad Scientist Jr |
last post by:
For some reason I can't get a WebClient to access an outside URL from
behind our firewall. The code works when it runs outside the firewall.
I turned on windows authentication in the web.config...
|
by: MichaelSchoeler |
last post by:
Hi,
I'm having problems with the WebClient class regarding UTF-8 encoded data.
When I access a specific webservice directly I can see the data arrives in corretly formatted UTF-8. But when I...
|
by: alex21 |
last post by:
I'm trying to detect the http status number such as (401 Unauthorized) from a 'WebException' when a WebClient in my code fails.
Public Function DataSources_ValidURL() As Boolean
Dim...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM).
In this month's session, the creator of the excellent VBE...
|
by: MeoLessi9 |
last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
|
by: DolphinDB |
last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation.
Take...
|
by: Aftab Ahmad |
last post by:
Hello Experts!
I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: marcoviolo |
last post by:
Dear all,
I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
| |