469,641 Members | 1,129 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,641 developers. It's quick & easy.

Uri problem/bug

Hi all,
I have found a possible bug in the Uri class constructor.
When I make something like this:
test = new Uri(@"http://www.test.com/dir1/page1.html");

test2 = new Uri(test, @"../../page2.html");

in test2.AbsolutetPath I receive http://www.test.com/../page2.html.

I know that you can not go before the /test/ dir, and that why there is
strange result. And if you try to use test2 ageist a website, it crashes,
while if you open such a page (page1.htm) in a browser (IE) and click on
such a link, it navigates to http://test.com/page2.html. So I'm wondering
....

And yes, there is such a page (not mine) which I have to parse.

I am checking for such a links and replace "../" with "", but maybe this
have to be reposted somewhere as a possible bug.

Sunny

Nov 15 '05 #1
7 1820
Sunny, when you create test2 you are creating it with the following URL:

http://www.test.com/dir1/../../page2.html

which is really

http://www.test.com/../page2.html (the first .. goes down 1 dir, eliminating
the dir1 ref)

So, the URI class is returning the correct AbsolutePath. Does that make
sense? If you can give some more details about what you are trying to do
I'm sure we could help you here.

--
Greg Ewing [MVP]
http://www.citidc.com
"Sunny" <su******@icebergwireless.com> wrote in message
news:#T**************@TK2MSFTNGP11.phx.gbl...
Hi all,
I have found a possible bug in the Uri class constructor.
When I make something like this:
test = new Uri(@"http://www.test.com/dir1/page1.html");

test2 = new Uri(test, @"../../page2.html");

in test2.AbsolutetPath I receive http://www.test.com/../page2.html.

I know that you can not go before the /test/ dir, and that why there is
strange result. And if you try to use test2 ageist a website, it crashes,
while if you open such a page (page1.htm) in a browser (IE) and click on
such a link, it navigates to http://test.com/page2.html. So I'm wondering
...

And yes, there is such a page (not mine) which I have to parse.

I am checking for such a links and replace "../" with "", but maybe this
have to be reposted somewhere as a possible bug.

Sunny

Nov 15 '05 #2
I know that this is not a valid construct ( as far as ".." is a pointer to
the upper dir in most OSes, and for sure for all web servers).

I have to parse an html page (lets say page1.html from the example), in
which, because of bad (?) design maybe, there is "<img
src="../../image.gif". Even then, in IE and Netscape, the image is
displayed, but if I use Uris' to prepare the whole URL from the base page, I
receive the result I post. And then I can not use directly
imgUri.AbsolutePath to download the image.
As far as I'm concerned of that problem, I just check the result URL and
correct it by removing "../", but I just wanted to know where is the problem
for this different behavior in Uri class and browser.
I have posted this just to start a discussion in that direction. I can
assume that there are not so many so bad coded web pages, but ...

Thanks for reading all this
Sunny

P.S. I try to download the image (if it does matter) with:
System.Net.WebClient source = new System.Net.WebClient();

Stream myData = null;

myData = source.OpenRead(sUrl);

byte[] buffer = new byte[4096];

int br = buffer.Length;

while (br == buffer.Length)

br = myData.Read(buffer, 0, buffer.Length);

Actually my code breaks in OpenRead method.

Sunny
"Greg Ewing [MVP]" <gewing@_NO_SPAM_gewing.com> wrote in message
news:uz**************@TK2MSFTNGP11.phx.gbl...
Sunny, when you create test2 you are creating it with the following URL:

http://www.test.com/dir1/../../page2.html

which is really

http://www.test.com/../page2.html (the first .. goes down 1 dir, eliminating the dir1 ref)

So, the URI class is returning the correct AbsolutePath. Does that make
sense? If you can give some more details about what you are trying to do
I'm sure we could help you here.

--
Greg Ewing [MVP]
http://www.citidc.com
"Sunny" <su******@icebergwireless.com> wrote in message
news:#T**************@TK2MSFTNGP11.phx.gbl...
Hi all,
I have found a possible bug in the Uri class constructor.
When I make something like this:
test = new Uri(@"http://www.test.com/dir1/page1.html");

test2 = new Uri(test, @"../../page2.html");

in test2.AbsolutetPath I receive http://www.test.com/../page2.html.

I know that you can not go before the /test/ dir, and that why there is
strange result. And if you try to use test2 ageist a website, it crashes, while if you open such a page (page1.htm) in a browser (IE) and click on
such a link, it navigates to http://test.com/page2.html. So I'm wondering ...

And yes, there is such a page (not mine) which I have to parse.

I am checking for such a links and replace "../" with "", but maybe this
have to be reposted somewhere as a possible bug.

Sunny



Nov 15 '05 #3
Hi Sunny,

I have made a test on my machine and found that
<img src="../../image.gif"
if current path is in the root of the website,(usually the root is the
wwwroot directory), then the ../../image.gif will be parsed to image.gif.
that is to say, in the <img src="../../image.gif"> when the current path
is at the root of the website, then the ../ which used to get the upper
directory will be skipped.
So when the current directory is root of the website, e.g.
http://localhost/test.htm
then the <img src="../../image.gif"> section in the test.htm will equal
to <img src="../image.gif"> as well as <img src="image.gif">.
I suggest when you parse the htm file, you need to get he current path to
see if it is the root of the website, in such case , you may need to
neglect the ../ as I discussed above.

Did I misunderstand your meaing?
I look forward to hearing from you.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
From: "Sunny" <su******@icebergwireless.com>
References: <#T**************@TK2MSFTNGP11.phx.gbl> <uz**************@TK2MSFTNGP11.phx.gbl>Subject: Re: Uri problem/bug
Date: Sun, 7 Sep 2003 22:59:41 -0500
Lines: 92
MIME-Version: 1.0
Content-Type: text/plain;
charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
Message-ID: <eH**************@TK2MSFTNGP09.phx.gbl>
Newsgroups: microsoft.public.dotnet.languages.csharp
NNTP-Posting-Host: c-66-41-159-114.mn.client2.attbi.com 66.41.159.114
Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTN GP09.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.languages.csharp:183078
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

I know that this is not a valid construct ( as far as ".." is a pointer to
the upper dir in most OSes, and for sure for all web servers).

I have to parse an html page (lets say page1.html from the example), in
which, because of bad (?) design maybe, there is "<img
src="../../image.gif". Even then, in IE and Netscape, the image is
displayed, but if I use Uris' to prepare the whole URL from the base page, Ireceive the result I post. And then I can not use directly
imgUri.AbsolutePath to download the image.
As far as I'm concerned of that problem, I just check the result URL and
correct it by removing "../", but I just wanted to know where is the problemfor this different behavior in Uri class and browser.
I have posted this just to start a discussion in that direction. I can
assume that there are not so many so bad coded web pages, but ...

Thanks for reading all this
Sunny

P.S. I try to download the image (if it does matter) with:
System.Net.WebClient source = new System.Net.WebClient();

Stream myData = null;

myData = source.OpenRead(sUrl);

byte[] buffer = new byte[4096];

int br = buffer.Length;

while (br == buffer.Length)

br = myData.Read(buffer, 0, buffer.Length);

Actually my code breaks in OpenRead method.

Sunny
"Greg Ewing [MVP]" <gewing@_NO_SPAM_gewing.com> wrote in message
news:uz**************@TK2MSFTNGP11.phx.gbl...
Sunny, when you create test2 you are creating it with the following URL:

http://www.test.com/dir1/../../page2.html

which is really

http://www.test.com/../page2.html (the first .. goes down 1 dir,

eliminating
the dir1 ref)

So, the URI class is returning the correct AbsolutePath. Does that make
sense? If you can give some more details about what you are trying to do
I'm sure we could help you here.

--
Greg Ewing [MVP]
http://www.citidc.com
"Sunny" <su******@icebergwireless.com> wrote in message
news:#T**************@TK2MSFTNGP11.phx.gbl...
> Hi all,
> I have found a possible bug in the Uri class constructor.
> When I make something like this:
> test = new Uri(@"http://www.test.com/dir1/page1.html");
>
> test2 = new Uri(test, @"../../page2.html");
>
> in test2.AbsolutetPath I receive http://www.test.com/../page2.html.
>
> I know that you can not go before the /test/ dir, and that why there is
> strange result. And if you try to use test2 ageist a website, itcrashes, > while if you open such a page (page1.htm) in a browser (IE) and click on > such a link, it navigates to http://test.com/page2.html. So I'mwondering > ...
>
> And yes, there is such a page (not mine) which I have to parse.
>
> I am checking for such a links and replace "../" with "", but maybe this > have to be reposted somewhere as a possible bug.
>
>
>
> Sunny
>




Nov 15 '05 #4
In article <uA*************@cpmsftngxa06.phx.gbl>, v-
ph****@online.microsoft.com says...
Hi Sunny,

I have made a test on my machine and found that
<img src="../../image.gif"
if current path is in the root of the website,(usually the root is the
wwwroot directory), then the ../../image.gif will be parsed to image.gif.
that is to say, in the <img src="../../image.gif"> when the current path
is at the root of the website, then the ../ which used to get the upper
directory will be skipped.
So when the current directory is root of the website, e.g.
http://localhost/test.htm
then the <img src="../../image.gif"> section in the test.htm will equal
to <img src="../image.gif"> as well as <img src="image.gif">.
I suggest when you parse the htm file, you need to get he current path to
see if it is the root of the website, in such case , you may need to
neglect the ../ as I discussed above.

Did I misunderstand your meaing?
I look forward to hearing from you.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------

Hi Peter,
Yes, you have understood me correctly. There is no problem at all, as I
check for that (../../).
The main point of my original post was that the way web browsers and Uri
constructor deals with that issue are different (and the way browsers do
it, is the right one). I thing that this is a possible bug in the Uri
class, as far as most of OSystems use ".." to point to the upper dir.
So I can no accept that new Uri(baseuri, sPath) can return an uri which
likes "http://test.com/../image.gif" and you have by yourself to check
for this.
I have started this thread only as a warning to other developers, and to
MS team, if they read this group.

And if someone thinks that Uri SHOULD act like this, because of any
reason, I'd like to hear it.

Thanks
Sunny
Nov 15 '05 #5
Hi Sunny,

http://test.com/../image.gif
the uri you posted will not work in an IE browser either, it seems that the
uri is parsed from the html file, isn't it?
I think the IE browser is for compatibility concern. Since there are many
links on the web that will not work, that was why browser must be strong
compatibility.
Since the http://test.com/../image.gif doesn't not work in the ie, but the
<img src="../image.gif> works, it is all because of the compatibility.
But to uri class, it can not guarantee that the http://test.com/image.gif
will work, then it didn't not check if it is necessary to skip "../" when
encounter the root.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
From: Sunny <su******@icebergwireless.com>
Subject: Re: Uri problem/bug
Date: Mon, 8 Sep 2003 10:28:59 -0500
Message-ID: <MP************************@msnews.microsoft.com >
References: <#T**************@TK2MSFTNGP11.phx.gbl> <uz**************@TK2MSFTNGP11.phx.gbl>
<eH**************@TK2MSFTNGP09.phx.gbl>
<uA*************@cpmsftngxa06.phx.gbl>Organization: Iceberg Wireless LLC
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
X-Newsreader: MicroPlanet Gravity v2.60
Newsgroups: microsoft.public.dotnet.languages.csharp
NNTP-Posting-Host: 216.17.90.91
Lines: 1
Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTN GP09.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.languages.csharp:183215
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

In article <uA*************@cpmsftngxa06.phx.gbl>, v-
ph****@online.microsoft.com says...
Hi Sunny,

I have made a test on my machine and found that
<img src="../../image.gif"
if current path is in the root of the website,(usually the root is the
wwwroot directory), then the ../../image.gif will be parsed to image.gif.
that is to say, in the <img src="../../image.gif"> when the current path is at the root of the website, then the ../ which used to get the upper
directory will be skipped.
So when the current directory is root of the website, e.g.
http://localhost/test.htm
then the <img src="../../image.gif"> section in the test.htm will equal to <img src="../image.gif"> as well as <img src="image.gif">.
I suggest when you parse the htm file, you need to get he current path to see if it is the root of the website, in such case , you may need to
neglect the ../ as I discussed above.

Did I misunderstand your meaing?
I look forward to hearing from you.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights. --------------------

Hi Peter,
Yes, you have understood me correctly. There is no problem at all, as I
check for that (../../).
The main point of my original post was that the way web browsers and Uri
constructor deals with that issue are different (and the way browsers do
it, is the right one). I thing that this is a possible bug in the Uri
class, as far as most of OSystems use ".." to point to the upper dir.
So I can no accept that new Uri(baseuri, sPath) can return an uri which
likes "http://test.com/../image.gif" and you have by yourself to check
for this.
I have started this thread only as a warning to other developers, and to
MS team, if they read this group.

And if someone thinks that Uri SHOULD act like this, because of any
reason, I'd like to hear it.

Thanks
Sunny


Nov 15 '05 #6
Hi Peter,
yes, that was my mistake, actually if you are looking a page
http://test.com/dir1/page1.html, and in there is a image with
src="../../image.gif", the browser will find it.
But do you think that browser first tries to get from
test.com/../image.gif, and if no success (for sure :) ), it'll try the
correct one, or it just inspects that link like /../image.gif is not
valid and directly tries the other?
So, I agree that Uri is strict, and it can not "suppose" to remove ../,
but at least it has to note that this is not a valid construct, and to
note somehow the user code (throw exception?).

Cheers
Sunny

In article <U6**************@cpmsftngxa06.phx.gbl>, v-
ph****@online.microsoft.com says...
Hi Sunny,

http://test.com/../image.gif
the uri you posted will not work in an IE browser either, it seems that the
uri is parsed from the html file, isn't it?
I think the IE browser is for compatibility concern. Since there are many
links on the web that will not work, that was why browser must be strong
compatibility.
Since the http://test.com/../image.gif doesn't not work in the ie, but the
<img src="../image.gif> works, it is all because of the compatibility.
But to uri class, it can not guarantee that the http://test.com/image.gif
will work, then it didn't not check if it is necessary to skip "../" when
encounter the root.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
From: Sunny <su******@icebergwireless.com>
Subject: Re: Uri problem/bug
Date: Mon, 8 Sep 2003 10:28:59 -0500
Message-ID: <MP************************@msnews.microsoft.com >
References: <#T**************@TK2MSFTNGP11.phx.gbl>

<uz**************@TK2MSFTNGP11.phx.gbl>
<eH**************@TK2MSFTNGP09.phx.gbl>
<uA*************@cpmsftngxa06.phx.gbl>
Organization: Iceberg Wireless LLC
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
X-Newsreader: MicroPlanet Gravity v2.60
Newsgroups: microsoft.public.dotnet.languages.csharp
NNTP-Posting-Host: 216.17.90.91
Lines: 1
Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTN GP09.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.languages.csharp:183215
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

In article <uA*************@cpmsftngxa06.phx.gbl>, v-
ph****@online.microsoft.com says...
Hi Sunny,

I have made a test on my machine and found that
<img src="../../image.gif"
if current path is in the root of the website,(usually the root is the
wwwroot directory), then the ../../image.gif will be parsed to image.gif.
that is to say, in the <img src="../../image.gif"> when the current

Nov 15 '05 #7
Hi Sunny,

I think the behavior is by design.
In the RFC specification, it said.
e) All occurrences of "<segment>/../", where <segment> is a
complete path segment not equal to "..", are removed from the
buffer string. Removal of these path segments is performed
iteratively, removing the leftmost matching pattern on each
iteration, until no matching pattern remains.

f) If the buffer string ends with "<segment>/..", where <segment>
is a complete path segment not equal to "..", that
"<segment>/.." is removed.

For more information, please refer to the link below
http://www.ietf.org/rfc/rfc2396.txt

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.

--------------------
From: Sunny <su******@icebergwireless.com>
Subject: Re: Uri problem/bug
Date: Tue, 9 Sep 2003 09:44:16 -0500
Message-ID: <MP************************@msnews.microsoft.com >
References: <#T**************@TK2MSFTNGP11.phx.gbl> <uz**************@TK2MSFTNGP11.phx.gbl>
<eH**************@TK2MSFTNGP09.phx.gbl>
<uA*************@cpmsftngxa06.phx.gbl>
<MP************************@msnews.microsoft.com >
<U6**************@cpmsftngxa06.phx.gbl>Organization: Iceberg Wireless LLC
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-15"
Content-Transfer-Encoding: 7bit
X-Newsreader: MicroPlanet Gravity v2.60
Newsgroups: microsoft.public.dotnet.languages.csharp
NNTP-Posting-Host: 216.17.90.91
Lines: 1
Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTN GP11.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.languages.csharp:183487
X-Tomcat-NG: microsoft.public.dotnet.languages.csharp

Hi Peter,
yes, that was my mistake, actually if you are looking a page
http://test.com/dir1/page1.html, and in there is a image with
src="../../image.gif", the browser will find it.
But do you think that browser first tries to get from
test.com/../image.gif, and if no success (for sure :) ), it'll try the
correct one, or it just inspects that link like /../image.gif is not
valid and directly tries the other?
So, I agree that Uri is strict, and it can not "suppose" to remove ../,
but at least it has to note that this is not a valid construct, and to
note somehow the user code (throw exception?).

Cheers
Sunny

In article <U6**************@cpmsftngxa06.phx.gbl>, v-
ph****@online.microsoft.com says...
Hi Sunny,

http://test.com/../image.gif
the uri you posted will not work in an IE browser either, it seems that the uri is parsed from the html file, isn't it?
I think the IE browser is for compatibility concern. Since there are many links on the web that will not work, that was why browser must be strong
compatibility.
Since the http://test.com/../image.gif doesn't not work in the ie, but the <img src="../image.gif> works, it is all because of the compatibility.
But to uri class, it can not guarantee that the http://test.com/image.gif will work, then it didn't not check if it is necessary to skip "../" when encounter the root.
Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights. --------------------
>From: Sunny <su******@icebergwireless.com>
>Subject: Re: Uri problem/bug
>Date: Mon, 8 Sep 2003 10:28:59 -0500
>Message-ID: <MP************************@msnews.microsoft.com >
>References: <#T**************@TK2MSFTNGP11.phx.gbl>

<uz**************@TK2MSFTNGP11.phx.gbl>
<eH**************@TK2MSFTNGP09.phx.gbl>
<uA*************@cpmsftngxa06.phx.gbl>
>Organization: Iceberg Wireless LLC
>MIME-Version: 1.0
>Content-Type: text/plain; charset="iso-8859-15"
>Content-Transfer-Encoding: 7bit
>X-Newsreader: MicroPlanet Gravity v2.60
>Newsgroups: microsoft.public.dotnet.languages.csharp
>NNTP-Posting-Host: 216.17.90.91
>Lines: 1
>Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTN GP09.phx.gbl
>Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.languages.csharp:183215 >X-Tomcat-NG: microsoft.public.dotnet.languages.csharp
>
>In article <uA*************@cpmsftngxa06.phx.gbl>, v-
>ph****@online.microsoft.com says...
>> Hi Sunny,
>>
>> I have made a test on my machine and found that
>> <img src="../../image.gif"
>> if current path is in the root of the website,(usually the root is the >> wwwroot directory), then the ../../image.gif will be parsed to image.gif. >> that is to say, in the <img src="../../image.gif"> when the current


Nov 15 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

11 posts views Thread by Kostatus | last post: by
117 posts views Thread by Peter Olcott | last post: by
18 posts views Thread by Ian Stanley | last post: by
28 posts views Thread by Jon Davis | last post: by
6 posts views Thread by Ammar | last post: by
2 posts views Thread by Mike Collins | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.