472,127 Members | 2,054 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,127 software developers and data experts.

if-modified-since question (protocol problem?)

hug
[originally posted in alt.www.webmaster, was suggested that this ng
could be a better place.]

I've updated my test server to handle if-modified-since. I've noticed
that the (old copies I run of) IE and Netscape seem never to send
if-modified-since. But the strange thing is that Opera sends
if-modified-since but when I reply with "HTTP/1.0 304 Not Modified" it
is not refreshing the screen from its cache, it is leaving the screen
blank.

I can only conclude that either I am not returning a correct protocol
sequence including "HTTP/1.0 304 Not Modified", or that the old Opera
I'm running contains a bug. I'm betting on the incorrect response in
my code.

Anybody have experience with handling if-modified-since themselves and
doing it properly?

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 7 '06
102 6477
..oO(hug)
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:
In the event that they do unintentionally leak out, as seems to have
happened here, they take the form of an HTML comment, so the browser
is behaving correctly by displaying a "blank" page.


Interesting that the browser displays any text inserted before/after
the SSI include but not the included text.


SSI are a server-side concept, the browser doesn't know anything about
that. All he gets is a document sent from the server. So the browser
just displays what he receives. If he doesn't show the SSI-included
content then it's _definitely_ a server problem.

Can you post a URL for a test case?

Micha
May 10 '06 #51
On Wed, 10 May 2006 09:26:56 -0600, hug wrote:
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:
On Wed, 10 May 2006, hug wrote: As you've already been told before, "SSI includes" have no business
being presented to a browser. They're meant to be resolved *by the
server*, before being served-out.


Yeah, I understand that.
In the event that they do unintentionally leak out, as seems to have
happened here, they take the form of an HTML comment, so the browser
is behaving correctly by displaying a "blank" page.


Interesting that the browser displays any text inserted before/after
the SSI include but not the included text.


I'm just responding to this one point: It is in no way interesting, it is
what one would expect. I'm restating what Alan said in more detail. Given
this HTML:

<html><head><title>no title</title></head>
<body><!--#include virtual="something.html" --></body>
</html>

every compliant browser in existence is required to show nothing at all
because the body is comprised of a single HTML comment. However, given
this HTML:

<html><head><title>no title</title></head>
<body>Hello<!--#include virtual="something.html" -->
World!
</body>
</html>

every compliant browser will display

Hello World!

Once you are in the situation where SSI directives are being pushed out to
the browser unprocessed they will not appear in your page. I guess that
this is exactly the reason why they have the syntax that they do.

Rich
May 10 '06 #52
VK

Michael Fesser wrote:
When called with gmdate() it returns a "GMT" on my system.


OK, PHP is marked as the first broke the tradition :-) It has proper
RFC1123 date by default. Irrelevant to ciwah, but nice to know. Thanks.
It does work this way in my browsers (recent Opera and FF). If necessary
they send a conditional GET with If-Modified-Since and/or If-None-Match
headers.


I guess then than "if necessary" doesn't cover the expired Expires
header?


Hmm, actually I'm not sending an Expires header yet. Currently it's just
a max-age value in the Cache-Control header.


Oh, Cash-Control fine-tune modifiers... Keep promising to myself to
study them and their support across UA's. Maybe this thread was a sign
to me? :-)

Amazing: indeed in the entire Usenet there is not one group for HTTP
protocol discussions (alt.www doesn't count - just make one visit to
see why). I guess this topic is considered to be crystal clear to
anyone.

I wonder if ciwah charter could be read in some way to squeeze
Content-Type / Cache-Control/ and the like into it? I guess not.

May 10 '06 #53
On 10/05/2006 11:30, VK wrote:
hug wrote:
As far as I can tell, the fact that a 204 response header appears to
work (haven't played with your sample) is coincidence and using it
would be a "hack".
204 No Content header is an official documented HTTP/1 header, so I'm
not sure how could it be a "hack".


The context in which you suggested it

Simply check the data source server-side and take a server-side
decision if it's the same or not. ... If the data is the same,
simply throw back
Status: 204 No Content\n\n
-- 11*********************@i39g2000cwa.googlegroups.c om

is incorrect, as evidenced by what follows.
Its application domain are the situations where you want to send a
request, but in response you are interested only if your request
processed properly or not (thus you are not interested in the content
unless it contains something new to show). ^^^^^^^^^^^
To base the decision on whether there is something 'new to show' would
require knowledge about something 'old' - a cached resource. If this
information is communicated using If-Modified-Since, then there should
only be two types of response: 304, to indicate that the existing
variant can be served from the cache; or, whatever would result from a
normal GET request. As the OP has made no indication of responding with
anything other than 200 upon success, 204 is not reasonable. To use it
as a substitute for 304 would be, at best, a hack (if it worked as
intended).

[snip]
The script behind is forcely a bit more complicated. A good part of it
is taken by the sub generating proper date in RFC1123 format. A great
mistery to me of all known server-side scripting languages (Perl, PHP,
JSP, ASP etc):- this absolutely necessary format is not available by
default language tools.
PHP has already been covered. In JSP, one should (JSP is on my list of
'Things to Learn') be able to use the java.text.SimpleDateFormat class
with a similar pattern and an English locale:

"EEE, dd MM yyyy HH:mm:ss 'GMT'"

Presumably, ASP will have a similar feature somewhere.

[snip]
print "Date: $date\n";
I don't know Perl, but shouldn't all instances of \n in your header
lines be replaced with \r\n?
print "Expires: $date\n";
print "Content-Type: text/html; iso-8859-1\n\n";
If you don't send a Last-Modified header,

[snip]
elsif ( defined($ENV{HTTP_IF_MODIFIED_SINCE}) ){
....why would you expect If-Modified-Since in a subsequent request?

[snip]
1) In theory any cached page may be in two states: fresh (not expired
upon headers) and stale (expired upon headers).
Not 'in theory', in practice. Those are the only two states for an
existing cached resource.
If you call a page which is cached but stale, UA supposes to validate
this page before displaying:
Supposed to, yes. However, a client may be configured to return a stale
response (unless forced not to by a Cache-Control directive).
thus contact the origin server and check if the page was updated.
The origin server need not be involved in revalidation if an
intermediate cache can satisfy the request.

[snip]
3) By refreshing the page several times and trying to navigate back
later, you see right away big differences in caching mrchanics in
different UA's
I see no differences in how the browsers behave in this instance, so it
would be helpful for you to describe what you think you're seeing.

- No Last-Modified header is sent in the response, so no
If-Modified-Since is used in a subsequent request. That
particular branch of your code should never be executed.
- The Expires header that is sent is equal to the value of the
Date header, therefore the resource is immediately considered
stale. The resource will be cached, but because it is stale,
subsequent requests will be directed towards the origin
server unless expressly overridden by the user.

[snip]
[Caching] defines a great part of usability thus the caching
neuristics is one of the most complicated and often copyrighted part
of the UA's.
If cache-related headers are sent properly, no heuristics will need to
be employed.

Part of the reason why people get confused about caching behaviour is
they do something like sending a validator, but no freshness
information. Consequently, the browser must guess what do to. Some
browsers, like Opera, give the user the opportunity to set an explicit
waiting period. Others, like Internet Explorer, will guess a time. If
the result is inappropriate, things will go wrong; set the time
explicitly, and it doesn't.

[snip]
On a real run it will be MySGL, PostgreSQL or another database
request. But the nature of the situation doesn't change: the HTTP
mediator (this script) has no means to know if data has changed or
not. Respectively it cannot set any reasonable Last-Modified header.
You (your database driver) has to check the data against the request
and inform the mediator.
That's not necessarily difficult though, is it? A time stamp can be
updated whenever a change is made to a particular document. For
instance, a CMS or BB might keep track of - perhaps for the purpose of
display - when a user edited or added content. That same date, combined
with the last modified time of the template used to generate the
document (whether it's an XSLT style sheet, a script, or whatever), can
be used to generate a Last-Modified header.
5) Overall for dynamic sources simply serve all responses with
cache-preventing headers [...].
If a resource is truly dynamic (that is, it will change on every
request), then sure. There'll be no point in caching it. If not, then at
least some effort should be made to cache it, even if it means that
revalidation should always occur. Why send data if there's no need to
bother?
Mostly (but no guarantees of any kind) UA will re-query for the same
page from the server, where you can take some decisions.
What sort of decisions do you think one /could/ make? If the data hasn't
been cached at all, there's no option but to send an entity.
6) An icing on the cake: Opera doesn't check the cache expiration at
all when going back and forwards in the history - it does it only when
you click a link.
Good. It's not supposed to.

History mechanisms and caches are different. In particular
history mechanisms SHOULD NOT try to show a semantically
transparent view of the current state of a resource. Rather, a
history mechanism is meant to show exactly what the user saw at
the time when the resource was retrieved.
-- 13.13 History Lists, RFC 2616
7) The huge popularity of IXMLHTTPRequest/XMLHttpRequest (aka AJAX)
may get a bit clearer now.


'AJAX' is popular because it's seen to be a new technology. That's the
same reason why some people think it's a good idea to churn out XHTML;
they believe it to be the latest and greatest (despite being six years
old as of January, earlier this year).

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 10 '06 #54
On 10/05/2006 13:28, hug wrote:

[snip]
The old Opera 6.05 that I'm using seems to send 3 types of GET
request. If it's a new page, there's nothing special in the headers.
If it's a link that's been clicked before, Opera sends
if-modified-since and deals correctly with the response (nevermind the
case of a file containing only one SSI include). If it's a "refresh"
what Opera sends is a GET request with a "Cache-control: none" header.


In the last instance, I assume you mean

Cache-Control: no-cache

That's certainly what Op 6.06 sends. However, other, more modern
browsers, do precisely the same thing.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 10 '06 #55
On 10/05/2006 13:20, VK wrote:
Michael Fesser wrote:
[snip]
<?php print gmdate(DATE_RFC1123)?>

(PHP 5.1)


Oh, they finally added it in the core? Took five releases, but better
later than never :-)


The string,

'D, d M Y H:i:s \G\M\T'

isn't that difficult to construct or use, though.
AFAIK RFC1123 expects "GMT" identifier, not "UTC";
No, it doesn't. RFC 2616 does.
from the other side servers should be forgiveful on that.
Should, yes. However, one should respect the robustness principle in its
entirety:

an implementation must be conservative in its sending
behavior, and liberal in its receiving behavior.

[snip]
I also send ETag and Cache-Control headers.


[...] you cannot tell if it will be actually regarded on the other
end and even if - you cannot tell in what manner will it be regarded
(unless intranet).


Then what makes you think one can rely on instructions to disable the
cache, either (other than doing things like constructing unique URIs)?
There is certainly no harm in trying, and when it does succeed (which
should be in the vast majority of cases), it could be very beneficial.
Say the same Opera (which was in OP) simply doesn't care of your
headers for < > navigation and doesn'r read them. It goes by "Expire in
X days" in UA preferences.
No it doesn't. See my other follow up to you.
At the same time it considers each refresh request of the same page as
new item in the history (if Expires is set properly).
Where exactly do you get these ideas? Not on Earth, certainly.
At the same time Firefox sees only one item in the history no matter
what headers say.
That's because by refreshing, the current document is replaced. It isn't
navigation, so there's no new entry to add to the history list. As far
as I know, all browsers at this way.
Go to <http://www.nskom.com/external/tmp/http/cache.cgi>, refresh the
page several times and try Back in both browsers.
The Back button is greyed out.
That is the major problem with GET on dynamic sources: you can make
only very rough guesses of how the actual refresh will work for the
recipients: it all depends on UA's default caching mechanics and only
just a bit on HTTP headers.


You might need to make rough guesses. Things are looking pretty
predictable from where I'm sitting. Granted, one must consider that the
user can override normal behaviour, but the result, and how it can be
dealt with, still isn't some sort of guessing game.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 10 '06 #56
On 10/05/2006 17:41, VK wrote:

[snip]
I wonder if ciwah charter could be read in some way to squeeze
Content-Type / Cache-Control/ and the like into it? I guess not.


No, because the HTTP protocol isn't markup issue (though it obviously
has implications). One of the c.i.w.misc, c.i.w.a.misc, and c.i.w.a.cgi
groups are probably more on topic.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 10 '06 #57
VK

Michael Winter wrote:
204 No Content header is an official documented HTTP/1 header, so I'm
not sure how could it be a "hack".
The context in which you suggested it

Simply check the data source server-side and take a server-side
decision if it's the same or not. ... If the data is the same,
simply throw back
Status: 204 No Content\n\n
-- 11*********************@i39g2000cwa.googlegroups.c om

is incorrect, as evidenced by what follows.


OK. Just a short preface: I did not suggest to use 204 as an universal
"cache solution" for all situations. It was suggested in the context of
304 troubles.
Its application domain are the situations where you want to send a
request, but in response you are interested only if your request
processed properly or not (thus you are not interested in the content
unless it contains something new to show).

^^^^^^^^^^^
To base the decision on whether there is something 'new to show' would
require knowledge about something 'old' - a cached resource. If this
information is communicated using If-Modified-Since, then there should
only be two types of response: 304, to indicate that the existing
variant can be served from the cache; or, whatever would result from a
normal GET request.


Absolutely right.
If anyone is asking If-Modified-Since, I'm a law-obeyant citizen :-) :
"Please take your 304 or the content". The problem is (as demonstated
earlier with a demo at
<http://www.nskom.com/external/tmp/http/cache.cgi>) that there is a
broad range of situations where If-Modified-Since is not send despite
the "original intention of the request" - if I'm allowed to use such
wording - is to validate the stale cache version. And If-Modified-Since
was not asked, there is no use to send back 304 Not Modified. Say IE
will simply interpret such response as a broken page and it will
generate a blank page automatically.
As the OP has made no indication of responding with
anything other than 200 upon success, 204 is not reasonable. To use it
as a substitute for 304 would be, at best, a hack (if it worked as
intended).


The OP's problem (besides the bad temper) seems to be narrowed to this
exactly situation described above: sending Not Modified while was not
asked If-Modified-Since. It seems like it leads in Opera to the same
result as in IE: autogenerated blank page.

May 10 '06 #58
..oO(VK)
Michael Fesser wrote:
When called with gmdate() it returns a "GMT" on my system.
OK, PHP is marked as the first broke the tradition :-)


;)
Hmm, actually I'm not sending an Expires header yet. Currently it's just
a max-age value in the Cache-Control header.


Oh, Cash-Control fine-tune modifiers... Keep promising to myself to
study them and their support across UA's. Maybe this thread was a sign
to me? :-)


On one side there's HTTP 1.1, which offers a whole new bunch of
possibilities to speed up communications and save traffic for you and
your visitors. On the other side there are a lot of broken UAs like IE,
braindead proxies etc...

But personally I care more about the _working_ UAs than about the broken
ones.
Amazing: indeed in the entire Usenet there is not one group for HTTP
protocol discussions (alt.www doesn't count - just make one visit to
see why). I guess this topic is considered to be crystal clear to
anyone.


At least in the German Usenet we have <news:de.comm.software.webserver>.
Isn't there something similar for English speaking people?

Micha
May 10 '06 #59
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
And what's this rubbish about Opera doing a "naked get" ?
Don't blame other teams for your bugs.
Please feel free to explain why Opera 6.05 when presented with a
not-modified response to an if-modified-since request renders a
blank page for a shtml file that includes only a single SSI
include. Because the page hasn't been modified.


The page as cached from when it was last displayed should be shown.
I'm wondering what server configuration problems might cause this
bizarre happening.

You don't have SSI working.


Could be a configuration problem, yes; but SSI has been including
files on this server for over 3 years and I haven't touched anything
SSI-related lately.


I didn't say "configuration problem", I said you don't have SSI working.
If SSI used to work, then perhaps other SSI pages are working. Can you
find other SSI pages on the same server? Perhaps one that hasn't changed
recently?

If those are still working, and they really are coming from the same
webserver, then a quick butchers' at your SSI directive might be
instructive (we haven't seen it yet, AFAIK).
I haven't reached any definite conclusions except that what I am
seeing on the Opera screen is not what I expect to see. That's not Opera's fault, judging by your earlier remarks. Opera
*still* has the SSI source cached; and your Apache server is
*still* serving raw SSI to it. Fix the server, clear out the Opera
cache, and restart your testing cycle.


So SSI is going to include text for one GET and not for another,
within seconds? Kind of hard to believe.

I don't know. All I have to go on is your remarks. What I do know is
that an SSI-enabled server won't send SSI directives to browsers.
Apache's SSI module has to be both installed and switched on before
it will do its thing with SSI directives. Read the Apache
documentation, and then direct any Apache questions you have to an
appropriate newsgroup. If you don't control the Apache server, then
have a word with the dude who does.


SSI includes are generallly working fine.


OK, your SSI directive may be in error. Also, SSI can be enabled on a
per-directory basis; it may not be enabled with respect to your webspace.

Here's how to enable mod_include:
http://ucommdev.unl.edu/webdev/wiki/...ng_mod_include

Here's a page on how to turn on SSI for a specific directory, if the
server supports it but doesn't have it switched on for your webspace:
http://www.javascriptkit.com/howto/htaccess4.shtml

--
Jack.
May 10 '06 #60
VK

Michael Winter wrote:
On 10/05/2006 17:41, VK wrote:
I wonder if ciwah charter could be read in some way to squeeze
Content-Type / Cache-Control/ and the like into it? I guess not.


No, because the HTTP protocol isn't markup issue (though it obviously
has implications). One of the c.i.w.misc, c.i.w.a.misc, and c.i.w.a.cgi
groups are probably more on topic.


<comp.infosystems.www.authoring.cgi> would be the most appropriate
place I guess for all joyful company in this thread - including myself
- for further discussion on the topic (if anyone lets us on the door
step - it's moderated).

May 10 '06 #61
hug
Michael Winter <m.******@blueyonder.co.uk> wrote:
In the last instance, I assume you mean

Cache-Control: no-cache


That's what I meant (but not what I typed).

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 10 '06 #62
hug
Jack <mr*********@nospam.jackpot.uk.net> wrote:
hug wrote:
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:

In the event that they do unintentionally leak out, as seems to
have happened here, they take the form of an HTML comment, so the
browser is behaving correctly by displaying a "blank" page.
Interesting that the browser displays any text inserted before/after
the SSI include but not the included text.


OK, that's a novel meaning for 'blank'.


Originally the shtml file contained only the SSI include and the page
rendered after a not-modified response was blank. When text was
inserted before/after the SSI include, only that text was shown in the
case of a not-modified response. I still find this puzzling to say
the least.
Now: if you have SSI directives in the page on the server; and if those
directives are being delivered, *raw*, to your browser, then it follows
that they are not being expanded by the server.
It seems like that would be the case, but I haven't yet collected the
specific data necessary to conclude that it actually is the case.
There's possibly some
confusion about the nature of that server - perhaps it's Apache with
mod-ssi, or perhaps it's custom code. It's not a foregone conclusion
that the server is even capable of expanding SSI directives. But if the
server is supposed to be SSI-enabled, then (judging from what you've
said) it seems not to be doing that.


It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own type
of includes that are not mixed with SSI includes. In fact, the only
thing the SSI includes are used for is to include a PHP file.
Everybody is not a moron, some people just lack a few pieces of
information -- that's DATA, not understanding.


Most people don't have either the temperament or the intellectual rigour
for diagnosing problems in complex systems. That doesn't make them
morons; people like that sometimes make excellent salesmen, or Prime
Ministers.


The initial reason that I posted the question was that, as this was my
first time at supporting if-modified-since and I was getting strange
results, it seemed likely that I was not sending some additional
required header that I'd never heard of. Since the responses seem to
be pointing fingers at my supposedly nonexistent debugging skills, I
have to assume that none of the responders know of any additional
headers that are necessary. As for diagnosing problems in complex
systems, it has been my experience that sometimes certain race
conditions are essentialy irreproducible and all you can do is close
every timing window you can find; everything else I've run into is
soluable though sometimes not easy.

Thanks Jack.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 10 '06 #63
hug
Michael Fesser <ne*****@gmx.de> wrote:
.oO(hug)
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:
In the event that they do unintentionally leak out, as seems to have
happened here, they take the form of an HTML comment, so the browser
is behaving correctly by displaying a "blank" page.
Interesting that the browser displays any text inserted before/after
the SSI include but not the included text.


SSI are a server-side concept, the browser doesn't know anything about
that. All he gets is a document sent from the server. So the browser
just displays what he receives. If he doesn't show the SSI-included
content then it's _definitely_ a server problem.


I would assume that to be true, but since nothing beyond the
not-modified response is sent and the browser renders what appears not
to be the correct page, I am not ready to conclude anything.
Can you post a URL for a test case?


I could spend my time generating a testcase, or I could spend my time
actually fixing the problem. I will post results when I know what has
been going on. I still expect that the problem is in my code
somewhere but at the same time I am still not ready for conclusions.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 10 '06 #64
hug
Richard Gration <ri*****@zync.co.uk> wrote:
On Wed, 10 May 2006 09:26:56 -0600, hug wrote:
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:
On Wed, 10 May 2006, hug wrote:As you've already been told before, "SSI includes" have no business
being presented to a browser. They're meant to be resolved *by the
server*, before being served-out.


Yeah, I understand that.
In the event that they do unintentionally leak out, as seems to have
happened here, they take the form of an HTML comment, so the browser
is behaving correctly by displaying a "blank" page.


Interesting that the browser displays any text inserted before/after
the SSI include but not the included text.


I'm just responding to this one point: It is in no way interesting, it is
what one would expect. I'm restating what Alan said in more detail. Given
this HTML:

<html><head><title>no title</title></head>
<body><!--#include virtual="something.html" --></body>
</html>

every compliant browser in existence is required to show nothing at all
because the body is comprised of a single HTML comment. However, given
this HTML:

<html><head><title>no title</title></head>
<body>Hello<!--#include virtual="something.html" -->
World!
</body>
</html>

every compliant browser will display

Hello World!

Once you are in the situation where SSI directives are being pushed out to
the browser unprocessed they will not appear in your page. I guess that
this is exactly the reason why they have the syntax that they do.

Rich


The text initially consisted of this only:

<!--#include virtual="something.php" -->

Text was included before/after that to see what was going on.

I will post results when I have resolved the problem. Thank you.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 10 '06 #65
hug
Jack <mr*********@nospam.jackpot.uk.net> wrote:
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
> And what's this rubbish about Opera doing a "naked get" ?
> Don't blame other teams for your bugs.
Please feel free to explain why Opera 6.05 when presented with a
not-modified response to an if-modified-since request renders a
blank page for a shtml file that includes only a single SSI
include.
Because the page hasn't been modified.
The page as cached from when it was last displayed should be shown.
I'm wondering what server configuration problems might cause this
bizarre happening.
You don't have SSI working.


Could be a configuration problem, yes; but SSI has been including
files on this server for over 3 years and I haven't touched anything
SSI-related lately.


I didn't say "configuration problem", I said you don't have SSI working.
If SSI used to work, then perhaps other SSI pages are working. Can you
find other SSI pages on the same server? Perhaps one that hasn't changed
recently?

If those are still working, and they really are coming from the same
webserver, then a quick butchers' at your SSI directive might be
instructive (we haven't seen it yet, AFAIK).


It seems like I have said this a dozen times already in this thread
but I will say it one last time.

The exact same page generated by the exact same file in the exact same
directory is being rendered differently depending on whether the GET
request is an if-modified-since. Without if-modified-since,
regardless of any cache-control headers that may or may not be
included in the GET, the page is rendered correctly. With
if-modified-since when only a not-modified response is returned, the
page renders without whatever text the SSI include caused to be
generated. We are talking about two requests in a timespan of
seconds.

Yet people think it strange that I consider it -possible- that it
-might- be a browser bug; I'm not by any means concluding that it is,
but it seems -possible- given the set of circumstances.
So SSI is going to include text for one GET and not for another,
within seconds? Kind of hard to believe.

I don't know. All I have to go on is your remarks. What I do know is
that an SSI-enabled server won't send SSI directives to browsers.


I know that it's not supposed to, I don't "know" for a fact that it
won't. I assume it won't for the present, but that is not the same as
knowing it to be a fact.
OK, your SSI directive may be in error. Also, SSI can be enabled on a
per-directory basis; it may not be enabled with respect to your webspace.


Same file, same directory, within seconds. ONLY difference that I
have found so far is that of if-modified-since.

I will post results when the issue has been resolved. Thanks.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 10 '06 #66
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
Now: if you have SSI directives in the page on the server; and if
those directives are being delivered, *raw*, to your browser, then
it follows that they are not being expanded by the server.
It seems like that would be the case, but I haven't yet collected the
specific data necessary to conclude that it actually is the case.


What additional data do think might serve to convince you, this way or
that? Well, some Apache documentation, I guess. [Dumb question, Jack]
There's possibly some confusion about the nature of that server -
perhaps it's Apache with mod-ssi, or perhaps it's custom code. It's
not a foregone conclusion that the server is even capable of
expanding SSI directives. But if the server is supposed to be
SSI-enabled, then (judging from what you've said) it seems not to
be doing that.


It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own
type of includes that are not mixed with SSI includes. In fact, the
only thing the SSI includes are used for is to include a PHP file.


That sounds like a very odd thing to do. Have you ever seen that
working? For it to work, the Apache SSI module would have to ask Apache
to load the PHP, and Apache would have to ask the PHP module to execute
the PHP, returning the results via Apache to the SSI module. I'm
extremely sceptical that this is in fact what would happen, for various
reasons.

--
Jack.
May 10 '06 #67
hug wrote:

The exact same page generated by the exact same file in the exact
same directory is being rendered differently depending on whether the
GET request is an if-modified-since. Without if-modified-since,
regardless of any cache-control headers that may or may not be
included in the GET, the page is rendered correctly.
"Correctly" - do you mean that the SSI is being executed, and that the
PHP which it is supposed to include is also being executed, and the
results of the PHP execution are being returned via the SSI include??? I
really *don't* think you've already said that.

Anyhow, if that's what's happening, have you tried clearing the Opera
cache yet (as I suggested several hours ago)?
With if-modified-since when only a not-modified response is returned,
the page renders without whatever text the SSI include caused to be
generated. We are talking about two requests in a timespan of
seconds.


Look, if Opera has a cached version of the page from xx/xx/xx, and asks
for a page that has been modified more recently than that, and the
server doesn't think it has one, then Opera won't show it to you.

Clear the Opera cache, get the page again, then check that it's in the
cache (and notice that it doesn't have any SSI directives in it). I'm
sure it's not helping you that you have had a version of the page in
Opera's cache all this time that contains SSI directives and nothing else.

--
Jack.
May 10 '06 #68
On Wed, 10 May 2006, Michael Fesser wrote:
At least in the German Usenet we have <news:de.comm.software.webserver>.
gewiss...
Isn't there something similar for English speaking people?


Closest match here would be the comp.infosystems.www.servers.*
hierarchy. However, they're a bit awkwardly divided by OS. In
theory, I suppose cross-OS discussions *ought* to go into the
corresponding .misc group, but - as you may be able to verify for
yourself - that .misc group is practically empty (googroups for
example records 15 postings in the last half-year, and the news server
that I use is showing a similar story). So the theory surely broke
down there!

In practice it seems as if most of the discussion of HTTP interworking
gravitates to the comp.infosystems.www.authoring.cgi group, even
though that isn't entirely logical, seeing that many HTTP interworking
issues don't exactly involve CGI at all. (Of course that's an
auto-moderated group - see its postings for a pointer to details).

hope that makes some kind of sense. Don't blame me (I didn't vote for
it), I'm just trying to review how it is.
May 10 '06 #69
hug
Jack <mr*********@nospam.jackpot.uk.net> wrote:
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
Now: if you have SSI directives in the page on the server; and if
those directives are being delivered, *raw*, to your browser, then
it follows that they are not being expanded by the server.


It seems like that would be the case, but I haven't yet collected the
specific data necessary to conclude that it actually is the case.


What additional data do think might serve to convince you, this way or
that? Well, some Apache documentation, I guess. [Dumb question, Jack]


A protocol trace that gives sufficient detail and can be relied upon
might help. I'll figure it out one way or the other. I forgot to
download FireFox this morning, I want to check its live-http stuff out
and see how useful it is.
There's possibly some confusion about the nature of that server -
perhaps it's Apache with mod-ssi, or perhaps it's custom code. It's
not a foregone conclusion that the server is even capable of
expanding SSI directives. But if the server is supposed to be
SSI-enabled, then (judging from what you've said) it seems not to
be doing that.


It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own
type of includes that are not mixed with SSI includes. In fact, the
only thing the SSI includes are used for is to include a PHP file.


That sounds like a very odd thing to do. Have you ever seen that
working? For it to work, the Apache SSI module would have to ask Apache
to load the PHP, and Apache would have to ask the PHP module to execute
the PHP, returning the results via Apache to the SSI module. I'm
extremely sceptical that this is in fact what would happen, for various
reasons.


Don't know what to tell you Jack, it's been working in production for
a couple years now.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #70
hug
Jack <mr*********@nospam.jackpot.uk.net> wrote:
hug wrote:

The exact same page generated by the exact same file in the exact
same directory is being rendered differently depending on whether the
GET request is an if-modified-since. Without if-modified-since,
regardless of any cache-control headers that may or may not be
included in the GET, the page is rendered correctly.
"Correctly" - do you mean that the SSI is being executed, and that the
PHP which it is supposed to include is also being executed, and the
results of the PHP execution are being returned via the SSI include???


I don't know whether the results of the PHP execution are passing
through SSI on the way back, or going directly through apache, but the
results of the PHP execution are being rendered one way or another. I
find it difficult to imagine that the SSI developers never encountered
an include that was in fact a PHP file that would in turn be executed
via apache, but lots of things are possible.
I
really *don't* think you've already said that.
Oh. I'll take you word for it. I've been a lot more concerned with
other things (like finding time to shoot this bug) than in keeping
track of this thread since it seems to have turned into more of a
"roast hug" thread than a "help hug" thread.
Anyhow, if that's what's happening, have you tried clearing the Opera
cache yet (as I suggested several hours ago)?
I think the only way to clear the cache on 6.05 is to exit and
restart, but yes, I've done that.
With if-modified-since when only a not-modified response is returned,
the page renders without whatever text the SSI include caused to be
generated. We are talking about two requests in a timespan of
seconds.


Look, if Opera has a cached version of the page from xx/xx/xx, and asks
for a page that has been modified more recently than that, and the
server doesn't think it has one, then Opera won't show it to you.


Sorry Jack, the meaning of that one didn't quite come through.
Clear the Opera cache, get the page again, then check that it's in the
cache (and notice that it doesn't have any SSI directives in it). I'm
sure it's not helping you that you have had a version of the page in
Opera's cache all this time that contains SSI directives and nothing else.


I've cleared the cache and not cleared the cache dozens of times.
First time through the page is retrieved and rendered correctly (of
course first time through is not an if-modified-since). Second time
when it does send if-modified-since and the server responds
not-modified it renders none of the SSI included stuff.

I'll figure it out by and by. Thanks Jack.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #71
..oO(hug)
It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own type
of includes that are not mixed with SSI includes. In fact, the only
thing the SSI includes are used for is to include a PHP file.


Why do you use SSI at all if you have PHP available?

Micha
May 11 '06 #72
hug
Michael Fesser <ne*****@gmx.de> wrote:
.oO(hug)
It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own type
of includes that are not mixed with SSI includes. In fact, the only
thing the SSI includes are used for is to include a PHP file.


Why do you use SSI at all if you have PHP available?

Micha


When I initially built the site about 4 years ago, things were not the
same as they are now. I've been leaving the shtml files around for
robots mostly. It needs to be revisited once I get a couple bugs
dealt with and put the code into production.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #73
On Sun, 07 May 2006 13:02:05 -0600, hug wrote:
[originally posted in alt.www.webmaster, was suggested that this ng
could be a better place.]

I've updated my test server to handle if-modified-since. I've noticed
that the (old copies I run of) IE and Netscape seem never to send
if-modified-since. But the strange thing is that Opera sends
if-modified-since but when I reply with "HTTP/1.0 304 Not Modified" it
is not refreshing the screen from its cache, it is leaving the screen
blank.

I can only conclude that either I am not returning a correct protocol
sequence including "HTTP/1.0 304 Not Modified", or that the old Opera
I'm running contains a bug. I'm betting on the incorrect response in
my code.

Anybody have experience with handling if-modified-since themselves and
doing it properly?


I have a suggestion of how you could experiment and maybe pin down the
problem. It involves installing (maybe) one application and writing a
smallish test script:

1. [Install and] Use ethereal (http://www.ethereal.com) to sniff the
request and response headers in the 2 cases you appear to be testing: With
and without If-modified-since header requests for the URL which is giving
you problems.

2. Write a perl script which mimics both requests. This sounds difficult
but is actually really easy if you use LWP::Simple. This module provides a
rich set of objects which can be used to construct a 10 line script which
can request URLs from your webserver, and you'll be able to
include/exclude any headers you wish. If you don't have access to a
machine with Perl installed then ActiveState perl installs very easily on
winboxes.

Here is a sample script to get you started:

#!/usr/bin/perl

use LWP::UserAgent;
use HTTP::Request;

use strict;
use warnings;

my $ua = new LWP::UserAgent;
# My Opera, change to what you need
$ua->agent('Opera/7.54 (X11; Linux i686; U) [en]');

my $req = new HTTP::Request(GET=>'http://www.google.com/search?hl=en&q=smeg&btnG=Google+Search&meta=');

# Add some custom headers
#$req->header(Work=>'you bastard');

my $res = $ua->request($req);

# Inspect the headers ...
# print $res->headers_as_string;
# ... or the whole thing
print $res->as_string;

if ($res->is_success()) {
# do something
} else {
# do something else
}
You can find all the info you need in the perldocs for HTTP::Request,
HTTP::Response, HTTP::Headers, HTTP::Message, LWP::UserAgent. The perldoc
that draws all this together is LWP.

This may seem like a lot of effort but probably less than the effort of
maintaining your composure in this thread ;-) and will also be useful for
future development. If you write webapps a lot then ethereal is something
you should get to know, ditto some scripted www client if you are
manipulating headers.

HTH
Rich
May 11 '06 #74
On Thu, 11 May 2006 12:24:36 +0100, Richard Gration wrote:
2. Write a perl script which mimics both requests. This sounds difficult
but is actually really easy if you use LWP::Simple. This module provides a


Sorry, Bundle::LWP is what you need
May 11 '06 #75
On Wed, 10 May 2006 21:52:11 +0100, Jack wrote:
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own
type of includes that are not mixed with SSI includes. In fact, the
only thing the SSI includes are used for is to include a PHP file.


That sounds like a very odd thing to do. Have you ever seen that
working? For it to work, the Apache SSI module would have to ask Apache
to load the PHP, and Apache would have to ask the PHP module to execute
the PHP, returning the results via Apache to the SSI module. I'm
extremely sceptical that this is in fact what would happen, for various
reasons.


A thumb suck tells me this would work. I don't know this for sure, but I
would expect Apache to internally create a sub-request to get the content
for the SSI, which means that if the Apache can handle PHP normally then
it would work for the SSI. What you can't do in Apache 1.x is chain
handlers, so SSI cannot be used in output from CGIs

Rich
May 11 '06 #76
hug
Thanks to everyone (including snarly-Andy) for their help, attempts to
help, heckling, and cetera.

The problem has been resolved to my satisfaction for now. I have not
yet put the code in production since there are unrelated issues that
need to be addressed before the time necessary for a full
regression-test is spent.

Although there were a number of bugs in my code that were causing
not-modified to be used at inappropriate times, the primary issue
turned out to be what looks to me like it -might- be some caching bug
in Opera 6.05, though it's not clear to me how that could come about.
I've not seen it happen with any other browsers, but I am running
ancient browsers for a number of what I consider to be good reasons.
One of these days I'll get around to downloading some more modern
stuff, but time seems always short.

The problem is difficult to describe. Opera 6.05 sends three
different types of requests for different situations.

1. New page, no unusual headers sent.
2. Cached page via link, if-modified-since sent.
3. Cached page via hard reload, cache-control: no-cache sent.

The problem occurs in case 2. If a not-modified is sent in response,
the page is incorrectly rendered. The file that causes this is a file
that contains an SSI include of a PHP file. In the case of
not-modified being sent by the server, the page is rendered
*exclusive*of* the results of the SSI include of the PHP file. I
cannot understand how that can happen, since Opera should be
displaying directly from its cache which should contain the entire
page as previously rendered.

I have resolved the problem in my server code by recognizing calls to
my PHP code that originate in an html file that contains an SSI
directive. In that case I -never- send a not-modified response. I
could have parsed out the SSI directives and determined a correct
last-modified time, but it is an unusual case and I don't want to
become dependent on the format and number of SSI directives since
those may be enhanced or modified at some time in the future.

I have placed minimal test files in the publicly-accessible part of my
server. I will post their contents here and then provide a link
afterward. The funkiness could be occurring for any number of
reasons, including some SSI configuration issue that I am unaware of,
though I do not see how SSI could be at fault since the page is
properly rendered in all cases except when a not-modified response is
returned to the client. (Note that the following PHP file always
responds to if-modified-since with not-modified.)

When Opera 6.05 sends an if-modified-since and receives a not-modified
response, it should either redisplay the page from its cache as last
rendered, or perform a fresh GET, and it appears that neither is
happening. I assume, at least until I get around to downloading a
current copy of Opera, that this is a problem that does not exist in
current code (if in fact it is an Opera problem at all).

If anyone comes to understand what is really happening, I would be
interested to learn from this little exercise.

==== the shtml file follows ====
Text before SSI include.

<!--#include virtual="cache.php" -->

Text after SSI include.

<a href="http://www.ren-prod-inc.com/temp/cache.shtml">Link to this
file</a>
==== end of shtml file ====

==== the PHP file follows ====
<?php

$hdrs = getallheaders();
$ims = $hdrs["If-Modified-Since"];
if (strlen($ims)>0)
{
header("HTTP/1.0 304 Not Modified");
}
else
{
echo "<p>If-Modified-Since was not included, headers follow:<br>";
$keys = array_keys($hdrs);
for ($i=0; $i<count($keys); $i++)
{
$name = $keys[$i];
$value = $hdrs[$name];
echo "header: $name, value: $value<br>";
}
echo "-end of headers-<p>";
}

?>
==== end of PHP file ====

Link: http://www.ren-prod-inc.com/temp/cache.shtml

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #77
hug
Richard Gration <ri*****@zync.co.uk> wrote:

<snipped>

Richard, the problem has been resolved, please see my latest post
(response to original post) in this thread for details.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #78
hug
Richard Gration <ri*****@zync.co.uk> wrote:
On Wed, 10 May 2006 21:52:11 +0100, Jack wrote:
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own
type of includes that are not mixed with SSI includes. In fact, the
only thing the SSI includes are used for is to include a PHP file.


That sounds like a very odd thing to do. Have you ever seen that
working? For it to work, the Apache SSI module would have to ask Apache
to load the PHP, and Apache would have to ask the PHP module to execute
the PHP, returning the results via Apache to the SSI module. I'm
extremely sceptical that this is in fact what would happen, for various
reasons.


A thumb suck tells me this would work. I don't know this for sure, but I
would expect Apache to internally create a sub-request to get the content
for the SSI, which means that if the Apache can handle PHP normally then
it would work for the SSI. What you can't do in Apache 1.x is chain
handlers, so SSI cannot be used in output from CGIs

Rich


I've been assuming that for Apache to run PHP files at all, it's
installing a temporary file-system hook of some kind. That may be
incorrect, but it seems like a more reasonable approach than for
Apache to parse the entire datastream and try to sort it all out. IF
the assumption of a temporary file-system hook is correct, it
absolutely should work because in that case Apache would be
determining the "contents" of PHP files.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #79
On Thu, 11 May 2006 08:08:28 -0600, hug wrote:
Thanks to everyone (including snarly-Andy) for their help, attempts to
help, heckling, and cetera.

The problem has been resolved to my satisfaction for now. I have not
yet put the code in production since there are unrelated issues that
need to be addressed before the time necessary for a full
regression-test is spent.


The problem is not resolved to my satisfaction. The problem *is*
server-side, not client side. I have taken my own advice and used my
script to test the link you posted
(http://www.ren-prod-inc.com/temp/cache.shtml). The results are below. I
shall summarise what I see as the problem. When the If-Modified-Since
header is present, your PHP script is returning 304 Not modified TO THE
WEB SERVER WHICH IS TRYING TO INCLUDE ITS OUTPUT IN A PAGE, NOT THE CLIENT
REQUESTING THE URL. Regardless of the response code, the body is empty
and that empty body is being included in the page in place of the SSI
directive. The "caching issue" is that Apache does not cache previous
output of your PHP script for inclusion in subsequent requests, quite
reasonably. The clients in this situation only ever see a 200 response
from the server. Your misconception causing all this pain is that the 304
returned from your PHP script could ever make it to the browser, it is
only ever seen by Apache when (probably executing a sub-request when)
resolving the SSI include.

I have spent less than 60 minutes testing this. I have used
perl because I can, but this diagnosis could have been performed using
ethereal alone, a strategy suggested to you quite early in this thread.
Also, seeing a test case allows the problem to be seen quite easily, also
suggested quite early in the thread.

As a final note, Mozilla does not send an If-Modified-Since header for any
of the variations of requesting a page I cursorily tried and that is why
the problem is not seen with Mozilla. I guess the same goes for other
browsers which don't exhibit the problem.

HTH
Rich

script 1 (bare bones, not serious code): No "If-Modified-Since"
====script====
#!/usr/bin/perl

use LWP::UserAgent;
use HTTP::Request;

my $ua = new LWP::UserAgent;
$ua->agent('Opera/7.54 (X11; Linux i686; U) [en]');

my $url = 'http://www.ren-prod-inc.com/temp/cache.shtml';
my $req = HTTP::Request->new(GET=>$url);

print "Request headers\n--------------\n";
print $req->headers_as_string,"\n";
print "--------------\n\n";
my $res = $ua->request($req);

print "Response\n------------\n",$res->as_string,"\n";
====/script====
====output====
Request headers
--------------

--------------

Response
------------
HTTP/1.1 200 OK
Connection: close
Date: Thu, 11 May 2006 15:31:10 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_ssl/2.8.4 OpenSSL/0.9.6b
PHP/4.0.6 mod_auth_pam_external/0.1 FrontPage/4.0.4.3 mod_perl/1.25
Content-Type: text/html
Client-Date: Thu, 11 May 2006 15:31:11 GMT Client-Peer: 66.29.129.211:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked

Text before SSI include.

<p>If-Modified-Since was not included, headers follow:<br>header:
Connection, value: TE, close<br>header: Host, value:
www.ren-prod-inc.com<br>header: TE, value: deflate,gzip;q=0.3<br>header:
User-Agent, value: Opera/7.54 (X11; Linux i686; U) [en]<br>-end of
headers-<p>

Text after SSI include.

<a href="http://www.ren-prod-inc.com/temp/cache.shtml">Link to this file</a>

===/output===
script 2: With "If-Modified-Since"
====script====
#!/usr/bin/perl

use LWP::UserAgent;
use HTTP::Request;

my $ua = new LWP::UserAgent;
$ua->agent('Opera/7.54 (X11; Linux i686; U) [en]');

my $url = 'http://www.ren-prod-inc.com/temp/cache.shtml';
my $req = HTTP::Request->new(GET=>$url);

$req->header('If-Modified-Since'=>'Thu, 11 May 2006 15:18:23 GMT');

print "Request headers\n--------------\n";
print $req->headers_as_string,"\n";
print "--------------\n\n";
my $res = $ua->request($req);

print "Response\n------------\n",$res->as_string,"\n";
====/script====
====output====
Request headers
--------------
If-Modified-Since: Thu, 11 May 2006 15:18:23 GMT

--------------

Response
------------
HTTP/1.1 200 OK
Connection: close
Date: Thu, 11 May 2006 15:34:42 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_ssl/2.8.4 OpenSSL/0.9.6b
PHP/4.0.6 mod_auth_pam_external/0.1 FrontPage/4.0.4.3 mod_perl/1.25
Content-Type: text/html
Client-Date: Thu, 11 May 2006 15:34:43 GMT Client-Peer: 66.29.129.211:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked

Text before SSI include.

Text after SSI include.

<a href="http://www.ren-prod-inc.com/temp/cache.shtml">Link to this file</a>
====/output====
May 11 '06 #80
On Thu, 11 May 2006 08:12:43 -0600, hug wrote:
Richard Gration <ri*****@zync.co.uk> wrote:
On Wed, 10 May 2006 21:52:11 +0100, Jack wrote:
hug wrote:
Jack <mr*********@nospam.jackpot.uk.net> wrote:
It's apache with mod_ssi or whatever the mod is called that runs SSI.
In addition, the PHP code running on the server supports its own
type of includes that are not mixed with SSI includes. In fact, the
only thing the SSI includes are used for is to include a PHP file.

That sounds like a very odd thing to do. Have you ever seen that
working? For it to work, the Apache SSI module would have to ask Apache
to load the PHP, and Apache would have to ask the PHP module to execute
the PHP, returning the results via Apache to the SSI module. I'm
extremely sceptical that this is in fact what would happen, for various
reasons.


A thumb suck tells me this would work. I don't know this for sure, but I
would expect Apache to internally create a sub-request to get the content
for the SSI, which means that if the Apache can handle PHP normally then
it would work for the SSI. What you can't do in Apache 1.x is chain
handlers, so SSI cannot be used in output from CGIs

Rich


I've been assuming that for Apache to run PHP files at all, it's
installing a temporary file-system hook of some kind. That may be
incorrect, but it seems like a more reasonable approach than for
Apache to parse the entire datastream and try to sort it all out. IF
the assumption of a temporary file-system hook is correct, it
absolutely should work because in that case Apache would be
determining the "contents" of PHP files.


The process whereby apache resolves URLs to resources is *very*
complicated. What happens here is that the .php file is resolved to
a physical (filesystem) resource and then apache reacts to the php
extension. In the apache config file is a mapping which causes files which
end in .php to be handed off to mod_php for processing. The php language
has its methods for indicating the status of requests that it handles to
mod_php which then communicates them to apache when it gives the response
(output) back to apache to give to the client (a bit of a clumsy
explanation - the line between mod_php and apache is not so distinct, if
it even exists once mod_php is compiled in). This is why a sub-request
would be created by apache to handle SSI includes. Then anything that
apache could serve directly could in theory be included in an .shtml page,
but in practice the content type of the included thingy must be compatible
with the content-type of the page otherwise you could be in the situation
where binary gif data is placed into a document with type text/html.
Probably not what you want.

Rich

May 11 '06 #81
On 11/05/2006 15:08, hug wrote:

[snip]
If anyone comes to understand what is really happening, I would be
interested to learn from this little exercise.


The status code that you try to send in the response (via PHP) is never
used. A look at your server logs should have always shown 200 for the
response.

The problem has nothing to do with Opera 6.05, specifically (Opera 8.53
also produced the same result). What is unique about Opera, though, is
that it sends a Last-Modified date when others (I briefly checked with
IE 6 and Fx 1.5.0.3) don't. It would seem that Opera chooses to use the
Date header value in subsequent requests as there's nothing better with
which it can validate.

I should note that to get this behaviour to occur, I had to force Opera
to revalidate on every request. If I left my settings to the default, it
would just serve the same document from the cache for the next five
hours or so.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 11 '06 #82
hug
Richard Gration <ri*****@zync.co.uk> wrote:
On Thu, 11 May 2006 08:08:28 -0600, hug wrote:
Thanks to everyone (including snarly-Andy) for their help, attempts to
help, heckling, and cetera.

The problem has been resolved to my satisfaction for now. I have not
yet put the code in production since there are unrelated issues that
need to be addressed before the time necessary for a full
regression-test is spent.
The problem is not resolved to my satisfaction.


Note that I said "for now" and stated that I still did not understand
what was going on.
The problem *is*
server-side, not client side. I have taken my own advice and used my
script to test the link you posted
(http://www.ren-prod-inc.com/temp/cache.shtml). The results are below. I
shall summarise what I see as the problem. When the If-Modified-Since
header is present, your PHP script is returning 304 Not modified TO THE
WEB SERVER WHICH IS TRYING TO INCLUDE ITS OUTPUT IN A PAGE, NOT THE CLIENT
REQUESTING THE URL. Regardless of the response code, the body is empty
and that empty body is being included in the page in place of the SSI
directive. The "caching issue" is that Apache does not cache previous
output of your PHP script for inclusion in subsequent requests, quite
reasonably. The clients in this situation only ever see a 200 response
from the server. Your misconception causing all this pain is that the 304
returned from your PHP script could ever make it to the browser, it is
only ever seen by Apache when (probably executing a sub-request when)
resolving the SSI include.
Clearly there is something basic here that I don't understand. Are
all header() directives issued by PHP basically ignored? What happens
to them? I'm lost here.
I have spent less than 60 minutes testing this. I have used
perl because I can, but this diagnosis could have been performed using
ethereal alone, a strategy suggested to you quite early in this thread.
Also, seeing a test case allows the problem to be seen quite easily, also
suggested quite early in the thread.
Still trying to kick my ass, huh? Let's move past that.
As a final note, Mozilla does not send an If-Modified-Since header for any
of the variations of requesting a page I cursorily tried and that is why
the problem is not seen with Mozilla. I guess the same goes for other
browsers which don't exhibit the problem.

HTH
Rich

script 1 (bare bones, not serious code): No "If-Modified-Since"
====script====
#!/usr/bin/perl

use LWP::UserAgent;
use HTTP::Request;

my $ua = new LWP::UserAgent;
$ua->agent('Opera/7.54 (X11; Linux i686; U) [en]');

my $url = 'http://www.ren-prod-inc.com/temp/cache.shtml';
my $req = HTTP::Request->new(GET=>$url);

print "Request headers\n--------------\n";
print $req->headers_as_string,"\n";
print "--------------\n\n";
my $res = $ua->request($req);

print "Response\n------------\n",$res->as_string,"\n";
====/script====
====output====
Request headers
--------------

--------------

Response
------------
HTTP/1.1 200 OK
Connection: close
Date: Thu, 11 May 2006 15:31:10 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_ssl/2.8.4 OpenSSL/0.9.6b
PHP/4.0.6 mod_auth_pam_external/0.1 FrontPage/4.0.4.3 mod_perl/1.25
Content-Type: text/html
Client-Date: Thu, 11 May 2006 15:31:11 GMT Client-Peer: 66.29.129.211:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked

Text before SSI include.

<p>If-Modified-Since was not included, headers follow:<br>header:
Connection, value: TE, close<br>header: Host, value:
www.ren-prod-inc.com<br>header: TE, value: deflate,gzip;q=0.3<br>header:
User-Agent, value: Opera/7.54 (X11; Linux i686; U) [en]<br>-end of
headers-<p>

Text after SSI include.

<a href="http://www.ren-prod-inc.com/temp/cache.shtml">Link to this file</a>

===/output===
script 2: With "If-Modified-Since"
====script====
#!/usr/bin/perl

use LWP::UserAgent;
use HTTP::Request;

my $ua = new LWP::UserAgent;
$ua->agent('Opera/7.54 (X11; Linux i686; U) [en]');

my $url = 'http://www.ren-prod-inc.com/temp/cache.shtml';
my $req = HTTP::Request->new(GET=>$url);

$req->header('If-Modified-Since'=>'Thu, 11 May 2006 15:18:23 GMT');

print "Request headers\n--------------\n";
print $req->headers_as_string,"\n";
print "--------------\n\n";
my $res = $ua->request($req);

print "Response\n------------\n",$res->as_string,"\n";
====/script====
====output====
Request headers
--------------
If-Modified-Since: Thu, 11 May 2006 15:18:23 GMT

--------------

Response
------------
HTTP/1.1 200 OK
Connection: close
Date: Thu, 11 May 2006 15:34:42 GMT
Server: Apache/1.3.20 Sun Cobalt (Unix) mod_ssl/2.8.4 OpenSSL/0.9.6b
PHP/4.0.6 mod_auth_pam_external/0.1 FrontPage/4.0.4.3 mod_perl/1.25
Content-Type: text/html
Client-Date: Thu, 11 May 2006 15:34:43 GMT Client-Peer: 66.29.129.211:80
Client-Response-Num: 1
Client-Transfer-Encoding: chunked

Text before SSI include.

Text after SSI include.

<a href="http://www.ren-prod-inc.com/temp/cache.shtml">Link to this file</a>
====/output====

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #83
On Thu, 11 May 2006 10:45:30 -0600, hug wrote:
Richard Gration <ri*****@zync.co.uk> wrote:
The problem *is*
server-side, not client side. I have taken my own advice and used my
script to test the link you posted
(http://www.ren-prod-inc.com/temp/cache.shtml). The results are below. I
shall summarise what I see as the problem. When the If-Modified-Since
header is present, your PHP script is returning 304 Not modified TO THE
WEB SERVER WHICH IS TRYING TO INCLUDE ITS OUTPUT IN A PAGE, NOT THE CLIENT
REQUESTING THE URL. Regardless of the response code, the body is empty
and that empty body is being included in the page in place of the SSI
directive. The "caching issue" is that Apache does not cache previous
output of your PHP script for inclusion in subsequent requests, quite
reasonably. The clients in this situation only ever see a 200 response
from the server. Your misconception causing all this pain is that the 304
returned from your PHP script could ever make it to the browser, it is
only ever seen by Apache when (probably executing a sub-request when)
resolving the SSI include.
Clearly there is something basic here that I don't understand. Are
all header() directives issued by PHP basically ignored? What happens
to them? I'm lost here.


Ok, the basic point is that in your situation the server is executing the
php file in order to include its output in another page (case 1). It isn't
executing it in order to send the output to a client (case 2). The
difference here is that in case 2 any headers which are set in the script
are passed to the client intact. (I think this is as true for mod_php as
it is for mod_perl - header manipulation is done by modifying the
response structure directly.) In case 1 however, the server is executing
the php for its own ends (to get the output) and is probably only
interested in whether the response code represents success or failure. If
the response code is a success then the body (script output) is put in the
shtml file it is parsing in place of the SSI directive. If the response
code is a failure then the SSI directive is *not* replaced by the output
of the script, but rather by an error message or possibly nothing, it's
server dependent I guess. 200 (OK) is a successful response code, 500
(internal server error) and 404 (page not found) are not. It appears that
in this case your 304 response code is treated as a success and the body
of the output of your script (which is empty) replaces the SSI directive
in the page.
Still trying to kick my ass, huh? Let's move past that.


Ok, but please take on board that what has allowed me to see your problem
is you creating a minimal test case and posting the URL :-)

Rich
May 11 '06 #84
Richard Gration wrote:
When the If-Modified-Since header is present, your PHP script is
returning 304 Not modified TO THE WEB SERVER WHICH IS TRYING TO
INCLUDE ITS OUTPUT IN A PAGE, NOT THE CLIENT REQUESTING THE URL.
Regardless of the response code, the body is empty and that empty
body is being included in the page in place of the SSI directive. The
"caching issue" is that Apache does not cache previous output of your
PHP script for inclusion in subsequent requests, quite reasonably.
The clients in this situation only ever see a 200 response from the
server. Your misconception causing all this pain is that the 304
returned from your PHP script could ever make it to the browser, it
is only ever seen by Apache when (probably executing a sub-request
when) resolving the SSI include.
Well done - I think that's a very plausible explanation of the state of
affairs, and it's quite an achievement to have got there on the strength
of not much more than a URL.
As a final note, Mozilla does not send an If-Modified-Since header
for any of the variations of requesting a page I cursorily tried and
that is why the problem is not seen with Mozilla. I guess the same
goes for other browsers which don't exhibit the problem.


Neat. Something to (try to) remember.

--
Jack.
May 11 '06 #85
hug
Richard Gration <ri*****@zync.co.uk> wrote:
On Thu, 11 May 2006 10:45:30 -0600, hug wrote:

<snip>
Clearly there is something basic here that I don't understand. Are
all header() directives issued by PHP basically ignored? What happens
to them? I'm lost here.


Ok, the basic point is that in your situation the server is executing the
php file in order to include its output in another page (case 1). It isn't
executing it in order to send the output to a client (case 2). The
difference here is that in case 2 any headers which are set in the script
are passed to the client intact. (I think this is as true for mod_php as
it is for mod_perl - header manipulation is done by modifying the
response structure directly.) In case 1 however, the server is executing
the php for its own ends (to get the output) and is probably only
interested in whether the response code represents success or failure. If
the response code is a success then the body (script output) is put in the
shtml file it is parsing in place of the SSI directive. If the response
code is a failure then the SSI directive is *not* replaced by the output
of the script, but rather by an error message or possibly nothing, it's
server dependent I guess. 200 (OK) is a successful response code, 500
(internal server error) and 404 (page not found) are not. It appears that
in this case your 304 response code is treated as a success and the body
of the output of your script (which is empty) replaces the SSI directive
in the page.


That makes sense. All the other cases, where it was working solidly,
were urls that pointed directly to the php file not to an shtml file.
Still trying to kick my ass, huh? Let's move past that.


Ok, but please take on board that what has allowed me to see your problem
is you creating a minimal test case and posting the URL :-)

Rich


I think the main thing that has allowed you to see the problem where I
couldn't is more the fact that you have adequate instrumentation where
I had nothing more than echo statements in code to work with. If I'd
been able to see that what the client was receiving was a 200 instead
of a 304 it would have made a lot more sense early-on. Ethereal is
downloading as I type... slowly, but it's downloading. I put off even
looking at it as long as I could find another way to attack the
problem because I expected it to be payware, no money for much of
anything, if I'd just looked it would have saved me time but I was
sure of what I'd see and didn't want to see that, too painful. So it
goes, lots of new stuff to learn in a new environment, seldom enough
time to dig into it... that makes me almost yearn for the days when I
sat in a cubicle and could take as long as it took. Oh poor me lol,
the only thing I miss about the cubicle is the paycheck.

So ethereal is downloading and I'll have a new toy to play with if I
can find time to play with it. Now I need to get rid of the shtml
stuff since really isn't providing any value, not sure quite what the
best way of doing that is, maybe time for me to explore some of those
mod_rewrite thingies, we'll see.

Thanks again Rich.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #86
On Thu, 11 May 2006, Richard Gration wrote:
The problem is not resolved to my satisfaction.
Well said...
The problem *is* server-side, not client side. I have taken my own
advice and used my script to test the link you posted
(http://www.ren-prod-inc.com/temp/cache.shtml). The results are
below. I shall summarise what I see as the problem. When the
If-Modified-Since header is present, your PHP script is returning
304 Not modified TO THE WEB SERVER WHICH IS TRYING TO INCLUDE ITS
OUTPUT IN A PAGE, NOT THE CLIENT REQUESTING THE URL.
I admire your patience, in the face of aggressive cluelessness, and
congratulate you on getting a diagnosis. (I have to admit that I
won't be trying to follow your good example, though. I'm getting
extra grumpy in my old age...)
As a final note, Mozilla does not send an If-Modified-Since header
for any of the variations of requesting a page I cursorily tried and
that is why the problem is not seen with Mozilla.


I guess that depends on characteristics of the page which it holds
cached. I've just done a bit of browsing with Moz (to some arbitrary
site - well, actually it was news.bbc.co.uk), and Ethereal running
alongside, and I can see that the logged responses are awash with 304
Not Modified. Looking at the associated requests, there are certainly
some If-Modified-Since headers, while some have If-None-Match.

best regards
May 11 '06 #87
On Thu, 11 May 2006 12:22:44 -0600, hug wrote:
I think the main thing that has allowed you to see the problem where I
couldn't is more the fact that you have adequate instrumentation where
I had nothing more than echo statements in code to work with. If I'd
been able to see that what the client was receiving was a 200 instead
of a 304 it would have made a lot more sense early-on. Ethereal is
downloading as I type... slowly, but it's downloading.
It may come down more quickly from a mirror.
So ethereal is downloading and I'll have a new toy to play with if I
can find time to play with it.
Just a couple of hints about using it. Use filters for capture - "port 80"
or possibly "dst port 80" are the ones you should use for inspecting HTTP
traffic. If you're running it on a windows box then the first time or two
you may find it amusing to see the chatter (haystack) they put out on the
network, but you definitely don't want to see it if you're looking for
specific packets (needles)!

Now I need to get rid of the shtml stuff since really isn't providing any value, not sure quite what the
best way of doing that is, maybe time for me to explore some of those
mod_rewrite thingies, we'll see.
Yep, if you're up to coding in PHP then there is no reason to use SSI
whatsoever. But if it ain't broke, don't fix it :-)

Thanks again Rich.


You're welcome :-)
May 11 '06 #88
On 11/05/2006 18:56, Richard Gration wrote:
[...] the server is executing the php for its own ends (to get the
output) and is probably only interested in whether the response code
represents success or failure.


As far as I can see, it doesn't even care about that. Sending an error
response code still produces output in the same way that success will. I
can only assume that the output filter jumps straight to the entity
returned, ignoring everything that comes before it.

If the OP is really interested, he can always ask in an appropriate
server group.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 11 '06 #89
hug
"Alan J. Flavell" <fl*****@physics.gla.ac.uk> wrote:
I admire your patience, in the face of aggressive cluelessness, and
congratulate you on getting a diagnosis. (I have to admit that I
won't be trying to follow your good example, though. I'm getting
extra grumpy in my old age...)


If you are really getting extra grumpy in your old age, you might want
to cut another grumpy old fart some slack... or not, seeing as you are
getting extra grumpy.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #90
hug
Richard Gration <ri*****@zync.co.uk> wrote:
On Thu, 11 May 2006 12:22:44 -0600, hug wrote:
I think the main thing that has allowed you to see the problem where I
couldn't is more the fact that you have adequate instrumentation where
I had nothing more than echo statements in code to work with. If I'd
been able to see that what the client was receiving was a 200 instead
of a 304 it would have made a lot more sense early-on. Ethereal is
downloading as I type... slowly, but it's downloading.
It may come down more quickly from a mirror.


Tried it from the US mirror the first time, had to kill it after
almost 2 hours and restart it. Sometimes I long for my old DSL
connection.
So ethereal is downloading and I'll have a new toy to play with if I
can find time to play with it.


Just a couple of hints about using it. Use filters for capture - "port 80"
or possibly "dst port 80" are the ones you should use for inspecting HTTP
traffic. If you're running it on a windows box then the first time or two
you may find it amusing to see the chatter (haystack) they put out on the
network, but you definitely don't want to see it if you're looking for
specific packets (needles)!


Thanks for the guidance.
Now I need to get rid of the shtml
stuff since really isn't providing any value, not sure quite what the
best way of doing that is, maybe time for me to explore some of those
mod_rewrite thingies, we'll see.


Yep, if you're up to coding in PHP then there is no reason to use SSI
whatsoever. But if it ain't broke, don't fix it :-)


I think it might be fair to consider it broken-as-used, LOL.

--
http://www.ren-prod-inc.com/hug_soft...action=contact
May 11 '06 #91
On Thu, 11 May 2006 19:53:01 +0100, Alan J. Flavell wrote:
I admire your patience, in the face of aggressive cluelessness, and
congratulate you on getting a diagnosis. (I have to admit that I
won't be trying to follow your good example, though. I'm getting
extra grumpy in my old age...)


I have to say, the OP caught me on a good day. If you google my email you
will find a stack of grumpy responses to usenauts over the years,
including the thread which caused me to stop reading c.l.p.misc a few
months ago and seek sunnier pastures in which to offer help. <shrug>
Usenet is not what is once was, at least not the technical groups I read,
and I've only been reading it for 8 years or so :-(

"Usenet is the place where people continue to flog the memory of grease
spots left behind by dead horses" :-P

Rich
May 12 '06 #92
On Thu, 11 May 2006 19:25:53 +0100, Jack wrote:
Well done - I think that's a very plausible explanation of the state of
affairs, and it's quite an achievement to have got there on the strength
of not much more than a URL.


I'd like to say that it was the creation of the minimal test case at last
by the OP AND THE POSTING OF THE CONTENTS OF THE 2 FILES IN QUESTION that
really cracked it. I was about to abandon this thread until I saw the test
case.

All the technical groups I have read over time insist on minimal test
cases and this is for 2 reasons I know of:

1) The process of creating the test case very often leads to the
OPs seeing and fixing the problem for themselves without the need
for any more help.

2) It is easy to diagnose problems when there are no lines of code which
aren't relevant to the problem. Posting complete files requires
investigators to understand the whole system to know what is relevant and
what is not, and it leaves the job of creating the test case to the
investigators, which is not a fair division of labour :-P

Rich
May 12 '06 #93
On Fri, 12 May 2006, Richard Gration wrote:
On Thu, 11 May 2006 19:25:53 +0100, Jack wrote:
Well done - I think that's a very plausible explanation of the
state of affairs, and it's quite an achievement to have got there
on the strength of not much more than a URL.


I'd like to say that it was the creation of the minimal test case at
last by the OP AND THE POSTING OF THE CONTENTS OF THE 2 FILES IN
QUESTION that really cracked it. I was about to abandon this thread
until I saw the test case.


The O.P had evidently already harangued his way so far down my score
file by then that I was no longer seeing his postings.

I guess that's enough for this thread, anyway. Well done.
May 12 '06 #94
Richard Gration wrote:

All the technical groups I have read over time insist on minimal test
cases and this is for 2 reasons I know of:

1) The process of creating the test case very often leads to the OPs
seeing and fixing the problem for themselves without the need for any
more help.

2) It is easy to diagnose problems when there are no lines of code
which aren't relevant to the problem. Posting complete files requires
investigators to understand the whole system to know what is
relevant and what is not, and it leaves the job of creating the test
case to the investigators, which is not a fair division of labour :-P

Indeed.

I happen to have some problems with XSLT of a couple-months standing,
which I haven't yet raised with the XSL-List people. They also favour
proper test-cases, and I believe the process of producing a suitable
test-case is likely to be equivalent to solving the problem.

One sometimes desires a way of throwing a problem at a list or newsgroup
when it's still at the stage of not being properly understood - like my
XSLT problem. But that doesn't seem to be possible; it seems that
creating the test-case, or at the very least trying to create it, is
often a pre-requisite to even articulating the problem.

--
Jack.
May 12 '06 #95
VK

Michael Winter wrote:
an implementation must be conservative in its sending
behavior, and liberal in its receiving behavior.


ACK
[...] you cannot tell if it will be actually regarded on the other
end and even if - you cannot tell in what manner will it be regarded
(unless intranet).


Then what makes you think one can rely on instructions to disable the
cache, either (other than doing things like constructing unique URIs)?
There is certainly no harm in trying, and when it does succeed (which
should be in the vast majority of cases), it could be very beneficial.


That's a kind of question "What make you think that the requestor
cannot handle XML+XSL transformers? There is certainly no harm in
trying, and when it does succeed (which should be in the vast majority
of cases), it could be very beneficial." :-)

You may add any other technics instead of XML+XSL. It is always a
question to calculate the amount of visitors you are screwing on, and
take a decision if you are ready to screw on them (because the benefits
would override fragments of % of potential losses). Yet it's always a
per solution thinking, not an universal rule.
Say the same Opera (which was in OP) simply doesn't care of your
headers for < > navigation and doesn'r read them. It goes by "Expire in
X days" in UA preferences.


No it doesn't. See my other follow up to you.


Yes it does. See my tastcase in this thread.
At the same time it considers each refresh request of the same page as
new item in the history (if Expires is set properly).


Where exactly do you get these ideas? Not on Earth, certainly.


As it was already posted and explained in this thread:

1. <http://www.nskom.com/external/tmp/http/204.cgi>
This script randomly generates new page with server timestamp or sends
204 No Content.

2. Hit Refresh several times to get at least 2 or 3 page updates.

3. Hit Back bitton and see where are you going.

4. Do the same for say Firefox.

May 12 '06 #96
On 12/05/2006 21:41, VK wrote:
Michael Winter wrote:
[snip]

[VK:]
[...] you cannot tell if it will be actually regarded on the
other end and even if - you cannot tell in what manner will it be
regarded (unless intranet).


Then what makes you think one can rely on instructions to disable
the cache, either (other than doing things like constructing unique
URIs)? There is certainly no harm in trying, and when it does
succeed (which should be in the vast majority of cases), it could
be very beneficial.


That's a kind of question "What make you think that the requestor
cannot handle XML+XSL transformers? There is certainly no harm in
trying, and when it does succeed (which should be in the vast
majority of cases), it could be very beneficial." :-)


Not at all. Browsers that send HTTP/1.1 in their requests should at
least conditionally comply with RFC 2616. That is, they implement all
'MUST' and 'REQUIRED' level requirements set forth in the specification.
As such, one /should/ be able to depend upon that behaviour. Of course,
browsers don't in all respects, but this is where one can at least try
because general failure is not the expectation.

One would only expect a user agent to handle XSL if they advertised such
an ability in a similarly binding manner.

Anyway, my point was that /you/ stated that one cannot rely on cache
controls, yet you seemed to want to use them to disable caching. It was
a rather contradictory statement.
Say the same Opera (which was in OP) simply doesn't care of your
headers for < > navigation and doesn'r read them. It goes by
"Expire in X days" in UA preferences.


No it doesn't. See my other follow up to you.


Yes it does. See my tastcase in this thread.


After instructing Opera to validate all requests for cached documents, I
looked at the logs for my test server and I saw no additional requests
when using the Back or Forward buttons. So no, it doesn't.

Even when a document isn't cached, including when the browser cache is
disabled entirely, the navigation controls do not prompt new requests.
The cache validation controls in Opera are used when no freshness
information is provided by the server. It has nothing to do those controls.

[snip]
As it was already posted and explained in this thread:

1. <http://www.nskom.com/external/tmp/http/204.cgi>
This script randomly generates new page with server timestamp or sends
204 No Content.

2. Hit Refresh several times to get at least 2 or 3 page updates.

3. Hit Back bitton and see where are you going.


[snip]

And as I told you, in the very post that you replied to, visiting that
URL and then refreshing it leaves the Back button disabled. A refresh is
not navigation.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 12 '06 #97
VK

Michael Winter wrote:
As it was already posted and explained in this thread:

1. <http://www.nskom.com/external/tmp/http/204.cgi>
This script randomly generates new page with server timestamp or sends
204 No Content.

2. Hit Refresh several times to get at least 2 or 3 page updates.

3. Hit Back bitton and see where are you going.
[snip]

And as I told you, in the very post that you replied to, visiting that
URL and then refreshing it leaves the Back button disabled. A refresh is
not navigation.


No offence, but it's already the second exclusivity demonstrated by
your test system (counting the first the application/xhtml+xml
handling, see the "DTD in browsers" thread).
The described behavior was tested on two machines with Opera 8.52 and
Opera 9 Beta with the earlier described results.

By vigoriously excluding not proper testing/description from your side,
it is needed to assume that you are sitting behind some really =~
overly smart proxy that does all kind of runtime adjustments with
request/response.

The source code of the testcase (thus all headers it really sends) was
already posted. It may be interesting to check all headers you /really/
get: via an ajaxoid or some more sophisticated network tool.
From my side I only can confirm that all headers are going "as it is"

back and forth. If confirmer the same for you, then the situation must
be further narrowed to OS/browser installation specifics.

May 13 '06 #98
On 13/05/2006 08:45, VK wrote:
Michael Winter wrote:
[VK:]
1. <http://www.nskom.com/external/tmp/http/204.cgi> This script
randomly generates new page with server timestamp or sends 204 No
Content.

2. Hit Refresh several times to get at least 2 or 3 page updates.
3. Hit Back bitton and see where are you going.


[snip]
No offence, but it's already the second exclusivity demonstrated by
your test system (counting the first the application/xhtml+xml
handling, see the "DTD in browsers" thread).
You don't need to consider the state of my system to understand that the
precise process you describe is impossible. A little common sense should
be enough.

A refresh action using F5 (with or without modifiers), an equivalent key
combination, or the Refresh GUI control, is not navigation in the sense
of activating a link, loading a bookmark, or typing a URL. These actions
take the user somewhere, so the concept of going backwards or forwards
makes sense. A refresh updates the current document: there is no
movement. That said, requesting an identical URL in Firefox and IE is
considered to be the same as a refresh so your 'test', as it stands,
cannot be used with them to evaluate the behaviour of the Back and
Forward buttons.

Still, even when loading your 'test' from a bookmark or re-entering the
URL, subsequent use of the Back and Forward buttons still doesn't result
in additional network traffic.

[snip]
By vigoriously excluding not proper testing/description from your
side, it is needed to assume that you are sitting behind some really
=~ overly smart proxy that does all kind of runtime adjustments with
request/response.
Perhaps you'd care for a brief description, then? You are welcome to
state your method.

First, software you might consider relevant:

- Windows XP Professional (with SP2)
- Opera 8.54 (build 7730)
Relevant History settings:
Memory cache: Automatic
Disk cache: 10MB
Check ...: Every 5 Hours
No proxies configured.
- Ethereal 0.99.0

In each case, the cache is emptied before beginning, and a new tab is
created. As I described previously, refreshing will not enable the
navigation controls so the URL:

<http://www.nskom.com/external/tmp/http/204.cgi>

will be pasted into the location bar, instead.

A total of five requests will be made in each run, with Ethereal
capturing all inbound and outbound HTTP traffic over my WAN interface.
However, for simplicity, I'll only include the request and status lines.
After the requests have been made, I'll step back through each to the
first using the Back button, then forward again to the last.

The first run using the initial settings:

1 GET /external/tmp/http/204.cgi HTTP/1.1
HTTP/1.1 200 OK

There was no further traffic.

As I stated previously, your 'test' doesn't send any freshness
information, therefore Opera reverts to the 'Check ...' options to
determine whether to revalidate a cached resource. As each option was
set to an interval of five hours during that run, the next four requests
were served from the cache.

Clearly, these settings don't tell us much. One could argue that there
was no traffic from the navigation controls because the requests you
expected to see were served from the cache, too. Therefore, for the
second run, all three 'Check ...' options will be set to 'Always' to
force revalidation (though really, only 'Check documents' needs to be
changed).

1 GET /external/tmp/http/204.cgi HTTP/1.1
HTTP/1.1 204 No Content
2 GET /external/tmp/http/204.cgi HTTP/1.1
HTTP/1.1 204 No Content
3 GET /external/tmp/http/204.cgi HTTP/1.1
HTTP/1.1 200 OK
4 GET /external/tmp/http/204.cgi HTTP/1.1
If-Modified-Since: Sat, 13 May 2006 12:15:03 GMT
HTTP/1.1 200 OK
5 GET /external/tmp/http/204.cgi HTTP/1.1
If-Modified-Since: Sat, 13 May 2006 12:15:17 GMT
HTTP/1.1 200 OK

There was no further traffic.

In this run, the cache was not automatically used to serve later
requests: once a cacheable entity was available, Opera sent conditional
requests.

With the cache ignored (in the absence of a 304 response), and following
one of your earlier statements:

Say the same Opera (which was in OP) simply doesn't care of
your headers for < > navigation and doesn'r read them. It goes
by "Expire in X days" in UA preferences.
-- 11**********************@e56g2000cwe.googlegroups. com

one must conclude that you'd expect four more requests as the
preferences call for revalidation, but they didn't occur.
The source code of the testcase (thus all headers it really sends)
was already posted. It may be interesting to check all headers you
/really/ get: via an ajaxoid or some more sophisticated network tool.
Not that it was necessary, but now I have and the evidence points to a
rather obvious (and predictable) result.

You cannot blame some mythical proxy server (outbound traffic is
monitored before that point), and there are no in-browser settings that
can modify its behaviour in such a fundamental way (and I haven't edited
any configuration files).
From my side I only can confirm that all headers are going "as it is"
back and forth. If confirmer the same for you, then the situation
must be further narrowed to OS/browser installation specifics.


Or, you might just want to consider that you're wrong. You've provided
no evidence that I am, and I would have hoped that another regular would
have jumped in by now to correct me if I was.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
May 13 '06 #99
VK

Michael Winter wrote:
You don't need to consider the state of my system to understand that the
precise process you describe is impossible. A little common sense should
be enough.
Thank you for perfect description of your test case. I think that mine
should be at least close to the quality of yours, so I may need a day
of two for response (depending on my primary load). As ciwah seems
making an exception for this thread, we may continue here rather than
to move to somewhere else.

Also I want to discover the mistery of Content-Type:
application/xhtml+xml treated completely different by your machine and
by mine: as no amount of common sense seems able to help here :-)
I have a hypothesis (nothing but hypothesis yet) of a common reason for
all discrepancies: because of intermediary HTTP proxies. It is very
rare when one gets the page directly from the server X. It is usually
server A forwarding to B ... to X and then X forwarding to W ... to
recipient where both request and response path can be completely
different. As any of intermediary servers can have their own cache
configurations, a lot of fancy things may happen. I need more time to
think of some experiments.

Only one preliminary note:
As I stated previously, your 'test' doesn't send any freshness
information, therefore Opera reverts to the 'Check ...' options to
determine whether to revalidate a cached resource.


It is not totally correct. My "test" sends Date and Expires headers
both set to the time of request in RFC1123 format. This is a proper way
to handle caching - but not the only one available, another one
(already mentioned) is via Cache-Control.

<http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html>
<q> To mark a response as "already expired," an origin server sends an
Expires date that is equal to the Date header value. (See the rules for
expiration calculations in section 13.2.4.) </q>

The reason of the observed Opera behavior in your test (also already
mentioned) is
<http://www.opera.com/support/search/supsearch.dml?index=82>
<q> Note that cache expiration is not checked when going back and
forwards
in the window history. It is only checked when you click a link. </q>

May 13 '06 #100

This discussion thread is closed

Replies have been disabled for this discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.