473,387 Members | 1,453 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

How to judge whether content type is truly "text/html"?

I know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
The response has one ContentType property. But the property is just
decided by http response header. It is possible that the content is
actually HTML, while the ContentType is "image/jpeg".

Is there any effective way to judge whether the response type is truly
"text"?
I have a idea to read the first several bytes of the response stream;
and check whether they are real displayable characters. But, they can
be any kind of Encoding. Should I try all kinds of Encoding?

Sep 21 '06 #1
6 2233
The property is not decided by the HTTP Response Header. It is decided by
the web server and/or the developer who created the web site. The problem
here is, the reason for the ContentType header is to tell the client what is
stored in the stream of bits it is sending. Since a stream of bits is just
1's and 0's there's no way to tell without it.

However, I have never heard of what you describe happening. If it did,
browsers would not be able to view the content, and whomever created the web
site would know about it very shortly (from the response of the users).

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@k70g2000cwa.googlegr oups.com...
>I know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
The response has one ContentType property. But the property is just
decided by http response header. It is possible that the content is
actually HTML, while the ContentType is "image/jpeg".

Is there any effective way to judge whether the response type is truly
"text"?
I have a idea to read the first several bytes of the response stream;
and check whether they are real displayable characters. But, they can
be any kind of Encoding. Should I try all kinds of Encoding?

Sep 21 '06 #2

Kevin Spencer 写道:
The property is not decided by the HTTP Response Header. It is decided by
the web server and/or the developer who created the web site. The problem
here is, the reason for the ContentType header is to tell the client whatis
stored in the stream of bits it is sending. Since a stream of bits is just
1's and 0's there's no way to tell without it.
Yes, web server can config the response MIME type, which turns to be in
HTTP response header. That is my understanding.
However, I have never heard of what you describe happening. If it did,
browsers would not be able to view the content, and whomever created the web
site would know about it very shortly (from the response of the users).
I tried to manually set one html to be "image/jpeg" type in IIS6. Then
access the page from another machine and ambush the http package with
Fiddle. It shows that the response header has "ContentType:
image/jpeg". Interestingly, IE still show the html page, while Firefox
cannot show it up. It looks that IE does further job.

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@k70g2000cwa.googlegr oups.com...
I know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
The response has one ContentType property. But the property is just
decided by http response header. It is possible that the content is
actually HTML, while the ContentType is "image/jpeg".

Is there any effective way to judge whether the response type is truly
"text"?
I have a idea to read the first several bytes of the response stream;
and check whether they are real displayable characters. But, they can
be any kind of Encoding. Should I try all kinds of Encoding?
Sep 21 '06 #3

Vadym Stetsyak 写道:
Hello, Morgan!

MCI know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
MCThe response has one ContentType property. But the property is just
MCdecided by http response header. It is possible that the content is
MCactually HTML, while the ContentType is "image/jpeg".

If you're talking to "well-behaved" web server, then it gives you the content
from the set you've specified in the Accept header.
I agree.
It happens to me to handle some un-normal situation:p
>
MCIs there any effective way to judge whether the response type is truly
MC"text"?
MCI have a idea to read the first several bytes of the response stream;
MCand check whether they are real displayable characters. But, they can
MCbe any kind of Encoding. Should I try all kinds of Encoding?

IMO there is no good way how verify if it is "text".
As a workaround you can check the response content for the subset of printable
characters...
The problem is the encoding.
However, html lang are in English which is 33-127 in most of Encoding.
Perhaps try to parse some html tag works.

--
Regards, Vadym Stetsyak
www: http://vadmyst.blogspot.com
Sep 21 '06 #4

Vadym Stetsyak 写道:
Hello, Morgan!

MCI know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
MCThe response has one ContentType property. But the property is just
MCdecided by http response header. It is possible that the content is
MCactually HTML, while the ContentType is "image/jpeg".

If you're talking to "well-behaved" web server, then it gives you the content
from the set you've specified in the Accept header.
I agree.
It happens to me to handle some un-normal situation:p
>
MCIs there any effective way to judge whether the response type is truly
MC"text"?
MCI have a idea to read the first several bytes of the response stream;
MCand check whether they are real displayable characters. But, they can
MCbe any kind of Encoding. Should I try all kinds of Encoding?

IMO there is no good way how verify if it is "text".
As a workaround you can check the response content for the subset of printable
characters...
The problem is the encoding.
However, html lang are in English which is 33-127 in most of Encoding.
Perhaps try to parse some html tag works.

--
Regards, Vadym Stetsyak
www: http://vadmyst.blogspot.com
Sep 21 '06 #5
Hi Morgan,

Your expreience underscores my point. While it is possible to manually (or,
perhaps unintentionally) change the ContentType header, any web site that
did would find out about it very quickly, because there are many different
browsers in use out there, and they would hear about the problem and fix it.

It isn't productive to imagine the most remote of possibilities and handle
them gracefully. If one did, one would never finish much of anything.
Sometimes the most graceful thing to do is to handle the error as an error
and move on. My guess is that you would never run into the issue at all.

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@m7g2000cwm.googlegro ups.com...

Kevin Spencer ??:
The property is not decided by the HTTP Response Header. It is decided by
the web server and/or the developer who created the web site. The problem
here is, the reason for the ContentType header is to tell the client what
is
stored in the stream of bits it is sending. Since a stream of bits is just
1's and 0's there's no way to tell without it.
Yes, web server can config the response MIME type, which turns to be in
HTTP response header. That is my understanding.
However, I have never heard of what you describe happening. If it did,
browsers would not be able to view the content, and whomever created the
web
site would know about it very shortly (from the response of the users).
I tried to manually set one html to be "image/jpeg" type in IIS6. Then
access the page from another machine and ambush the http package with
Fiddle. It shows that the response header has "ContentType:
image/jpeg". Interestingly, IE still show the html page, while Firefox
cannot show it up. It looks that IE does further job.

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@k70g2000cwa.googlegr oups.com...
I know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
The response has one ContentType property. But the property is just
decided by http response header. It is possible that the content is
actually HTML, while the ContentType is "image/jpeg".

Is there any effective way to judge whether the response type is truly
"text"?
I have a idea to read the first several bytes of the response stream;
and check whether they are real displayable characters. But, they can
be any kind of Encoding. Should I try all kinds of Encoding?

Sep 21 '06 #6

Kevin Spencer wrote:
Hi Morgan,

Your expreience underscores my point. While it is possible to manually (or,
perhaps unintentionally) change the ContentType header, any web site that
did would find out about it very quickly, because there are many different
browsers in use out there, and they would hear about the problem and fix it.

It isn't productive to imagine the most remote of possibilities and handle
them gracefully. If one did, one would never finish much of anything.
Sometimes the most graceful thing to do is to handle the error as an error
and move on. My guess is that you would never run into the issue at all.
I happen to find one function FindMimeFromData in UrlMon.dll. It
works.

http://msdn.microsoft.com/workshop/n...appendix_a.asp

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@m7g2000cwm.googlegro ups.com...

Kevin Spencer ??:
The property is not decided by the HTTP Response Header. It is decided by
the web server and/or the developer who created the web site. The problem
here is, the reason for the ContentType header is to tell the client what
is
stored in the stream of bits it is sending. Since a stream of bits is just
1's and 0's there's no way to tell without it.
Yes, web server can config the response MIME type, which turns to be in
HTTP response header. That is my understanding.
However, I have never heard of what you describe happening. If it did,
browsers would not be able to view the content, and whomever created the
web
site would know about it very shortly (from the response of the users).
I tried to manually set one html to be "image/jpeg" type in IIS6. Then
access the page from another machine and ambush the http package with
Fiddle. It shows that the response header has "ContentType:
image/jpeg". Interestingly, IE still show the html page, while Firefox
cannot show it up. It looks that IE does further job.

--
HTH,

Kevin Spencer
Microsoft MVP
Software Composer

A watched clock never boils.

"Morgan Cheng" <mo************@gmail.comwrote in message
news:11**********************@k70g2000cwa.googlegr oups.com...
>I know that HttpWebRequest.GetResponse() generates a HttpWebResonse.
The response has one ContentType property. But the property is just
decided by http response header. It is possible that the content is
actually HTML, while the ContentType is "image/jpeg".
>
Is there any effective way to judge whether the response type is truly
"text"?
I have a idea to read the first several bytes of the response stream;
and check whether they are real displayable characters. But, they can
be any kind of Encoding. Should I try all kinds of Encoding?
>
Sep 25 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: j.t.w | last post by:
Hi All. I'm having a problem with my Date of Birth textbox. When I open the ..htm file, the "DoB" textbox is flat with a border. All of my other textboxes are sunken and are yellow. When I...
9
by: Arash Dejkam | last post by:
Hi All, Is it possible to write on an <OBJECT type="text/html"> using document.write() from within the html containing that tag the way we write on a popup window? I couldn't do that after a lot...
9
by: David D. | last post by:
Does the file extension matter when including a JavaScript file in an HTML page? Normally, one would include a JavaScript file in an HTML page using <script src="foo.JS" type="text/javascript">...
3
by: Silmar | last post by:
Hi! In my form I have table which cells contain input objects of type="text" which initially are disabled. I would like to activate them by clicking on them. However because input object does...
6
by: Jon Davis | last post by:
I recently learned how to do an <OBJECT> alternative to <IFRAME> in current browsers using: <object id="extendedhtml" type="text/html" data="otherpage.html" width="250" height="400"></object> ...
5
by: Kivak Wolf | last post by:
Hey everyone, I have a textbox in my web page that is going to be used to enter an E-mail into (just plain text, no HTML). Now, this will interact with a SQL database where the contents of the...
1
by: tilt | last post by:
Hello, I use an object element to replace the iframe element in ie, like this: <object id="x_obj" data="http://.../" type="text/html"> <iframe name="x_if" id="x_if"...
9
by: Steve | last post by:
Hi; I've being going through some legacy code on an old JSP site I have been patching. I noticed that when I save the JSP down to my PC as an HTML file I get this javascript error in IE 6 ( ...
3
by: K Viltersten | last post by:
I've been informed that a webserver sending a XML file is supposed to add "Content-Type: text/xml header". I'm not questioning that infromation but i'm unsure what was ment by it. The XML i get...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.