By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,680 Members | 1,475 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,680 IT Pros & Developers. It's quick & easy.

Getting correct encoding from a webresponse

P: n/a
Hi guys,

I have the following code that retrieves a webpage.
My problem is getting it to use the right encoding.
I've tested it against a danish page, but it won't show the danish
characters. When I set it to sniff the encoding (using
sr = New System.IO.StreamReader(strm, True)
) it sets it to UTF8, but when I browse the page using MSIE with
auto-selection of encoding, it uses Western European (and displays the danish
chars correctly).

Any ideas?

The code:
' Create a new WebRequest Object to the mentioned URL.
Dim myWebRequest As System.Net.WebRequest =
System.Net.WebRequest.Create(txtUrl.Text)

' Set the 'Timeout' property in Milliseconds.
myWebRequest.Timeout = 10000

Dim encode As System.Text.Encoding =
System.Text.Encoding.GetEncoding("Unicode")

' Assign the response object of 'WebRequest' to a 'WebResponse'
variable.
Dim myWebResponse As System.Net.WebResponse =
myWebRequest.GetResponse()
Dim strm As System.IO.Stream
Dim sr As System.IO.StreamReader
Dim line As String

strm = myWebResponse.GetResponseStream()

sr = New System.IO.StreamReader(strm, True)
'sr = New System.IO.StreamReader(strm, encode)
MsgBox("Encoding: " & sr.CurrentEncoding.ToString)
txtSrc.Text = ""
Do
line = sr.ReadLine()
txtSrc.Text += line
Loop While Not line Is Nothing
Jul 21 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Jonax <Jo***@discussions.microsoft.com> wrote:
I have the following code that retrieves a webpage.
My problem is getting it to use the right encoding.
I've tested it against a danish page, but it won't show the danish
characters. When I set it to sniff the encoding (using
sr = New System.IO.StreamReader(strm, True)
) it sets it to UTF8, but when I browse the page using MSIE with
auto-selection of encoding, it uses Western European (and displays the danish
chars correctly).

Any ideas?


You should look at what the HttpWebResponse says the ContentEncoding is
- although the server may not tell you. Guessing accurately is tricky
and there's always a chance you'll get it wrong. To use "Western
European" however, use Encoding.GetEncoding(1252).

(StreamReader will always use UTF-8 by default, as the documentation
states.)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #2

P: n/a
Thx Jon,

I'll have a go at it - and merry xmas...

Jul 21 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.