Hi,
I am trying to write a generic RSS/Atom/OPML feed client. The problem
is, that those xml feeds may have different encodings:
- <?xml version="1.0" encoding="ISO-8859-1" ?>...
- <?xml version="1.0" encoding="utf-8" ?>...
- ...
I am using the WebRequest functionality to get the feeds. So, my code
looks simplified like this:
WebRequest req = WebRequest.Create(url);
StreamReader reader = new StreamReader(..., Encoding.Default);
string result = readerEnc.ReadToEnd();
As you can see on the second line, I can (or must, because utf-8 is
default) already define the encoding type of the expected stream.
However, as I do not now the encoding type while fetching the xml
stream, I use Encoding.Default.
And now, I am in the middle of the problem: I like to read the result
xml string, get the encoding type and re-encode result string with the
correct encoding type. Otherwise, all special characters are not
readable or missing in the result string.
I have unlukely tried following work-arounds:
- convert directly the result xml string from Encoding.Default to XML
Encoding Type:
result = this.convertString(result, Encoding.Default,
Encoding.GetEncoding(myEncodingStringFromXMLFile)) ;
The convertString function uses similar code as the convert example on
msdn: http://msdn.microsoft.com/library/de...classtopic.asp
--> did not work - characters remained as they where before
- Creating a second StreamReader instance with the right encoding:
StreamReader reader2 = new StreamReader(...,
Encoding.GetEncoding(myEncodingStringFromXMLFile)) ;
string result = readerEnc.ReadToEnd();
--> did not work - it seems, that the ResponseStream from the
WebRequest class can only be read once! I am getting an error when
trying to modify the Position attribute on the stream (Another guy had
exactly the same problem:
http://groups.google.ch/groups?hl=de...f67a0c2&rnum=1)
Is there another solution, than fetching the URL twice? Do I miss some
basic functionalities? Thanks for your help...
Greets,
Phil