473,324 Members | 2,179 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Question about ReadLine UTF8 line truncation

(Dot Net 2 C# application - using Encoding.UTF8 with a StreamReader)
I have a very strange problem that I cannot explain with a UTF8 Readline()
although this could exist in other types of encoding, I have not tried them.

Our application wrote this sequence to a UTF8 file. Now I am loading it
back and the text is not coming back in the same as it went out.

DATA:
from: processfrom checkemail failed: 501 syntax error in parameters: invalid
char in email: "sometext\content-transfer-encoding:"@server.com
command: mail from:"sometext\content-transfer-encoding:"@server.com
Each of those lines will be slipt at the \c of the
content-transfer-encoding. They are not high order characters, and I don't
ever remember \c being a control character for anything.
So instead of getting back in two lines that I wrote out, I get in 4 lines.

Any ideas as to why this is, and how would I correct it? I am guessing that
I will need to escape the sequence prior to writing it out, but I don't know
why and that bugs me.

Thanks for any help.

Feb 12 '07 #1
4 4862
EmeraldShield <em***********@noemail.noemailwrote:
(Dot Net 2 C# application - using Encoding.UTF8 with a StreamReader)
I have a very strange problem that I cannot explain with a UTF8 Readline()
although this could exist in other types of encoding, I have not tried them.
<snip>

Could you post a short but complete program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 12 '07 #2
Hello Jon,

Based on your description, when you use StreamReader with UTF8 encoding to
read some text data writen out previously, you get the wrong
string(different from original output), correct?

Would you show to simple test code snippet on this so that we can get a
more clear view on your code logic? Also, as for the following test
fragment you mentioned:

==============
from: processfrom checkemail failed: 501 syntax error in parameters:
invalid char in email:
"sometext\content-transfer-encoding:"@server.comcommand: mail
from:"sometext\content-transfer-encoding:"@server.com
==============

Did you directly embeded in C# code like

string txt = " ..... the text here.....";

or is it load from some other source(such as a Textbox or from another
file)? Also, when you output the data to the txt file(through
StreamReader+UTF8 encoding), have you checked the txt output file to see
whether the output is correctly expected?

Based on my experience, such problem is likely occur when you directly
embeded the string in code since there are some particular chars that need
escaping in C# string. For example, you need to escape \ as \\ So if
you directly embed string in C# code, you need to escape the whole string
as below:

=============
string txt = "from: processfrom checkemail failed: 501 syntax error in
parameters: invalid char in email:
\"sometext\\content-transfer-encoding:\"@server.com command: mail
from:\"sometext\\content-transfer-encoding:\"@server.com";

StreamWriter sw = new StreamWriter("direct_output.txt", false,
Encoding.UTF8);
sw.Write(txt);

sw.Close();

===============

In addition, I suggest you put those string in a TextBox and writeout it
from that TextBox into the StreanWriter to see whether the output is as
expected. here is my test code which works well for the text fragment you
provided.

=============================
private void btnSave_Click(object sender, EventArgs e)
{
StreamWriter sw = new StreamWriter("output.txt", false,
Encoding.UTF8);
sw.Write(textBox1.Text);

sw.Close();
}

private void btnLoad_Click(object sender, EventArgs e)
{
StreamReader sr = new StreamReader("output.txt", Encoding.UTF8);

string txt = sr.ReadToEnd();

sr.Close();
MessageBox.Show(txt);
}
=======================

Please feel free to post here if there is anything unclear or if you met
any furtehr problems.

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead

==================================================

Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscripti...ult.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscripti...t/default.aspx.

==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.

Feb 12 '07 #3
Based on your description, when you use StreamReader with UTF8 encoding to
read some text data writen out previously, you get the wrong
string(different from original output), correct?

Would you show to simple test code snippet on this so that we can get a
more clear view on your code logic? Also, as for the following test
fragment you mentioned:
I will do that tonight or tomorrow.
Did you directly embeded in C# code like

string txt = " ..... the text here.....";

or is it load from some other source(such as a Textbox or from another
file)? Also, when you output the data to the txt file(through
StreamReader+UTF8 encoding), have you checked the txt output file to see
whether the output is correctly expected?
No, this is data sent to my application from a remote system over a socket.
I have looked at the output file and it is correct. I have loaded it in
Wordpad, Notepad, and VS2005 and it looks correct in all of them.
In addition, I suggest you put those string in a TextBox and writeout it
from that TextBox into the StreanWriter to see whether the output is as
expected. here is my test code which works well for the text fragment you
provided.
I will try that and see what it looks like. I just wanted to make sure that
the \C was not some sort of escape character that I didn't know about. It
is being sent to my app that way and written straight to disk without any
escaping.

Jason
Feb 12 '07 #4
Thanks for your reply Jason,

Sure, we'll wait for your further update.

Also, I'm sure that "\C" won't be particular escaped by UTF8 encoding or
other encoding readers. Generally, only when we put string directly in code
or some script should we take care of some escaping issue.

Anyway, please feel free to post here if you meet any further problem on
this.

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead
This posting is provided "AS IS" with no warranties, and confers no rights.

Feb 13 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: peter leonard | last post by:
Hi, I having a problem with reading each line from a text file. For example, the file is a text file named 'test.txt' with the following content : line 1 line 2 line 3 line 4 line 5
2
by: Pernell Williams | last post by:
Hi all: Thank you for your responses. I have a more specific question about "file.seek() and file.readline()" versus "file.seek() and file.xreadlines". When I have the following code:
4
by: Paul Rubin | last post by:
I have a file with contents like: Vegetable: spinach Fruit: banana Flower: Daisy Fruit: pear
5
by: Richard Lewis | last post by:
Hi there, I'm having a problem with unicode files and ftplib (using Python 2.3.5). I've got this code: xml_source = codecs.open("foo.xml", 'w+b', "utf8") #xml_source = file("foo.xml",...
2
by: Macisu | last post by:
Hi I am using the object Stream Reader to read Files. The method ReadLine() does not read the character ñ o Ñ. What Can i do? Helpme please thanks
3
by: Ian Taite | last post by:
Hello, I'm exploring why one of my C# .NET apps has "high" memory usage, and whether I can reduce the memory usage. I have an app that wakes up and processes text files into a database...
0
by: 7stud | last post by:
Hi, 1) Does this make any sense: """ Thus, the loop: for line in f: iterates on each line of the file. Due to buffering issues,
6
by: asif08 | last post by:
The following program is giving an error: System.NullReferenceException Object Reference not set to an instance of an object. at IgnoreKwic.GetAllLinesForIndexWords<with parameters> at...
6
by: Sean Davis | last post by:
I have a large file that I would like to transform and then feed to a function (psycopg2 copy_from) that expects a file-like object (needs read and readline methods). I have a class like so: ...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.