I have the following function I use in my application quite a bit (I
missed the VFP one and decided to make my own):
Public Shared Function File2String(ByVal strFile)
'Open a file for reading
Dim strFilename As String = strFile
'Get a StreamReader class that can be used to read the file
Dim objStreamReader As System.IO.StreamReader
Try
objStreamReader = System.IO.File.OpenText(strFilename)
Catch ex As Exception
Return Nothing
End Try
Dim str As String = objStreamReader.ReadToEnd
objStreamReader.Close()
Return str
End Function
It's been working well, but I just found out it doesn't correctly read a
text file with a different encoding. I have a text file with some French
accents in it, like "acheté". My function would return "achet", dropping
the é completely. I'm not sure how to address this and it's very
important to make it continue to work as it has with the plain English
files I usually use it with. Anyone know how to address this? Thanks!
Matt 5 1643
MattB wrote: I have the following function I use in my application quite a bit (I missed the VFP one and decided to make my own):
Public Shared Function File2String(ByVal strFile) 'Open a file for reading Dim strFilename As String = strFile 'Get a StreamReader class that can be used to read the file Dim objStreamReader As System.IO.StreamReader
Try objStreamReader = System.IO.File.OpenText(strFilename) Catch ex As Exception Return Nothing End Try
Dim str As String = objStreamReader.ReadToEnd objStreamReader.Close() Return str End Function
It's been working well, but I just found out it doesn't correctly read a text file with a different encoding. I have a text file with some French accents in it, like "acheté". My function would return "achet", dropping the é completely. I'm not sure how to address this and it's very important to make it continue to work as it has with the plain English files I usually use it with. Anyone know how to address this? Thanks!
Same answer as always: Use the correct encoding. File.OpenText() always
uses an UTF-8 StreamReader implicitly. Create your own StreamReader
instance instead and specify the desired encoding.
using (StreamReader reader =
new StreamReader(@"C:\Foo\Bar.txt", Encoding.Default)
{
// ...
}
Note that Encoding.Default is your OS default encoding (most likely
Windows-1252) and represents the best guess if UTF-8 doesn't apply.
Cheers,
-- http://www.joergjooss.de
mailto:ne********@joergjooss.de
Joerg Jooss wrote: MattB wrote:
I have the following function I use in my application quite a bit (I missed the VFP one and decided to make my own):
Public Shared Function File2String(ByVal strFile) 'Open a file for reading Dim strFilename As String = strFile 'Get a StreamReader class that can be used to read the file Dim objStreamReader As System.IO.StreamReader
Try objStreamReader = System.IO.File.OpenText(strFilename) Catch ex As Exception Return Nothing End Try
Dim str As String = objStreamReader.ReadToEnd objStreamReader.Close() Return str End Function
It's been working well, but I just found out it doesn't correctly read a text file with a different encoding. I have a text file with some French accents in it, like "acheté". My function would return "achet", dropping the é completely. I'm not sure how to address this and it's very important to make it continue to work as it has with the plain English files I usually use it with. Anyone know how to address this? Thanks!
Same answer as always: Use the correct encoding. File.OpenText() always uses an UTF-8 StreamReader implicitly. Create your own StreamReader instance instead and specify the desired encoding.
using (StreamReader reader = new StreamReader(@"C:\Foo\Bar.txt", Encoding.Default) { // ... }
Note that Encoding.Default is your OS default encoding (most likely Windows-1252) and represents the best guess if UTF-8 doesn't apply.
Cheers,
Thanks for the reply!
Do you know if I can detect the encoding of the text file somehow, so
this app will work correctly with differently encoded text files?
Got any links or examples?
Thanks again!
Matt
Joerg Jooss wrote: MattB wrote:
I have the following function I use in my application quite a bit (I missed the VFP one and decided to make my own):
Public Shared Function File2String(ByVal strFile) 'Open a file for reading Dim strFilename As String = strFile 'Get a StreamReader class that can be used to read the file Dim objStreamReader As System.IO.StreamReader
Try objStreamReader = System.IO.File.OpenText(strFilename) Catch ex As Exception Return Nothing End Try
Dim str As String = objStreamReader.ReadToEnd objStreamReader.Close() Return str End Function
It's been working well, but I just found out it doesn't correctly read a text file with a different encoding. I have a text file with some French accents in it, like "acheté". My function would return "achet", dropping the é completely. I'm not sure how to address this and it's very important to make it continue to work as it has with the plain English files I usually use it with. Anyone know how to address this? Thanks!
Same answer as always: Use the correct encoding. File.OpenText() always uses an UTF-8 StreamReader implicitly. Create your own StreamReader instance instead and specify the desired encoding.
using (StreamReader reader = new StreamReader(@"C:\Foo\Bar.txt", Encoding.Default) { // ... }
Note that Encoding.Default is your OS default encoding (most likely Windows-1252) and represents the best guess if UTF-8 doesn't apply.
Cheers,
OK, so I tried creating the StreamReader as you said, and I tried every
encoding I could and nothing could read my text file with French
characters correctly. For example, the word "acheté" comes across as
"achet".
It entirely possible (even likely) I'm taking the wrong approach.
Can anyone with US English Windows put the word "acheté" in a text file
and have the last character come through?
Maybe I'll try reading it as binary next...
Any suggestions appreciated!
Matt
MattB wrote:
[...] OK, so I tried creating the StreamReader as you said, and I tried every encoding I could and nothing could read my text file with French characters correctly. For example, the word "acheté" comes across as "achet". It entirely possible (even likely) I'm taking the wrong approach. Can anyone with US English Windows put the word "acheté" in a text file and have the last character come through?
Maybe I'll try reading it as binary next...
There's no such thing as binary text. There are only bytes, which after
decoding them to characters, may become meaningful text.
The only way to solve this problem is to understand which character
encoding is being used. Can you load the file in a hex editor and try
to find out what bytes are used to represent the 'é'?
Cheers,
-- http://www.joergjooss.de
mailto:ne********@joergjooss.de
MattB wrote: Joerg Jooss wrote: MattB wrote:
I have the following function I use in my application quite a bit (I missed the VFP one and decided to make my own):
Public Shared Function File2String(ByVal strFile) 'Open a file for reading Dim strFilename As String = strFile 'Get a StreamReader class that can be used to read the file Dim objStreamReader As System.IO.StreamReader
Try objStreamReader = System.IO.File.OpenText(strFilename) Catch ex As Exception Return Nothing End Try
Dim str As String = objStreamReader.ReadToEnd objStreamReader.Close() Return str End Function
It's been working well, but I just found out it doesn't correctly read a text file with a different encoding. I have a text file with some French accents in it, like "acheté". My function would return "achet", dropping the é completely. I'm not sure how to address this and it's very important to make it continue to work as it has with the plain English files I usually use it with. Anyone know how to address this? Thanks!
Same answer as always: Use the correct encoding. File.OpenText() always uses an UTF-8 StreamReader implicitly. Create your own StreamReader instance instead and specify the desired encoding.
using (StreamReader reader = new StreamReader(@"C:\Foo\Bar.txt", Encoding.Default) { // ... }
Note that Encoding.Default is your OS default encoding (most likely Windows-1252) and represents the best guess if UTF-8 doesn't apply.
Cheers,
Thanks for the reply!
Do you know if I can detect the encoding of the text file somehow, so this app will work correctly with differently encoded text files?
I answered this yesterday -- see http://tinyurl.com/cn7z8.
Got any links or examples?
See Jon Skeet's page on Unicode and .NET: http://www.yoda.arachsys.com/csharp/unicode.html
Cheers,
-- http://www.joergjooss.de
mailto:ne********@joergjooss.de This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Christopher H. Laco |
last post by:
Long story longer. I need to get web user input into a backend system
that a) only grocks single byte encoding, b) expectes the data transer
to be 1 bytes = 1 character, and c) uses the HP Roman-6...
|
by: davisjoseph |
last post by:
Hi All,
I'm newbie to this XML world. My problem is to identify the encoding
type of XML at runtime. What currently I'm doing is checking whether
BOM is available in the XML; based on the BOM...
|
by: Demon News |
last post by:
I'm trying to do a transform (Using XmlTransform class in c#) and in the
Transform I'm specifying the the output xsl below:
<xsl:output method="xml" encoding="UTF-8" indent="no"/>
the...
|
by: Waldy |
last post by:
Hi there,
how do you set the encoding format of an XML string? When I
was outputting the XML to a file you can specify the encoding format like
so:
XmlTextWriter myWriter;
myWriter = new...
|
by: fitsch |
last post by:
Hi,
I am trying to write a generic RSS/Atom/OPML feed client. The problem
is, that those xml feeds may have different encodings:
- <?xml version="1.0" encoding="ISO-8859-1" ?>...
- <?xml...
|
by: Chris McDonough |
last post by:
ElementTree's XML serialization routine implied by tree._write(file,
node, encoding, namespaces looks like this (elided):
def _write(self, file, node, encoding, namespaces):
# write XML to file...
|
by: Christina |
last post by:
Hey Guys,
Currently, I am using the below code:
Dim oReqDoc as XmlDocument
Dim requiredBytes As Byte()
requiredBytes =
System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml).
Here, I am...
|
by: mortb |
last post by:
1. How do I determine which encoding a xmldocument or xmlreader uses when
opening a document?
I'm not just talking about the <?xml encoding="utf-8"?attribute, but the
actual encoding of the...
|
by: ujjwaltrivedi |
last post by:
Hey guys,
Can anyone tell me how to create a text file with Unicode Encoding. In
am using
FileStream Finalfile = new FileStream("finalfile.txt",
FileMode.Append, FileAccess.Write);
...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |