Drew,
0x93 in ASCII is a 0x13 while 0x94 in ASCII is 0x14, as Herfried stated,
ASCII is a 7 bit characters, the high bit is ignored at best, exceptioned at
worst. (they are simply not valid ASCII).
When you open notepad you are in ANSI, with a specific code page. (the code
page is defined by the Windows Control Panel). ANSI is a full 8 bit
characters. 0x93 & 0x94 are typographic quote characters in the US ANSI code
page, I believe they are typographic quote characters in most European ANSI
code pages also.
Based on your original post, it appears you are using
System.IO.File.OpenText which opens the file in UTF-8, 0x93 & 0x94 are NOT
typographic quote characters in UTF-8! As UTF-8 is an 8-bit encoding for
Unicode, 8-bit Unicode characters are using for char points 128 to 255, the
typographic quote characters are 0x201C and 0x201D in Unicode.
To see a full explaination of Unicode and Encodings see:
http://www.yoda.arachsys.com/csharp/unicode.html
To see the Unicode code point for a character in Character Map, look in the
lower left corner of the window. It gives the Unicode code point, while the
lower right gives the ANSI/keyboard short cut. Note the "character set"
combo box in Character Map is the Encoding in .NET.
Common Unicode typographic quote chars include, but are not limited to:
' what most people think of quote chars
Const Apostrophe As Char = ChrW(&H27) ' single quotes
Const Quote As Char = ChrW(&H22) ' double quotes
' various typographic quote characters
Const LeftSingleQuote As Char = ChrW(&H2018)
Const RightSingleQuote As Char = ChrW(&H2019)
Const LeftDoubleQuote As Char = ChrW(&H201C)
Const RightDoubleQuote As Char = ChrW(&H201D)
' other typographic quote characters (international)
' Note: HP48 uses these for delimiters
Const LeftPointingDoubleAngleQuote As Char = ChrW(&HAB)
Const RightPointingDoubleAngleQuote As Char = ChrW(&HBB)
' other typographic quote characters (international)
Const SingleLow9Quote As Char = ChrW(&H201A)
Const SingleHighReversed9Quote As Char = ChrW(&H201B)
Const DoubleLow9Quote As Char = ChrW(&H201E)
The above are valid for Unicode encodings (UTF-8).
Hope this helps
Jay
"Drew Berkemeyer" <sp**@menow.com> wrote in message
news:uG**************@TK2MSFTNGP11.phx.gbl...
Thank you for the reply, but...
ASCII 0x93 and 0x94 are *not* control characters. They are the open and
close quotes.
If I open notepad and type Alt+0147 (0x93) I get ". If I type Alt+0148
(0x94) I get ".
So, to answer my own question posted earlier... Here's the solution. I was
not using the proper Encoding. I had created a StreamReader like this:
Dim sr As StreamReader = New StreamReader(strFilePath)
The correct code for reading in a plain text file (which does not eat "
and " chars) is:
Dim sr As StreamReader = New StreamReader(strFilePath,
System.Text.Encoding.Default)
Dim input As String = sr.ReadLine()
While Not input Is Nothing
strReturn += input + vbCrLf
input = sr.ReadLine()
End While
sr.Close()
Thanks again for your help. I appreciate the effort.
1
- Drew
""Peter Huang"" <v-******@online.microsoft.com> wrote in message
news:X5**************@cpmsftngxa10.phx.gbl... Hi Drew,
In addition to Herfried's suggestion, it seems that the " represented
the 0x22 in ASCII coding.
You may try to the code below.
Sub Main()
For i As Integer = 0 To 255
Console.WriteLine("0x" + Hex(i) + ": " + Chr(i))
Next
End Sub
We will find the result as below.
0x22: "
and
0x93:
0x94:
the two are control character they are not the printable character.
If I have any misunderstanding, please feel free to let me know.
Best regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no
rights.