By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
459,223 Members | 1,371 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 459,223 IT Pros & Developers. It's quick & easy.

TextFieldParser - reading tab delimited file

P: n/a
I’m using textfieldparser to read a data file. which contains, for example:

Amondó Szegi Amondo Szegi
andré nossek André Nossek
© Characte Character

Note the vowels with diacriticals and the copyright symbol - it is dropping
these (and other similar) characters which fall outside ascii range
(apparently)

The code is simple and looks like:
Using MyReader As New TextFieldParser(Application.StartupPath &
"\designers.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.CommentTokens = New String() {"#"}
MyReader.Delimiters = New String() {vbTab}
MyReader.TrimWhiteSpace = True
Dim currentRow As String()
intElement = 0
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If Microsoft.VisualBasic.Left(currentRow(0), 7) =
"UNKNOWN" Then
strUnknownDesigner = currentRow(1)
Continue While
End If
arDesigner(intElement, 0) = currentRow(0)
arDesigner(intElement, 1) = currentRow(1)
arDesignerCounter(intElement) = 0
intElement += 1
Catch ex As MalformedLineException
MsgBox("Designer Line " & ex.Message & "is not valid
and will be skipped.")
End Try
End While
End Using

I can’t see any reason in the documentation for it dropping copyright or
the French and German (etc…) vowels with accents.

Comments or suggestions anyone??

Thanks //al
Sep 21 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
al jones wrote:
I'm using textfieldparser to read a data file. which contains, for
example:

Amond Szegi Amondo Szegi
andr nossek Andr Nossek
Characte Character

Note the vowels with diacriticals and the copyright symbol - it is
dropping these (and other similar) characters which fall outside
ascii range (apparently)
It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
..BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew
Sep 21 '06 #2

P: n/a
On Thu, 21 Sep 2006 13:02:59 +0100, Andrew Morton wrote:
al jones wrote:
>I'm using textfieldparser to read a data file. which contains, for
example:

Amond Szegi Amondo Szegi
andr nossek Andr Nossek
Characte Character

Note the vowels with diacriticals and the copyright symbol - it is
dropping these (and other similar) characters which fall outside
ascii range (apparently)

It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
.BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew
Possibly my confusion is from the fact that I maintain these files (there
are three of them) within VS 2005 so I would have epected them to be
unicode. The characters exist within the files (the three line examples are
cut & paste from the file itself) so I don't understand why reading them
would literally eliminate the characters.

I've been over the TextFieldParser docs and see nothing that indicates that
it shouldn't take the data as presented.
Sep 21 '06 #3

P: n/a
Try OrchidGrid control, which can pase/import data from delimited files.
Im using textfieldparser to read a data file. which contains, for
example:

Amond Szegi Amondo Szegi
andr nossek Andr Nossek
? Characte Character

Note the vowels with diacriticals and the copyright symbol - it is
dropping
these (and other similar) characters which fall outside ascii range
(apparently)

The code is simple and looks like:
Using MyReader As New TextFieldParser(Application.StartupPath &
"\designers.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.CommentTokens = New String() {"#"}
MyReader.Delimiters = New String() {vbTab}
MyReader.TrimWhiteSpace = True
Dim currentRow As String()
intElement = 0
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If Microsoft.VisualBasic.Left(currentRow(0), 7) =
"UNKNOWN" Then
strUnknownDesigner = currentRow(1)
Continue While
End If
arDesigner(intElement, 0) = currentRow(0)
arDesigner(intElement, 1) = currentRow(1)
arDesignerCounter(intElement) = 0
intElement += 1
Catch ex As MalformedLineException
MsgBox("Designer Line " & ex.Message & "is not valid
and will be skipped.")
End Try
End While
End Using

I cant see any reason in the documentation for it dropping copyright or
the French and German (etc*) vowels with accents.

Comments or suggestions anyone??

Thanks //al

Sep 22 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.