Connecting Tech Pros Worldwide Help | Site Map

Hi...about some strange character in textfile

Newbie
 
Join Date: Sep 2007
Posts: 9
#1: Sep 3 '07
Hi all,
I'm a newbie in VB.Net Programming..

Hope that some of you can help me to solve this..

I'm working out to read,parse and save textfile into SQL Server.
The textfile contains thousands of rows with about 50 coloums every row..

Everythings goes well until I found one textfile with some strange character...seems to be Japanese character(because it's a Japanese company who owns this textfile)

The problem is..
Not all rows in this file have this strange characters..and because of this character, the parse function can't work properly(I'm using Substring to read every column).
Because the length of the row become different..
the length of the row with the strange character is 208.The length without this strange character is only 206.

Can someone please tell me how to fix this kind of problems??

Thanks for any suggestions and answers.....
Robbie's Avatar
Familiar Sight
 
Join Date: Mar 2007
Location: igirisu~
Posts: 184
#2: Sep 3 '07

re: Hi...about some strange character in textfile


Hi. I do not know about .NET programming much at all, but I know a lot about dealing with such Unicode characters in VB6, and maybe .NET is similar in this respect...?

A single one of these Unicode characters is made of two bytes.
If, for example, you ask VB to give you the length of...
"aaaか" (3 of letter 'a', then a Hiragana 'ka' - don't know if that'll come out alright on these forums)
...using Len(), then VB will say that it is 4 characters long, but if you use LenB() it'll say it's 5 characters long (5 bytes).

I'm not understanding your problem though.
You say that it's not letting something parse properly, and you also seemed to think that VB thinking it was 2 characters was wrong.

Well, I've tried to explain why it says it's 2 characters less when you remove the Unicode character, but I don't understand what you mean about the parsing, sorry. -_-;

EDIT: So basically, if there are different functions in .NET which let you deal with strings of characters by the 'number of characters' and 'number of bytes', as there are in VB6, maybe you need to play around with the different functions to find ones which work...? (Yep, I'm still not understanding >__<)
Newbie
 
Join Date: Sep 2007
Posts: 9
#3: Sep 3 '07

re: Hi...about some strange character in textfile


Thanks,Robbie..

I know that it has something to do with the unicode...,but i don't know how can I get over this in my code...

so, it is like this..
I have one textfile with almost 40,000 rows and 50 columns...
in one of this column it has some strange character.
But it shows up only in some rows..not all of the rows in the textfile.That makes the length of each rows become different...

As I say before ,I'm working out to read,parse and save the textfile into SQLServer.
First of all I read the textfile line per line.After that I try to read and save the data column per column in every line. Here, I'm using Substring to do it.
Simple to say, I'm counting the length of every column to get the data and save it to database...

For ex:

The column:
ItemCode ItemDescription InvoiceNo

The Data :
CV1025 HandkerchiefRED SX100 --> no strange character
SC22254 Leather Purse Orange U SC452 --> with strange character

Let say the length for column itemcode is 10,
the length for column itemdescription is 15(here sometimes contains strange characters)
and for column invoiceNo is 10


But because the length become different in some rows...the function I've made to read out the data per column is not working properly anymore...

in row without strange character I'll get the data exactly as the data in textfile according to the column..

column itemcode : CV1025
column ItemDescription : HandkerchiefRED
InvoiceNo : SX100


but in rows with the strange character, I get the data like this:

column itemcode : SC22254
column ItemDescription : Leather Purse Orange (the strange character is missing)
InvoiceNo : 452 (the SC is missing)

and it can't be save into the database.

Hope that the problem is more clear now...

can anyone help me??

thanks...
Robbie's Avatar
Familiar Sight
 
Join Date: Mar 2007
Location: igirisu~
Posts: 184
#4: Sep 3 '07

re: Hi...about some strange character in textfile


Sorry, I can't help with SQLServer at all. It means nothing to me.
But I think that your function would only fail if different parts in .NET are working differently. For example, getting the length of a piece of text measures in characters (a Unicode character is counted as 1), but you using Substring to pic out some text measures in bytes (a Unicode character is counted as 2).
Or something to that effect.
That's all I can do, only hint towards where I think the problem is; I don't know how to solve it. T_T
QVeen72's Avatar
Moderator
 
Join Date: Oct 2006
Location: Bangalore
Posts: 1,385
#5: Sep 3 '07

re: Hi...about some strange character in textfile


Hi,

By looking into ur last post, I noticed, u r using a Space Character as a Seperator between Columns.. And If a Field Value Contains a Space, then u may not be able to Parse/Read properly.. So why not use some other NonPrintable Char To seperate the Columns in Text File.. Say Chr(165) or Chr(166) or a Tab..
Or SemiColon/ a Star...
Another way out is, u can use Fixed Lenth Strings.. Say 1 to 10 Chars contain Item Code, 11 To 50 Contain Item Desc.. and so on..

REgards
Veena
Newbie
 
Join Date: Sep 2007
Posts: 9
#6: Sep 3 '07

re: Hi...about some strange character in textfile


Thanks anyway, Robbie..

I hope that others can give me solutions too..


thanks be4
Newbie
 
Join Date: Sep 2007
Posts: 9
#7: Sep 3 '07

re: Hi...about some strange character in textfile


Thanks Veena..
I think I've been using fixed length string..

Dim itemDescription As String = l.Substring(85, 50)
Dim invoiceNo As String = l.Substring(135, 10)

I begin to read the item description data in position 85 with length 50.
and the invoiceNo will begin at position 135.

How can I fixed this problems??

thanks
QVeen72's Avatar
Moderator
 
Join Date: Oct 2006
Location: Bangalore
Posts: 1,385
#8: Sep 3 '07

re: Hi...about some strange character in textfile


Hi,

Can u post the Code(Read and Write TextFile), How are u Populating the TextFile.. If using Fixed Length, then while writing to TextFile, are u Padding the FieldValue with Proper Spaces..? It will be easy to help u..

REgards
Veena
Newbie
 
Join Date: Sep 2007
Posts: 9
#9: Sep 4 '07

re: Hi...about some strange character in textfile


Actually I don't generate the textfile...It's generated by other application..
I only use the textfile as input file in my application..
So, I upload the file, open,read and then work with it..

Here 's the code to open the file:

Private Sub textfile()
Dim sr As StreamReader = File.OpenText(Me.txtFile.Text)
Dim read As String
read = sr.ReadLine()
While Not read Is Nothing
Dim x = parseLineTextfile(read)
read = sr.ReadLine()
Try
' I open the connection and execute the query here
Finally
' I close the connection
End Try
End While


and here's the code tp parse the line...:
Private Function parseLineTextfile(ByVal l As String)
If l.Length <= 0 Then
Return Nothing
End If

Dim year As String = l.Substring(0, 4)
'
'
'
Dim itemCode As String = l.Substring(70,10)
Dim itemDescription As String = l.Substring(80, 50)
Dim InvoiceNo As String = l.Substring(130, 10)
'
'
'

'after that I save the data into the database using parameter

End Function

As I say, the problem is that the strange character change the length of the rows...and cause the cod eto parse can't work properly anymore..and I can't get the right data anymore..

So, any suggestion??
and because there's no separator like Tab or | or others..I can't use split function to cut the data per column..I only can count the beginning position per column and the column length per each column....Does anyone has andere ideas????

thanks....
QVeen72's Avatar
Moderator
 
Join Date: Oct 2006
Location: Bangalore
Posts: 1,385
#10: Sep 4 '07

re: Hi...about some strange character in textfile


Hi,

What type of Strange Chars do you have? Can you post some text lines which contain these Characters..?

Regards
Veena
Newbie
 
Join Date: Sep 2007
Posts: 9
#11: Sep 4 '07

re: Hi...about some strange character in textfile


here is two rows data..

200202ML11 SM01 GUDANG WM05 PART ASSY V844930 RACK MOLDING (R) CLP-120/130/150/170 Y20020228693374 PC 288.0000 2.0325 585.36 KZZ

--> without strange characters --> length 208


200202ML11 SM01 GUDANG WM05 PART ASSY V845560 CUSHION 380X7XT4 ¸Û Y20020206729872 PC 50.0000 0.0241 1.21 KZZ
--> with strange characters ( ¸Û ) --> length206

for my database I'm using varchar.

How can I count this strange character as 1 character, not as 2 bytes???

thanks...
QVeen72's Avatar
Moderator
 
Join Date: Oct 2006
Location: Bangalore
Posts: 1,385
#12: Sep 4 '07

re: Hi...about some strange character in textfile


Hi,

Before U parse the String, U can Build another String , which excludes all the special Chars, Some thing like this :

Expand|Select|Wrap|Line Numbers
  1. Public Function RemoveSpChar(ByVal TempStr As String) As String
  2. Dim i As Integer
  3. Dim NewStr As String
  4. Dim TStr As String 
  5. Dim TAsc As Integer
  6. NewStr = ""
  7. For i = 1 To Len(TempStr)
  8.    TStr = Mid(TempStr, i ,1)
  9.    TAsc =Asc(Tstr)
  10.    If TAsc >=65 And TAsc<=122 Then
  11.       ' A To Z and a to z
  12.    ElseIf TAsc >= 32 And TAsc<= 57 Then
  13.       ' printable Chars and Numbers
  14.    Else
  15.       'Special Char
  16.       TStr =""
  17.    End If
  18.    NewStr = NewStr & TStr
  19. Next
  20. RemoveSpChar =NewStr
  21.  
May be U can check for few more Ascii's and zap the Char if not satisfying ur KeyAscii...


REgards
Veena
Newbie
 
Join Date: Sep 2007
Posts: 9
#13: Sep 5 '07

re: Hi...about some strange character in textfile


thanks..Veena..
I'll try the code now..

Li
Newbie
 
Join Date: Sep 2007
Posts: 9
#14: Sep 5 '07

re: Hi...about some strange character in textfile


Veena...it doesn't work...
I can't catch the strange character....

but thankss anyway..I'm still trying to find the solution..
Newbie
 
Join Date: Sep 2007
Posts: 9
#15: Sep 5 '07

re: Hi...about some strange character in textfile


Veena....I've got it..
thanks a lot for your time and suggestion..:)

Li
Reply