473,233 Members | 1,410 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,233 software developers and data experts.

Hi...about some strange character in textfile

Hi all,
I'm a newbie in VB.Net Programming..

Hope that some of you can help me to solve this..

I'm working out to read,parse and save textfile into SQL Server.
The textfile contains thousands of rows with about 50 coloums every row..

Everythings goes well until I found one textfile with some strange character...seems to be Japanese character(because it's a Japanese company who owns this textfile)

The problem is..
Not all rows in this file have this strange characters..and because of this character, the parse function can't work properly(I'm using Substring to read every column).
Because the length of the row become different..
the length of the row with the strange character is 208.The length without this strange character is only 206.

Can someone please tell me how to fix this kind of problems??

Thanks for any suggestions and answers.....
Sep 3 '07 #1
14 3255
180 100+
Hi. I do not know about .NET programming much at all, but I know a lot about dealing with such Unicode characters in VB6, and maybe .NET is similar in this respect...?

A single one of these Unicode characters is made of two bytes.
If, for example, you ask VB to give you the length of...
"aaaか" (3 of letter 'a', then a Hiragana 'ka' - don't know if that'll come out alright on these forums)
...using Len(), then VB will say that it is 4 characters long, but if you use LenB() it'll say it's 5 characters long (5 bytes).

I'm not understanding your problem though.
You say that it's not letting something parse properly, and you also seemed to think that VB thinking it was 2 characters was wrong.

Well, I've tried to explain why it says it's 2 characters less when you remove the Unicode character, but I don't understand what you mean about the parsing, sorry. -_-;

EDIT: So basically, if there are different functions in .NET which let you deal with strings of characters by the 'number of characters' and 'number of bytes', as there are in VB6, maybe you need to play around with the different functions to find ones which work...? (Yep, I'm still not understanding >__<)
Sep 3 '07 #2

I know that it has something to do with the unicode...,but i don't know how can I get over this in my code...

so, it is like this..
I have one textfile with almost 40,000 rows and 50 columns...
in one of this column it has some strange character.
But it shows up only in some rows..not all of the rows in the textfile.That makes the length of each rows become different...

As I say before ,I'm working out to read,parse and save the textfile into SQLServer.
First of all I read the textfile line per line.After that I try to read and save the data column per column in every line. Here, I'm using Substring to do it.
Simple to say, I'm counting the length of every column to get the data and save it to database...

For ex:

The column:
ItemCode ItemDescription InvoiceNo

The Data :
CV1025 HandkerchiefRED SX100 --> no strange character
SC22254 Leather Purse Orange U SC452 --> with strange character

Let say the length for column itemcode is 10,
the length for column itemdescription is 15(here sometimes contains strange characters)
and for column invoiceNo is 10

But because the length become different in some rows...the function I've made to read out the data per column is not working properly anymore...

in row without strange character I'll get the data exactly as the data in textfile according to the column..

column itemcode : CV1025
column ItemDescription : HandkerchiefRED
InvoiceNo : SX100

but in rows with the strange character, I get the data like this:

column itemcode : SC22254
column ItemDescription : Leather Purse Orange (the strange character is missing)
InvoiceNo : 452 (the SC is missing)

and it can't be save into the database.

Hope that the problem is more clear now...

can anyone help me??

Sep 3 '07 #3
180 100+
Sorry, I can't help with SQLServer at all. It means nothing to me.
But I think that your function would only fail if different parts in .NET are working differently. For example, getting the length of a piece of text measures in characters (a Unicode character is counted as 1), but you using Substring to pic out some text measures in bytes (a Unicode character is counted as 2).
Or something to that effect.
That's all I can do, only hint towards where I think the problem is; I don't know how to solve it. T_T
Sep 3 '07 #4
1,445 Expert 1GB

By looking into ur last post, I noticed, u r using a Space Character as a Seperator between Columns.. And If a Field Value Contains a Space, then u may not be able to Parse/Read properly.. So why not use some other NonPrintable Char To seperate the Columns in Text File.. Say Chr(165) or Chr(166) or a Tab..
Or SemiColon/ a Star...
Another way out is, u can use Fixed Lenth Strings.. Say 1 to 10 Chars contain Item Code, 11 To 50 Contain Item Desc.. and so on..

Sep 3 '07 #5
Thanks anyway, Robbie..

I hope that others can give me solutions too..

thanks be4
Sep 3 '07 #6
Thanks Veena..
I think I've been using fixed length string..

Dim itemDescription As String = l.Substring(85, 50)
Dim invoiceNo As String = l.Substring(135, 10)

I begin to read the item description data in position 85 with length 50.
and the invoiceNo will begin at position 135.

How can I fixed this problems??

Sep 3 '07 #7
1,445 Expert 1GB

Can u post the Code(Read and Write TextFile), How are u Populating the TextFile.. If using Fixed Length, then while writing to TextFile, are u Padding the FieldValue with Proper Spaces..? It will be easy to help u..

Sep 3 '07 #8
Actually I don't generate the textfile...It's generated by other application..
I only use the textfile as input file in my application..
So, I upload the file, open,read and then work with it..

Here 's the code to open the file:

Private Sub textfile()
Dim sr As StreamReader = File.OpenText(Me.txtFile.Text)
Dim read As String
read = sr.ReadLine()
While Not read Is Nothing
Dim x = parseLineTextfile(read)
read = sr.ReadLine()
' I open the connection and execute the query here
' I close the connection
End Try
End While

and here's the code tp parse the line...:
Private Function parseLineTextfile(ByVal l As String)
If l.Length <= 0 Then
Return Nothing
End If

Dim year As String = l.Substring(0, 4)
Dim itemCode As String = l.Substring(70,10)
Dim itemDescription As String = l.Substring(80, 50)
Dim InvoiceNo As String = l.Substring(130, 10)

'after that I save the data into the database using parameter

End Function

As I say, the problem is that the strange character change the length of the rows...and cause the cod eto parse can't work properly anymore..and I can't get the right data anymore..

So, any suggestion??
and because there's no separator like Tab or | or others..I can't use split function to cut the data per column..I only can count the beginning position per column and the column length per each column....Does anyone has andere ideas????

Sep 4 '07 #9
1,445 Expert 1GB

What type of Strange Chars do you have? Can you post some text lines which contain these Characters..?

Sep 4 '07 #10
here is two rows data..

200202ML11 SM01 GUDANG WM05 PART ASSY V844930 RACK MOLDING (R) CLP-120/130/150/170 Y20020228693374 PC 288.0000 2.0325 585.36 KZZ

--> without strange characters --> length 208

200202ML11 SM01 GUDANG WM05 PART ASSY V845560 CUSHION 380X7XT4 Y20020206729872 PC 50.0000 0.0241 1.21 KZZ
--> with strange characters ( ) --> length206

for my database I'm using varchar.

How can I count this strange character as 1 character, not as 2 bytes???

Sep 4 '07 #11
1,445 Expert 1GB

Before U parse the String, U can Build another String , which excludes all the special Chars, Some thing like this :

Expand|Select|Wrap|Line Numbers
  1. Public Function RemoveSpChar(ByVal TempStr As String) As String
  2. Dim i As Integer
  3. Dim NewStr As String
  4. Dim TStr As String 
  5. Dim TAsc As Integer
  6. NewStr = ""
  7. For i = 1 To Len(TempStr)
  8.    TStr = Mid(TempStr, i ,1)
  9.    TAsc =Asc(Tstr)
  10.    If TAsc >=65 And TAsc<=122 Then
  11.       ' A To Z and a to z
  12.    ElseIf TAsc >= 32 And TAsc<= 57 Then
  13.       ' printable Chars and Numbers
  14.    Else
  15.       'Special Char
  16.       TStr =""
  17.    End If
  18.    NewStr = NewStr & TStr
  19. Next
  20. RemoveSpChar =NewStr
May be U can check for few more Ascii's and zap the Char if not satisfying ur KeyAscii...

Sep 4 '07 #12
I'll try the code now..

Sep 5 '07 #13
Veena...it doesn't work...
I can't catch the strange character....

but thankss anyway..I'm still trying to find the solution..
Sep 5 '07 #14
Veena....I've got it..
thanks a lot for your time and suggestion..:)

Sep 5 '07 #15

Sign in to post your reply or Sign up for a free account.

Similar topics

by: Hans A | last post by:
I have a textfile "textfile.txt" containing a list of words. There is one word on each line. I want to pick two random lines from this textfile, and I have tried to do something like: //Loading...
by: copx | last post by:
For some reason Python (on Windows) doesn't use the system's default character set and that's a serious problem for me. I need to process German textfiles (containing umlauts and other > 7bit...
by: Andyza | last post by:
I'm using FileSystemObject to open and write to a tab delimited text file. First, I connect to a database and select some data. Then I create the text file and insert each record in the text...
by: Richard Sweeny | last post by:
I will be supplied a file of names delimited by the ASCII character 13. I know in AppleScript I would set this : set cr to ASCII character 13 How do I refer or set this in java. I figure I can...
by: Nathan Sokalski | last post by:
Visual Studio 2005 unexpectedly stopped generating the *.designer.vb files for *.aspx and *.ascx files. After a few days of frustration trying to fix this, I noticed that it had the following...
by: ssetz | last post by:
Hello, For work, I need to write a password filter. The problem is that my C+ + experience is only some practice in school, 10 years ago. I now develop in C# which is completely different to me....
by: Lasse Edsvik | last post by:
Hello I have a slight problem, I'm trying to open a textfile that has been saved as UTF-8. But when I run it it displays strange chars eventhough i've specified that it should read the file as...
by: tempest | last post by:
Hi all. This is a rather long posting but I have some questions concerning the usage of character entities in XML documents and PCI security compliance. The company I work for is using a...
by: asedt | last post by:
With my Excel macro and two text files I want to create a new textfile containing the first textfile then text from the sheet and then the second textfile. My problem is that i don't know how to...
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
by: egorbl4 | last post by:
Скачал я git, хотел начать настройку, а там вылезло вот это Что это? Что мне с этим делать? ...
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, youll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.