473,407 Members | 2,598 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,407 software developers and data experts.

Strip Unwanted Characters from a text file

I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB(LOF(nFileNumGENERIC), nFileNumGENERIC),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.

David A. Beck

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #1
4 3290
David,

Are you writing these back to the main file, or to a new file? Either
way, you should open up two streams (one to read, one to write), and then
put them in a StreamReader and StreamWriter respectively. As you cycle
through the characters in the stream (you can read in chunks, you can decide
what chunck size is the best) check for the existence of the character. If
you need it replaced, then replace it before writing it to the output
stream.

Hope this helps.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"David Beck" <da****@beckb.com> wrote in message
news:up**************@TK2MSFTNGP14.phx.gbl...
I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB(LOF(nFileNumGENERIC), nFileNumGENERIC),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.

David A. Beck

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Nov 16 '05 #2
David Beck <da****@beckb.com> wrote:
I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB(LOF(nFileNumGENERIC), nFileNumGENERIC),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.


The simplest way would probably be to read the file line by line with a
StreamReader (using whatever encoding the file is in), use
String.Replace (or possibly a regular expression) to remove the
characters, then write the line back to the new file.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #3

Jon:

Actually all I want to do is read line for line. However, some files had
chr(0) imbedded in them and that gets translated as EOF.

I set up a test openeing an append file and processing chuncks 1K, 10K,
and 100K on a 101MB file. The total time is not much different from the
VB6 loading the whole thing at once. However, it does not kill the whole
machine in the process of doing it.

Are you saying I can use file streams in/out in VB6? Can you give me a
code stub?
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #4
David Beck <da****@beckb.com> wrote:
Actually all I want to do is read line for line. However, some files had
chr(0) imbedded in them and that gets translated as EOF.
It shouldn't do when read by StreamReader, IIRC.
I set up a test openeing an append file and processing chuncks 1K, 10K,
and 100K on a 101MB file. The total time is not much different from the
VB6 loading the whole thing at once. However, it does not kill the whole
machine in the process of doing it.

Are you saying I can use file streams in/out in VB6? Can you give me a
code stub?


Well, something like:

using (StreamReader input = new StreamReader (...))
{
using (StreamWriter output = new StreamWriter (...))
{
string line;
while ( (line = input.ReadLine()) != null)
{
// Munge the line however you want to

output.WriteLine (line);
}
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jocknerd | last post by:
I'm a Python newbie and I'm having trouble with Regular Expressions when reading in a text file. Here is a sample layout of the input file: 09/04/2004 Virginia 44 Temple ...
6
by: Ruben | last post by:
Hello. I am trying to read a small text file using the readline statement. I can only read the first 2 records from the file. It stops at the blank lines or at lines with only spaces. I have a...
7
by: Frank.Sebesta | last post by:
I have a wedge mag stripe reader that I swipe when ask to input information in a query. How do I filter the unwanted characters. Apparently there are two mag stripes that are read every time I...
3
by: et | last post by:
How can I strip out unwanted characters in a string before updating the database? For instance, in names & addresses in our client table, we want only letters and numbers, no punctuation. Is...
4
by: vvenk | last post by:
Hello: I have a string, "Testing_!@#$%^&*()". It may have single and double quotations as well. I would like to strip all chararcters others than a-z, A-Z, 0-9 and the comma. I came across...
4
by: Garimella | last post by:
Hi I've created a text file with records from the tables in access to a file using PUT function. There are few line in the file that are coming with "NUL(^@=0=0x0)" shown as blank spaces,at the...
9
by: NEWSGROUPS | last post by:
I have data in a table in an Access 2000 database that needs to be exported to a formatted text file. For instance, the first field is an account number that is formatted in the table as text and...
2
by: =?iso-8859-1?b?cultaQ==?= | last post by:
Hi, I would like to rename files (jpg's ones) using a text file containing the new names... Below is the code that doesn't work : ***** #!/usr/bin/python #-*- coding: utf-8 -*- from os...
1
by: =?Utf-8?B?YWxwbzQ4Ng==?= | last post by:
when I create a text file with: wmic ntdomain get /value z:\test\test.txt then run findstr /i /r ".*" z:\test\test.txt The output has a dot between each character. Is there any way to...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.