473,802 Members | 2,430 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Strip Unwanted Characters from a text file

I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB( LOF(nFileNumGEN ERIC), nFileNumGENERIC ),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.

David A. Beck

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #1
4 3326
David,

Are you writing these back to the main file, or to a new file? Either
way, you should open up two streams (one to read, one to write), and then
put them in a StreamReader and StreamWriter respectively. As you cycle
through the characters in the stream (you can read in chunks, you can decide
what chunck size is the best) check for the existence of the character. If
you need it replaced, then replace it before writing it to the output
stream.

Hope this helps.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"David Beck" <da****@beckb.c om> wrote in message
news:up******** ******@TK2MSFTN GP14.phx.gbl...
I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB( LOF(nFileNumGEN ERIC), nFileNumGENERIC ),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.

David A. Beck

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Nov 16 '05 #2
David Beck <da****@beckb.c om> wrote:
I donwnload some files for processing every day that have unwanted
characters in them. In VB6 I use the InputB to read in the text and the
StrConv.

vLinesFromFile = StrConv(InputB( LOF(nFileNumGEN ERIC), nFileNumGENERIC ),
vbUnicode)

If the string has any unwanted characters (e.g. Chr(26)), I use the
replace to remove them and save the file.

Now the size of some of these files has grown to several megabytes.
Processing them in VB6 now is slower that a slug in salt. Can someone
give me a C# program stub that can help a VB guy check for unwanted
characters and eliminate them? I'm thinking it will be much faster.


The simplest way would probably be to read the file line by line with a
StreamReader (using whatever encoding the file is in), use
String.Replace (or possibly a regular expression) to remove the
characters, then write the line back to the new file.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #3

Jon:

Actually all I want to do is read line for line. However, some files had
chr(0) imbedded in them and that gets translated as EOF.

I set up a test openeing an append file and processing chuncks 1K, 10K,
and 100K on a 101MB file. The total time is not much different from the
VB6 loading the whole thing at once. However, it does not kill the whole
machine in the process of doing it.

Are you saying I can use file streams in/out in VB6? Can you give me a
code stub?
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 16 '05 #4
David Beck <da****@beckb.c om> wrote:
Actually all I want to do is read line for line. However, some files had
chr(0) imbedded in them and that gets translated as EOF.
It shouldn't do when read by StreamReader, IIRC.
I set up a test openeing an append file and processing chuncks 1K, 10K,
and 100K on a 101MB file. The total time is not much different from the
VB6 loading the whole thing at once. However, it does not kill the whole
machine in the process of doing it.

Are you saying I can use file streams in/out in VB6? Can you give me a
code stub?


Well, something like:

using (StreamReader input = new StreamReader (...))
{
using (StreamWriter output = new StreamWriter (...))
{
string line;
while ( (line = input.ReadLine( )) != null)
{
// Munge the line however you want to

output.WriteLin e (line);
}
}
}

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
2680
by: Jocknerd | last post by:
I'm a Python newbie and I'm having trouble with Regular Expressions when reading in a text file. Here is a sample layout of the input file: 09/04/2004 Virginia 44 Temple 14 09/04/2004 LSU 22 Oregon State 21 09/09/2004 Troy State 24 Missouri 14 As you can see, the text file contains a list of games. Each game has a date, a winning team, the winning...
6
4301
by: Ruben | last post by:
Hello. I am trying to read a small text file using the readline statement. I can only read the first 2 records from the file. It stops at the blank lines or at lines with only spaces. I have a while statement checking for an empty string "" which I understand represents an EOF in Python. The text file has some blank lines with spaces and other with blanks. Thanks a lot.
7
2976
by: Frank.Sebesta | last post by:
I have a wedge mag stripe reader that I swipe when ask to input information in a query. How do I filter the unwanted characters. Apparently there are two mag stripes that are read every time I swipe the card. The first line has the name information and the second line has the number that I want to use. The number looks like this when I swipe a card. %00123478? I need to filter out the % and the ?
3
2583
by: et | last post by:
How can I strip out unwanted characters in a string before updating the database? For instance, in names & addresses in our client table, we want only letters and numbers, no punctuation. Is there a way to do this?
4
3031
by: vvenk | last post by:
Hello: I have a string, "Testing_!@#$%^&*()". It may have single and double quotations as well. I would like to strip all chararcters others than a-z, A-Z, 0-9 and the comma. I came across the following snippet in the online help but the output does not change at all:
4
1132
by: Garimella | last post by:
Hi I've created a text file with records from the tables in access to a file using PUT function. There are few line in the file that are coming with "NUL(^@=0=0x0)" shown as blank spaces,at the start which makes my file go out of format. Can someone please help me on this? I don't want these characters to be on the file. Does this have any thing to do with the length of the string defined. Regards, Vinay
9
7716
by: NEWSGROUPS | last post by:
I have data in a table in an Access 2000 database that needs to be exported to a formatted text file. For instance, the first field is an account number that is formatted in the table as text and is 8 characters long. This field needs to be exported as pic(15) padded in the front with 0's (zeros). The next field an ID name that is 15 characters that needs to be exported as pic(20) padded with trailing spaces. There are about 5 fields in...
2
3639
by: =?iso-8859-1?b?cultaQ==?= | last post by:
Hi, I would like to rename files (jpg's ones) using a text file containing the new names... Below is the code that doesn't work : ***** #!/usr/bin/python #-*- coding: utf-8 -*- from os import listdir, getcwd, rename import re
1
3872
by: =?Utf-8?B?YWxwbzQ4Ng==?= | last post by:
when I create a text file with: wmic ntdomain get /value z:\test\test.txt then run findstr /i /r ".*" z:\test\test.txt The output has a dot between each character. Is there any way to strip these out? The text.txt file only displays these
0
9699
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9562
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10282
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10061
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9111
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7598
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5622
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4270
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3792
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.