Modifying a text file - .NET Framework

soup_nazi

I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

Jan 23 '06 #1

Subscribe Post Reply

1854

Kevin Spencer

Use the StreamReader to read the lines into an array of strings. Close the
StreamReader. Loop through the array to eliminate the duplicates by
comparing each string in the array with all of the strings before it. You
can eliminate the duplicates by setting the duplicate entries to a blank
string. Write the string to the file using a StreamWriter. Don't write the
blank array members.

If your file contains blank lines, use a different string to indicate a
removed string (e.g. "[REMOVED]").

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
Who is Mighty Abbott?
A twin turret scalawag.

"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...

I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

Jan 23 '06 #2

Peter Rilling

If the file is large this might be a drain on resources and cause
performance problems.

"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message
news:uw****************@TK2MSFTNGP11.phx.gbl...

Use the StreamReader to read the lines into an array of strings. Close the
StreamReader. Loop through the array to eliminate the duplicates by
comparing each string in the array with all of the strings before it. You
can eliminate the duplicates by setting the duplicate entries to a blank
string. Write the string to the file using a StreamWriter. Don't write the
blank array members.

If your file contains blank lines, use a different string to indicate a
removed string (e.g. "[REMOVED]").

--
HTH,

Kevin Spencer
Microsoft MVP
.Net Developer
Who is Mighty Abbott?
A twin turret scalawag.

"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

Jan 23 '06 #3

Peter Rilling

Question, will the duplicate entries always be next to each other?

Can you provide some code that shows how you used the reader and writer.
There just might be something wrong with your logic.

"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...

I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

Jan 23 '06 #4

Kevin Spencer

> If the file is large this might be a drain on resources and cause

performance problems.
If the file is *very* large, perhaps. However, I have written applications
that load hundreds of MB of data into memory without any performance issues.
Considering the sample he posted, I estimated that the size of the file is
not likely to be very large.

Other solutions that would handle very large files and check for duplicate
lines would definitely slow down performance. Disk IO is costly and slow,
especially in a managed app. When possible, it's best to read an entire file
into memory and work with it from there.

Yes, it would be possible to open a stream to the file, and read a line (or
a chunk of lines) at a time, comparing each line to another line (or chunk
of lines) read from the stream. If it were a very large file, this might be
necessary. But again, it would be costly to do so, because of the constant
disk IO involved. In addition, the constant re-allocation of strings would
consume a lot of managed memory. You'll notice that my solution did not
involve any reallocation of strings, except for the blank strings used to
replace removed strings.

Yes, my solution could be optimized a bit more. For example, rather than
replacing a string with a blank string in the array, removed strings could
be replace with null, now that I think of it.

If you have a better idea, let's hear it.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
Who is Mighty Abbott?
A twin turret scalawag.

"Peter Rilling" <pe***@nospam.rilling.net> wrote in message
news:OQ****************@TK2MSFTNGP15.phx.gbl... If the file is large this might be a drain on resources and cause
performance problems.

"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message
news:uw****************@TK2MSFTNGP11.phx.gbl...
Use the StreamReader to read the lines into an array of strings. Close
the StreamReader. Loop through the array to eliminate the duplicates by
comparing each string in the array with all of the strings before it. You
can eliminate the duplicates by setting the duplicate entries to a blank
string. Write the string to the file using a StreamWriter. Don't write
the blank array members.

If your file contains blank lines, use a different string to indicate a
removed string (e.g. "[REMOVED]").

--
HTH,

Kevin Spencer
Microsoft MVP
.Net Developer
Who is Mighty Abbott?
A twin turret scalawag.

"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

Jan 24 '06 #5

rossum

On 23 Jan 2006 10:26:02 -0800, "soup_nazi" <bc*****@wfs-ops.org>
wrote:

I want to remove duplicate entries within a text file. So if I had
this within a text file...

Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/

I would want the result to be this:

Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/

I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas?

The usual way to remove duplicates is to load the file into memory,
sort it then run through it keeping any line that does not match the
previous line.

If the file is too big to load into memory in one piece then you will
have to look at other techniques. Either process the file in chunks
(read up on "merge sort" for ideas) or else use the structure inherent
in the example you showed. You could load the whole thing into a
tree, reducing the amount of memory used:
<ASCII art ahead - monospaced font strongly recommended>

Applications -+-> Diabetic Registry ---> end
|
+-> Great Plains -+-> end
| |
| +-> Servers ---> end
|
+-> HeartBase ---> end
|
+-> HHC ---> end

rossum
--

The ultimate truth is that there is no ultimate truth

Jan 24 '06 #6

Similar topics

Modifying the contents of a file

by: Jason Heyes | last post by:

I would like to modify the contents of a file, replacing all occurances of one string with another. I wrote these functions: bool read_file(std::string name, std::string &s); bool...

C / C++

Modifying namespace in XML doc, C# help

by: jrmsmo | last post by:

Hi I have a document as follows: <?xml version="1.0"?> <metadata xml:lang="en"> </metadata> I want to change the document so it looks as follows: <?xml version="1.0"?> <metadata...

.NET Framework

C# VS Add-In: Modifying Solution Explorer Icons and/or Text

by: Max Khitrov | last post by:

Hello everyone, I'm working on a VS .NET add-in that will allow developers to use Subversion software from within the IDE (much like Source Safe). Ideally, I would like for my plug-in to be able...

C# / C Sharp

modifying web.config programmatically

by: vadim | last post by:

Hi, Is there a .Net control available that allows to write into web.config file appsettings section? The idea is to create encrypted user name and password for database connection and then...

ASP.NET

Modifying the NotifyIcon.Icon Property from Within A Service Application

by: Randall Powell | last post by:

I am in the process of developing a Windows Service which will: (1) monitor multiple network shares; (2) marshal text file transfers into an SQL Server 2000 instance; and (3) provide messaging...

Visual Basic .NET

Display an existing XML in browser as readable without modifying it.

by: rk | last post by:

I have the following library.xml file coming from a system, this can't be modified. ____________________________________________________________________________ <?xml version="1.0"...

.NET Framework

problems using fgets() and sscanf() while modifying file contents

by: allpervasive | last post by:

hi all, this is reddy, a beginner to c lang,,here i have some problems in reading and modifying the contents of a file,, hope you can help to solve this problem. Here i attach the file to be...

C / C++

Editing and modifying an existing text file

by: ganesanji | last post by:

hi to all, I am new to php. I have to edit a text file using php. I saw the file system concepts modes. My problem is I want to change a particular text or word in a text file. How to...

PHP

Modifying JPEG Comment

by: Joe Cool | last post by:

I am attempting to add a function to an application I am working on to modify the JPEG Comment in a Jpeg image file. I can retrieve the JPEG Comment with no problem. The problem is modifying it....

C# / C Sharp

Modifying application folder structure results in application restart- How to stop?

by: IUnknown | last post by:

Ok, we are all aware of the situation where modifying the folder structure (adding files, folders, deleting files, etc) will result in ASP.NET triggering a recompilation/restart of the application....

ASP.NET

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration