I want to remove duplicate entries within a text file. So if I had
this within a text file...
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HeartBase/
Applications/HeartBase/
Applications/HHC/
Applications/HHC/
Applications/HHC/
Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/
Applications/Great Plains/
Applications/Great Plains/Servers/
Applications/HeartBase/
Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no
success...any other ideas? 5 1883
Use the StreamReader to read the lines into an array of strings. Close the
StreamReader. Loop through the array to eliminate the duplicates by
comparing each string in the array with all of the strings before it. You
can eliminate the duplicates by setting the duplicate entries to a blank
string. Write the string to the file using a StreamWriter. Don't write the
blank array members.
If your file contains blank lines, use a different string to indicate a
removed string (e.g. "[REMOVED]").
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
Who is Mighty Abbott?
A twin turret scalawag.
"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11******** **************@ g44g2000cwa.goo glegroups.com.. . I want to remove duplicate entries within a text file. So if I had this within a text file...
Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HeartBase/ Applications/HeartBase/ Applications/HHC/ Applications/HHC/ Applications/HHC/ Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no success...any other ideas?
If the file is large this might be a drain on resources and cause
performance problems.
"Kevin Spencer" <ke***@DIESPAMM ERSDIEtakempis. com> wrote in message
news:uw******** ********@TK2MSF TNGP11.phx.gbl. .. Use the StreamReader to read the lines into an array of strings. Close the StreamReader. Loop through the array to eliminate the duplicates by comparing each string in the array with all of the strings before it. You can eliminate the duplicates by setting the duplicate entries to a blank string. Write the string to the file using a StreamWriter. Don't write the blank array members.
If your file contains blank lines, use a different string to indicate a removed string (e.g. "[REMOVED]").
-- HTH,
Kevin Spencer Microsoft MVP .Net Developer Who is Mighty Abbott? A twin turret scalawag.
"soup_nazi" <bc*****@wfs-ops.org> wrote in message news:11******** **************@ g44g2000cwa.goo glegroups.com.. .I want to remove duplicate entries within a text file. So if I had this within a text file...
Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HeartBase/ Applications/HeartBase/ Applications/HHC/ Applications/HHC/ Applications/HHC/ Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no success...any other ideas?
Question, will the duplicate entries always be next to each other?
Can you provide some code that shows how you used the reader and writer.
There just might be something wrong with your logic.
"soup_nazi" <bc*****@wfs-ops.org> wrote in message
news:11******** **************@ g44g2000cwa.goo glegroups.com.. . I want to remove duplicate entries within a text file. So if I had this within a text file...
Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HeartBase/ Applications/HeartBase/ Applications/HHC/ Applications/HHC/ Applications/HHC/ Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no success...any other ideas?
> If the file is large this might be a drain on resources and cause performance problems.
If the file is *very* large, perhaps. However, I have written applications
that load hundreds of MB of data into memory without any performance issues.
Considering the sample he posted, I estimated that the size of the file is
not likely to be very large.
Other solutions that would handle very large files and check for duplicate
lines would definitely slow down performance. Disk IO is costly and slow,
especially in a managed app. When possible, it's best to read an entire file
into memory and work with it from there.
Yes, it would be possible to open a stream to the file, and read a line (or
a chunk of lines) at a time, comparing each line to another line (or chunk
of lines) read from the stream. If it were a very large file, this might be
necessary. But again, it would be costly to do so, because of the constant
disk IO involved. In addition, the constant re-allocation of strings would
consume a lot of managed memory. You'll notice that my solution did not
involve any reallocation of strings, except for the blank strings used to
replace removed strings.
Yes, my solution could be optimized a bit more. For example, rather than
replacing a string with a blank string in the array, removed strings could
be replace with null, now that I think of it.
If you have a better idea, let's hear it.
--
HTH,
Kevin Spencer
Microsoft MVP
..Net Developer
Who is Mighty Abbott?
A twin turret scalawag.
"Peter Rilling" <pe***@nospam.r illing.net> wrote in message
news:OQ******** ********@TK2MSF TNGP15.phx.gbl. .. If the file is large this might be a drain on resources and cause performance problems.
"Kevin Spencer" <ke***@DIESPAMM ERSDIEtakempis. com> wrote in message news:uw******** ********@TK2MSF TNGP11.phx.gbl. .. Use the StreamReader to read the lines into an array of strings. Close the StreamReader. Loop through the array to eliminate the duplicates by comparing each string in the array with all of the strings before it. You can eliminate the duplicates by setting the duplicate entries to a blank string. Write the string to the file using a StreamWriter. Don't write the blank array members.
If your file contains blank lines, use a different string to indicate a removed string (e.g. "[REMOVED]").
-- HTH,
Kevin Spencer Microsoft MVP .Net Developer Who is Mighty Abbott? A twin turret scalawag.
"soup_nazi" <bc*****@wfs-ops.org> wrote in message news:11******** **************@ g44g2000cwa.goo glegroups.com.. .I want to remove duplicate entries within a text file. So if I had this within a text file...
Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HeartBase/ Applications/HeartBase/ Applications/HHC/ Applications/HHC/ Applications/HHC/ Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no success...any other ideas?
On 23 Jan 2006 10:26:02 -0800, "soup_nazi" <bc*****@wfs-ops.org>
wrote: I want to remove duplicate entries within a text file. So if I had this within a text file...
Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HeartBase/ Applications/HeartBase/ Applications/HHC/ Applications/HHC/ Applications/HHC/ Applications/HHC/
I would want the result to be this:
Applications/Diabetic Registry/ Applications/Great Plains/ Applications/Great Plains/Servers/ Applications/HeartBase/ Applications/HHC/
I've tried using StreamReader and StreamWriter simulataneously with no success...an y other ideas?
The usual way to remove duplicates is to load the file into memory,
sort it then run through it keeping any line that does not match the
previous line.
If the file is too big to load into memory in one piece then you will
have to look at other techniques. Either process the file in chunks
(read up on "merge sort" for ideas) or else use the structure inherent
in the example you showed. You could load the whole thing into a
tree, reducing the amount of memory used:
<ASCII art ahead - monospaced font strongly recommended>
Applications -+-> Diabetic Registry ---> end
|
+-> Great Plains -+-> end
| |
| +-> Servers ---> end
|
+-> HeartBase ---> end
|
+-> HHC ---> end
rossum
--
The ultimate truth is that there is no ultimate truth This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Jason Heyes |
last post by:
I would like to modify the contents of a file, replacing all occurances of
one string with another. I wrote these functions:
bool read_file(std::string name, std::string &s);
bool write_file(std::string name, const std::string &s);
void find_replace(std::string &s, std::string first, std::string second);
bool find_replace_file(std::string name, std::string first, std::string
second)
{
|
by: jrmsmo |
last post by:
Hi I have a document as follows:
<?xml version="1.0"?>
<metadata xml:lang="en">
</metadata>
I want to change the document so it looks as follows:
<?xml version="1.0"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://localhost/XMLDemo/MyXMLschema.xsd">
|
by: Max Khitrov |
last post by:
Hello everyone,
I'm working on a VS .NET add-in that will allow developers to use
Subversion software from within the IDE (much like Source Safe).
Ideally, I would like for my plug-in to be able to modify icons that are
displayed in Solution Explorer based on the file's status.
So far I've been able to retrieve data from Solution Explorer using the
UIHierarchy and related objects. Those give me access to the contents,
but not to the...
|
by: vadim |
last post by:
Hi,
Is there a .Net control available that allows to write into web.config file
appsettings section?
The idea is to create encrypted user name and password for database
connection and then use them from ASP.Net. The program that will create the
encrypted entries is a simple winform app.
ConfigurationSettings.appsettings allows to read web.config sections but how
|
by: Randall Powell |
last post by:
I am in the process of developing a Windows Service which will: (1) monitor multiple network shares; (2) marshal text file transfers into an SQL Server 2000 instance; and (3) provide messaging services via email and a customized event log viewer. An additional goal is to have the service provide a visual status indicator via an icon to be located in the Taskbar status area. The NotifyIcon component appears to be a logical candidate and worked...
| |
by: rk |
last post by:
I have the following library.xml file coming from a system, this can't
be modified.
____________________________________________________________________________
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<library>
<book>
<name>
Discover America
</name>
</book>
|
by: allpervasive |
last post by:
hi all, this is reddy, a beginner to c lang,,here i have some problems
in reading and modifying the contents of a file,, hope you can help to
solve this problem. Here i attach the file to be modified and the
program code.
In the attached file below i just want to change the value of
data(only float value) after the line 1 P V T 1 15 till 2 G TT,
from positive to negative and vice versa, and wire the date in other
file. can someone help...
|
by: ganesanji |
last post by:
hi to all,
I am new to php. I have to edit a text file using php. I saw the file system concepts modes. My problem is I want to change a particular text or word in a text file. How to find the index of the specific word. Is there any functions or methods available for change a particular word or finding index of a word...
For the example, consider a text file named ganesh.txt which content is shown below....
ganesh
is
working ...
|
by: Joe Cool |
last post by:
I am attempting to add a function to an application I am working on to
modify the JPEG Comment in a Jpeg image file.
I can retrieve the JPEG Comment with no problem. The problem is
modifying it.
I have the contents of a Jpeg loaded into an Image object, _Image,
using the Image.FromFile method.
I convert the Text property of a TextBox to a byte array with code
|
by: IUnknown |
last post by:
Ok, we are all aware of the situation where modifying the folder
structure (adding files, folders, deleting files, etc) will result in
ASP.NET triggering a recompilation/restart of the application.
In a nutshell, I understand how this can be considered desireable by
some, but I am not one of those people.
My situation is that we have a root site (hosted @ http://www.mydomain.com)
in the root application folder '/'.
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |