473,695 Members | 1,967 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to add a string to a big file in csharp !

I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
..
..
////////////////////
and now i want to add "dad" into it ! Just after "cabbage" and at the front
of "dog"! Because of so many word in file so i need to adopt binary search to
find the location !

/// <summary>
/// want to find the word from given file
/// </summary>
/// <param name="?"></param>

private bool find(string word)
{
if (word == null)
{
throw new ArgumentNullExc eption("word is null.");
}

StreamReader sr = new StreamReader(fi le.FullName); //file is object of
FileInfo
lock(this)
{
//Check the word is in the first!
string str = sr.ReadLine();
if (str == null)
{
return false;
}
if (string.Compare (str.Trim(),wor d))
return true;
}

// binary search starts
FileStream fs = File.OpenRead(f ile.FullName);
long lower = 0;
long upper = fs.Length - 1;
while (lower <= upper)
{
long index = (lower + upper) / 2;
fs.seek(index,S eekOrigin.End);

// read off an incomplete line
str = fs.Read();
////i donot know how to set the parameters of Read() so that it can read a
line

// the line might be null if it's the end of file
int t = str == null ? -1
: string.Compare( word, str.trim());
// found it
if (t == 0)
{
return true;
}
if (t > 0)
{
lower = index + 1;
}
else
{
upper = index - 1;
}
}
}

that is the fuction of method and my question is
1: the FileStream is fitable in it ?
2 : string.Compare is fitable in it ?
3: is there any method i can do it better ?

thanx of all !
Nov 16 '05 #1
9 1857
"zjut" <zj**@discussio ns.microsoft.co m> wrote in message
news:A2******** *************** ***********@mic rosoft.com...
I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
.
.
////////////////////
and now i want to add "dad" into it !


Given:
* Your file ("old_file") is in alphabetical order
* old_file is immensely big

Required:
* Adding word ("new_word") to old_file at the right place.

Solution:
* Sequentially read old_file ("word_read" ) and write word_read to new_file
* If (new_word > word_read) and (new_word< word_read+1) then shove it in

Nov 16 '05 #2
"zjut" <zj**@discussio ns.microsoft.co m> wrote in message
news:A2******** *************** ***********@mic rosoft.com...
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an
example. The example sorts a short alphabetically ordered file into a very
big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.
Nov 16 '05 #3
I see another NG member has already given you a possible
solution, but I don't feel it would be an optimal solution... You
really have a couple of different options that all revolve around the
same set of principles... First, you know the new size of the word
you are sorting into place, so you'll want to open and then grow
the file by that amount. This is to make sure you can copy the
rest of the file around while you are doing your searching. For a
sanity check, go ahead and check the first and last element to make
sure this isn't a trivial case.

Okay, the binary search is going to involve, cutting the file in half,
you can do this based on length, and then seeking to that location.
Once you've done that, you are going to walk backwards and
forwards until you encounter newlines on either side. This'll be
your *word*, and you'll compare it and continue the process of
cutting the file in half (aka a binary search)... Once you've found
your insertion location, you are going to do large buffer copies (4K
is probably best) of bytes moving all of the end elements into that
space you allocated in the beginning. With that done, write your
word into place. You've just managed an in place insertion.

If you have multiple words to merge, then merge sorting and other
heuristics come into play. Get your basic algorithm and then think
about refactoring.

--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"zjut" <zj**@discussio ns.microsoft.co m> wrote in message
news:A2******** *************** ***********@mic rosoft.com...
I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
.
.
////////////////////
and now i want to add "dad" into it ! Just after "cabbage" and at the front
of "dog"! Because of so many word in file so i need to adopt binary search to
find the location !

/// <summary>
/// want to find the word from given file
/// </summary>
/// <param name="?"></param>

private bool find(string word)
{
if (word == null)
{
throw new ArgumentNullExc eption("word is null.");
}

StreamReader sr = new StreamReader(fi le.FullName); //file is object of
FileInfo
lock(this)
{
//Check the word is in the first!
string str = sr.ReadLine();
if (str == null)
{
return false;
}
if (string.Compare (str.Trim(),wor d))
return true;
}

// binary search starts
FileStream fs = File.OpenRead(f ile.FullName);
long lower = 0;
long upper = fs.Length - 1;
while (lower <= upper)
{
long index = (lower + upper) / 2;
fs.seek(index,S eekOrigin.End);

// read off an incomplete line
str = fs.Read();
////i donot know how to set the parameters of Read() so that it can read a
line

// the line might be null if it's the end of file
int t = str == null ? -1
: string.Compare( word, str.trim());
// found it
if (t == 0)
{
return true;
}
if (t > 0)
{
lower = index + 1;
}
else
{
upper = index - 1;
}
}
}

that is the fuction of method and my question is
1: the FileStream is fitable in it ?
2 : string.Compare is fitable in it ?
3: is there any method i can do it better ?

thanx of all !

Nov 16 '05 #4
Zach <no*@this.addre ss> wrote:
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an
example. The example sorts a short alphabetically ordered file into a very
big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.


While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...
Zach <no*@this.addre ss> wrote:
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an example. The example sorts a short alphabetically ordered file into a very big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.


While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.


I had to process > 100,000 words - sort them and so on to create a
spellcheck
vocabulary, and Word wouldn't do it. Message saying it couldn't handle the
volume. WordPerfect had no problems sorting >100,000 words etc.
(So I wrote some software for the job.)
Nov 16 '05 #6
"Justin Rogers" <Ju****@games4d otnet.com> wrote in message
news:e$******** ******@TK2MSFTN GP14.phx.gbl...

NB the words of the OP are in a
file to start with and have to be read
and re-written at least once!

Sequentially reading through the parent file,
slipping in the words from a sorted array,
at their respective right places in the parent
file, whilst checking for doubles, is fast, simple
to write and has no capacity constraints.
IMO Doing binary sorts in this situation is silly,
even more so if the new words are in random order.


Nov 16 '05 #7
Zach <no*@this.addre ss> wrote:
While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.


I had to process > 100,000 words - sort them and so on to create a
spellcheck vocabulary, and Word wouldn't do it. Message saying it couldn't
handle the volume. WordPerfect had no problems sorting >100,000 words etc.
(So I wrote some software for the job.)


I would argue that a word processor isn't the right tool for sorting a
vocabulary file anyway. While Word may not be able to sort a document
with over 100,000 words, it's fine when it comes to normal word
processing tasks with the same size of document.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #8
"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...
Zach <no*@this.addre ss> wrote:
I would argue that a word processor isn't the right tool for sorting a
vocabulary file anyway. While Word may not be able to sort a document
with over 100,000 words, it's fine when it comes to normal word
processing tasks with the same size of document.


Yes, and I wanted to throw out the words that WP didn't know,
because they wouldn't be every day vocabulary.
Nov 16 '05 #9
I have a few suggestions:

1) Can you just split the names in the files into 26 separate files
such that 1st file has all A's, 2nd file has all B's and so on. I
think that will reduce the amount of text you need to process.

2) You can also try using a B+ tree. Very good for frequent finds and
few updates.
3) Alternatively, why don't you use a hash table to store the hash of
each of the words. That way, when you want to find where a word goes,
compute its hash and you should be able to see which hash should come
before the word you want to add. That way, when your searching for the
word, it will be much much faster because you can search for a
specific word, rather than comparing each and every word (ie. you can
ignore chunks of data using hashes).
Let me know what you decide to do.
Nov 16 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1703
by: zjut | last post by:
want to add a string to the file and the file is sort by letter! for examply: the follow file is a big file! ////////////////////// abort black cabbage dog egg fly ////////////////////
8
5040
by: Nader | last post by:
Hello all, In C# string is a reference type but I learned that string is different from other reference types such as class. For example, if you pass a string argument to a method and then change the value in that method the modification will not be visible outside the method. However this is not true for classes. In my example I am not using ref keyword. Thanks for feedback.
3
4106
by: David N | last post by:
I got a solution that contains about 30 projects, three of which cannot be open. When I open the project, I always receive the error message "Unable to get the project file from the Web Server" My machine is running Windows 2003. My colleagues are using the same hardware and software as I do, but I am the only one running into this problem. I searched the Internet and see that quite a lot of people having the same problem in the...
60
49087
by: Julie | last post by:
What is the *fastest* way in .NET to search large on-disk text files (100+ MB) for a given string. The files are unindexed and unsorted, and for the purposes of my immediate requirements, can't be indexed/sorted. I don't want to load the entire file into physical memory, memory-mapped files are ok (and preferred). Speed/performance is a requirement -- the target is to locate the string in 10 seconds or less for a 100 MB file. The...
4
79642
by: Julia | last post by:
Hi, I need to convert unicode string to ansi string Thanks in adavance.
7
6222
by: Sharon | last post by:
I have successfully loaded a DataSet object with a XML schema (XSD). Now I wish to populate the tables that was created in the DataSet. I have an XML file/string that contain all the needed data in the same format as the XSD (the XML file/string was created using this same schema). The XML file/string may contain data for a single table or for several tables at once. The question is:
2
1155
by: Edward Lee | last post by:
Hi, I have a string of XML in VB dot net. What I want to do is simply dump this string programatically into an IE browser so I can view it as its raw data form... Is there any way to do this in VB Dot Net WITHOUT first saving the file to disk? I don't want to do an XSLT, it's unnecessary for my purposes, I just want a straight dump... Any ideas? Thanks! ed
6
7196
by: SevDer | last post by:
Is there a way to test guid string? I want to do it without try catch block to save on performance. Thanks in advance. -- SevDer
7
2511
by: BillG | last post by:
Hi, Does anyone know of a site or have code for a function that will generate a random string or random number? I need one where I can tell it what type of value I need and where I can set the length of the desired output. Thanks
13
7956
by: xzzy | last post by:
None of the following properly do the VB.net double quote conversion because all of the following in csharp convert to \" instead of just a double quote: " I have tried: char myDoubleQuote = (char)34; string myDoubleQuote = "" + (char)34;
0
8628
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8567
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8981
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8843
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8823
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7660
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5839
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4340
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4578
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.