473,512 Members | 14,457 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to add a string to a big file in csharp !

I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
..
..
////////////////////
and now i want to add "dad" into it ! Just after "cabbage" and at the front
of "dog"! Because of so many word in file so i need to adopt binary search to
find the location !

/// <summary>
/// want to find the word from given file
/// </summary>
/// <param name="?"></param>

private bool find(string word)
{
if (word == null)
{
throw new ArgumentNullException("word is null.");
}

StreamReader sr = new StreamReader(file.FullName); //file is object of
FileInfo
lock(this)
{
//Check the word is in the first!
string str = sr.ReadLine();
if (str == null)
{
return false;
}
if (string.Compare(str.Trim(),word))
return true;
}

// binary search starts
FileStream fs = File.OpenRead(file.FullName);
long lower = 0;
long upper = fs.Length - 1;
while (lower <= upper)
{
long index = (lower + upper) / 2;
fs.seek(index,SeekOrigin.End);

// read off an incomplete line
str = fs.Read();
////i donot know how to set the parameters of Read() so that it can read a
line

// the line might be null if it's the end of file
int t = str == null ? -1
: string.Compare(word, str.trim());
// found it
if (t == 0)
{
return true;
}
if (t > 0)
{
lower = index + 1;
}
else
{
upper = index - 1;
}
}
}

that is the fuction of method and my question is
1: the FileStream is fitable in it ?
2 : string.Compare is fitable in it ?
3: is there any method i can do it better ?

thanx of all !
Nov 16 '05 #1
9 1842
"zjut" <zj**@discussions.microsoft.com> wrote in message
news:A2**********************************@microsof t.com...
I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
.
.
////////////////////
and now i want to add "dad" into it !


Given:
* Your file ("old_file") is in alphabetical order
* old_file is immensely big

Required:
* Adding word ("new_word") to old_file at the right place.

Solution:
* Sequentially read old_file ("word_read") and write word_read to new_file
* If (new_word > word_read) and (new_word< word_read+1) then shove it in

Nov 16 '05 #2
"zjut" <zj**@discussions.microsoft.com> wrote in message
news:A2**********************************@microsof t.com...
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an
example. The example sorts a short alphabetically ordered file into a very
big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.
Nov 16 '05 #3
I see another NG member has already given you a possible
solution, but I don't feel it would be an optimal solution... You
really have a couple of different options that all revolve around the
same set of principles... First, you know the new size of the word
you are sorting into place, so you'll want to open and then grow
the file by that amount. This is to make sure you can copy the
rest of the file around while you are doing your searching. For a
sanity check, go ahead and check the first and last element to make
sure this isn't a trivial case.

Okay, the binary search is going to involve, cutting the file in half,
you can do this based on length, and then seeking to that location.
Once you've done that, you are going to walk backwards and
forwards until you encounter newlines on either side. This'll be
your *word*, and you'll compare it and continue the process of
cutting the file in half (aka a binary search)... Once you've found
your insertion location, you are going to do large buffer copies (4K
is probably best) of bytes moving all of the end elements into that
space you allocated in the beginning. With that done, write your
word into place. You've just managed an in place insertion.

If you have multiple words to merge, then merge sorting and other
heuristics come into play. Get your basic algorithm and then think
about refactoring.

--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

"zjut" <zj**@discussions.microsoft.com> wrote in message
news:A2**********************************@microsof t.com...
I want to add a string to the file and the file is sort by letter! for
examply:
the follow file is a big file
//////////////////////
abort
black
cabbage
dog
egg
fly
.
.
////////////////////
and now i want to add "dad" into it ! Just after "cabbage" and at the front
of "dog"! Because of so many word in file so i need to adopt binary search to
find the location !

/// <summary>
/// want to find the word from given file
/// </summary>
/// <param name="?"></param>

private bool find(string word)
{
if (word == null)
{
throw new ArgumentNullException("word is null.");
}

StreamReader sr = new StreamReader(file.FullName); //file is object of
FileInfo
lock(this)
{
//Check the word is in the first!
string str = sr.ReadLine();
if (str == null)
{
return false;
}
if (string.Compare(str.Trim(),word))
return true;
}

// binary search starts
FileStream fs = File.OpenRead(file.FullName);
long lower = 0;
long upper = fs.Length - 1;
while (lower <= upper)
{
long index = (lower + upper) / 2;
fs.seek(index,SeekOrigin.End);

// read off an incomplete line
str = fs.Read();
////i donot know how to set the parameters of Read() so that it can read a
line

// the line might be null if it's the end of file
int t = str == null ? -1
: string.Compare(word, str.trim());
// found it
if (t == 0)
{
return true;
}
if (t > 0)
{
lower = index + 1;
}
else
{
upper = index - 1;
}
}
}

that is the fuction of method and my question is
1: the FileStream is fitable in it ?
2 : string.Compare is fitable in it ?
3: is there any method i can do it better ?

thanx of all !

Nov 16 '05 #4
Zach <no*@this.address> wrote:
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an
example. The example sorts a short alphabetically ordered file into a very
big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.


While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Zach <no*@this.address> wrote:
I want to add a string to the file and the file is sort by letter! for


If you are having problems with the algorithm let me know and I will post an example. The example sorts a short alphabetically ordered file into a very big alphabetically ordered file. By the way, WordPerfect can deal with
immensely big files (100,000+ words). Microsoft Word can't.


While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.


I had to process > 100,000 words - sort them and so on to create a
spellcheck
vocabulary, and Word wouldn't do it. Message saying it couldn't handle the
volume. WordPerfect had no problems sorting >100,000 words etc.
(So I wrote some software for the job.)
Nov 16 '05 #6
"Justin Rogers" <Ju****@games4dotnet.com> wrote in message
news:e$**************@TK2MSFTNGP14.phx.gbl...

NB the words of the OP are in a
file to start with and have to be read
and re-written at least once!

Sequentially reading through the parent file,
slipping in the words from a sorted array,
at their respective right places in the parent
file, whilst checking for doubles, is fast, simple
to write and has no capacity constraints.
IMO Doing binary sorts in this situation is silly,
even more so if the new words are in random order.


Nov 16 '05 #7
Zach <no*@this.address> wrote:
While I wouldn't be surprised if Word had some limits somewhere, Word
can certainly cope with 100,000+ words easily. I just created a
document with over 300,000 words, and Word didn't have any problems
with it.


I had to process > 100,000 words - sort them and so on to create a
spellcheck vocabulary, and Word wouldn't do it. Message saying it couldn't
handle the volume. WordPerfect had no problems sorting >100,000 words etc.
(So I wrote some software for the job.)


I would argue that a word processor isn't the right tool for sorting a
vocabulary file anyway. While Word may not be able to sort a document
with over 100,000 words, it's fine when it comes to normal word
processing tasks with the same size of document.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #8
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Zach <no*@this.address> wrote:
I would argue that a word processor isn't the right tool for sorting a
vocabulary file anyway. While Word may not be able to sort a document
with over 100,000 words, it's fine when it comes to normal word
processing tasks with the same size of document.


Yes, and I wanted to throw out the words that WP didn't know,
because they wouldn't be every day vocabulary.
Nov 16 '05 #9
I have a few suggestions:

1) Can you just split the names in the files into 26 separate files
such that 1st file has all A's, 2nd file has all B's and so on. I
think that will reduce the amount of text you need to process.

2) You can also try using a B+ tree. Very good for frequent finds and
few updates.
3) Alternatively, why don't you use a hash table to store the hash of
each of the words. That way, when you want to find where a word goes,
compute its hash and you should be able to see which hash should come
before the word you want to add. That way, when your searching for the
word, it will be much much faster because you can search for a
specific word, rather than comparing each and every word (ie. you can
ignore chunks of data using hashes).
Let me know what you decide to do.
Nov 16 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1692
by: zjut | last post by:
want to add a string to the file and the file is sort by letter! for examply: the follow file is a big file! ////////////////////// abort black cabbage dog egg fly ////////////////////
8
5032
by: Nader | last post by:
Hello all, In C# string is a reference type but I learned that string is different from other reference types such as class. For example, if you pass a string argument to a method and then...
3
4086
by: David N | last post by:
I got a solution that contains about 30 projects, three of which cannot be open. When I open the project, I always receive the error message "Unable to get the project file from the Web Server" ...
60
48983
by: Julie | last post by:
What is the *fastest* way in .NET to search large on-disk text files (100+ MB) for a given string. The files are unindexed and unsorted, and for the purposes of my immediate requirements, can't...
4
79563
by: Julia | last post by:
Hi, I need to convert unicode string to ansi string Thanks in adavance.
7
6210
by: Sharon | last post by:
I have successfully loaded a DataSet object with a XML schema (XSD). Now I wish to populate the tables that was created in the DataSet. I have an XML file/string that contain all the needed data...
2
1147
by: Edward Lee | last post by:
Hi, I have a string of XML in VB dot net. What I want to do is simply dump this string programatically into an IE browser so I can view it as its raw data form... Is there any way to do this...
6
7186
by: SevDer | last post by:
Is there a way to test guid string? I want to do it without try catch block to save on performance. Thanks in advance. -- SevDer
7
2501
by: BillG | last post by:
Hi, Does anyone know of a site or have code for a function that will generate a random string or random number? I need one where I can tell it what type of value I need and where I can set the...
13
7922
by: xzzy | last post by:
None of the following properly do the VB.net double quote conversion because all of the following in csharp convert to \" instead of just a double quote: " I have tried: char...
0
7254
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7153
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7373
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
7094
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7519
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5677
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
5079
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4743
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
452
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.