473,386 Members | 1,706 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

LinearSearch to BinarySearch

stealwings
I have this code that does a linear search within two files, but it is too slow......
Expand|Select|Wrap|Line Numbers
  1.  
  2. for(unsigned i=0;i<Standard.size();i++) 
  3.   for(unsigned j=0;j<MyWords.size();j++)  //each MyWords compare to the Standard 
  4.                                       //and see if it is in the standard vector
  5.         {
  6.             string::size_type FoundAt=MyWords[i].find(Standard[j]);
  7.             while( string::npos != FoundAt )
  8.                 {
  9.                     MyWords[i].replace( FoundAt, Standard[j].length(),Standard[j]+"/");
  10. FoundAt = MyWords[i].find( Standard[j], FoundAt + Standard[j].length() );
  11.                 }
  12.                             }
  13.               }
  14. /***********************************Create output file****************************************/
  15.     for(unsigned j=0;j<MyWords.size();j++)    // for each word in MyWords vector  
  16.     {
  17.         WriteWord<<MyWords[j]<<endl;    // write it to the destination file
  18.     }
  19.     WriteWord.flush();    // flush the memory to ensure all is written to the file
  20.     WriteWord.close();   // close the write file handls
  21.     return 0;
  22. }
I want to change it into binarysearch so it will perform faster, but the problem is I don't know how can I do that, I mean, if I change it to binarysearch the structure itself will have to be changed. I have the following questions that I can't solve so the binarysearch can be done:
1. Cut the sentences into words.
2. Compare first word of the sentences and see if matches, if match take second and so on till a phrase in my standard file is formed. Otherwise, discard word.
e.g. Lets say I have a sentences "This world is full of beauty!" and I have in my sorted array the phrase "full of beauty", I would like to take each word of the sentence to and compare it with my array like the following:
1. Take the word "This", since it doesn't match delete it
2. Take the word "world", since it doesn't match delete it.
3. and so on until "full", since "full" is part of my phrase "full of beauty" conserve it and take the next and compare it, and so on till I get the full phrase "full of beauty" then return that the phrase was found.
Hope someone can give me some hint, I really need to get this thing to work, and want to get it work myself but I really can't do. Thanks in advance to all.
Mar 21 '07 #1
8 1387
DeMan
1,806 1GB
You can do a binary search in the same structure, provided the data is sorted...Then instead of iterating through every option, you start in the middle, and choose whether this is your value, otherwise which side to search. You continue to do this until you find your word, (or conversely find you have no words left).

Thus, you have a sorted array,
"beauty", "Full", "of"

You cut "This" off the input string and compare it to the middle elemnet ("Full"), since "This" is lexicongraphically later, you compare to "of" and discard.

Same for "world"

"full" you find immediately...etc

In this example, there is no huge saving, but if your array contains more words it becomes more efficient
Mar 21 '07 #2
Thanks, but I think I haven't been clear in my question. I have I sort array of phrases which I named as standard, it contains 400 phrases sorted in alphabetical order. Then I have another file that contains lets say 1000 sentences, I want to cut this sentences into words, then do the comparison.
e.g. in the sentences I there is the word "excellent" , I want to take that word, and look if there is any phrase in my array that begins with that word, if true take the next word, if false discard the word and take next one.
Lets say I have in my array "excellent job" and "excellent work" so I will have to check the next word to make sure which is the match. And if the word that follows "excellent" is not neither discard, if is any of both return true.
Mar 21 '07 #3
DeMan
1,806 1GB
Sorry if I'm still on the wrong page.....

I think we can still modify the same idea.....

Assumiong you are always searching from the beginning of the word stored in the list you are searching......

say you array has:
"apples and oranges", "burgers and fries", "cats and dog", "excellent job", "excellent weather", "excellent work", "fantastic stuff"

and you want to search for "Doug does excellent work", we chop off Doug and does when we don't find them.....
we compare excellent and (depending on how you implement the comparison (it is trivial to implement one that ignores size and will match "excellent" with "excellent anything") you can do one of many things. At this point, you are at "excellent job" and you could concatenate " work" back onto your "excellent" and search linearly from where you are. You could add "w work" back on and search binary (although you's have to search over the remainder of the array, otherwise finding excellent from the other end may be a little complex). Or (depending on what you actually store in the array (like if you store a list of words in teh array, you could search again without the concatenation, although I'd suggest that's more work).

This (above) should work provided that ytou always want to match a longer phrase with a shorter substring (that is something having "and oranges" won't match to "apples and oranges" but "fred's apples and oranges" will.
You can also modify so that a phrase like "It was somewhere in the country, in a land of rock and scrub" matches to "somewhere in the country".

Hope this helps...

As alway post again if you want further clarification....
Mar 21 '07 #4
Sorry if I'm still on the wrong page.....

I think we can still modify the same idea.....

Assumiong you are always searching from the beginning of the word stored in the list you are searching......

say you array has:
"apples and oranges", "burgers and fries", "cats and dog", "excellent job", "excellent weather", "excellent work", "fantastic stuff"

and you want to search for "Doug does excellent work", we chop off Doug and does when we don't find them.....
we compare excellent and (depending on how you implement the comparison (it is trivial to implement one that ignores size and will match "excellent" with "excellent anything") you can do one of many things. At this point, you are at "excellent job" and you could concatenate " work" back onto your "excellent" and search linearly from where you are. You could add "w work" back on and search binary (although you's have to search over the remainder of the array, otherwise finding excellent from the other end may be a little complex). Or (depending on what you actually store in the array (like if you store a list of words in teh array, you could search again without the concatenation, although I'd suggest that's more work).

This (above) should work provided that ytou always want to match a longer phrase with a shorter substring (that is something having "and oranges" won't match to "apples and oranges" but "fred's apples and oranges" will.
You can also modify so that a phrase like "It was somewhere in the country, in a land of rock and scrub" matches to "somewhere in the country".

Hope this helps...

As alway post again if you want further clarification....
Thank you DeMan, but the problem here is I don't get how can I chop off Doug and does and how to concatenate work back to excellent, sorry if I driving you crazy but that is the exact thing I want to know. I mean, I also have the idea of how should it work but I don't know how do I code that idea.
Mar 21 '07 #5
I still don't get it, can some one help me out with my question??? Thank you all in advance!
Mar 23 '07 #6
DeMan
1,806 1GB
Sorry I didn't repkly sooner (I somehow missed your last post)

Given what you originally posted I had assumed you were already taking words from a string. You don't have to chop words per'se (especially if you are using local variables).

Say you have the String from before "Doug does excellent work".
You compare the beginning of this string to those in the array/linkedlist. As soon as we can tell Doug can't be find, we iterate to the character after the next space (so the string is still the same size, but we're pointing at the d of "does excellent work".
We begin the search again, quickly realising "does" is no good so we iterate to the char after the next space which is the e in "excellent".
Then we find a match on excellent and continue until one (or both) of the strings (the original "Doug does excellent work" and "excellent work") are completed (Thus avoiding the concatenate side of things).

I realise that much of this is probably not clear, but I can't draw diagrams, and cannot post code unless either
a) you can show you have done work based on previous posts
b) you can explain specifically what part you are having difficulty with (you can learn about strings by googling "c string" - and if you add terms like concatenate, manipulate, modify, compare you'll get more specific results)
Mar 23 '07 #7
Ok thanks again DeMan, I will try my best to figure it out these days, and if I have further questions I will continue pop upping the post, if these is not against the forum's rules. Or shall I star a new post further?
Mar 23 '07 #8
DeMan
1,806 1GB
If you feel the question is related to this post, then feel free to post again in this thread. (On the other hand a new thread tends to get mroe attention, I think)....
So it's up to you.

Look forward to hearing from you again!
Mar 23 '07 #9

Sign in to post your reply or Sign up for a free account.

Similar topics

4
by: dotNetDave | last post by:
I have created my own comparer class using IComparer for use with ArrayList.BinarySearch. My class seems to work with BinarySearch, but the problem is that my ArrayList has three items in it and it...
4
by: Homa | last post by:
I can' believe my own eye, but it happens...there is a bug in ArrayList.BinarySearch!! It should be such a simple function...... Here is the detail. (I'm using C#, don't know if this is C#'s...
2
by: Yogi21 | last post by:
Hi I have a sorted array containing strings. I am iterating through the array and clearing the contents one by one using "array.BinarySearch" to find each element. So far so good. But the moment I...
2
by: Mario | last post by:
Hi there, I'm writing a piece of code with VS.Net 2003, Framework 1.1. And I can't make BinarySearch to work right. Look at this sample: Dim sexy() As String = {"-", "a", "b", "ba", "A",...
5
by: Lee | last post by:
I have a custom collection that implements IComarer right now which implements sorting. I've googled and it seems that built in binary search is not available in the underlying List object. If...
11
by: solandre | last post by:
my motivation is as follows: i have to binarysearch several million times an long that is several million items big. this costs time, although i use Array.Binarysearch<long>. so i try to keep...
8
by: Guy | last post by:
Is there a better way to search identical elements in a sorted array list than the following: iIndex = Array.BinarySearch( m_Array, 0, m_Array.Count, aSearchedObject ); aFoundObject= m_Array;...
2
by: tshad | last post by:
I can do the following with Find: Racer theRacer2 = racers.Find(delegate(Racer racer) { return racer.Car == "Ferrari"; }); But I can't seem to do the same with BinarySearch, this: int iktr...
3
by: JinFTW | last post by:
Hey guys I was just curious as to how (if) you could use linearsearch to find two of the same item in an array. For example: public static Comparable linearSearch (Comparable list, Comparable...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.