Can anyone suggest some efficient way to search a substring in
a text file. For eg. suppose i want to search the pattern "abc"
in a text file, then which algorithm should i use.
I just want some hints, not the complete program.
thanx in advance..... 7 2198
"junky_fell ow" <ju**********@y ahoo.co.in> wrote in message
news:8c******** *************** ***@posting.goo gle.com... Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some hints, not the complete program.
thanx in advance.....
It is a fact that you must check all characters (except the last two). The
simplest method would be to search for 'a' and if you find it, test for 'b'
then 'c'.
The fastest method in this case (byte data[4] = a,b,c) would be to cast data
to an int (if you work with a 32-bit processor) then "xor" it with every
third byte of the data string, if the result is 0 you have found a sub
string. This can be optimized in assembler as well, and I think this is the
fastest method if combined with a suitable buffering method.
junky_fellow wrote: Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some hints, not the complete program.
Use srstr(), from <string.h>.
--
pete
Martin Johansen wrote: "junky_fell ow" <ju**********@y ahoo.co.in> wrote in message news:8c******** *************** ***@posting.goo gle.com... Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some hints, not the complete program.
thanx in advance..... It is a fact that you must check all characters (except the last two). The simplest method would be to search for 'a' and if you find it, test for 'b' then 'c'.
The fastest method in this case (byte data[4] = a,b,c) would be to cast data to an int (if you work with a 32-bit processor) then "xor" it with every
xor "what" with every third byte ?
third byte of the data string, if the result is 0 you have found a sub string.
Even assuming CHAR_BIT equals 8 and sizeof(int) equals 4, and
ignoring alignment problems,
I have no idea of what you think you're saying.
--
pete
On Thu, 12 Feb 2004, junky_fellow wrote: Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some hints, not the complete program.
You have to define efficient. If primary memory is a concern then
efficient would be reading one character at a time and seeing if it
matches the first letter of the string. If it does then compare the
second, third, etc. characters of the string.
If efficient is as fast as possible without concern for primary memory
then determine file size, allocate a block of memory the size of the file,
read the entire contents into memory then search the memory for the string
(maybe using strstr() or just using the search for the first letter, if
you find it search the the entire string).
Or you could try a combination. Allocate a large chunk of memory, read in
part of the file and search the memory for the string you are looking for.
Only danger here is if the last letter of block #1 is 'a' and the first
letters of block #2 are "bc". You'll never to figure out how to handle
that situations.
--
Send e-mail to: darrell at cs dot toronto dot edu
Don't send e-mail to vi************@ whitehouse.gov
"pete" <pf*****@mindsp ring.com> wrote in message
news:40******** ***@mindspring. com... Martin Johansen wrote: "junky_fell ow" <ju**********@y ahoo.co.in> wrote in message news:8c******** *************** ***@posting.goo gle.com... Can anyone suggest some efficient way to search a substring in a text file. For eg. suppose i want to search the pattern "abc" in a text file, then which algorithm should i use. I just want some hints, not the complete program.
thanx in advance..... It is a fact that you must check all characters (except the last two). The simplest method would be to search for 'a' and if you find it, test for 'b' then 'c'.
The fastest method in this case (byte data[4] = a,b,c) would be to cast data to an int (if you work with a 32-bit processor) then "xor" it with every
xor "what" with every third byte ?
Oh, sorry, I ment xor every third byte with data[3] casted to int, three
byte from a file could be read into an array in the same manner as a,b,c in
data[3].
third byte of the data string, if the result is 0 you have found a sub string.
Even assuming CHAR_BIT equals 8 and sizeof(int) equals 4, and ignoring alignment problems, I have no idea of what you think you're saying.
-- pete
Martin Johansen wrote: "pete" <pf*****@mindsp ring.com> wrote in message news:40******** ***@mindspring. com... Martin Johansen wrote: "junky_fell ow" <ju**********@y ahoo.co.in> wrote in message news:8c******** *************** ***@posting.goo gle.com... > Can anyone suggest some efficient way to search a substring in > a text file. For eg. suppose i want to search the pattern "abc" > in a text file, then which algorithm should i use. > I just want some hints, not the complete program. > > thanx in advance.....
It is a fact that you must check all characters (except the last two). The simplest method would be to search for 'a' and if you find it, test for 'b' then 'c'.
The fastest method in this case (byte data[4] = a,b,c) would be to cast data to an int (if you work with a 32-bit processor) then "xor" it with every
xor "what" with every third byte ?
Oh, sorry, I ment xor every third byte with data[3] casted to int, three byte from a file could be read into an array in the same manner as a,b,c in data[3].
I think you mean data[2], if you mean 'c'. third byte of the data string, if the result is 0 you have found a sub string.
I don't understand why you prefer to check the result of an xor,
instead of checking equality directly.
Your described procedure,
only checks for one letter, not a substring.
Even assuming CHAR_BIT equals 8 and sizeof(int) equals 4, and ignoring alignment problems, I have no idea of what you think you're saying.
--
pete
Yes, I was a bit sloppy, simple character comparizon is the fastest. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Adrian |
last post by:
This script works well for searching thru a SELECT list but doesnt work so
well with the MULTIPLE option. I need to modify it to work for a SELECT
MULTIPLE but I have no idea where to begin, any tips?
<script>
var searchString = "";
function clearSearchString() {
|
by: C3 |
last post by:
I have to process some data in C that is given to me as a char * array. I
have a fairly large number of substrings (well, they're not actually
printable, but let's treat them as strings) that I have to search for in the
data.
I need to keep a count of how often each of these substrings occurs in my
original data and then print it out at the end.
This is a fairly mundane task, but since I have so many substrings, it's a
pain having to...
|
by: Andy |
last post by:
I am having trouble using the ComboBox in my app. I am attempting to use the
text from the ComboBox, to search the list box portion. However: In my
"TextChanged" event: I cannot get the value of the entered text. Below is
my code.
private void cbEdit_TextChanged(object sender, System.EventArgs e)
{
bool bFound = false;
string sText = cbEdit.Text;
int iPosition = cbEdit.SelectionStart;
|
by: Cor |
last post by:
Hallo,
I have promised Jay B yesterday to do some tests.
The subject was a string evaluation that Jon had send in. Jay B was in doubt
what was better because there was a discussion in the C# newsgroup on 25
September. The regular expressions where in that newsgroup too involved.
I told yesterday night, to Jay that I would test all 4 methods and the
stupid method I was thinking of the first time that night when I saw Jon's
|
by: Emily |
last post by:
I am new to VB.NET and I need help finding and updating
values in an arraylist.
My arraylist is called arrVarInfo and holds data in this
structure type:
Public Structure VarInfo
Dim Name As String
Dim Format As String
Dim Desc As String
| |
by: chernetsov |
last post by:
I am creating a server indexing files in my local area network, in
order to provide a searching feature.
So i want to make it possible to searchsuch rows where the 'name'
(VARCHAR) column contains a substring submitted by user.
an i use a full-text index to make it faster? from my first sight i
noticed that this index can be used to search for words in a string. So
the example below would not go:
searching 'cool' in 'uncoolness'
Am i...
|
by: pbd22 |
last post by:
Hi.
I am somewhat new to this and would like some advice.
I want to search my xml file using "keyword" search and
return results based on "proximity matching" - in other words,
since the search string will often not produce a direct match,
the results will be based on proximity (50%, 20% 100%, etc).
are there any good examples out there on how to do keyword
searches on XML data? How should i set up my xml file so
|
by: Aaron |
last post by:
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks...
|
by: jhansi |
last post by:
I need c program to display the possible names of the persons in a record when there starting letter or some substring is given.
example: if we give "v" it has to display the person names starting with "v" in a record or if i give a substring like "vij" it has to give me the possible names with that substring. plzzzzzzzz i need the solution very urgently
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |