473,399 Members | 3,656 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Search for a string backwards in a file.

SK
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.
Jul 22 '05 #1
15 3561

"SK" <sk******@rediffmail.com> wrote in message
news:83**************************@posting.google.c om...
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.


Is the file small enough to read the whole file into memory? If so then read
the whole file into a string and search in the string not in the file.
Anything else rapidly gets very complicated and also not terribly efficient.
Files aren't designed to be read backwards, I would prefer to redesign your
file format so that you don't need to read backwards than to actually
attempt this.

And no there is no quick trick to do this.

john
Jul 22 '05 #2
SK wrote:

Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
No.
Or is there a better way to solve this problem.


Seek to the end.

Seek back a number of characters
Read a number of characters into a buffer.
Search that buffer from the end.

If you find the pattern -> fine
If you don't find the pattern: seek back a number
of characters into the buffer, check the buffer
and repeat.
--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #3

"Karl Heinz Buchegger" <kb******@gascad.at> wrote in message
news:40***************@gascad.at...
SK wrote:

Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?


No.
Or is there a better way to solve this problem.


Seek to the end.

Seek back a number of characters
Read a number of characters into a buffer.
Search that buffer from the end.

If you find the pattern -> fine
If you don't find the pattern: seek back a number
of characters into the buffer, check the buffer
and repeat.


And don't forget to deal with the case where the string you are searching
for straddles one of your seek positions. I.e. if you aren't careful, one
half of the string ends up in one buffer and the other half in another
buffer, so you never find it.

john
Jul 22 '05 #4
John Harrison wrote:

"Karl Heinz Buchegger" <kb******@gascad.at> wrote in message
news:40***************@gascad.at...
SK wrote:

Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?


No.
Or is there a better way to solve this problem.


Seek to the end.

Seek back a number of characters
Read a number of characters into a buffer.
Search that buffer from the end.

If you find the pattern -> fine
If you don't find the pattern: seek back a number
of characters into the buffer, check the buffer
and repeat.


And don't forget to deal with the case where the string you are searching
for straddles one of your seek positions. I.e. if you aren't careful, one
half of the string ends up in one buffer and the other half in another
buffer, so you never find it.


:-)
It's a tricky thing which could be solved with letting the reads overlap:

+---------------------------+
+---------------------------+

| |
Overlapping area, large enough that the pattern will fit
into it.

But there should be left something to think about for the OP.

The best thing the OP could do is: Avoid that topic at all by redesigning
the file format.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #5
How about reversing "ABC" and doing a plain search?

Henrik Vallgren

"SK" <sk******@rediffmail.com> skrev i meddelandet
news:83**************************@posting.google.c om...
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.

Jul 22 '05 #6
SK
"John Harrison" <jo*************@hotmail.com> wrote in message news:<2j************@uni-berlin.de>...
"SK" <sk******@rediffmail.com> wrote in message
news:83**************************@posting.google.c om...
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.
Is the file small enough to read the whole file into memory?


No, it goes in several mbs.
If so then read the whole file into a string and search in the string not in the file.
Anything else rapidly gets very complicated and also not terribly efficient.
Files aren't designed to be read backwards, I would prefer to redesign your
file format so that you don't need to read backwards than to actually
attempt this.

Wish I could redesign the file format, but can't actually :-(
But what I am parsing are timestamps in a file. I need to know the
first and the last timestamp in a given file.
And no there is no quick trick to do this.


Thanks.
Jul 22 '05 #7

"SK" <sk******@rediffmail.com> wrote in message
news:83**************************@posting.google.c om...
"John Harrison" <jo*************@hotmail.com> wrote in message

news:<2j************@uni-berlin.de>...
"SK" <sk******@rediffmail.com> wrote in message
news:83**************************@posting.google.c om...
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.


Is the file small enough to read the whole file into memory?


No, it goes in several mbs.
If so then read
the whole file into a string and search in the string not in the file.
Anything else rapidly gets very complicated and also not terribly efficient. Files aren't designed to be read backwards, I would prefer to redesign your file format so that you don't need to read backwards than to actually
attempt this.


Wish I could redesign the file format, but can't actually :-(
But what I am parsing are timestamps in a file. I need to know the
first and the last timestamp in a given file.


Then Karl's suggestion is the right one, read blocks of data from the end of
a file, and search backwards though each block, repeat if you don't find
anything, and watch out for the timestamp falling half way between two
blocks. In practise this means that the blocks have to overlap sufficiently
so that the timestamp will always be contained entirely within one block.

john
Jul 22 '05 #8
"SK" <sk******@rediffmail.com> wrote in message
I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.
This approach seems fine to me. Others suggested reading the file into
blocks of strings. But I'm not sure if there's a need to do this as the
default fstreams already read the file into blocks of strings (though the
standard requires them to, but they do for file streams). As for how they
normally handle reading from the end of the file, I don't know for sure, but
I think it's like this: when the user reads the last character, then read
the last N chars into memory where N is the size of a block, position the
streambuf's get pointer to the last character, return the character at the
get pointer.
file.get(c);
file.seekg(-2, std::ios::cur);


The above might be more clearly expressed with

c = file.peek();
file.seekg(-1, std::ios::cur);

Also, since you're looking for a string of length 3, you could save the last
3 chars read into variables like c1, c2, c3. Or even an array, vector, etc.
Then even if you encounter an 'A' you could ignore it if the next characters
are not 'BC'.

Finally, to get improved performance you could work directly with the
underlying streambuf. Call file.rdbuf() to get a pointer to the streambuf.
It will really be a filebuf, but pretend you don't know this. To position
the stream to the beginning or end use pubseekpos. To position the stream
relative to the current position use pubseekoff. To get the current
character as in peek use sgetc.

If this is not fast enough, you could even write your own class derived from
filebuf that implements a sbumpcminus() function that gets the current
character then decrements the get pointer. Use setg to set the get pointers
and range of the get area. But this is getting really advanced.

peek at the current character
Jul 22 '05 #9
SK wrote:
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.


You can do something like this.

std::string toFind = "ABC";
std::string reversed;
std::copy(toFind.rbegin(), toFind.rend(),std::front_inserter(reversed));

std::ifstream ifs("MyFile.txt", std::ios::in);
if(ifs)
{
std::string fileContent = std::string(
std::istreambuf_iterator<char>(ifs),
std::istreambuf_iterator<char>());

std::string::size_type found_position = fileContent .rfind(reversed);
if(std::string::npos != found_position)
std::cout<<"Requested string found at position:
"<<found_positions<<std::endl;
}

(Forgive any typos...)

JLR
Jul 22 '05 #10
sk******@rediffmail.com (SK) wrote:
I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.


it goes in several mbs.

Wish I could redesign the file format, but can't actually :-(
But what I am parsing are timestamps in a file. I need to know the
first and the last timestamp in a given file.


If the file is in ascii format (ie. it is made up of lines), then
you can read forwards a line at a time (it won't take long even to
parse several megs that way).
If you really must start at the end, then start, say, 50k from the
end and find the first newline, and then read forwards from there
(assuming your file does not contain lines longer than 50k).
Jul 22 '05 #11
Karl Heinz Buchegger wrote:
<snip>
The best thing the OP could do is: Avoid that topic at all by redesigning
the file format.


This comment really has no value to the discussion because it assumes that: the
OP has the liberty to arbitrarily redesign the file format, that the
requirements for the existing file format are secondary to the (code)
implementation details, and that reading from the end of the file isn't a
suitable solution to the problem as expressed by the OP.
Jul 22 '05 #12
Julie wrote:

Karl Heinz Buchegger wrote:
<snip>
The best thing the OP could do is: Avoid that topic at all by redesigning
the file format.


This comment really has no value to the discussion because it assumes that: the
OP has the liberty to arbitrarily redesign the file format, that the
requirements for the existing file format are secondary to the (code)
implementation details, and that reading from the end of the file isn't a
suitable solution to the problem as expressed by the OP.


On the other hand it happens quite frequently, that posters post
questions about how to do something and it turns out later that
a simple change in the assignment (if possible) is the far better
solution.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #13
Karl Heinz Buchegger wrote:

Julie wrote:

Karl Heinz Buchegger wrote:
<snip>
The best thing the OP could do is: Avoid that topic at all by redesigning
the file format.


This comment really has no value to the discussion because it assumes that: the
OP has the liberty to arbitrarily redesign the file format, that the
requirements for the existing file format are secondary to the (code)
implementation details, and that reading from the end of the file isn't a
suitable solution to the problem as expressed by the OP.


On the other hand it happens quite frequently, that posters post
questions about how to do something and it turns out later that
a simple change in the assignment (if possible) is the far better
solution.


Perhaps you should phrase your original comment as a question, rather than an
assertion.
Jul 22 '05 #14
why just reverse the string and search it in the normal way?:)

"John Harrison" <jo*************@hotmail.com> дÈëÓʼþ
news:2j************@uni-berlin.de...

"SK" <sk******@rediffmail.com> wrote in message
news:83**************************@posting.google.c om...
Hey folks,

I am searching for a string (say "ABC") backwards in a file.
First I seek to the end.
Then I try to make a check like -

do {
file.clear ();
file.get(c);
file.seekg(-2, std::ios::cur);
} while (c != 'A' && c.readback() != 'B' && c.readback != 'C')
// readback, hypotheticl func, problem here)
Is there some function/trick like peek that can instead read backwards
one character at a time?
Or is there a better way to solve this problem.

Thank you.
Is the file small enough to read the whole file into memory? If so then

read the whole file into a string and search in the string not in the file.
Anything else rapidly gets very complicated and also not terribly efficient. Files aren't designed to be read backwards, I would prefer to redesign your file format so that you don't need to read backwards than to actually
attempt this.

And no there is no quick trick to do this.

john

Jul 22 '05 #15
Hardy wrote:

why just reverse the string and search it in the normal way?:)


Because it isn't very efficient to start searching a 10 MB file
at the beginning, when all you are interested in, is the *last*
occourence of the search string and you know that this
search string is located somewhere in the last 100 Bytes or so.
--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Sharif T. Karim | last post by:
I am trying to do the following with my search script that looks for records in a mysql table. The following is an example of what I am trying to do. Text being searched: -- The brown fox...
5
by: Chris R. | last post by:
I'm trying to do something relatively simple - find the offset of the first - or next - occurance of a string in a file, ideally in a case- insensitive way. I've seen a few solutions that seemed...
6
by: Neil Patel | last post by:
I have a log file that puts the most recent record at the bottom of the file. Each line is delimited by a \r\n Does anyone know how to seek to the end of the file and start reading backwards?
6
by: mandibdc | last post by:
I need to extract some elements from a very large XML file. Because of the size, I'd like to work with it on my Linux machine as a text file. Basically, I am going to have a list of specific...
1
by: Eric | last post by:
Hi: I have two files. I search pattern ":" from emails text file and save email contents into a database. Another search pattern " field is blank. Please try again.", vbExclamation + vbOKOnly...
4
by: Dameon | last post by:
Hi All, I have a process where I'd like to search the contents of a file(in a dir) for all occurences (or the count of) of a given string. My goal is to focus more on performance, as some of the...
4
by: BenCoo | last post by:
Hello, In a Binary Search Tree I get the error : Object must be of type String if I run the form only with the "Dim bstLidnummer As New BinarySearchTree" it works fine. Thanks for any...
4
by: Russell Mangel | last post by:
Hi, The code I have posted searches for a pattern of bytes starting from the end of Byte array *backwords*, if a match is found, return the starting index of those found bytes. Since one of...
0
Debadatta Mishra
by: Debadatta Mishra | last post by:
Introduction In this article I will provide you an approach to manipulate an image file. This article gives you an insight into some tricks in java so that you can conceal sensitive information...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.