By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,651 Members | 1,908 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,651 IT Pros & Developers. It's quick & easy.

Locating partially matched strings in a file

P: n/a

I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.

Can someone point me in the right direction what I need to do if I have
to make substring searches too?

thanks!

Sep 18 '06 #1
Share this Question
Share on Google+
8 Replies


P: n/a
Dilip wrote:
I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.
Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_itera tor<string>(ifs),
istream_iterator<string>(), strtosearch);

// at this point, how do I check if I located what I need?
// and how do I extract what I located in the file?

Sep 18 '06 #2

P: n/a
Dilip wrote:
Dilip wrote:
I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.

Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_itera tor<string>(ifs),
istream_iterator<string>(), strtosearch);

// at this point, how do I check if I located what I need?
// and how do I extract what I located in the file?
Just read in each string from your file into an std::string object.
Then use one of std::string's "find" functions to locate your substr
(google std::string). This isn't necessarily the most efficient way to
do it, but it's one way.

Sep 18 '06 #3

P: n/a
Dilip wrote:
Dilip wrote:
>I have a situation where I have a bunch of strings stuffed into a
vector.

I need to run through each of them and locate approximate (or complete)
matches to them in another file (which also contains a string on each
line). If I am searching for CAD I need to locate even something that
looks like ACAD or CADINC (along with the exact match for CAD of
course).

I ran into some examples using istream_iterator and have an idea how to
do this if I am just looking for an exact match.

Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_itera tor<string>(ifs),
istream_iterator<string>(), strtosearch);
This assumes that the lines in your file contain no whitespace (IIRC
iterator <stringsplits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

D.
>
// at this point, how do I check if I located what I need?
// and how do I extract what I located in the file?

Sep 18 '06 #4

P: n/a
Davlet Panech wrote:
Dilip wrote:

Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_itera tor<string>(ifs),
istream_iterator<string>(), strtosearch);

This assumes that the lines in your file contain no whitespace (IIRC
iterator <stringsplits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.
oops.. there could be white space in a line. so istream_iterator is
out.
I do exactly what you suggested above but see all the cool kids out
there have littered their code iterators and binders and what not. I
thought I'd use this opportunity to learn a more C++/STLish way of
doing things...
oh well..

Sep 18 '06 #5

P: n/a
Dilip wrote:
Davlet Panech wrote:
>Dilip wrote:
>>Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_ite rator<string>(ifs),
istream_iterator<string>(), strtosearch);
This assumes that the lines in your file contain no whitespace (IIRC
iterator <stringsplits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

oops.. there could be white space in a line. so istream_iterator is
out.
No, it's not. istream_iterator doesn't do anything special with
whitespace. It just copies one character after another.

--

-- Pete

Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." For more information about this book, see
www.petebecker.com/tr1book.
Sep 18 '06 #6

P: n/a
Pete Becker wrote:
Dilip wrote:
>Davlet Panech wrote:
>>Dilip wrote:
Actually here is what I came up with. Does it work for the situation I
mentioned above (partial matches)?

ifstream ifs("filetosearch.txt");
string strtosearch("CAD");

istream_iterator<stringpos(std::find(istream_it erator<string>(ifs),
istream_iterator<string>(), strtosearch);
This assumes that the lines in your file contain no whitespace (IIRC
iterator <stringsplits on any whitespace char). IMHO trying to do this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

oops.. there could be white space in a line. so istream_iterator is
out.

No, it's not. istream_iterator doesn't do anything special with
whitespace. It just copies one character after another.
Actually istream_iterator uses operator >to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".
Sep 18 '06 #7

P: n/a
Davlet Panech wrote:
Pete Becker wrote:
>Dilip wrote:
>>Davlet Panech wrote:
Dilip wrote:
Actually here is what I came up with. Does it work for the
situation I
mentioned above (partial matches)?
>
ifstream ifs("filetosearch.txt");
string strtosearch("CAD");
>
istream_iterator<stringpos(std::find(istream_i terator<string>(ifs),
istream_iterator<string>(), strtosearch);
This assumes that the lines in your file contain no whitespace (IIRC
iterator <stringsplits on any whitespace char). IMHO trying to do
this
with std algorithms/iterators/binders is likely to give you a brain
hemorrhage... the easiest thing is to read your file one line at a time
(in a loop) and search for each of your substrings in the current line.

oops.. there could be white space in a line. so istream_iterator is
out.

No, it's not. istream_iterator doesn't do anything special with
whitespace. It just copies one character after another.

Actually istream_iterator uses operator >to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".
You're right. Sorry about the confusion. I was thinking of
istreambuf_iterator, which would be the right choice here.

--

-- Pete

Author of "The Standard C++ Library Extensions: a Tutorial and
Reference." For more information about this book, see
www.petebecker.com/tr1book.
Sep 18 '06 #8

P: n/a
In article <oY******************************@storm.ca>,
dp***********@yahoo.ca says...

[ ... ]
Actually istream_iterator uses operator >to read things, which in case
of std::string reads one "word", e.g. "abc efg" would stop after "c".
I'd use "token" instead of "word", but more or less correct. Keep in
mind, however, that tokens/words are broken (only) at what is defined as
whitesapce by the ctype facet of the locale associated with the stream.

You can define a new ctype facet that only defines new-line as white
space, and go from there.

Alternatively, you can define a string proxy that overloads operator>>
to use std::getline:

class line {
std::string data;
public:
operator std::string() const { return data; }

friend std::istream &operator>>(std::istream &is, line &l) {
return std::getline(is, l.data);
}
};

Then use it something like:

std::vector<std::stringlines;

std::copy(std::istream_iterator<line>(wherever),
std::istream_iterator<line>(),
std::back_inserter(lines));

The only place we use the 'line' type is to instantiate the iterator.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Sep 18 '06 #9

This discussion thread is closed

Replies have been disabled for this discussion.