By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,414 Members | 1,024 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,414 IT Pros & Developers. It's quick & easy.

Matching chars in a std::string

P: n/a
Hi, I need a function to specify a match pattern including using
wildcard characters as below
to find chars in a std::string.

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.

Does boost or some other library have this capability? If boost does
have this, do i need to include an entire
boost library or just the bit i want. How much extra code size would
result from just using a single
utility function from the library?

Thanks
Jun 27 '08 #1
Share this Question
Share on Google+
11 Replies


P: n/a
tech wrote:
Hi, I need a function to specify a match pattern including using
wildcard characters as below
to find chars in a std::string.
Use a Regular expression library.
The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
Does boost or some other library have this capability?
Yes, it's called boost_regex
http://www.boost.org/doc/libs/1_35_0...tml/index.html
If boost does have this, do i need to include an entire
boost library or just the bit i want. How much extra code size would
result from just using a single
utility function from the library?#
On my (Linux-)System, the size of the shared library

/usr/lib/libboost_regex.so.1.34.1

is 768320 bytes.

Regards

M.
Jun 27 '08 #2

P: n/a
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
tech wrote:
Hi, I need a function to specify a match pattern including
using wildcard characters as below to find chars in a
std::string.
Use a Regular expression library.
Yes, but...
The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.

It's still easier to use a regular expression class than to do
it all by hand, but you do need some extra code to generate
the regular expression from the initial pattern.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #3

P: n/a
James Kanze wrote:
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>Use a Regular expression library.

Yes, but...
>Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);

This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.
What are you talking about? There's no 'filename' mentioned
nowhere. It's plain text processing with regular expressions
(if I'm not completely off the road).
It's still easier to use a regular expression class than to do
it all by hand, but you do need some extra code to generate
the regular expression from the initial pattern.
Not at all. The above would be (OK I made this up, its
a pseudo expression) a valid regular expression. Other
(maybe related) example. Find all links in a web page:

int linkparser(const char* htmlname)
{
boost::regex reg(
"(?isx-m: \
< \\s* A [^>]* href \\s* = \
[\"\\s]* \
\\w+:// ([^\"\\s]*) \
)"
);

string line; // read lines and perform one match/search per line
int linecount = 0; // count lines (nice)
ifstream fin(htmlname); // open saved .html file

cout << "trying to find links in " << htmlname << endl;
while( getline(fin, line) ) {
++linecount;
boost::smatch match; // instantiate match variable
if( boost::regex_search(line, match, reg) )
cout << linecount << "\t" << match[1] << endl;
}

...

What part of the above expression exactly would you consider
when saying:
you do need some extra code to generate the regular expression
Maybe we speak of different things?

Regards

Mirco
Jun 27 '08 #4

P: n/a
On Jun 23, 7:21 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
Use a Regular expression library.
Yes, but...
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.
What are you talking about? There's no 'filename' mentioned
nowhere. It's plain text processing with regular expressions
(if I'm not completely off the road).
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
It's still easier to use a regular expression class than to
do it all by hand, but you do need some extra code to
generate the regular expression from the initial pattern.
Not at all. The above would be (OK I made this up, its
a pseudo expression) a valid regular expression.
Yes, but it's not what he asked for. What he asked for was that
``"*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any
character.'' A subset of the classical filename globbing
patterns.

[...]
What part of the above expression exactly would you consider
when saying:
you do need some extra code to generate the regular expression
Maybe we speak of different things?
I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #5

P: n/a
James Kanze wrote:
>Maybe we speak of different things?

I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.
[...]
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.

I fail to see anything here
you mentioned in your two
preceding posts.

Regards

Mirco
Jun 27 '08 #6

P: n/a
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
Maybe we speak of different things?
I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.
[...]
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
I fail to see anything here you mentioned in your two
preceding posts.
Really? You don't see any mention of "wildcard"? You don't see
a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #7

P: n/a
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
[...]
Really? You don't see any mention of "wildcard"? You don't see
a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?
OK, I'm sorry, my mistake. When I read your post saying:
>>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
I understood it more like:

| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).

So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"

but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).

Regards & Thanks for clearing this up

Mirco
Jun 27 '08 #8

P: n/a
On 24 Jun, 10:12, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
I think you're both mind-reading. You're translating what the
user asked for into what you think he wants.

<snip>
>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation.
no... I wonder if he wants pattern matching and has only seen
file globbing. he not *know* he wants reg-exprs. I think the
*, ? was possibly only an example.
>>*The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).

I understood it more like:

| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. *The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).

So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"

but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).
well it confused me too. I too thought James Kanze was insisting
that the OP was matching file names.

Perhaps the OP could give more info?
--
Nick Keighley

Jun 27 '08 #9

P: n/a
On Jun 24, 10:34*am, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:
On 24 Jun, 10:12, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>This is the OP's question:
>|[Subject: Matching chars in a std::string]
>| Hi, I need a function to specify a match pattern including using
>| wildcard characters as below to find chars in a std::string. The
>| match pattern can contain the wildcard characters "*" and "?",
>| where "*" matches zero or more consecutive occurrences of any
>| character and "?" matches a single occurrence of any character.

I think you're both mind-reading. You're translating what the
user asked for into what you think he wants.

<snip>
>>The pattern matching he described was wildcard matching of
>>filenames, not regular expression evaluation.

no... I wonder if he wants pattern matching and has only seen
file globbing. he not *know* he wants reg-exprs. I think the
*, ? was possibly only an example.


>>>*The conventions
>>are different (but it is possible to map the wildcard matching
>>to regular expressions, sort of).
I understood it more like:
| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. *The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).
So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"
but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).

well it confused me too. I too thought James Kanze was insisting
that the OP was matching file names.

Perhaps the OP could give more info?

--
Nick Keighley- Hide quoted text -

- Show quoted text -
Sorry for not being clear, i just wanted a simple pattern matcher not
using
regular expressions, i think this is too much

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character

The std::string does not have such a match function which returns a
bool
Jun 27 '08 #10

P: n/a
On Jun 24, 11:12 am, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
[...]
Really? You don't see any mention of "wildcard"? You don't
see a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?
OK, I'm sorry, my mistake. When I read your post saying:
>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
Exactly. Since that's what he said.
I understood it more like:
| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).
So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"
What I meant was "wildcard matching of filenames", since that's
what the poster described. Maybe he wants to use it for
something else, but the patterns he decribed corresponds to
those used in filename gobbing, not in regular expressions.

Of course, maybe he doesn't really want what he asked for, but
is looking for something else. It does happen a lot here. But
in this particular case, I've needed both at various times in
the past, so I more or less assume that both have some utility,
and that if he took the time to write "wildcard matching", and
describe the conventions, it's because it didn't want "regular
expression matching" (which uses significantly different
conventions).

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #11

P: n/a
In article <73b61208-0920-43d3-b9e2-a901dc7d8b55
@m73g2000hsh.googlegroups.com>, na************@googlemail.com says...

[ ... ]
Sorry for not being clear, i just wanted a simple pattern matcher not
using regular expressions, i think this is too much

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character

The std::string does not have such a match function which returns a
bool
I tend to agree -- given that the matching itself only takes up
something like 4 lines of code, it's probably easier to do the match
than convert to an RE, and then use an RE engine to do the job.

#include <string>
#include <functional>

class patmat : public std::unary_function<char const *, bool{
std::string pat;

bool match(char const *pat, char const *str) const {
switch (*pat) {
case '\0': return *str=='\0';
case '*':
return match(pat+1, str) || *str && match(pat, str+1);
case '?': return *str && match(pat+1, str+1);
default: return *pat==*str && match(pat+1, str+1);
}
}
public:
patmat(std::string pattern) : pat(pattern) {}

bool operator()(std::string const &str) const {
return(match(pat.c_str(), str.c_str()));
}
};

#ifdef TEST

#include <iostream>
#include <vector>
#include <algorithm>

void test(char const * const *strings, size_t num, std::string pat) {
std::cout << "\nTesting against " << pat << "\n";
std::remove_copy_if(strings, strings+num,
std::ostream_iterator<std::string>(std::cout, "\n"),
std::not1(patmat(pat)));
}

int main() {

char *test_strings[] = {
"longstring",
"a really, really long string, compared to the others",
"string",
"spring",
"a string"
};
std::cout<< "Test strings:\n";
std::copy(test_strings, test_strings+4,
std::ostream_iterator<std::string>(std::cout, "\n"));

test(test_strings, 5, "a*");
test(test_strings, 5, "*g");
test(test_strings, 5, "*s?r*g");
test(test_strings, 5, "*st*g");
return 0;
}

#endif

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jun 27 '08 #12

This discussion thread is closed

Replies have been disabled for this discussion.