473,385 Members | 1,769 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Matching chars in a std::string

Hi, I need a function to specify a match pattern including using
wildcard characters as below
to find chars in a std::string.

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.

Does boost or some other library have this capability? If boost does
have this, do i need to include an entire
boost library or just the bit i want. How much extra code size would
result from just using a single
utility function from the library?

Thanks
Jun 27 '08 #1
11 4793
tech wrote:
Hi, I need a function to specify a match pattern including using
wildcard characters as below
to find chars in a std::string.
Use a Regular expression library.
The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
Does boost or some other library have this capability?
Yes, it's called boost_regex
http://www.boost.org/doc/libs/1_35_0...tml/index.html
If boost does have this, do i need to include an entire
boost library or just the bit i want. How much extra code size would
result from just using a single
utility function from the library?#
On my (Linux-)System, the size of the shared library

/usr/lib/libboost_regex.so.1.34.1

is 768320 bytes.

Regards

M.
Jun 27 '08 #2
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
tech wrote:
Hi, I need a function to specify a match pattern including
using wildcard characters as below to find chars in a
std::string.
Use a Regular expression library.
Yes, but...
The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character.
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.

It's still easier to use a regular expression class than to do
it all by hand, but you do need some extra code to generate
the regular expression from the initial pattern.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #3
James Kanze wrote:
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>Use a Regular expression library.

Yes, but...
>Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);

This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.
What are you talking about? There's no 'filename' mentioned
nowhere. It's plain text processing with regular expressions
(if I'm not completely off the road).
It's still easier to use a regular expression class than to do
it all by hand, but you do need some extra code to generate
the regular expression from the initial pattern.
Not at all. The above would be (OK I made this up, its
a pseudo expression) a valid regular expression. Other
(maybe related) example. Find all links in a web page:

int linkparser(const char* htmlname)
{
boost::regex reg(
"(?isx-m: \
< \\s* A [^>]* href \\s* = \
[\"\\s]* \
\\w+:// ([^\"\\s]*) \
)"
);

string line; // read lines and perform one match/search per line
int linecount = 0; // count lines (nice)
ifstream fin(htmlname); // open saved .html file

cout << "trying to find links in " << htmlname << endl;
while( getline(fin, line) ) {
++linecount;
boost::smatch match; // instantiate match variable
if( boost::regex_search(line, match, reg) )
cout << linecount << "\t" << match[1] << endl;
}

...

What part of the above expression exactly would you consider
when saying:
you do need some extra code to generate the regular expression
Maybe we speak of different things?

Regards

Mirco
Jun 27 '08 #4
On Jun 23, 7:21 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 12:41 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
Use a Regular expression library.
Yes, but...
Example:
using namespace boost;
...
regex reg("^ \\s* .*? (\\d+) [^\\n\\r]* \d? [\\n\\r]+", regex::mod_x);
This is a joke, right. You need code to convert a match pattern
to a regular expression; you have to convert "*' to something like
"[^/]*", for example (under Unix---under Windows, the equivalent
mapping would be "[^/\\]*"---and under Unix, at least, if it is
the first thing in a filename, you also have to exclude .). And
you have to escape the regular expression meta-characters as
well.
What are you talking about? There's no 'filename' mentioned
nowhere. It's plain text processing with regular expressions
(if I'm not completely off the road).
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
It's still easier to use a regular expression class than to
do it all by hand, but you do need some extra code to
generate the regular expression from the initial pattern.
Not at all. The above would be (OK I made this up, its
a pseudo expression) a valid regular expression.
Yes, but it's not what he asked for. What he asked for was that
``"*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any
character.'' A subset of the classical filename globbing
patterns.

[...]
What part of the above expression exactly would you consider
when saying:
you do need some extra code to generate the regular expression
Maybe we speak of different things?
I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #5
James Kanze wrote:
>Maybe we speak of different things?

I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.
[...]
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.

I fail to see anything here
you mentioned in your two
preceding posts.

Regards

Mirco
Jun 27 '08 #6
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
Maybe we speak of different things?
I was talking about what the original poster asked for. You can
do it with regular expressions (I have code which translates a
Unix globbing pattern into a regular expression), but it takes
some pre-processing.
[...]
The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
I fail to see anything here you mentioned in your two
preceding posts.
Really? You don't see any mention of "wildcard"? You don't see
a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #7
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
[...]
Really? You don't see any mention of "wildcard"? You don't see
a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?
OK, I'm sorry, my mistake. When I read your post saying:
>>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
I understood it more like:

| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).

So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"

but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).

Regards & Thanks for clearing this up

Mirco
Jun 27 '08 #8
On 24 Jun, 10:12, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
I think you're both mind-reading. You're translating what the
user asked for into what you think he wants.

<snip>
>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation.
no... I wonder if he wants pattern matching and has only seen
file globbing. he not *know* he wants reg-exprs. I think the
*, ? was possibly only an example.
>>*The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).

I understood it more like:

| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. *The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).

So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"

but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).
well it confused me too. I too thought James Kanze was insisting
that the OP was matching file names.

Perhaps the OP could give more info?
--
Nick Keighley

Jun 27 '08 #9
On Jun 24, 10:34*am, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:
On 24 Jun, 10:12, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
>This is the OP's question:
>|[Subject: Matching chars in a std::string]
>| Hi, I need a function to specify a match pattern including using
>| wildcard characters as below to find chars in a std::string. The
>| match pattern can contain the wildcard characters "*" and "?",
>| where "*" matches zero or more consecutive occurrences of any
>| character and "?" matches a single occurrence of any character.

I think you're both mind-reading. You're translating what the
user asked for into what you think he wants.

<snip>
>>The pattern matching he described was wildcard matching of
>>filenames, not regular expression evaluation.

no... I wonder if he wants pattern matching and has only seen
file globbing. he not *know* he wants reg-exprs. I think the
*, ? was possibly only an example.


>>>*The conventions
>>are different (but it is possible to map the wildcard matching
>>to regular expressions, sort of).
I understood it more like:
| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. *The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).
So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"
but rather meant exactly what the OP wanted to know. Sorry
for not being able to deduce that from it (I'm new to c.l.c++).

well it confused me too. I too thought James Kanze was insisting
that the OP was matching file names.

Perhaps the OP could give more info?

--
Nick Keighley- Hide quoted text -

- Show quoted text -
Sorry for not being clear, i just wanted a simple pattern matcher not
using
regular expressions, i think this is too much

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character

The std::string does not have such a match function which returns a
bool
Jun 27 '08 #10
On Jun 24, 11:12 am, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
James Kanze wrote:
On Jun 23, 9:46 pm, Mirco Wahab <wa...@chemie.uni-halle.dewrote:
This is the OP's question:
|[Subject: Matching chars in a std::string]
| Hi, I need a function to specify a match pattern including using
| wildcard characters as below to find chars in a std::string. The
| match pattern can contain the wildcard characters "*" and "?",
| where "*" matches zero or more consecutive occurrences of any
| character and "?" matches a single occurrence of any character.
[...]
Really? You don't see any mention of "wildcard"? You don't
see a definition of "*" which says it matches zero or more
consecutive occurrence of any character? You don't see a
definition of "?" which matches a single occurance of any
character?
OK, I'm sorry, my mistake. When I read your post saying:
>The pattern matching he described was wildcard matching of
filenames, not regular expression evaluation. The conventions
are different (but it is possible to map the wildcard matching
to regular expressions, sort of).
Exactly. Since that's what he said.
I understood it more like:
| The pattern matching he described was wildcard matching of
| filenames, not regular expression evaluation. The conventions
| are different (but it is possible to map the wildcard matching
| to regular expressions, sort of).
So you didn't really mean:
"/... matching of filenames, not regular expression evaluation .../"
What I meant was "wildcard matching of filenames", since that's
what the poster described. Maybe he wants to use it for
something else, but the patterns he decribed corresponds to
those used in filename gobbing, not in regular expressions.

Of course, maybe he doesn't really want what he asked for, but
is looking for something else. It does happen a lot here. But
in this particular case, I've needed both at various times in
the past, so I more or less assume that both have some utility,
and that if he took the time to write "wildcard matching", and
describe the conventions, it's because it didn't want "regular
expression matching" (which uses significantly different
conventions).

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #11
In article <73b61208-0920-43d3-b9e2-a901dc7d8b55
@m73g2000hsh.googlegroups.com>, na************@googlemail.com says...

[ ... ]
Sorry for not being clear, i just wanted a simple pattern matcher not
using regular expressions, i think this is too much

The match pattern can contain the wildcard characters "*" and "?",
where "*" matches zero or more consecutive occurrences of any
character and "?" matches a single occurrence of any character

The std::string does not have such a match function which returns a
bool
I tend to agree -- given that the matching itself only takes up
something like 4 lines of code, it's probably easier to do the match
than convert to an RE, and then use an RE engine to do the job.

#include <string>
#include <functional>

class patmat : public std::unary_function<char const *, bool{
std::string pat;

bool match(char const *pat, char const *str) const {
switch (*pat) {
case '\0': return *str=='\0';
case '*':
return match(pat+1, str) || *str && match(pat, str+1);
case '?': return *str && match(pat+1, str+1);
default: return *pat==*str && match(pat+1, str+1);
}
}
public:
patmat(std::string pattern) : pat(pattern) {}

bool operator()(std::string const &str) const {
return(match(pat.c_str(), str.c_str()));
}
};

#ifdef TEST

#include <iostream>
#include <vector>
#include <algorithm>

void test(char const * const *strings, size_t num, std::string pat) {
std::cout << "\nTesting against " << pat << "\n";
std::remove_copy_if(strings, strings+num,
std::ostream_iterator<std::string>(std::cout, "\n"),
std::not1(patmat(pat)));
}

int main() {

char *test_strings[] = {
"longstring",
"a really, really long string, compared to the others",
"string",
"spring",
"a string"
};
std::cout<< "Test strings:\n";
std::copy(test_strings, test_strings+4,
std::ostream_iterator<std::string>(std::cout, "\n"));

test(test_strings, 5, "a*");
test(test_strings, 5, "*g");
test(test_strings, 5, "*s?r*g");
test(test_strings, 5, "*st*g");
return 0;
}

#endif

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jun 27 '08 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Angus Leeming | last post by:
Hello, Could someone explain to me why the Standard conveners chose to typedef std::string rather than derive it from std::basic_string<char, ...>? The result of course is that it is...
11
by: Christopher Benson-Manica | last post by:
Let's say I have a std::string, and I want to replace all the ',' characters with " or ", i.e. "A,B,C" -> "A or B or C". Is the following the best way to do it? int idx; while(...
2
by: Thorsten Viel | last post by:
Hi List, im trying to use special chars with the std::string (like S) but when i cout<< the string the special chars have disappeared. What's wrong here? Thanks in advance
22
by: Jason Heyes | last post by:
Does this function need to call eof after the while-loop to be correct? bool read_file(std::string name, std::string &s) { std::ifstream in(name.c_str()); if (!in.is_open()) return false; ...
19
by: Erik Wikström | last post by:
First of all, forgive me if this is the wrong place to ask this question, if it's a stupid question (it's my second week with C++), or if this is answered some place else (I've searched but not...
8
by: Patrick Kowalzick | last post by:
Dear NG, I would like to change the allocator of e.g. all std::strings, without changing my code. Is there a portable solution to achieve this? The only nice solution I can think of, would be...
6
by: Nemok | last post by:
Hi, I am new to STD so I have some questions about std::string because I want use it in one of my projects instead of CString. 1. Is memory set dinamicaly (like CString), can I define for...
2
by: FBergemann | last post by:
if i compile following sample: #include <iostream> #include <string> int main(int argc, char **argv) { std::string test = "hallo9811111z"; std::string::size_type ret;
11
by: Jacek Dziedzic | last post by:
Hi! I need a routine like: std::string nth_word(const std::string &s, unsigned int n) { // return n-th word from the string, n is 0-based // if 's' contains too few words, return "" //...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.