473,395 Members | 1,677 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Pattern match question

BCC
What is the best way to find out how many times a pattern is matched in a
string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that
exists to do this already or do I need to roll my own?

Thanks,
B
Jul 22 '05 #1
6 2134
BCC wrote:
What is the best way to find out how many times a pattern is matched in a
string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that
exists to do this already or do I need to roll my own?


You could write a [simple enough] predicate for 'count_if', given that
"BBB" has two 'BB' patterns, or you need to roll your own if "BBB" should
only have one.

V
Jul 22 '05 #2
Pattern matching is done with something called regular language recognition.

If you're not familiar with this, read up on "deterministic finite
automata". By implementing a little state machine, you could do what you
want.
"BCC" <br***@akanta.com> wrote in message
news:Zq*****************@newssvr29.news.prodigy.co m...
What is the best way to find out how many times a pattern is matched in a
string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that exists to do this already or do I need to roll my own?

Thanks,
B

Jul 22 '05 #3

"BCC" <br***@akanta.com> wrote in message
news:Zq*****************@newssvr29.news.prodigy.co m...
What is the best way to find out how many times a pattern is matched in a
string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that exists to do this already or do I need to roll my own?

Thanks,
B


Please forgive and ignore my earlier inadvertent top-post. Here's my
response again, formatted properly:
Pattern matching is done with something called regular language recognition.

If you're not familiar with this, read up on "deterministic finite
automata". By implementing a little state machine, you could do what you
want.

Jul 22 '05 #4
"Victor Bazarov" <v.********@comAcast.net> wrote in message news:0KqWc.189
What is the best way to find out how many times a pattern is matched in a string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that exists to do this already or do I need to roll my own?


You could write a [simple enough] predicate for 'count_if', given that
"BBB" has two 'BB' patterns, or you need to roll your own if "BBB" should
only have one.


How to do this? The predicate of count_if can only check one char at a
time. I suppose one could do

class IsBB {
public:
explicit IsBB(const char * expect);
bool operator()(const char& arg) const {
const char * a = &lhs;
return strncmp(a, d_expect, strlen(d_expect));
}
private:
const char * d_expect;
};

But it's ugly because in operator()(const char&) we assume that the original
container is a continuous array of chars, which is only true for C style
strings and std::vector<char>, but not std::string and other more exotic
containers.

One could use std::search though.

unsigned int count(const string& s, const string& expect) {
unsigned int count = 0;
typedef string::const_iterator Iter;
Iter iter = s.begin();
const Iter end = s.end();
while (iter!=end) {
Iter find = std::search(iter, end, expect.begin(), expect.end());
if (find == end) break;
++count;
/* either one of the below */
++iter;
iter += expect-end() - expect.begin();
}
return count;
}

Jul 22 '05 #5
"Dave" <be***********@yahoo.com> wrote in message
"BCC" <br***@akanta.com> wrote in message Pattern matching is done with something called regular language recognition.
If you're not familiar with this, read up on "deterministic finite
automata". By implementing a little state machine, you could do what you
want.


The std::search is worst case O(n*m) where n is the length of the string we
are searching and m is the length of the expected string. It's average case
is O(n) though.

The automata would be best and worst case O(n). But we have to construct
the automata which is I think is at least O(m^2), and traversing each char
in the string is possibly a little longer. But regular text editors usually
use this elaborate approach.
Jul 22 '05 #6

"BCC" <br***@akanta.com> wrote in message news:Zq*****************@newssvr29.news.prodigy.co m...
What is the best way to find out how many times a pattern is matched in a
string? For example if I have:

std::string s = "AAAAABBAAAABBAAAABAAABB";

I want to know how many "BB" patterns there are... is there a function that
exists to do this already or do I need to roll my own?

Thanks,
B


Something like?

--------- C++ code ---------
#include <string>
#include <iostream>
#include <cassert>
using namespace std;

typedef unsigned long ulong;

ulong get_pattern_times (const string& str_i, const string& pattern_i)
{
assert (!pattern_i.empty());

ulong ret = 0;
ulong pos = 0;

while ((pos = str_i.find (pattern_i, pos)) != string::npos)
{
ret++;
if (++pos == str_i.size()) break;
}

return ret;

}

int main()
{
cout << get_pattern_times ("AAAAABBBAAAABBAAAABAAABB", "BB") << endl;
return 0;
}
----------------------------
--
Alex Vinokur
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn


Jul 22 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Fabian | last post by:
Hi all there, Sorry for this newbee question but how comes that the following pattern: $r = "%<td valign=top><a href=\"(+?)\"(.*?)>%"; does not return any result, while the pattern: $r = "%<a...
9
by: Tjerk Wolterink | last post by:
I have an xsl file wich xsl:includes this file: <?xml version="1.0" encoding="ISO-8859-1"?> <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/xhtml"...
4
by: aevans1108 | last post by:
expanding this message to microsoft.public.dotnet.xml Greetings Please direct me to the right group if this is an inappropriate place to post this question. Thanks. I want to format a...
5
by: Terry Olsen | last post by:
Is there a good way to find a pattern of bytes/chars in a stream? I've got a serial port connected to a tcp port. I need to be able to catch a unique character string in the stream so that I can...
4
by: Jéjé | last post by:
Hi, I have a file which contain 1 pair of values by line like: Name1=Value1 = I nned to store these pair of values in a sortedlist. So the result expected for the 2 samples lines is: Key ...
10
by: greatprovider | last post by:
i'm starting with a string such as "Na**3C**6H**5O**7*2H**20" im attempting to match all **\d+ ...once i can match all the double asterix \d i intend to wrap the \d in "<sub>" tags for display...
8
by: sherifffruitfly | last post by:
Hi, I've been searching as best I can for this - coming up with little. I have a file that is full of lines fitting this pattern: (?<year>\d{4}),(?<amount>\d{6,7}) I'm likely to get a...
19
by: konrad Krupa | last post by:
I'm not expert in Pattern Matching and it would take me a while to come up with the syntax for what I'm trying to do. I hope there are some experts that can help me. I'm trying to match...
3
by: konrad Krupa | last post by:
This message is a continuation of my previous post "Pattern Match" Doug - Thank you for your help. Doug Semler was able to solve my problem to some point but I still need some help. Doug's...
4
by: mosesdinakaran | last post by:
Can any one explain how the rule is applied for the following Regular expression $Str = 'the red king'; $Pattern = '/((red|white) (king|queen))/'; preg_match($Pattern,$Str,$Val); Result:
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.