469,268 Members | 1,023 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,268 developers. It's quick & easy.

excluding search string in regular expressions

Hello,

Following Problem:

find only occurances, where in the line are'::' characters and
the former line is not equal '**/'

so 2) and 3) should be found and 1) not.

1)
"""
**/
void C::B
"""

2)
"""

void C::B
"""

3)
"""
*/
void C::B
"""

I tried something
"\*\*/\n.*::"

But this is the opposite.

So my question is: how can I exclude a pattern?

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

thank you in advance,
--
Franz Steinhaeusler
Jul 18 '05 #1
9 2689
On Thu, 21 Oct 2004 13:36:46 +0200, Franz Steinhaeusler
<fr*****************@utanet.at> wrote:

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::


Sorry,
is this the solution (simple concatenating [^*][^*][^/]\n.*:: ?
The background:
I want to scan cpp file, whether the have a doxygen comment already:
It should find all postitions, where this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)

the problem is to find the method or function definition, and for
that, I need a regex.
it should ignore blabla::InitButtonPanel(a, b);

So a mark is that if there is a semikolon at the end,
it is no function or method defininition.

So I would need
[^*][^*][^/]\n.*[)]*[^;]
but this is not working.

Thank you again in advance!
--
Franz Steinhaeusler
Jul 18 '05 #2
Franz Steinhaeusler wrote:
On Thu, 21 Oct 2004 13:36:46 +0200, Franz Steinhaeusler
<fr*****************@utanet.at> wrote:

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::
Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?


That should do, though it's admittedly far from elegant; I, too, would like to see a nicer solution.
The background:
I want to scan cpp file, whether the have a doxygen
comment already: It should find all postitions, where
this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)


In this case, I'd replace \n with \w*, meaning any amount of whitespace.
Jul 18 '05 #3
On Thu, 21 Oct 2004 14:40:24 +0200, "Mitja" <nu*@example.com> wrote:
Franz Steinhaeusler wrote:
On Thu, 21 Oct 2004 13:36:46 +0200, Franz Steinhaeusler
<fr*****************@utanet.at> wrote:

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::


Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?


That should do, though it's admittedly far from elegant; I, too, would like to see a nicer solution.
The background:
I want to scan cpp file, whether the have a doxygen
comment already: It should find all postitions, where
this is missing:

ok

doxygen comment
**/
void CBs::InitButtonPanel (int progn1, int progn2)


In this case, I'd replace \n with \w*, meaning any amount of whitespace.


Hello, thank you.

Oh, not really right (about finding c function/method definition):

[^*][^*][^/]\w*.*[)]*[^;]
if func()
{

would also be found.

A more common solution for detecting functions/Methods would be fine.

[^*][^*][^/]\w*--c-method/function/definition
--
Franz Steinhaeusler
Jul 18 '05 #4
Mitja <nu*@example.com> wrote:
Franz Steinhaeusler wrote:
Franz Steinhaeusler wrote:
[...]
single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::


Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?


That should do, though it's admittedly far from elegant; I, too,
would like to see a nicer solution.


It won't work correctly. Franz needs a sub-expression that
matches anything which is not "**/". However, [^*][^*][^/]
is a character-wise negation, not word-wise. It doesn't
match "**/", but neither does it match "xx/", nor any other
string which has only one or two of the characters at the
right position.

What you need is a "negative look-behind assertion". The
following Python-RE will do: (?<!\*\*/)\n.*::
Remember to use raw string notation, or you need to double
the backslashes:

my_re_str = r"(?<!\*\*/)\n.*::"
my_re_obj = re.compile(my_re_str)

Note that you might want to use \s* instead of \n, so any
amount of whitespace (including newlines) is matched, not
just one single newline.

For more information about regular expressions supported by
Python, refer to the Library Reference manual:

http://docs.python.org/lib/re-syntax.html

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #5
>
A more common solution for detecting functions/Methods would be fine.


Maybe you should go for a real parser here - together with a
C-syntax-grammar. Trying to cram this stuff into regexps is bound for not
catching special cases. And its gereally difficult to have a regexp _not_
macht a certain word.

Another approach would be to look for closing comments and function
definitions in several rexes, and use python-logic:

if doxy_close_rex.match(line):
line = lines.next()
if fun_def_rex.match(line):
....
--
Regards,

Diez B. Roggisch
Jul 18 '05 #6
On Thu, 21 Oct 2004 13:36:46 +0200, Franz Steinhaeusler <fr*****************@utanet.at> wrote:
Hello,

Following Problem:

find only occurances, where in the line are'::' characters and
the former line is not equal '**/'

so 2) and 3) should be found and 1) not.

1)
"""
**/
void C::B
"""

2)
"""

void C::B
"""

3)
"""
*/
void C::B
"""

I tried something
"\*\*/\n.*::"

But this is the opposite.

So my question is: how can I exclude a pattern?

single characters with [^ab] but I need not(ab)

not_this_brace_pattern(\*\*/\n).*::

thank you in advance,


To look back a line, I think I'd just use a generator, and test current
and last lines to get what I wanted. E.g., perhaps you can adapt this:
(I am just going literally by
"""
find only occurances, where in the line are'::' characters and
the former line is not equal '**/'
"""
which doesn't need a regex)
def findem(lineseq): ... getline = iter(lineseq).next
... curr = getline().rstrip()
... while True:
... last, curr = curr, getline().rstrip()
... if '::' in curr and last != '**/': yield curr
...

I made a file, modifying your data a little:
print '----\n%s----'% file('franz.txt').read() ----
1)
"""
**/
void C::B -- no (1)
"""

2)
"""

void C::B -- yes (2)
"""

3)
"""
*/
void C::B -- yes (3)
"""
----

Here's what the generator returns:
for line in findem(file('franz.txt')): print repr(line)

...
'void C::B -- yes (2)'
'void C::B -- yes (3)'
Regards,
Bengt Richter
Jul 18 '05 #7
On Thu, 21 Oct 2004 15:32:37 +0200, "Diez B. Roggisch"
<de************@web.de> wrote:

A more common solution for detecting functions/Methods would be fine.
Maybe you should go for a real parser here - together with a
C-syntax-grammar. Trying to cram this stuff into regexps is bound for not
catching special cases. And its gereally difficult to have a regexp _not_
macht a certain word.


Hello Diez,

thanks, yes, it is difficult for "not" find a searchstring in regex ;)

I only want to find a regex for an editor (which is written in python)
to have a common function (of course it cannot be so accurate as a
parser) to find a function/method defininition.
Another approach would be to look for closing comments and function
definitions in several rexes, and use python-logic:

if doxy_close_rex.match(line):
line = lines.next()
if fun_def_rex.match(line):
....


--
Franz Steinhaeusler
Jul 18 '05 #8
On 21 Oct 2004 13:28:28 GMT, Oliver Fromme <ol**@haluter.fromme.com>
wrote:
Mitja <nu*@example.com> wrote:
Franz Steinhaeusler wrote:
Franz Steinhaeusler wrote:
> [...]
> single characters with [^ab] but I need not(ab)
>
> not_this_brace_pattern(\*\*/\n).*::

Sorry,
is this the solution (simple concatenating
[^*][^*][^/]\n.*:: ?
That should do, though it's admittedly far from elegant; I, too,
would like to see a nicer solution.


Hello Oliver,
It won't work correctly. Franz needs a sub-expression that
matches anything which is not "**/". However, [^*][^*][^/]
is a character-wise negation, not word-wise. It doesn't
match "**/", but neither does it match "xx/", nor any other
string which has only one or two of the characters at the
right position.
yes, you are right, the approach above is false.

What you need is a "negative look-behind assertion".
??, sounds interesting ;)
The
following Python-RE will do: (?<!\*\*/)\n.*::
Remember to use raw string notation, or you need to double
the backslashes:

my_re_str = r"(?<!\*\*/)\n.*::"
my_re_obj = re.compile(my_re_str)

Note that you might want to use \s* instead of \n, so any
amount of whitespace (including newlines) is matched, not
just one single newline.

For more information about regular expressions supported by
Python, refer to the Library Reference manual:

http://docs.python.org/lib/re-syntax.html


(?<!...)
Matches if the current position in the string is not preceded by a
match for..

That is it.

Many thanks for your helpful reply,

--
Franz Steinhaeusler
Jul 18 '05 #9
On Thu, 21 Oct 2004 22:38:00 GMT, bo**@oz.net (Bengt Richter) wrote:

To look back a line, I think I'd just use a generator, and test current
and last lines to get what I wanted. E.g., perhaps you can adapt this:
(I am just going literally by
"""
find only occurances, where in the line are'::' characters and
the former line is not equal '**/'
"""
which doesn't need a regex)
def findem(lineseq):

... getline = iter(lineseq).next
... curr = getline().rstrip()
... while True:
... last, curr = curr, getline().rstrip()
... if '::' in curr and last != '**/': yield curr
...
[...]

Regards,
Bengt Richter


Hello Bengt,

thank you for suggesting this interesting approach,

regards
--
Franz Steinhaeusler
Jul 18 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

10 posts views Thread by Anand Pillai | last post: by
4 posts views Thread by higabe | last post: by
32 posts views Thread by tshad | last post: by
29 posts views Thread by zoro | last post: by
6 posts views Thread by jcrouse | last post: by
7 posts views Thread by Brian Mitchell | last post: by
4 posts views Thread by Eric | last post: by
11 posts views Thread by dick.deneer | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.