469,366 Members | 2,045 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,366 developers. It's quick & easy.

How can this Perl regular expression be expressed in Python?

Here's a large Perl regular expression, from a Perl address parser in CPAN:

use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?:($Addr_Match{direct})\W+ (?{ $_{street} = $^N })
($Addr_Match{type})\b (?{ $_{type} = $^N }))
|
(?:($Addr_Match{direct})\W+ (?{ $_{prefix} = $^N }))?
(?:
([^,]+) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
|
([^,]*\d) (?{ $_{street} = $^N })
($Addr_Match{direct})\b (?{ $_{suffix} = $^N })
|
([^,]+?) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))?
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
)
)
/ix;

I'm trying to convert this to Python.

Those entries like "$(Addr_Match{direct}) are other regular expressions,
being used here as subexpressions. Those have already been converted
to forms like "Addr_Match.direct" in Python. But how to call them?
Is that possible in Python, and if so, where is it documented?

John Nagle
Feb 14 '07 #1
3 1402
En Wed, 14 Feb 2007 01:07:33 -0300, John Nagle <na***@animats.com>
escribió:
Here's a large Perl regular expression, from a Perl address parser in
CPAN:

use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?:($Addr_Match{direct})\W+ (?{ $_{street} = $^N })
($Addr_Match{type})\b (?{ $_{type} = $^N }))
|
(?:($Addr_Match{direct})\W+ (?{ $_{prefix} = $^N }))?
(?:
([^,]+) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
|
([^,]*\d) (?{ $_{street} = $^N })
($Addr_Match{direct})\b (?{ $_{suffix} = $^N })
|
([^,]+?) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))?
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N }))?
)
)
/ix;

I'm trying to convert this to Python.

Those entries like "$(Addr_Match{direct}) are other regular expressions,
being used here as subexpressions. Those have already been converted
to forms like "Addr_Match.direct" in Python. But how to call them?
Is that possible in Python, and if so, where is it documented?
That would be string interpolation, like this:

Addr_Match = {"direct": "some_re_string",
"type": "other_re"
}

regexp = "%(direct)s %(type)s" % Addr_Match

--
Gabriel Genellina

Feb 14 '07 #2
Gabriel Genellina wrote:
En Wed, 14 Feb 2007 01:07:33 -0300, John Nagle <na***@animats.com>
escribió:
>Here's a large Perl regular expression, from a Perl address parser in
CPAN:

use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?:($Addr_Match{direct})\W+ (?{ $_{street} = $^N })
($Addr_Match{type})\b (?{ $_{type} = $^N }))
|
(?:($Addr_Match{direct})\W+ (?{ $_{prefix} = $^N }))?
(?:
([^,]+) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N
}))?
|
([^,]*\d) (?{ $_{street} = $^N })
($Addr_Match{direct})\b (?{ $_{suffix} = $^N })
|
([^,]+?) (?{ $_{street} = $^N })
(?:[^\w,]+($Addr_Match{type})\b (?{ $_{type} = $^N }))?
(?:[^\w,]+($Addr_Match{direct})\b (?{ $_{suffix} = $^N
}))?
)
)
/ix;

I'm trying to convert this to Python.

Those entries like "$(Addr_Match{direct}) are other regular expressions,
being used here as subexpressions. Those have already been converted
to forms like "Addr_Match.direct" in Python. But how to call them?
Is that possible in Python, and if so, where is it documented?


That would be string interpolation, like this:

Addr_Match = {"direct": "some_re_string",
"type": "other_re"
}

regexp = "%(direct)s %(type)s" % Addr_Match
You're right. I looked at the Perl code, and the strings are just being
inserted, not precompiled as regular expressions and called.

Incidentally, does anybody know what "$^N" means in Perl? That
abbreviation isn't in the list of special variables.

John Nagle
Feb 14 '07 #3
En Wed, 14 Feb 2007 04:11:37 -0300, John Nagle <na***@animats.com>
escribió:
Gabriel Genellina wrote:
>En Wed, 14 Feb 2007 01:07:33 -0300, John Nagle <na***@animats.com>
escribió:
>>Here's a large Perl regular expression, from a Perl address parser in
CPAN:

use re 'eval';
$Addr_Match{street} = qr/
(?:
# special case for addresses like 100 South Street
(?:($Addr_Match{direct})\W+ (?{ $_{street} = $^N
})
($Addr_Match{type})\b (?{ $_{type} = $^N
}))
Incidentally, does anybody know what "$^N" means in Perl? That
abbreviation isn't in the list of special variables.
From the context it appears to be the "last matched group", or something
like that... but best look for some authoritative answer.

--
Gabriel Genellina

Feb 14 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

77 posts views Thread by Hunn E. Balsiche | last post: by
17 posts views Thread by Michael McGarry | last post: by
4 posts views Thread by Bill Chiu | last post: by
9 posts views Thread by Dieter Vanderelst | last post: by
9 posts views Thread by MJ | last post: by
6 posts views Thread by scottyman | last post: by
5 posts views Thread by prekida | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.