469,306 Members | 1,881 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,306 developers. It's quick & easy.

Conversion of perl based regex to python method

I have two Perl expressions
If windows:

perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt

If posix

perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt

The [^\w\s] is a negated expression stating that any character
a-zA-Z0-9_, space or tab is ignored.

The () captures whatever matches and throws it into the $1 for
processing by the sprintf

In this case, %%%2X which is a three character hex value.

How would you convert this to a python equivalent using the re or
similar module?

I've begun reading about using re expressions at
http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.

Any help you can provide would be greatly appreciated.

Thanks,
Andy
May 24 '06 #1
2 2738
Andrew Robert wrote:
I have two Perl expressions
If windows:

perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt

If posix

perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt

The [^\w\s] is a negated expression stating that any character
a-zA-Z0-9_, space or tab is ignored.

The () captures whatever matches and throws it into the $1 for
processing by the sprintf

In this case, %%%2X which is a three character hex value.

How would you convert this to a python equivalent using the re or
similar module?

I've begun reading about using re expressions at
http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.

Any help you can provide would be greatly appreciated.

Thanks,
Andy

Okay.. I got part of it..

The code/results below seem to do the first part of the expression.

I believe the next part is iterating across each of the characters,
evaluate the results and replace with hex as needed.
# Import the module
import re

# Open test file
file=open(r'm:\mq\mq\scripts\testme.txt','r')

# Read in a sample line
line=file.readline()

# Compile expression to exclude all characters plus space/tab
pattern=re.compile('[^\w\s]')

# Look to see if I can find a non-standard character
# from test line #! C:\Python24\Python

var=pattern.match('!')

# gotcha!
print var
<_sre.SRE_Match object at 0x009DA8E0

# I got
print var.group()

!

# See if pattern will come back with something it shouldn't
var =pattern.match('C')
print var

#I got
None

Instead of being so linear, I was thinking that this might be closer.
Got to figure out the hex line but then we are golden
# Evaluate captured character as hex
def ret_hex(ch):
return chr((ord(ch) + 1) % )

# Evaluate the value of whatever was matched
def eval_match(match):
return ret_hex(match.group(0))

# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')

# Read each line, pass any matches on line to function
for line in file.readlines():
re.sub('[^\w\s]',eval_match, line)
May 24 '06 #2
Andrew Robert wrote:

Wanted:
perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge'**somefile.txt
Got:
# Evaluate captured character as hex
def ret_hex(ch):
return*chr((ord(ch)*+*1)*%*)
Make it compile at least before posting :-)
# Evaluate the value of whatever was matched
def eval_match(match):
return*ret_hex(match.group(0))

# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')

# Read each line, pass any matches on line to function
for line in file.readlines():
re.sub('[^\w\s]',eval_match,*line)


for line in file:
...

without readlines() is better because it doesn't read the whole file into
memory first. If you want to read data from files passed as commandline
args or from stdin you can use fileinput.input():

import re
import sys
import fileinput

def replace(match):
return "%%%2X" % ord(match.group(0))

for line in fileinput.input():
sys.stdout.write(re.sub("[^\w\s]", replace, line))

Peter

May 25 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

77 posts views Thread by Hunn E. Balsiche | last post: by
17 posts views Thread by Michael McGarry | last post: by
17 posts views Thread by les_ander | last post: by
9 posts views Thread by Xah Lee | last post: by
31 posts views Thread by surfunbear | last post: by
1 post views Thread by pitjpz | last post: by
3 posts views Thread by Friedman, Jason | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
reply views Thread by harlem98 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.