Andrew Robert wrote:
I have two Perl expressions
If windows:
perl -ple "s/([^\w\s])/sprintf(q#%%%2X#, ord $1)/ge" somefile.txt
If posix
perl -ple 's/([^\w\s])/sprintf("%%%2X", ord $1)/ge' somefile.txt
The [^\w\s] is a negated expression stating that any character
a-zA-Z0-9_, space or tab is ignored.
The () captures whatever matches and throws it into the $1 for
processing by the sprintf
In this case, %%%2X which is a three character hex value.
How would you convert this to a python equivalent using the re or
similar module?
I've begun reading about using re expressions at
http://www.amk.ca/python/howto/regex/ but I am still hazy on implementation.
Any help you can provide would be greatly appreciated.
Thanks,
Andy
Okay.. I got part of it..
The code/results below seem to do the first part of the expression.
I believe the next part is iterating across each of the characters,
evaluate the results and replace with hex as needed.
# Import the module
import re
# Open test file
file=open(r'm:\mq\mq\scripts\testme.txt','r')
# Read in a sample line
line=file.readline()
# Compile expression to exclude all characters plus space/tab
pattern=re.compile('[^\w\s]')
# Look to see if I can find a non-standard character
# from test line #! C:\Python24\Python
var=pattern.match('!')
# gotcha!
print var
<_sre.SRE_Match object at 0x009DA8E0
# I got
print var.group()
!
# See if pattern will come back with something it shouldn't
var =pattern.match('C')
print var
#I got
None
Instead of being so linear, I was thinking that this might be closer.
Got to figure out the hex line but then we are golden
# Evaluate captured character as hex
def ret_hex(ch):
return chr((ord(ch) + 1) % )
# Evaluate the value of whatever was matched
def eval_match(match):
return ret_hex(match.group(0))
# open file
file = open(r'm:\mq\mq\scripts\testme.txt','r')
# Read each line, pass any matches on line to function
for line in file.readlines():
re.sub('[^\w\s]',eval_match, line)