Florian Schulze wrote:
See the following results:
Python 2.3.5 (#62, Feb 8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
import re
s = "1"
re.sub('1','\\n',s)
'\n'
'\\n'
'\\n'
re.sub('1',r'\\n',s)
'\\n'
s.replace('1','\\n')
'\\n'
repl = '\\n'
re.sub('1',repl,s)
'\n'
s.replace('1',repl)
'\\n'
Why is the behaviour of the regexp substitution so weird and can I
prevent that? It breaks my asumptions and thus my code.
Regards,
Florian Schulze
"Why" questions are always tough to answer. E.g.: Why are we here?
The answer to "what is happening" is much easier. Strings passed to the
regex engine are processed first, so escapes must be escaped. This is
why raw strings were invented. If it weren't for these, I'd still be
using perl. In raw strings, as you have noticed, a '\' is already
escaped. In the olden days, you'd have to type "\\\\" to mean a literal
backslash, so creating a literal backslash in a regex that produced a
string that would then itself be used in a regex would be
'\\\\\\\\\\\\\\\\', which scared me away from Python for a couple of
years (rmember, the final printed product would be '\').
That patently doesn't answer your question, but here is something to ponder:
py> s.replace('1',repl)[0]
'\\'
py> print s.replace('1',repl)
\n
James