Fernando Rodriguez wrote:
I'm trying to write a regex that matches a \r char if and only if it
is not followed by a \n (I want to translate text files from unix
newlines to windows\dos).
Unix uses \n and Windows uses \r\n, so matching lone \r isn't
going to help you the slighest... (read on)
I tried this, but it doesn't work:
p = re.compile(r'(\r)[^\n]', re.IGNORECASE)
it still matches a string such as r'\r\n'
really?
import re
p = re.compile(r'(\r)[^\n]', re.IGNORECASE)
print p.match('\r\n')
None print p.match(r'\r\n')
None
on the other hand,
<_sre.SRE_Match object at 0x0083B160> print p.match('\rx')
<_sre.SRE_Match object at 0x0083B120> print p.match(r'\rx')
it might be a good idea to play a little more with ''-literals and r''-
literals (and print x and print repr(x)) until you understand exactly
how things work...
:::
I want to translate text files from unix newlines to windows\dos
you don't need regular expressions for that; the easiest way to
convert any kind of line endings to the local format is to open the
source file with the "U" flag:
infile = open(filename, "rU") # universal line endings
outfile = open(outfilename, "w") # text mode is default
s = infile.readline()
outfile.write(s)
:::
if you're converting files from Unix format to Windows format on a
Windows box, you don't have to do anything -- just open the files
in text mode, and Python's file I/O layer will fix the rest for you.
</F>