469,354 Members | 2,035 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,354 developers. It's quick & easy.

backslash plague

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.

a='R0\1.2644\1.2344\D'
re.sub(r'\'','ff',a) does nothing
and why must I write two '' after the \? If I hadn't used r I would
understand...

how should I do it?

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBeVzlHn4UHCY8rB8RAqa7AJoDbHIXje4yP/pTZpOH0ZVe1MGqwwCfadOa
T8GTyeJU6Jve1405Xa9cuus=
=P+kP
-----END PGP SIGNATURE-----
Jul 18 '05 #1
9 1824
Luis P. Mendes <lu************@netvisaoXX.pt> wrote:
...
I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.
x = r'R0\1.2646\1.2649\D'
elements = x.split('\\')
and why must I write two '' after the \? If I hadn't used r I would
understand...
A raw literal can't end with an odd number of backslashes (_some_ way
has to be there to escape the quote char, after all).
how should I do it?


I think the string's split method, as above, is the simplest, fastest
way.
Alex
Jul 18 '05 #2
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So, the trick is to put the r in front of the string?!

with the r in front:

x = r'R0\1.2646\1.2649\D'
|>> elements = x.split('\\')
|>> elements
['R0', '1.2646', '1.2649', 'D'] <--what I want
without it:

y='R0\1.2646\1.2649\D'
|>> elements = y.split('\\')
|>> elements
['R0\x01.2646\x01.2649', 'D'] <-- not good
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBeWMjHn4UHCY8rB8RArvEAJ91mc423u5zY+xYQOLCXo E3ouzb4gCePix/
a745y1s3UMvW61prb6ndCUk=
=pREV
-----END PGP SIGNATURE-----
Jul 18 '05 #3
Luis P. Mendes wrote:
I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.

a='R0\1.2644\1.2344\D'
a=r'R0\1.2644\1.2344\D'
re.sub(r'\'','ff',a) does nothing

re.sub(r'\\\\','ff',a) 'R0ff1.2644ff1.2344ffD'

and also
a.replace('\\','ff') 'R0ff1.2644ff1.2344ffD'

You said, you wanna _split_ them:
a.split('\\')

['R0', '1.2644', '1.2344', 'D']

HTH,
Mike

Jul 18 '05 #4
"Luis P. Mendes" <lu************@netvisaoXX.pt> wrote in message
news:2t*************@uni-berlin.de...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.
a='R0\1.2644\1.2344\D'
re.sub(r'\'','ff',a) does nothing
and why must I write two '' after the \? If I hadn't used r I would
understand...

how should I do it?

Luis


Problem 1:
Did you perhaps intend:
a = r'R0\1.2644\1.2344\D'
Otherwise, your string contains this: "R0?.2644?.2344\D" - the \1 sequence
is interpreted to mean "ascii character 1", which is the ASCII <SOH>.

Problem 2: "why must I write two '' after the \?"
Even raw strings cannot handle a backslash as the final character, as this
is interpreted as being an escaped quotation character. Your assignment to
a above is a good candidate for raw strings, but futile for '\\' etc. (See
next.)

Problem 3:
If you are trying to re.sub the delimiting backslashes, then you need to
double-double them for re to process them correctly.

Try this:

a = r'R0\1.2644\1.2344\D'
bslash = '\\'
re_bslash = '\\\\'
print a.split( bslash )
print re.sub( re_bslash, 'ff', a )

gives:
['R0', '1.2644', '1.2344', 'D']
R0ff1.2644ff1.2344ffD
-- Paul
Jul 18 '05 #5
"Luis P. Mendes" <lu************@netvisaoXX.pt> wrote in message
news:2t*************@uni-berlin.de...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,

I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.
a='R0\1.2644\1.2344\D'


Here's your first problem: the name "a" is not mapped to the same string you
think it is. Python interprets the backslashes in a quoted string as escape
sequences, unless you specify a raw string by putting an "r" before the
first quotation mark. Specifically, it interprets the sequence \1 as the
character with an ASCII value of 1. So what you've done is:
a = 'R0\1.2646\1.2649\D'
print a R0?.2646?.2649\D

(The preceding line contains non-ASCII characters that may display oddly...)

I think what you really want is:
a = r'R0\1.2646\1.2649\D'
print a R0\1.2646\1.2649\D

Once you have stored your string properly, what's wrong with this?
print a.split("\\")

['R0', '1.2646', '1.2649', 'D']

--
I don't actually read my hotmail account, but you can replace hotmail with
excite if you really want to reach me.
Jul 18 '05 #6
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

thanks!

I was just considering the effect of back slashes in the search/replace
criterium and not in the string itself. Thank you again!

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBeWpaHn4UHCY8rB8RAtK+AJ421wDXodQ0zCM9AoDuay ojh+uFUQCeJPL2
rBWMVP36h6SBKLvqcWecBw4=
=R4+I
-----END PGP SIGNATURE-----
Jul 18 '05 #7
Luis P. Mendes <lu************@netvisaoXX.pt> wrote:
So, the trick is to put the r in front of the string?!
If you want a literal string with backslashes in it, you either double
each backslash or use a rawliteral (r in front).
y='R0\1.2646\1.2649\D'


the third character of y, \1, is in this case the byte with value 1 in
the ASCII code, etc. Apparently that's not what you want.

But is this string going to be a literal in your code, rather than, say,
read from a file? Sounds unlikely. When you read from a file there's
no escape-sequence interpretation, so the issue of how to write
literals, raw or otherwise, is irrelevant there.
Alex
Jul 18 '05 #8
On Fri, 22 Oct 2004 21:20:30 +0200, al*****@yahoo.com (Alex Martelli) wrote:
Luis P. Mendes <lu************@netvisaoXX.pt> wrote:
...
I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.


x = r'R0\1.2646\1.2649\D'
elements = x.split('\\')
and why must I write two '' after the \? If I hadn't used r I would
understand...


A raw literal can't end with an odd number of backslashes (_some_ way
has to be there to escape the quote char, after all).

Hm, just had the thought that something analogous to HDLC bit-stuffing
could be used. IIRC bitstreams had escape flags composed of 5 successive bits,
and if you wanted to transmit 5 successive data bits, you just added an extra bit
at the end to make 6 to show that the five did not comprise a flag. The extra bits
would get dropped on decoding when a 6th 1 followed 11111 and would be recognized
as a flag otherwise.

Translating this to quoted character sequences, we could have an alternate triple
quoted raw string format, with quote-stuffing instead of escapes. I.e., to quote
three successive quote characters, we stuff a 4th quote, which the tokenizer drops
as it creates the internal byte sequence string representation, so we don't need
escapes in the usual sense.

Thus (using f prefix to indicate flagged quote-stuffing syntax) you could write:

x = f'''c:\whatever\'''

and to quote the line above (without taking advantage of alternate quotes):

q = f''' x = f''''c:\whatever\'''''''
^^^ ^^^| ^^^|^^^

where ^^^ is flag and | indicates a stuffed quote that
makes the previous otherwise-flag into three quotes in the data.
You could quote again (using same type quote for illustrative purposes
again, since oviously you could do better using both ' and "):

r = f'''f'''' x = f'''''c:\whatever\'''''''''''
^^^| ^^^| ^^^|^^^|^^^

(I think ;-)

I guess the worst-case data to quote would be a repeating pattern of
'''""" or """''' since neither type of quote character would give an
advantage, but 1-in-6 overhead is still not too bad, and it would be rare.

Is there a hole in this raw string quoting syntax?

Regards,
Bengt Richter
Jul 18 '05 #9
On Sat, 23 Oct 2004 22:33:05 GMT, bo**@oz.net (Bengt Richter) wrote:
On Fri, 22 Oct 2004 21:20:30 +0200, al*****@yahoo.com (Alex Martelli) wrote:
Luis P. Mendes <lu************@netvisaoXX.pt> wrote:
...
I've already read many pages on this but I'm not able to separate the
string 'R0\1.2646\1.2649\D' in four elements, using the \ as the separator.
x = r'R0\1.2646\1.2649\D'
elements = x.split('\\')
and why must I write two '' after the \? If I hadn't used r I would
understand...


A raw literal can't end with an odd number of backslashes (_some_ way
has to be there to escape the quote char, after all).

Hm, just had the thought that something analogous to HDLC bit-stuffing
could be used. IIRC bitstreams had escape flags composed of 5 successive bits,
and if you wanted to transmit 5 successive data bits, you just added an extra bit
at the end to make 6 to show that the five did not comprise a flag. The extra bits
would get dropped on decoding when a 6th 1 followed 11111 and would be recognized
as a flag otherwise.

BZZT! wrong ;-(
The flag is 01111110 and I believe 0 gets stuffed after the 5th 1 to make sure
the flag is not part of data between real flags.

Translating this to quoted character sequences, we could have an alternate triple
quoted raw string format, with quote-stuffing instead of escapes. I.e., to quote
three successive quote characters, we stuff a 4th quote, which the tokenizer drops
as it creates the internal byte sequence string representation, so we don't need
escapes in the usual sense. This does not work for e.g. quoting as single quote, so it's not general at all :-(

Thus (using f prefix to indicate flagged quote-stuffing syntax) you could write:

x = f'''c:\whatever\'''

and to quote the line above (without taking advantage of alternate quotes):

q = f''' x = f''''c:\whatever\'''''''
^^^ ^^^| ^^^|^^^

where ^^^ is flag and | indicates a stuffed quote that
makes the previous otherwise-flag into three quotes in the data.
You could quote again (using same type quote for illustrative purposes
again, since oviously you could do better using both ' and "):

r = f'''f'''' x = f'''''c:\whatever\'''''''''''
^^^| ^^^| ^^^|^^^|^^^

(I think ;-)

I guess the worst-case data to quote would be a repeating pattern of
'''""" or """''' since neither type of quote character would give an
advantage, but 1-in-6 overhead is still not too bad, and it would be rare.

Is there a hole in this raw string quoting syntax?

Unfortunately, yes.

I thought of another format, but it doesn't quote previously quoted arbitrary text
without modifying at least the last character, so phooey. Might as well go to the
previously suggested mime-style delimiting, which an editor macro could do for
arbitrary selected text. It could use str(time.time()) as delimiter text without
much risk, IWT.

Regards,
Bengt Richter
Jul 18 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by Aloysio Figueiredo | last post: by
3 posts views Thread by Terry Asher | last post: by
3 posts views Thread by Sathyaish | last post: by
3 posts views Thread by Stef Mientki | last post: by
2 posts views Thread by Tobiah | last post: by
4 posts views Thread by Razzbar | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.