473,320 Members | 2,048 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

raw string tail escape revisited

Why wouldn't quote-stuffing solve the problem, and let you treat \ as
an ordinary character? In a raw string, it's no good for preventing
end-of-quoting anyway, unless you want the literal \ in front of the quote
you are escaping.

Quote-stuffing is a variation on the old quote-doubling, extended to
deal with triple quotes as well (which makes it a little like HDLC bit stuffing).

IOW, treat \ as an ordinary character, and then if you don't want the
string to end, just stuff one quote character of the starting kind after
the otherwise terminating sequence. You could do this with single quoting
or triple quoting, where of course you'd need it less for triple quotes.
E.g., using uppercase R as a prefix for this kind of raw string syntax,

R'\' # just fine
R'C:\' # one of the motivations
R'''' # dumb way to do "'"
R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""

The tokenizer would recognize a stuffed quote mark and just discard it if present,
otherwise recognize end of string.

Just had this idea. Do I need more coffee? What did I forget?

Regards,
Bengt Richter
Jul 18 '05 #1
2 3082
Well, one problem is that this is incompatible with all existing
R-strings, which have been in Python for comparative ages. So we'd be
forced to implement then as B'' strings (For Bengt). 16 ways to declare
string literals (single and triple, ' and ", standard, r, u, and ur)
are bad enough, I don't want to add another 8 (single and triple, ' and
", b and ub) to the mix.
$ python -c 'import this' | grep "only one"

Secondly, the price in the tokenizer for an R-string vs a regular string is
essentially zero, since after the leading r, u or ur is parsed, the
regular rule for parsing any string is used. Your rule will require
near-duplication of a 60-line segment of Parser/tokenizer.c and a new
function similar to PyString_DecodeEscape, probably another 60 lines of
C.

Finally, I'm not convinced that your description that triple-quotes and
quote-stuffing work well together. RIght now, if the parser sees
R'''' # dumb way to do "'"
it'll still be in the midst of parsing a triple-quoted raw string. How
will you be able to write a B''' string that begins with a ' if this
rule is followed? So there must be strings that you can't write with
B-quoting, just like there are strings you can't write with R-quoting
(but this time the problem is with strings that start with quotes
instead of ending with backslashes).

Jeff

Jul 18 '05 #2
On 9 Aug 2003 15:33:39 GMT, bo**@oz.net (Bengt Richter) wrote:
Why wouldn't quote-stuffing solve the problem, and let you treat \ as
an ordinary character? In a raw string, it's no good for preventing
end-of-quoting anyway, unless you want the literal \ in front of the quote
you are escaping.

Quote-stuffing is a variation on the old quote-doubling, extended to
deal with triple quotes as well (which makes it a little like HDLC bit stuffing).

IOW, treat \ as an ordinary character, and then if you don't want the
string to end, just stuff one quote character of the starting kind after
the otherwise terminating sequence. You could do this with single quoting
or triple quoting, where of course you'd need it less for triple quotes.
E.g., using uppercase R as a prefix for this kind of raw string syntax,

R'\' # just fine
R'C:\' # one of the motivations
R'''' # dumb way to do "'" Really dumb ;-/ That makes an un-terminated triple quoted string
starting with one quote. D'oh. The logic doesn't start until the beginning
delimiter - single or triple - has been passed and established. So if you
perversely wanted to use only single quotes to quote one single quote,
you couldn't. Is there one you couldn't do at all? I don't think so, since
you could always do single-quote doubling and choose the opposite quote of a leading
quote in the data. E.g., R'"""''''''' Would be a painful R'"""'+R"'''"
Actually, that could be triple quoted as R"""""""'''""", but putting an ending '"'
in that data would make a problem. Nope, R'''"""''''"''' would handle that.
But what if we add another "'"? Then the data would be ["""'''"'] Still ok,
looks like we can always start with a triple quote opposite to the end of the data:
R"""""""'''"'""" would do it. Is there an impossible case I'm missing that would have
to be split into two adjacent (thus concatenated) string representations?

Is there a reasonable use case that is messed up as the price of getting R'\' ?

Otherwise I guess it should be ok. Woke up too early and not enough ;-)

R""" <just about anything> ->[""""]<-makes 3 quotes, and we end with \"""
R""" ->[""""""""]<-two stuffing-extended triple quotes make 6 quotes."""

The tokenizer would recognize a stuffed quote mark and just discard it if present,
otherwise recognize end of string.

Just had this idea. Do I need more coffee? What did I forget?

Regards,
Bengt Richter


Regards,
Bengt Richter
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Lans | last post by:
I have a string that I need to tokenize but I need to use a string token see example i am trying the following but strtok only uses characters as delimiters and I need to seperate bu a certain...
108
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...
18
by: Steve Litvack | last post by:
Hello, I have built an XMLDocument object instance and I get the following string when I examine the InnerXml property: <?xml version=\"1.0\"?><ROOT><UserData UserID=\"2282\"><Tag1...
2
by: James | last post by:
Hi, I am looking for a stringtokenizer class/method in C#, but can't find one. The similar classes in Java and C++ are StringTokenizer and CStringT::tokenize respectively. I need to keep a...
9
by: Peter Row | last post by:
Hi, I know this has been asked before, but reading the threads it is still not entirely clear. Deciding which .Replace( ) to use when. Typically if I create a string in a loop I always use a...
17
by: Christoph Scholtes | last post by:
Hi, I have two questions about the following code snippet. I am trying to read in a series of strings and save them to character arrays. Since I dont know how long my string is going to be (and...
13
by: coosa | last post by:
Dear all, Using the conio implementation i wanted to create a dynamic string, whereby its size would be determined after each keyboard hit; in other words, i don't want to ask the user to...
9
by: incredible | last post by:
how to sort link list of string
111
by: Tonio Cartonio | last post by:
I have to read characters from stdin and save them in a string. The problem is that I don't know how much characters will be read. Francesco -- ------------------------------------- ...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.