By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,650 Members | 1,976 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,650 IT Pros & Developers. It's quick & easy.

optparse escaping control characters

P: n/a
optparse seems to be escaping control characters that I pass as
arguments on the command line. Is this a bug? Am I missing
something? Can this be prevented, or worked around?

This behaviour doesn't occur with non-control characters.

For example, if this program (called test.py):
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", dest="delimiter", action="store")
(options, args) = parser.parse_args()
print options

is run as follows:
python test.py -d '\t'

it outputs:
{'delimiter': '\\t'}

i.e. the \t has had an escape character added to give \\t.

Aug 19 '08 #1
Share this Question
Share on Google+
7 Replies


P: n/a
wa**********@gmail.com writes:
optparse seems to be escaping control characters that I pass as
arguments on the command line. Is this a bug? Am I missing
something? Can this be prevented, or worked around?
It has nothing to do with optparse, it's how Python prints strings:

$ python -c 'import sys; print sys.argv' '\t'
['-c', '\\t']

Note that you're not really passing a control character to Python,
you're passing a two-character string consisting of \ and t. When
representing the string inside a data structure, Python escapes the \
to avoid confusion with a real control character such as \t.

If you try printing the string itself, you'll see that everything is
correct:

$ python -c 'import sys; print sys.argv[1]' '\t'
\t
Aug 19 '08 #2

P: n/a
wa**********@gmail.com wrote:
optparse seems to be escaping control characters that I pass as
arguments on the command line. Is this a bug? Am I missing
something?
you're missing the distinction between the content of a string object,
and how the corresponding string literal looks.
>>x = {'delimiter': '\\t'}
x
{'delimiter': '\\t'}
>>x["delimiter"]
'\\t'
>>print x["delimiter"]
\t
>>len(x["delimiter"])
2

</F>

Aug 19 '08 #3

P: n/a
On Aug 19, 1:45*pm, Hrvoje Niksic <hnik...@xemacs.orgwrote:
wannymaho...@gmail.com writes:
optparse seems to be escaping control characters that I pass as
arguments on the command line. *Is this a bug? *Am I missing
something? *Can this be prevented, or worked around?

It has nothing to do with optparse, it's how Python prints strings:

$ python -c 'import sys; print sys.argv' '\t'
['-c', '\\t']

Note that you're not really passing a control character to Python,
you're passing a two-character string consisting of \ and t. *When
representing the string inside a data structure, Python escapes the \
to avoid confusion with a real control character such as \t.

If you try printing the string itself, you'll see that everything is
correct:

$ python -c 'import sys; print sys.argv[1]' '\t'
\t
Thanks for the reply, much clearer now, just one more question. How
would I pass a control character to python on the command line?

Aug 19 '08 #4

P: n/a
On Aug 19, 10:35 pm, wannymaho...@gmail.com wrote:
optparse seems to be escaping control characters that I pass as
arguments on the command line. Is this a bug? Am I missing
something? Can this be prevented, or worked around?

This behaviour doesn't occur with non-control characters.

For example, if this program (called test.py):
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", dest="delimiter", action="store")
(options, args) = parser.parse_args()
print options

is run as follows:
python test.py -d '\t'

it outputs:
{'delimiter': '\\t'}

i.e. the \t has had an escape character added to give \\t.
You are inputting a TWO-byte string composed of a backslash and a
lowercase t, and feeding that to OptionParser.

C:\junk>type test.py
import sys; a = sys.argv[1]; d = {'delimiter': a}
print len(a), a, str(a), repr(a)
print d

# Note: this is Windows, where the shell quote is ", not '
C:\junk>python test.py "\t"
2 \t \t '\\t'
{'delimiter': '\\t'}

The extra backslash that you see is caused by the (implicit) use of
repr() to display the string.

If you want/need to enter a literal TAB character in the command line,
consult the manual for your shell.

HTH,
John
Aug 19 '08 #5

P: n/a
On Tue, 19 Aug 2008 05:35:27 -0700, wannymahoots wrote:
optparse seems to be escaping control characters that I pass as
arguments on the command line. Is this a bug? Am I missing something?
Can this be prevented, or worked around?
You are misinterpreting the evidence. Here's the short explanation:

optparse isn't escaping a control character, because you're not supplying
it with a control character. You're supplying it with two normal
characters, which merely *look* like five (including the quote marks)
because of Python's special handling of backslashes.
If you need it, here's the long-winded explanation.

I've made a small change to your test.py file to demonstrate:

# test.py (modified)
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-d", dest="delimiter", action="store")
(options, args) = parser.parse_args()
print "Options:", options
print "str of options.delimiter =", str(options.delimiter)
print "repr of options.delimiter =", repr(options.delimiter)
print "len of options.delimiter =", len(options.delimiter)
Here's what it does when I call it:

$ python test.py -d '\t'
Options: {'delimiter': '\\t'}
str of options.delimiter = \t
repr of options.delimiter = '\\t'
len of options.delimiter = 2
When you pass '\t' in the command line, the shell sends a literal
backslash followed by a lowercase t to Python. That is, it sends the
literal string '\t', not a control character.

Proof: pass the same string to the "wc" program using "echo". Don't
forget that echo adds a newline to the string:

$ echo 't' | wc # just a t
1 1 2
$ echo '\t' | wc # a backslash and a t, not a control character
1 1 3
That's the first half of the puzzle. Now the second half -- why is Python
adding a *second* backslash to the backslash-t? Actually, it isn't, but
it *seems* to be adding not just a second backslash but also two quote
marks.

The backslash in Python is special. If you wanted a literal backslash t
in a Python string, you would have to type *two* backslashes:

'\\t'

because a single backslash followed by t is escaped to make a tab
character.

But be careful to note that even though you typed five characters (quote,
backslash, backslash, t, quote) Python creates a string of length two: a
single backslash and a t.

Now, when you print something using the str() function, Python hides all
that complexity from you. Hence the line of output that looks like this:

str of options.delimiter = \t

The argument is a literal backslash followed by a t, not a tab character.

But when you print using the repr() function, Python shows you what you
would have typed -- five characters as follows:

repr of options.delimiter = '\\t'

But that's just the *display* of a two character string. The actual
string itself is only two characters, despite the two quotes and the two
backslashes.

Now for the final piece of the puzzle: when you print most composite
objects, like the OptParse Value objects -- the object named "options" in
your code -- Python prints the internals of it using repr() rather than
str().

--
Steven
Aug 19 '08 #6

P: n/a
Thanks for all the responses!
Aug 19 '08 #7

P: n/a
Dan Halligan <da**********@gmail.comwrites:
How would I pass a control character to python on the command line?
It depends on which command line you are using. Most Unix-like shells
will allow you to input a control character by preceding it with ^V.
Since \t is the TAB character, you should be able to input it like
this:

$ python -c 'import sys; print sys.argv' '^V<tab>'
['-c', '\t'] # note single backslash
Aug 19 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.