By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,127 Members | 1,366 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,127 IT Pros & Developers. It's quick & easy.

UTF-8 characters in doctest

P: n/a
Hello,
I have problems with running doctests if I use czech national
characters in UTF-8 encoding.

I have Python script, which begin with encoding definition:

# -*- coding: utf-8 -*-

I have this function with doctest:

def get_inventary_number(block):
"""
>>t = u'''28. České královské insignie
... mědirytina, grafika je zcela vyřezána z papíru - max.
rozměr
... 420×582 neznačeno
... text: opis v levém medailonu: CAROL VI IMP.ELIS.CHR. AVG.
P.P.'''
>>get_inventary_number(t)
(u'nezna\xc4\x8deno', u'28. \xc4\x8cesk\xc3\xa9 kr\xc3\xa1lovsk
\xc3\xa9 insignie\nm\xc4\x9bdirytina, grafika je zcela vy\xc5\x99ez
\xc3\xa1na z pap\xc3\xadru \xe2\x80\x93 max. rozm\xc4\x9br
\n420\xc3\x97582 \ntext: opis v lev\xc3\xa9m medailonu: CAROL VI
IMP.ELIS.CHR. AVG. P.P.')
"""
m = RE_INVENTARNI_CISLO.search(block)
if m: return m.group(1), block.replace(m.group(0), '')
else: return None, block

After running doctest.testmod() I get this error message:

File "vizovice_03.py", line 417, in ?
doctest.testmod()
File "/usr/local/lib/python2.4/doctest.py", line 1841, in testmod
for test in finder.find(m, name, globs=globs,
extraglobs=extraglobs):
File "/usr/local/lib/python2.4/doctest.py", line 851, in find
self._find(tests, obj, name, module, source_lines, globs, {})
File "/usr/local/lib/python2.4/doctest.py", line 910, in _find
globs, seen)
File "/usr/local/lib/python2.4/doctest.py", line 895, in _find
test = self._get_test(obj, name, module, globs, source_lines)
File "/usr/local/lib/python2.4/doctest.py", line 985, in _get_test
filename, lineno)
File "/usr/local/lib/python2.4/doctest.py", line 602, in get_doctest
return DocTest(self.get_examples(string, name), globs,
File "/usr/local/lib/python2.4/doctest.py", line 616, in
get_examples
return [x for x in self.parse(string, name)
File "/usr/local/lib/python2.4/doctest.py", line 577, in parse
(source, options, want, exc_msg) = \
File "/usr/local/lib/python2.4/doctest.py", line 648, in
_parse_example
lineno + len(source_lines))
File "/usr/local/lib/python2.4/doctest.py", line 732, in
_check_prefix
raise ValueError('line %r of the docstring for %s has '
ValueError: line 17 of the docstring for __main__.get_inventary_number
has inconsistent leading whitespace: 'm\xc4\x9bdirytina, grafika je
zcela vy\xc5\x99ez\xc3\xa1na z pap\xc3\xadru \xe2\x80\x93 max. rozm
\xc4\x9br'

I try to fill expected output in docstring according to output from
Python shell, from doctest (if I bypass it in docstring, doctest says
me what he expect and what it get), I try to set variable t as t='some
text' together t=u'some unicode text'. But everything fails.

So my question is: Is it possible to run doctests with UTF-8
characters? And if your answer will be YES, tell me please how...

Thank you for any advice.
Regards
Michal

Sep 19 '07 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Bzyczek wrote:
So my question is: Is it possible to run doctests with UTF-8
characters? And if your answer will be YES, tell me please how...
Use raw strings in combination with explicit decoding and a little
try-and-error. E. g. this little gem passes ;)

# -*- coding: utf8 -*-
r"""
>>f("äöü".decode("utf8"))
(u'\xe4\xf6\xfc',)
"""
def f(s):
return (s,)

if __name__ == "__main__":
import doctest
doctest.testmod()

Peter
Sep 19 '07 #2

P: n/a
Peter Otten <__*******@web.dewrites:
[...]
# -*- coding: utf8 -*-
r"""
>>>f("äöü".decode("utf8"))
(u'\xe4\xf6\xfc',)
"""
def f(s):
return (s,)
Forgive me if this is a stupid question, but: What purpose does
function f serve?
John
Sep 20 '07 #3

P: n/a
John J. Lee wrote:
Peter Otten <__*******@web.dewrites:
[...]
>def f(s):
return (s,)

Forgive me if this is a stupid question, but: What purpose does
function f serve?
John
Well, it has nothing to do with the unicode bit that came before it. It
just takes an argument, and wraps it in a 1-tuple. Guessing by the
argument of "s", that argument is expected to be a string.

One use I can think of is that sometimes you'll find a function that
returns a string or a list or tuple of strings. If you want to pass that
result on to a for loop, and only loop once on the string (instead of
looping on each letter of the string), you might want to wrap it in a
tuple or a list before passing it to the loop.

Cheers,
Cliff
Sep 21 '07 #4

P: n/a
J. Cliff Dyer wrote:
John J. Lee wrote:
>Peter Otten <__*******@web.dewrites:
[...]

>>def f(s):
return (s,)

Forgive me if this is a stupid question, but: What purpose does
function f serve?
John


Well, it has nothing to do with the unicode bit that came before it. It
just takes an argument, and wraps it in a 1-tuple. Guessing by the
argument of "s", that argument is expected to be a string.

One use I can think of is that sometimes you'll find a function that
returns a string or a list or tuple of strings. If you want to pass that
result on to a for loop, and only loop once on the string (instead of
looping on each letter of the string), you might want to wrap it in a
tuple or a list before passing it to the loop.

Cheers,
Cliff
(replying to my own post)

Sorry. Itchy trigger finger and tired brain. I didn't read the whole
context of the thread. Dunno what it's doing here. Forcing __repr__ to
be called on a print statement? Funny way to do that. Like I said, I
don't know, so I'll leave it to someone else to say.

Cheers,
Cliff
Sep 21 '07 #5

P: n/a
John J. Lee wrote:
Peter Otten <__*******@web.dewrites:
[...]
># -*- coding: utf8 -*-
r"""
>>>>f("äöü".decode("utf8"))
(u'\xe4\xf6\xfc',)
"""
def f(s):
return (s,)

Forgive me if this is a stupid question, but: What purpose does
function f serve?
Like the OP's get_inventary_number() it takes a unicode string and
returns a tuple of unicode strings. I'ts pointless otherwise. I hoped I
had stripped down his code to a point where the analogy was still
recognizable.

Peter
Sep 21 '07 #6

P: n/a
Peter Otten <__*******@web.dewrites:
[...]
>Forgive me if this is a stupid question, but: What purpose does
function f serve?

Like the OP's get_inventary_number() it takes a unicode string and
returns a tuple of unicode strings. I'ts pointless otherwise. I hoped I
had stripped down his code to a point where the analogy was still
recognizable.
Ah, right.
John
Sep 22 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.