469,362 Members | 2,309 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,362 developers. It's quick & easy.

Unicode formatting for Strings

Hi,

I´m trying desperately to tell the interpreter to put an 'á' in my
string, so here is the code snippet:

# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')

Then I save it with Windows Notepad, in the UTF-8 format. So:

1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format

And even so, I get an error!

File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2.py
on line 1
, but no encoding declared; see http://www.python.org/peps/
pep-0263.html for det
ails

I don´t know how to tell Python that it should use UTF-8, it keeps
saying "no encoding declared" !

Robson

Feb 5 '07 #1
5 4462
On Feb 5, 11:55 am, robson.cozendey...@gmail.com wrote:
Hi,

I´m trying desperately to tell the interpreter to put an 'á' in my
string, so here is the code snippet:

# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')

Then I save it with Windows Notepad, in the UTF-8 format. So:

1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format

And even so, I get an error!

File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2.py
on line 1
, but no encoding declared; seehttp://www.python.org/peps/
pep-0263.html for det
ails

I don´t know how to tell Python that it should use UTF-8, it keeps
saying "no encoding declared" !

Robson
I can't tell from your email if you get the message when you try to
open or close the file. So, I recommend that you read the following
article as it explains the whole unicode business quite well:
http://www.pyzine.com/Issue008/Secti...Encodings.html

Feb 5 '07 #2
ro****************@gmail.com wrote:
Hi,

I´m trying desperately to tell the interpreter to put an 'á' in my
string, so here is the code snippet:

# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')

Then I save it with Windows Notepad, in the UTF-8 format. So:

1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format

And even so, I get an error!

File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2.py
on line 1
It looks like you are saving the file in Unicode format (not utf-8) and
Python is choking on the Byte Order Mark that Notepad puts at the
beginning of the document.

Try using an editor that will save utf-8 without a BOM, e.g. jedit or
TextPad.

Kent
Feb 5 '07 #3
On 2/5/07, Kent Johnson <ke**@kentsjohnson.comwrote:
ro****************@gmail.com wrote:
Hi,

I´m trying desperately to tell the interpreter to put an 'á' in my
string, so here is the code snippet:

# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')

Then I save it with Windows Notepad, in the UTF-8 format. So:

1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format

And even so, I get an error!

File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2.py
on line 1

It looks like you are saving the file in Unicode format (not utf-8) and
Python is choking on the Byte Order Mark that Notepad puts at the
beginning of the document.
Notepad does support saving to UTF-8, and I was able to do this
without the problem the OP was having. I also saved both with and
without a BOM (in UTF-8) using SciTe, and Python worked correctly in
both cases.
Try using an editor that will save utf-8 without a BOM, e.g. jedit or
TextPad.

Kent
--
http://mail.python.org/mailman/listinfo/python-list
Feb 5 '07 #4
On Feb 5, 7:00 pm, "Chris Mellon" <arka...@gmail.comwrote:
On 2/5/07, Kent Johnson <k...@kentsjohnson.comwrote:


robson.cozendey...@gmail.com wrote:
Hi,
I´m trying desperately to tell the interpreter to put an 'á' in my
string, so here is the code snippet:
# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')
Then I save it with Windows Notepad, in the UTF-8 format. So:
1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format
And even so, I get an error!
File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2..py
on line 1
It looks like you are saving the file in Unicode format (not utf-8) and
Python is choking on the Byte Order Mark that Notepad puts at the
beginning of the document.

Notepad does support saving to UTF-8, and I was able to do this
without the problem the OP was having. I also saved both with and
without a BOM (in UTF-8) using SciTe, and Python worked correctly in
both cases.
Try using an editor that will save utf-8 without a BOM, e.g. jedit or
TextPad.
Kent
--
http://mail.python.org/mailman/listinfo/python-list- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -
I saved it in UTF-8 with Notepad. I was thinking here... It can be a
limitation of file.open() method? Have anyone tested that?

Feb 5 '07 #5
On Feb 6, 8:05 am, robson.cozendey...@gmail.com wrote:
On Feb 5, 7:00 pm, "Chris Mellon" <arka...@gmail.comwrote:
On 2/5/07, Kent Johnson <k...@kentsjohnson.comwrote:
robson.cozendey...@gmail.com wrote:
Hi,
I´m trying desperately to tell the interpreter to put an 'á' inmy
string, so here is the code snippet:
# -*- coding: utf-8 -*-
filename = u"Ataris Aquáticos #2.txt"
f = open(filename, 'w')
Then I save it with Windows Notepad, in the UTF-8 format. So:
1) I put the "magic comment" at the start of the file
2) I write u"" to specify my unicode string
3) I save it in the UTF-8 format
And even so, I get an error!
File "Ataris Aqußticos #2.py", line 1
SyntaxError: Non-ASCII character '\xff' in file Ataris Aqußticos #2.py
on line 1
It looks like you are saving the file in Unicode format (not utf-8) and
Python is choking on the Byte Order Mark that Notepad puts at the
beginning of the document.
Notepad does support saving to UTF-8, and I was able to do this
without the problem the OP was having. I also saved both with and
without a BOM (in UTF-8) using SciTe, and Python worked correctly in
both cases.
Try using an editor that will save utf-8 without a BOM, e.g. jedit or
TextPad.
Kent
--
>http://mail.python.org/mailman/listi...thon-list-Hide quoted text -
- Show quoted text -- Hide quoted text -
- Show quoted text -

I saved it in UTF-8 with Notepad.
Please consider that you might possibly be mistaken.

Here are dumps of 4 varieties of file:

| >>for i in range(4):
.... print '\nFile %d:\n%r' % (i, open('robson' + str(i) + '.py',
'rb').read())
....

File 0:
'\xef\xbb\xbf# -*- coding: utf-8 -*-\r\nfilename = u"Ataris Aqu
\xc3\xa1ticos #2.
txt"\r\nf = open(filename, \'w\')'

File 1:
'# -*- coding: utf-8 -*-\r\nfilename = u"Ataris Aqu\xc3\xa1ticos
#2.txt"\r\nf =
open(filename, \'w\')'

File 2:
'# -*- coding: cp1252 -*-\r\nfilename = u"Ataris Aqu\xe1ticos #2.txt"\r
\nf = ope
n(filename, \'w\')'

File 3:
'\xff\xfe#\x00 \x00-\x00*\x00-\x00 \x00c\x00o\x00d\x00i\x00n\x00g
\x00:\x00 \x00u
\x00t\x00f\x00-\x008\x00 \x00-\x00*\x00-\x00\r\x00\n\x00f\x00i\x00l
\x00e\x00n\x0
0a\x00m\x00e\x00 \x00=\x00 \x00u\x00"\x00A\x00t\x00a\x00r\x00i\x00s
\x00 ]
[snip]

File 0 was saved in UTF-8 with Notepad. Notepad puts a "UTF-8 BOM" at
the front of the file. It works (that is, it creates a file with the a-
acute character in its name). There is no \xff character in line 1 for
Python to complain about.

File 1 was saved in UTF-8 with another editor. No BOM, no problem.
Works.

File 2 (which specifies cp1252 encoding (my default, and probably
yours too)) was saved normally (i.e. without the stuffing about
necessary to get UTF-8). Works.

File 3 was saved in "Unicode" (really utf_16_le) using Notepad. As you
can see, it has a UTF-16-LE BOM (which contains \xff) at the start.
Python is not amused, giving exactly the same error message as you
reported.

So:

(1) If you still believe that you are getting a problem with a file
saved as UTF-8, please present reproducible credible evidence: for
example, a copy/paste of what happens when you (a) dump of the file,
immediately followed by (b) running the file with Python.

(2) Consider using your "native" encoding (e.g. cp1252) with your
normal/usual editor/IDE.
I was thinking here... It can be a
limitation of file.open() method?
No, it can't.
Have anyone tested that?
Unlikely.

HTH,
John

Feb 5 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Jacob Friis | last post: by
11 posts views Thread by Steve Holden | last post: by
26 posts views Thread by Nige | last post: by
3 posts views Thread by Petr Prikryl | last post: by
3 posts views Thread by Perecli Manole | last post: by
2 posts views Thread by Guadala Harry | last post: by
2 posts views Thread by RichardF | last post: by
1 post views Thread by Dennis Benzinger | last post: by
3 posts views Thread by spohle | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.