473,378 Members | 1,489 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Help needed with python unicode cgi-bin script

Dear web gods:

After much, much, much struggle with unicode, many an hour reading all the
examples online, coding them, testing them, ripping them apart and putting
them back together, I am humbled. Therefore, I humble myself before you to
seek guidance on a simple python unicode cgi-bin scripting problem.

My problem is more complex than this, but how about I boil down one sticking
point for starters. I have a file with a Spanish word in it, "años", which I
wish to read with:
#!C:/Program Files/Python23/python.exe

STARTHTML= u'''Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
</head>
<body>
'''
ENDHTML = u'''
</body>
</html>
'''
print STARTHTML
print open('c:/test/spanish.txt','r').read()
print ENDHTML
Instead of seeing "año" I see "a?o". BAD BAD BAD
Yet, if I open the file with the browser (IE/Mozilla), I see "año." THIS IS
WHAT I WANT

WHAT GIVES?

Next, I'll get into codecs and stuff, but how about starting with this?

The general question is, does anybody have a complete working example of a
cgi-bin script that does the above properly that they'd be willing to share?
I've tried various examples online but haven't been able to get any to work.
I end up seeing hex code for the non-ascii characters u'a\xf1o', and later
on 'a\xc3\xb1o', which are also BAD BAD BAD.

Thanks -- your humble supplicant.
Dec 10 '07 #1
20 2010
You probably need to set stdout mode to binary. They are not by default on
Windows.
"weheh" <we***@verizon.netwrote in message
news:DV57j.11710$OR.11141@trnddc01...
Dear web gods:

After much, much, much struggle with unicode, many an hour reading all the
examples online, coding them, testing them, ripping them apart and putting
them back together, I am humbled. Therefore, I humble myself before you to
seek guidance on a simple python unicode cgi-bin scripting problem.

My problem is more complex than this, but how about I boil down one
sticking point for starters. I have a file with a Spanish word in it, "años",
which I wish to read with:
#!C:/Program Files/Python23/python.exe

STARTHTML= u'''Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
</head>
<body>
'''
ENDHTML = u'''
</body>
</html>
'''
print STARTHTML
print open('c:/test/spanish.txt','r').read()
print ENDHTML
Instead of seeing "año" I see "a?o". BAD BAD BAD
Yet, if I open the file with the browser (IE/Mozilla), I see "año." THIS
IS WHAT I WANT

WHAT GIVES?

Next, I'll get into codecs and stuff, but how about starting with this?

The general question is, does anybody have a complete working example of a
cgi-bin script that does the above properly that they'd be willing to
share? I've tried various examples online but haven't been able to get any
to work. I end up seeing hex code for the non-ascii characters u'a\xf1o',
and later on 'a\xc3\xb1o', which are also BAD BAD BAD.

Thanks -- your humble supplicant.

Dec 10 '07 #2
My problem is more complex than this, but how about I boil down one sticking
point for starters. I have a file with a Spanish word in it, "años", which I
wish to read with:
What is the encoding of that file? Without a correct answer to that
question, you will not be able to achieve what you want.

Possible answers are "iso-8859-1", "utf-8", "windows-1252", and "cp850"
(these all support the word "años")
Instead of seeing "año" I see "a?o". BAD BAD BAD
I don't see anything here. Where do you see the question mark? Did you
perhaps run the CGI script in a web server, and pointed your web browser
to the web page, and saw the question mark in the web browser?
WHAT GIVES?
Sending "Content-type: text/html" is not enough. The web browser needs
to know what the encoding is. So you should send

Content-type: text/html; charset="your-encoding-here"

Use "extras/page information" in Firefox to find out what the web
browser thinks the encoding of the page is.

Regards,
Martin

P.S. Please, stop shouting.
Dec 10 '07 #3
Thanks for the reply, Jack. I tried setting mode to binary but it had no
affect.
"Jack" <no****@invalid.comwrote in message
news:y_******************************@comcast.com. ..
You probably need to set stdout mode to binary. They are not by default on
Windows.
"weheh" <we***@verizon.netwrote in message
news:DV57j.11710$OR.11141@trnddc01...
>Dear web gods:

After much, much, much struggle with unicode, many an hour reading all
the examples online, coding them, testing them, ripping them apart and
putting them back together, I am humbled. Therefore, I humble myself
before you to seek guidance on a simple python unicode cgi-bin scripting
problem.

My problem is more complex than this, but how about I boil down one
sticking point for starters. I have a file with a Spanish word in it,
"años", which I wish to read with:
#!C:/Program Files/Python23/python.exe

STARTHTML= u'''Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
</head>
<body>
'''
ENDHTML = u'''
</body>
</html>
'''
print STARTHTML
print open('c:/test/spanish.txt','r').read()
print ENDHTML
Instead of seeing "año" I see "a?o". BAD BAD BAD
Yet, if I open the file with the browser (IE/Mozilla), I see "año." THIS
IS WHAT I WANT

WHAT GIVES?

Next, I'll get into codecs and stuff, but how about starting with this?

The general question is, does anybody have a complete working example of
a cgi-bin script that does the above properly that they'd be willing to
share? I've tried various examples online but haven't been able to get
any to work. I end up seeing hex code for the non-ascii characters
u'a\xf1o', and later on 'a\xc3\xb1o', which are also BAD BAD BAD.

Thanks -- your humble supplicant.


Dec 10 '07 #4
Hi Martin, thanks for your response. My updates are interleaved with your
response below:

What is the encoding of that file? Without a correct answer to that
question, you will not be able to achieve what you want.
I don't know for sure the encoding of the file. I'm assuming it has no
intrinsic encoding since I copied the word "año" into vim and then saved it
as the example text file called, "spanish.txt".
Possible answers are "iso-8859-1", "utf-8", "windows-1252", and "cp850"
(these all support the word "año")
>Instead of seeing "año" I see "a?o".

I don't see anything here. Where do you see the question mark? Did you
perhaps run the CGI script in a web server, and pointed your web browser
to the web page, and saw the question mark in the web browser?
The cgi-bin scripts prints to stdout, i.e. to my browser, and when I use
print I see a square box where the ñ should be. When I use print repr(...) I
see 'a\xf1o'. I never see the desired 'ñ' character.
Sending "Content-type: text/html" is not enough. The web browser needs
to know what the encoding is. So you should send

Content-type: text/html; charset="your-encoding-here"
Sorry, somehow my cut and paste job into outlook missed the exact line you
had above that specifies encoding tp be set as "utf8", but it's there in my
program. Not to worry.
Use "extras/page information" in Firefox to find out what the web
browser thinks the encoding of the page is.
Firefox says the page is UTF8.
P.S. Please, stop shouting.
OK, it's just that it hurts when I've been pulling my hair out for days on
end over a single line of code. I don't want to go bald just yet.
Dec 10 '07 #5
On Dec 11, 9:55 am, "weheh" <we...@verizon.netwrote:
Hi Martin, thanks for your response. My updates are interleaved with your
response below:
What is the encoding of that file? Without a correct answer to that
question, you will not be able to achieve what you want.

I don't know for sure the encoding of the file. I'm assuming it has no
intrinsic encoding since I copied the word "año" into vim and then savedit
as the example text file called, "spanish.txt".
Every text file encoded, and very few of them are tagged with the name
of the encoding in any reliable fashion.
>
Possible answers are "iso-8859-1", "utf-8", "windows-1252", and "cp850"
(these all support the word "año")
Instead of seeing "año" I see "a?o".
I don't see anything here. Where do you see the question mark? Did you
perhaps run the CGI script in a web server, and pointed your web browser
to the web page, and saw the question mark in the web browser?

The cgi-bin scripts prints to stdout, i.e. to my browser, and when I use
print I see a square box where the ñ should be. When I use print repr(....) I
see 'a\xf1o'. I never see the desired 'ñ' character.

Sending "Content-type: text/html" is not enough. The web browser needs
to know what the encoding is. So you should send
Content-type: text/html; charset="your-encoding-here"

Sorry, somehow my cut and paste job into outlook missed the exact line you
had above that specifies encoding tp be set as "utf8", but it's there in my
program. Not to worry.
Use "extras/page information" in Firefox to find out what the web
browser thinks the encoding of the page is.

Firefox says the page is UTF8.
P.S. Please, stop shouting.

OK, it's just that it hurts when I've been pulling my hair out for days on
end over a single line of code. I don't want to go bald just yet.
Forget for the moment what you see in the browser. You need to find
out how your file is encoded.

Look at your file using
print repr(open('c:/test/spanish.txt','rb').read())

If you see 'a\xf1o' then use charset="windows-1252" else if you see
'a\xc3\xb1o' then use charset="utf-8" else ????

Based on your responses to Martin, it appears that your file is
actually windows-1252 but you are telling browsers that it is utf-8.

Another check: if the file is utf-8, then doing
open('c:/test/spanish.txt','rb').read().decode('utf8')
should be OK; if it's not valid utf8, it will complain.

Yet another check: open the file with Notepad. Do File/SaveAs, and
look at the Encoding box -- ANSI or UTF-8?

HTH,
John
Dec 10 '07 #6
Just want to make sure, how exactly are you doing that?
Thanks for the reply, Jack. I tried setting mode to binary but it had no
affect.

Dec 11 '07 #7
import sys

if sys.platform == "win32":
import os, msvcrt
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
"Jack" <no****@invalid.comwrote in message
news:A7******************************@comcast.com. ..
Just want to make sure, how exactly are you doing that?
>Thanks for the reply, Jack. I tried setting mode to binary but it had no
affect.


Dec 11 '07 #8
Hi John:
Thanks for responding.
>Look at your file using
print repr(open('c:/test/spanish.txt','rb').read())
>If you see 'a\xf1o' then use charset="windows-1252"
I did this ... no change ... still see 'a\xf1o'
>else if you see 'a\xc3\xb1o' then use charset="utf-8" else ????
>Based on your responses to Martin, it appears that your file is
actually windows-1252 but you are telling browsers that it is utf-8.
>Another check: if the file is utf-8, then doing
open('c:/test/spanish.txt','rb').read().decode('utf8')
should be OK; if it's not valid utf8, it will complain.
No. this causes decode error:

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4: invalid
data
args = ('utf8', 'a\, 1, 5, 'invalid data')
encoding = 'utf8'
end = 5
object = 'a\xf1o'
reason = 'invalid data'
start = 1

>Yet another check: open the file with Notepad. Do File/SaveAs, and
look at the Encoding box -- ANSI or UTF-8?
Notepad says it's ANSI

Thanks. What now? Also, this is a general problem for me, whether I read
from a file or read from an html text field, or read from an html text area.
So I'm looking for a general solution. If it helps to debug by reading from
textarea or text field, let me know.
Dec 11 '07 #9
On Dec 12, 4:46 am, "weheh" <we...@verizon.netwrote:
Hi John:
Thanks for responding.
Look at your file using
print repr(open('c:/test/spanish.txt','rb').read())
If you see 'a\xf1o' then use charset="windows-1252"

I did this ... no change ... still see 'a\xf1o'
So it's not utf-8, it's windows-1252, so stop lying to browsers: like
I said, use charset="windows-1252"
>
else if you see 'a\xc3\xb1o' then use charset="utf-8" else ????
Based on your responses to Martin, it appears that your file is
actually windows-1252 but you are telling browsers that it is utf-8.
Another check: if the file is utf-8, then doing
open('c:/test/spanish.txt','rb').read().decode('utf8')>should be OK; if it's not valid utf8, it will complain.

No. this causes decode error:

UnicodeDecodeError: 'utf8' codec can't decode bytes in position 1-4: invalid
data
No what? YES, the "decode error" is complaining that the data supplied
is NOT valid utf-8 data. So it's not utf-8, it's windows-1252, so stop
lying to browsers: like I said, use charset="windows-1252"
args = ('utf8', 'a\, 1, 5, 'invalid data')
encoding = 'utf8'
end = 5
object = 'a\xf1o'
reason = 'invalid data'
start = 1
Yet another check: open the file with Notepad. Do File/SaveAs, and
look at the Encoding box -- ANSI or UTF-8?

Notepad says it's ANSI
That's correct (in Microsoft jargon) -- it's NOT utf-8. It's
windows-1252, so stop lying to browsers: like I said, use
charset="windows-1252"
>
Thanks. What now?
Listen to the Bellman: "What I tell you three times is true".
Your file is encoded using windows-1252, NOT utf-8.
You need to use charset="windows-1252".

Also, this is a general problem for me, whether I read
from a file or read from an html text field, or read from an html text area.
So I'm looking for a general solution. If it helps to debug by reading from
textarea or text field, let me know.
If you are creating a file, you should know what its encoding is. As I
said earlier, *every* file is encoded -- so-called "Unicode" files on
Windows are encoded using utf16le. If you don't explicitly specify the
encoding, it will typically be the default encoding for your locale
(e.g. cp1252 in Western Europe etc).

If you are reading a file created by others and its encoding is not
known, you will have inspect the file and/or guess (using whatever
knowledge you have about the language/locale of the creator).

"whether I ... read from an html text field, or read from an html text
area": isn't that what "charset" is for?

HTH,
John
Dec 11 '07 #10
No what? YES, the "decode error" is complaining that the data supplied
is NOT valid utf-8 data. So it's not utf-8, it's windows-1252, so stop
lying to browsers: like I said, use charset="windows-1252"
I think weheh can manage to resist good advise for a long time.

Regards,
Martin
Dec 11 '07 #11
John & Martin,

Thanks for your help and kind words of encouragement. Still, what you have
suggested doesn't seem to work, unless I'm not understanding your directive
to encode as 'windows-1252'. Here's my program in full:

#!C:/Program Files/Python23/python.exe
import cgi, cgitb
import sys, codecs
import os,msvcrt

cgitb.enable()

print u"""Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" lang="en,sp,fr"
xml:lang="en,sp,fr">
<head>
<meta http-equiv="content-type" content="text/html; charset=windows-1252" />
<meta http-equiv="content-language" content="en,fr,sp" />
</head>
<body>
"""
if sys.platform == 'win32':
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

x = repr(open('c:/test/spanish.txt','rb').read())
print '<p>',x,'# first print</p>'

x = open('c:/test/spanish.txt','rb').read()
print '<p>',x,'# second print</p>'

x = repr((open('c:/test/spanish.txt','rb').read()).decode('windows-1252'))
print '<p>',x,'# third print</p>'

print """
</body>
</html>
"""
The output of the program is this:

'a\xf1o\r\n' # first print

a?o # second print #### Note that there is no ñ between the a and the o,
only a box

u'a\xf1o\r\n' # third print

So what do you advise?
Dec 11 '07 #12
Thanks for your help and kind words of encouragement. Still, what you have
suggested doesn't seem to work, unless I'm not understanding your directive
to encode as 'windows-1252'.
Please read John's message again. Nowhere he said you should "encode as
'windows-1252'". Instead, he said 'use charset="windows-1252"'.
print u"""Content-Type: text/html
In one message, you said "somehow my cut and paste job into outlook
missed the exact line you had above that specifies encoding tp be set as
"utf8", but it's there in my program"

What EXACTLY do you mean by that? Where PRECISELY did you set
"encoding tp" (what is tp?) to "utf8"?

I suggested doing that in the Content-type CGI/HTTP header, but
perhaps you do it in the http-equiv HTML element instead:
<meta http-equiv="content-type" content="text/html; charset=windows-1252" />
So this should work fine.
x = repr(open('c:/test/spanish.txt','rb').read())
print '<p>',x,'# first print</p>'
It's no suprise that this prints an escaped text, because that's
what repr() does.
x = open('c:/test/spanish.txt','rb').read()
print '<p>',x,'# second print</p>'
This should work fine.
a?o # second print #### Note that there is no ñ between the a and the o,
only a box
What does the browser say what the encoding of the page is?

What browser are you using, and did you configure it to default to
UTF-8 for all pages? (which you should not have done)

Try "telnet server 80", then type

GET /path HTTP/1.1<enter>
Host: server<enter>
<enter>

and report what response from the server is (the complete one,
not just the character in question)

Regards,
Martin
Dec 11 '07 #13
What does the browser say what the encoding of the page is?
>
What browser are you using, and did you configure it to default to
UTF-8 for all pages? (which you should not have done)
Browser is both IE and Firefox. IE is defaulting to UTF8. If I force it to
"Encoding Western European (Windows)" it shows the ñ. The browser encoding
"Autoselect" feature is enabled, yet it always seems to default to UTF8. Any
idea how to change that?

Is there something I can put in html that forces it to do that?

I'm using Apache and have the following line in my http.conf file:

AddDefaultCharset utf-8

Is this a problem?
Try "telnet server 80", then type

GET /path HTTP/1.1<enter>
Host: server<enter>
<enter>

and report what response from the server is (the complete one,
not just the character in question)
OK, telnet session yields this:

HTTP/1.1 200 OK
Date: Tue, 11 Dec 2007 23:58:02 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8

1f8
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.or
g/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html
xmlns="http://www.w3.org/1999/xh
tml" lang="en,sp,fr" xml:lang="en,sp,fr">
<head>
<meta
http-equiv="content-type" c
ontent="text/html; charset=utf8" />
<meta http-equiv="content-language"
content="
en,fr,sp" />
</head>
<body>

<pa±o
# first print</p>
<p'a\xf1o\r\n' # second print</p>
<p'a\xf1o\r\n' # third
pri
nt</p>
<p'a\xf1o\r\n' # third print</p>

</body>
</html>
0

Connection to host lost.
Dec 12 '07 #14
p.s. I modified the code to break things out more explicitly:

#!C:/Program Files/Python23/python.exe
import cgi, cgitb
import sys, codecs
import os,msvcrt

cgitb.enable()

print u"""Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" lang="en,sp,fr"
xml:lang="en,sp,fr">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf8" />
<meta http-equiv="content-language" content="en,fr,sp" />
</head>
<body>
"""
if sys.platform == 'win32':
msvcrt.setmode(sys.stdin.fileno(), os.O_BINARY)
msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

x = open('c:/test/spanish.txt','rb').read()
print '<p>',x,'# first print</p>'

x = open('c:/test/spanish.txt','rb').read()
x = repr(x)
print '<p>',x,'# second print</p>'

x = open('c:/test/spanish.txt','rb').read()
x = repr(x)
x = x.decode('windows-1252')
print '<p>',x,'# third print</p>'

x = open('c:/test/spanish.txt','rb').read()
x = repr(x)
x = x.decode('windows-1252')
x = x.encode('utf8')
print '<p>',x,'# third print</p>'

print """
</body>
</html>
"""

(The last print should read "fourth print")
Dec 12 '07 #15
On Dec 12, 11:06 am, "weheh" <we...@verizon.netwrote:
p.s. I modified the code to break things out more explicitly:

#!C:/Program Files/Python23/python.exe
import cgi, cgitb
import sys, codecs
import os,msvcrt

cgitb.enable()

print u"""Content-Type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" >
<html xmlns="http://www.w3.org/1999/xhtml" lang="en,sp,fr"
xml:lang="en,sp,fr">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf8"
WTF?

Dec 12 '07 #16
John and Martin,

Thanks for your help. However, I have identified the culprit to be with
Apache and the command:
AddDefaultCharset utf-8
which forces my browser to utf-8 encoding.

It looks like your suggestions to change charset were incorrect. My example
works equally well with charset=utf8 as it does with charset=windows-1252.

Incidentally, next time, if you really want to be helpful, might I suggest
you leave out the mocking. I could care less, myself, but someone else might
have gotten their feelings hurt. And in the end, it doesn't make you look
good.

Thanks again. Cheers.
Dec 12 '07 #17
"weheh" <we***@verizon.netwrote:
John and Martin,

Thanks for your help. However, I have identified the culprit to be
with Apache and the command:
AddDefaultCharset utf-8
which forces my browser to utf-8 encoding.

It looks like your suggestions to change charset were incorrect. My
example works equally well with charset=utf8 as it does with
charset=windows-1252.

Incidentally, next time, if you really want to be helpful, might I
suggest you leave out the mocking. I could care less, myself, but
someone else might have gotten their feelings hurt. And in the end, it
doesn't make you look good.

Thanks again. Cheers.

FWIW, the code you posted only ever attempted to set the character set
encoding using an html meta tag which is the wrong place to set it. The
encoding specified in the HTTP headers always takes precedence. This is why
the default charset setting in Apache was the one which applied.

What you should have been doing was setting the encoding in the content-
type header. i.e. in this line of your code:

print u"""Content-Type: text/html

You should have changed it to read:

Content-Type: text/html; charset=windows-1252

but because you didn't Apache was quietly changing it to read:

Content-Type: text/html; charset=utf-8
Dec 12 '07 #18
Hi Duncan, thanks for the reply.
>>
FWIW, the code you posted only ever attempted to set the character set
encoding using an html meta tag which is the wrong place to set it. The
encoding specified in the HTTP headers always takes precedence. This is
why
the default charset setting in Apache was the one which applied.

What you should have been doing was setting the encoding in the content-
type header. i.e. in this line of your code:

print u"""Content-Type: text/html

You should have changed it to read:

Content-Type: text/html; charset=windows-1252

but because you didn't Apache was quietly changing it to read:

Content-Type: text/html; charset=utf-8
Will this work under the following situation? Let's say the user is filling
out a text field on a form on my website. The user has their browser
encoding set to utf8. My website has charset=windows-1252 as you indicate
above. Will I run into a conflict somewhere?
Dec 12 '07 #19
"weheh" <we***@verizon.netwrote:
Hi Duncan, thanks for the reply.
>>>
FWIW, the code you posted only ever attempted to set the character
set encoding using an html meta tag which is the wrong place to set
it. The encoding specified in the HTTP headers always takes
precedence. This is why
the default charset setting in Apache was the one which applied.

What you should have been doing was setting the encoding in the
content- type header. i.e. in this line of your code:

print u"""Content-Type: text/html

You should have changed it to read:

Content-Type: text/html; charset=windows-1252

but because you didn't Apache was quietly changing it to read:

Content-Type: text/html; charset=utf-8
Will this work under the following situation? Let's say the user is
filling out a text field on a form on my website. The user has their
browser encoding set to utf8. My website has charset=windows-1252 as
you indicate above. Will I run into a conflict somewhere?
If you are sending the user a form you should be specifying the acceptable
character sets with the accept-charset attribute on the form tag. Basically
the rule should be to pick an appropriate character set and stick to it.
don't depend on defaults, be explicit.

Dec 12 '07 #20
It looks like your suggestions to change charset were incorrect. My example
works equally well with charset=utf8 as it does with charset=windows-1252.
It rather looks like that you didn't follow the suggestions carefully.
In my very first message, I wrote

# Sending "Content-type: text/html" is not enough. The web browser needs
# to know what the encoding is. So you should send
#
# Content-type: text/html; charset="your-encoding-here"

As Duncan Booth explains, this is what you should have done instead - if
you do that, you can also leave the AddDefaultCharset declaration.

Regards,
Martin
Dec 13 '07 #21

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Elf M. Sternberg | last post by:
It's all Netscape's fault. RFC 2396 (URI Specifications) specifies that a space shall be encoded using %20 and the plus symbol is always safe. Netscape (and possibly even earlier browsers like...
9
by: Brian Kelley | last post by:
I have been using gettext and various utilities to provide internationalization for a wxPython application and have not really been liking the process. Essentially it uses a macro-style notation...
1
by: Pekka Niiranen | last post by:
Hi there, how can I write out Python Unicode character's hexadecimal value in generic format? I need to loop thru characters in Unicode string and store each character in format \U+hhhh,...
0
by: Kevin T. Ryan | last post by:
Hi All - I'm trying to develop web applications using python / Cheetah. I'm also trying to experiment with lighttpd (see www.lighttpd.net), which supports fast-cgi. So, I downloaded Robin...
4
by: Slalomsk8er | last post by:
I don't get it with the popen (popen3 or subprocess). 1. How do I establish my pipes? 2. And how do I interact with the pipes (interactive CGI-page)? Thanks, Dominik
12
by: rurpy | last post by:
Is there an effcient way (more so than cgi) of using Python with Microsoft IIS? Something equivalent to Perl-ISAPI?
4
by: Robin Haswell | last post by:
Okay I'm getting really frustrated with Python's Unicode handling, I'm trying everything I can think of an I can't escape Unicode(En|De)codeError no matter what I try. Could someone explain to...
0
by: Kurt B. Kaiser | last post by:
Patch / Bug Summary ___________________ Patches : 349 open ( +7) / 3737 closed (+25) / 4086 total (+32) Bugs : 939 open (-12) / 6648 closed (+60) / 7587 total (+48) RFE : 249 open...
145
by: Dave Parker | last post by:
I've read that one of the design goals of Python was to create an easy- to-use English-like language. That's also one of the design goals of Flaming Thunder at http://www.flamingthunder.com/ ,...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.