By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,903 Members | 1,115 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,903 IT Pros & Developers. It's quick & easy.

UNICODE input for CGI using C

P: n/a
Dear All,
I'm trying to accept a multi-lingual string (UNICODE) in a
form and am trying to parse it. What i am getting is %XX (which is a
single byte, not 2 bytes). So, is the data getting lost? What format
is it, if it is not getting lost.

Thanx in advance,
Punit.

May 29 '07 #1
Share this Question
Share on Google+
2 Replies


P: n/a
In article <11**********************@a26g2000pre.googlegroups .com>,
<pu***********@gmail.comwrote:
I'm trying to accept a multi-lingual string (UNICODE) in a
form and am trying to parse it. What i am getting is %XX (which is a
single byte, not 2 bytes). So, is the data getting lost? What format
is it, if it is not getting lost.
You should be getting 2 or more successive %XXs. HTML form data send
using GET is part of the URL Non-ASCII characters are represented in
UTF-8, then each byte of the UTF-8 sequence is encoded in hex as %XX.

See

http://www.ietf.org/rfc/rfc3986.txt
http://www.ietf.org/rfc/rfc2279.txt

For POST data, I can't find up-to-date documentation. The very old
http://www.w3.org/TR/html4/interact/forms.html describes the
application/x-www-form-urlencoded mime type, but it does not mention
non-ASCII characters. I think you'll find that it uses the same
method as GET, but it's possible that it might use the encoding
specified by the HTTP charset declaration rather than UTF-8. You'll
need to ask about that somewhere other than comp.lang.c.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.
May 29 '07 #2

P: n/a
pu***********@gmail.com wrote:
>
I'm trying to accept a multi-lingual string (UNICODE) in a form
and am trying to parse it. What i am getting is %XX (which is a
single byte, not 2 bytes). So, is the data getting lost? What
format is it, if it is not getting lost.
I suspect a coding error. It must be line 42.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>
<http://kadaitcha.cx/vista/dogsbreakfast/index.html>
cbfalconer at maineline dot net

--
Posted via a free Usenet account from http://www.teranews.com

May 29 '07 #3

This discussion thread is closed

Replies have been disabled for this discussion.