I'd like to implement some sort of search function on my site, so I took
Google sample code and tried it, i.e. basically:
<form method="GET" action="http://www.google.com/search">
<input type="hidden" name="as_sitesearch" value="www.relinquiere.com">
<input type="text" name="q" size="15" value="">
<input type="image" id="submit" value="" src="..." ...>
</form>
It works fine, most of the time: if I type in accented characters, they
get somehow misinterpreted.
My test page is : http://wwww.relinquiere.com/search.html
As you can see by yourself, the charset parameter in Content-Type is
ISO-8859-1 (that's intended) so I expect my client to send the request
(when submitting the form) using the same encoding (even if it is not
required to do so).
Here is the request when I enter "préhistorique" in my search box:
GET /search?as_sitesearch=www.relinquiere.com&q=pr%E9hi storique&x=8&y=8
HTTP/1.1
where %E9 is actually the value for "é" in the latin-1 répertoire. But
Google interprets it as "pr?historique". If I enter some UTF-8 data in
the search field, this works fine (accented characters are correctly
passed to Google). Does it mean that Google expects UTF-8 data? or that
something is wrong with my form?
Then I added a hidden field to my form:
<input type="hidden" name="ie" value="ISO88591">
as you can see in: http://www.relinquiere.com/search-latin-1.html
(I assume that this "ie" field stood for "input encoding" so that Google
can interpret the received data as Latin-1)
Now, entering "préhistorique" as before works and returns one page. Here
is the request sent to Google:
GET
/search?as_sitesearch=www.relinquiere.com&ie=ISO885 91&q=pr%E9historique&x=9&y=3
HTTP/1.1
What I conclude is that Google needs to be told what encoding is used
for the parameters, which is fair, but this raises a big issue: how am I
supposed to know what encoding my visitors use?
Imagine that a French-speaking Japanese visits my site: he will receive
my page encoded in ISO-8859-1, enter some text (let's assume this text
is made of latin characters - is this possible in Japanese encoding?),
submit the form, and now what? Will his input be encoded in ISO-8859-1 too?
--
Want to spend holidays in France ? Check http://www.relinquiere.com/