473,467 Members | 1,373 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Unicode and UrlEncode!

I am trying to translate some French via Google.

Here is my code.

# -*- coding: latin-1 -*-

import httplib, urllib, re

def translate_french_to_english(french):
params = urllib.urlencode( {'text': french, 'langpair': 'fr|en',
'hl':'en', 'ie':'UTF8', 'oe':'UTF8'} )
print params
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
Cn = httplib.HTTPConnection("translate.google.com")
Cn.request("POST", "/translate_t", params, headers)
response = Cn.getresponse()
data = response.read()
Cn.close()
match = re.compile('<textarea name=q
..*?>(.*?)</textarea>').search(data)
translated = match.groups()[0]
return translated

print translate_french_to_english('sÈdimentation')

The problem is my params in encoding to
langpair=fr%7Cen&text=s%C8dimentation&oe=UTF8&ie=U TF8&hl=en
-- s%C8dimentation --
when it should encode to
text=s%C3%88dimentation&langpair=fr%7Cen&hl=en&ie= UTF8&oe=UTF8
-- s%C3%88dimentation --

What should I do?
Jul 18 '05 #1
4 3999
Bonsoir !

Effectivement, j'ai fait des essais, et, dès qu'il y a un accent, ça
accroche.
Par contre, j'ai une fonction qui enlève les accents, et, alors, plus de
problème.
--
@-salutations
--
Michel Claveau
mél : http://cerbermail.com/?6J1TthIa8B
site : http://mclaveau.com


Jul 18 '05 #2
You should transform the byte string 'french' from whatever encoding
it's in (latin-1 according to your coding: directive) to unicode, if
you are going to tell google it's in Unicode.

Example:
"s\xc8dimentation".decode("latin-1").encode("utf-8")

Or, you can tell Python that the string is a Unicode literal, and it
will do the .decode() step for you:
u"sÈdimentation".encode("utf-8")

"" is always a bytestring literal, and u"" is always a unicode string
literal. If you have "<sequence of bytes>" then the string's value at
runtime is "<sequence of bytes>", and if you have u"<sequence of bytes>"
then the string's value is "<sequence of bytes>".encode(<file encoding>)

Jeff

Jul 18 '05 #3
OK

ça marche avec :
sret=translate_french_to_english('sÈdimentation'.d ecode("latin-1").encode("u
tf-8"))
et aussi pour :

sret=translate_french_to_english('sédimentation'.d ecode("latin-1").encode("u
tf-8"))
Merci pour lui. (thanks).

@-salutation
--
Michel Claveau
Jul 18 '05 #4
Thanks a bunch. It works.

"Michel Claveau/Hamster" <No********@No.Spam.mclaveau.No.Spam.com> wrote in message news:<c3**********@news-reader4.wanadoo.fr>...
OK

ça marche avec :
sret=translate_french_to_english('sÈdimentation'.d ecode("latin-1").encode("u
tf-8"))
et aussi pour :

sret=translate_french_to_english('sédimentation'.d ecode("latin-1").encode("u
tf-8"))
Merci pour lui. (thanks).

@-salutation

Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Joshua Beall | last post by:
Hi All, I can see from the manual that the difference between urlencode and rawurlencode is that urlencode translates spaces to '+' characters, whereas rawurlencode translates it into it's hex...
1
by: mpyles | last post by:
I have the following reference to an image in my web page. The name of the image is simply one Chinese character followed by .jpg. <a HREF="transfer/rad0/﨨.jpg" TARGET="_blank">﨨.jpg</a> The...
3
by: JP SIngh | last post by:
Hi All I have users who upload files using my application using ASPUPLOAD component. My code uploads the file to a network location and once the upload is finish I display the hyperlink using...
1
by: yawnmoth | last post by:
Any ideas as to why urlencode(addslashes(chr(0))) returns '%5C0'? It seems like it should return '%00' since that's what urlencode(chr(0)) returns. If not that, I could also see it returning...
4
by: Andreas Klemt | last post by:
Hello, is there a difference between System.Web.HttpUtility.UrlEncode and Server.UrlEncode ?
1
by: Dario Sala | last post by:
Hi, what's the difference about Asp Server.UrlEncode and the Asp.Net Server.UrlEncode ? In asp: Server.UrlEncode("*") = %2A In Asp.Net: Server.UrlEncode("*") = *
4
by: djc | last post by:
1) I just recently used my own function which simply replaces cariage return / line feed characters with <br> tags for a large detail field before showing it via an asp.net page to preserve line...
0
by: Nathan Sokalski | last post by:
I am using Server.UrlEncode to make a piece of data safe to use as a QueryString. The data I am using is as follows: ratedpoem("title") Which returns the following String: "An Expression...
4
by: Justin | last post by:
Hi, i need some help on this urgently... Thanx.... The situation is like this, i need to find out the way to convert Chinese characters into hexa code. but the problem is what i got from bin2hex()...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.