473,626 Members | 3,974 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

HTML Encoded Translation

How can I translate this:

gi

to this:

"gi"

I've tried urllib.unencode and it doesn't work.

Thanks!

Oct 17 '06 #1
3 1236
Dave wrote:
How can I translate this:

gi

to this:

"gi"
the easiest way is to run it through an HTML or XML parser (depending on
what the source is). or you could use something like this:

import re

def fix_charrefs(te xt):
def fixup(m):
text = m.group(0)
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
return text # leave as is
return re.sub("&#?\w+; ", fixup, text)
>>fix_charrefs( "gi")
'gi'

also see:

http://effbot.org/zone/re-sub.htm#strip-html
I've tried urllib.unencode and it doesn't work.
those are HTML/XML character references, not encoded URL characters.

</F>

Oct 17 '06 #2
Dave enlightened us with:
How can I translate this:

gi

to this:

"gi"

I've tried urllib.unencode and it doesn't work.
As you put so nicely in the subject: it is HTML encoding, not URL
encoding. Those are two very different things! Try a HTML decoder,
you'll have more luck with that...

Sybren
--
Sybren Stüvel
Stüvel IT - http://www.stuvel.eu/
Oct 17 '06 #3
Got it, great. This worked like a charm. I knew I was barking up the
wrong tree with urllib, but I didn't know which tree to bark up...

Thanks!

Fredrik Lundh wrote:
Dave wrote:
How can I translate this:

gi

to this:

"gi"

the easiest way is to run it through an HTML or XML parser (depending on
what the source is). or you could use something like this:

import re

def fix_charrefs(te xt):
def fixup(m):
text = m.group(0)
try:
if text[:3] == "&#x":
return unichr(int(text[3:-1], 16))
else:
return unichr(int(text[2:-1]))
except ValueError:
pass
return text # leave as is
return re.sub("&#?\w+; ", fixup, text)
>>fix_charrefs( "gi")
'gi'

also see:

http://effbot.org/zone/re-sub.htm#strip-html
I've tried urllib.unencode and it doesn't work.

those are HTML/XML character references, not encoded URL characters.

</F>
Oct 17 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

34
4446
by: Niels Berkers | last post by:
Hi, i'd like to host my web pages using multiparts to reduce the number of hits on the server. i know this isn't a real PHP subject, but i'll try it anyway. i've been searching the web for solutions and examples with no succes. does anybody know a good starting point hints / tips are also welcome Regards
15
2037
by: Cem Louis | last post by:
Hi to all, I have the following html mail sender code written in php. It is working properly but my problem is, code doesn't send the text correctly which is in the varible "$mesaj". Code sends some unreadable text. Any help would be appreciated, Thanx all. Note: I put the sample image of the problem in: http://www.wiliw.com/problem2.gif Cem Louis
5
3778
by: Donald Firesmith | last post by:
Are html tags allowed within meta tags? Specifically, if I have html tags within a <definition> tag within XML, can I use the definition as the content within the <meta content="description> tag? If not, is there an easy way to strip the html tags from the <definition> content before inserting the content into the meta tag?
12
2421
by: Mr. Clean | last post by:
As you may know, spammer use this technique to get by filters. <!H>It<!W> is<!N> <!K>a<!L> w<!Q>el<!Q>l <!X>k<!O>now<!B>n <!F>f<!G>a<!V>c<!O>t <!S>th<!B>at p<!R>eopl<!J>e<!G> <!Z>who <!V>p<!U>o<!P>ss<!F>e<!L>s<!U> <!S>a <!J>de<!S>gr<!T>ee <!W>a<!K>r<!I>e<!V> l<!O>o<!D>o<!W>k<!C>ed <!J>upo<!R>n<!K> a<!U>s<!G> <!X>th<!O>e <!E>elit<!U>e<!N><BR> <!T> If yo<!Q>u <!B>ha<!C>ve<!Y> a<!S> d<!Q>eg<!E>r<!Y>ee<!E>, yo<!F>u<!N> a<!Z>re<! M>...
11
3649
by: Patrick Van Esch | last post by:
Hello, I have the following problem of principle: in writing HTML pages containing ancient greek, there are two possibilities: one is to write the unicode characters directly (encoded as two bytes) into the HTML source, and save this source not as an ASCII text, but as a UNICODE text file (using 16 bits per character, also for the Western ASCII characters, which are usually encoded as Ox00XX with XX the ASCII code) ; or to write a pure...
5
2799
by: Steven Berkovitz | last post by:
Hi there, I am having a problem where if i add an attribute to a control at runtime the rendered attribute is HTML encoded. For example, on a textbox: textBox.Attributes = "if(x && y) alert('hello');"; Gets rendered as:
3
1840
by: darrel | last post by:
We have a parent-child table set up to maintain content. When this table is updated, I do a recursive call through the data and spit out an XML file. Then, when we want to display this content, I load up the XML file, attach the XSL file to it, and do the transformation via VB.net and send it off to the browser. The problem is that much of this content is HTML. As such, I'm storing it in
9
7603
by: darrel | last post by:
I'm trying to get ASP.net to write out some XML including HTML from a DB: The HTML is stored in the DB as encoded HTML. I'm trying to decode it and write it to an XML node (The HTML is valid XML). I have this: objXMLWriter.WriteElementString("text", Trim(System.Web.HttpContext.Current.Server.HtmlDecode(DS.Tables(0).Rows(rowCount)("itemText").ToString))) But in the XML file, it's still coming out encoded. It appears that the...
2
1936
by: | last post by:
I've written an app to parse an XML RSS feed. Some of the content in that RSS feed has entities properly encoded for XML, e.g.: UW&apos;s Hawes To Enter NBA Draft High school&apos;s &apos;No-shush&apos; to retire My app reads the content using XmlDocument/XmlNodeList and then commits selected pieces to my DB. I want to know the best/easiest/most reliable/most complete way to translate
0
8199
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8705
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8638
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8505
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7196
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6125
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5574
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4198
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1511
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.