473,402 Members | 2,053 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

Encoding HTML Entities

Well, today I needed to process some data for upload to a web page
and it needed higher ASCII characters encoded as HTML entities.

So, I wrote a function to do the job, which works with a table of
the entities.

The table and the function, along with sample data and a sample
query using the function, can be downloaded from:

http://www.bway.net/~dfassoc/downloa...MLEntities.zip

A couple of comments:

1. because Access is not case sensitive but the HTML entities *are*
case sensitive, there are a couple of little kludges:

a. the data table has a checkoff field to indicate if an entity
is an upper-case version of an entity. Examples would be á
and &Aaccute.

b. the function that does the replacing compares the character
being encoded to the upper case version of the same character. If
it's the same, it capitalizes the 2nd character of the entity
encoding.

So, what happens in the function is that it looks up only the lower
case version, then if the character being encoded is upper case, it
alters the entity to be uppercase. Example:

É [capital e acute]

will look up the entity definition:

é

Then the upper case version of the character being tested is
compared to the character itself:

If Asc(UCase(strChar)) = Asc(strChar) Then

and if it's equal, it converts the retrieved entity, "é", to
"É".

The code for the function is after my signature below, and uses
Trevor Best's tLookup function in a version based on his old
version (he's since rewritten it significantly).

Note also that the code works either as a function or with a ByRef
variable passed to it.

[confidential to Steve J.: yes, I used a static variable -- I'm
changing my mind on this]

Commentary, suggestions and improvements welcome.

--
David W. Fenton http://www.bway.net/~dfenton
dfenton at bway dot net http://www.bway.net/~dfassoc

Public Function HTMLEntityReplace(varInput As Variant) As Variant
Static db As DAO.Database
Dim lngLen As Long
Dim i As Long
Dim lngOutputCounter As Long
Dim strChar As String
Dim strEntity As String
Dim lngLenEntity As Long
Dim strOutput As String

lngLen = Len(Nz(varInput))
If lngLen = 0 Or IsNull(varInput) Then GoTo exitRoutine
If db Is Nothing Then Set db = CurrentDb()
strOutput = varInput
For i = 1 To lngLen
lngOutputCounter = lngOutputCounter + 1
strChar = Mid(varInput, i, 1)
If Asc(strChar) > 128 Then
strEntity = Nz(tLookup("HTMLEntity", "tblHTMLEntities", _
"[Letter]='" & strChar & "' AND [UCase]=False", db), _
vbNullString)
lngLenEntity = Len(strEntity)
If lngLenEntity > 0 Then
If Asc(UCase(strChar)) = Asc(strChar) Then
strEntity = "&" & StrConv(Mid(strEntity, 2), _
vbProperCase)
End If
strOutput = Left(strOutput, lngOutputCounter - 1) _
& strEntity & Mid(strOutput, lngOutputCounter + 1)
lngOutputCounter = lngOutputCounter + lngLenEntity - 1
End If
End If
Next i
varInput = strOutput
HTMLEntityReplace = strOutput

exitRoutine:
Exit Function

End Function
Nov 12 '05 #1
0 2766

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Rutger Claes | last post by:
I have a dom tree representing the content of a html document. In the xml I use € as the euro sign. I think I need to do this to be able to use xsl transformations. After the xsl...
2
by: Ian Rastall | last post by:
This is my understanding so far, and please correct any errors: 1. US-ASCII is a subset of ISO-8859-1 2. US-ASCII is a subset of UTF-8 3. ISO-8859-1 is not a subset of UTF-8 But ... are the...
38
by: lawrence | last post by:
I'm just now trying to give my site a character encoding of UTF-8. The site has been built in a hodge-podge way over the last 6 years. The validator tells me I've lots of characters that don't...
2
by: THY | last post by:
Hi, I am developing a website in english & chinese both language. whenever I save, it required I set the encoding in advanced save options. But I found there are 4 related to unicode, can anyone...
2
by: Paul E Collins | last post by:
I need to convert non-standard characters to their encoded forms for use in an HTML page. Note that I *don't* mean converting a URI, where e.g. spaces would become %20 - there are methods for that...
2
by: FP | last post by:
I have a javascript variable set to the contents of a database comments field. To set the js variable I used the PHP addslashes function which encodes the apostrophe, double quotes and the...
8
by: Erwin Moller | last post by:
Hi group, I could use a bit of guidance on the following matter. I am starting a new project now and must make some decisions regarding encoding. Environment: PHP4.3, Postgres7.4.3 I must...
14
by: Zoro | last post by:
My task is to read html files from disk and save them onto SQL Server database field. I have created an nvarchar(max) field to hold them. The problem is that some characters, particularly html...
1
Logan1337
by: Logan1337 | last post by:
Hello. I need to take a string in UTF-8 with extended characters (e.g trademark, curly quotes, etc) and encode it for html, with either the html named entities or xml numbered (unicode) entities. ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.