472,373 Members | 1,918 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,373 software developers and data experts.

Removing obscure chars

Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on a
certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","™")
s = Replace(s,"—","-")
s = Replace(s,"’",""")
s = Replace(s,"'",""")
s = Replace(s,"""",""")
s = Replace(s,"&","&")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need be?

Thanks

Apr 3 '07 #1
3 5520
Gazing into my crystal ball I observed "Yobbo" <in**@NoSpamIt.com>
writing in news:ug**************@TK2MSFTNGP05.phx.gbl:
Hi All

I have an ASP function in place to strip invalid chars out of a data
store before I create an XML file of this data, but my function
doesn't work on a certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)
I detest these "smart" quotes. Are regular quotes dumb by comparison?
>
Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking
for the html equiv rather than the actual char, but I can't possibly
get away with simply copy and pasting these friggin(!!) chars into my
function. Surely this is bad practise?
You are putting in the HTML entity, you may need to put the ascii
character instead, for example:
s = replace(s,chr(60),"&gt;")
>
Does anybody know how I can trap and replace/remove these chars if
need be?

Thanks

HTH

--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share

Apr 4 '07 #2
Yobbo wrote on Tue, 3 Apr 2007 18:17:59 +0100:
Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on
a certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need
be?
Your function is quite limited. What happens when a character not in your
list appears? The XML supported entity list is pretty small.

Here's the function I use in my own XML generation code, it's crude but it works:

function XMLEncode(strText)

'loop through code and replace all non-alphanumeric characters with their
ascii value
strNewText = ""

For i = 1 to Len(strText)

j = Asc(Mid(strText,i,1))

If j = 10 Then
'replace tab with a line break
strNewText= strNewText & "&lt;br&gt;"
ElseIf j = 13 or j = 9 then 'cr, lf, tab
'strip them
ElseIf j = 34 then
strNewText = strNewText & "&quot;"
ElseIf j = 39 then
strNewText = strNewText & "&apos;"
ElseIf j = 32 or j = 45 or (j >=49 and j <= 57) or (j >=65 and j <= 90) or
(j >= 97 and j <= 122) then
'ok
strNewText = strNewText & Mid(strText,i,1)
ElseIf j = 38 Then '&
strNewText = strNewText & "&amp;"
ElseIf j = 60 then '<
strNewText = strNewText & "&lt;"
ElseIf j = 62 then '>
strNewText = strNewText & "&gt;"
Else
strNewText = strNewText & "&#" & j & ";"
End If

Next

XMLEncode = strNewText
End Function
This checks each character in the string in turn, and replaces some with
entities, and the rest of the non-printable characters with their numeric
value. You could easily add a few more entity replacements as required. Just
watch out for the first couple of replacements where I replace tabs with a
<br>, and strip out carriage returns and line feeds, as that might not fit
what you want do with the XML yourself.

Dan
Apr 4 '07 #3

"Yobbo" <in**@NoSpamIt.comwrote in message
news:ug**************@TK2MSFTNGP05.phx.gbl...
Hi All

I have an ASP function in place to strip invalid chars out of a data store
before I create an XML file of this data, but my function doesn't work on
a
certain set of chars.

As far as I can see these are the following:

a) trademark char
b) long hyphen/dash char
c) smart/curly quotes (both left and right)

Even though my function is set up as follows:

Function ReFormatStringForXML(s)
IF LEN(s) 0 AND NOT IsNull(s) THEN
s = Replace(s,"™","&trade;")
s = Replace(s,"—","-")
s = Replace(s,"’","&quot;")
s = Replace(s,"'","&quot;")
s = Replace(s,"""","&quot;")
s = Replace(s,"&","&amp;")
s = Replace(s,"<","&lt;")
s = Replace(s,">","&gt;")
END IF
ReFormatStringForXML = s
End Function

These chars still pass by and foul up my XML file.

I have a feeling that its down to the fact that my function is looking for
the html equiv rather than the actual char, but I can't possibly get away
with simply copy and pasting these friggin(!!) chars into my function.
Surely this is bad practise?

Does anybody know how I can trap and replace/remove these chars if need
be?
>
Thanks
If you are creating an XML file can you use a DOMDocument to build it and
save it?
That'll ensure correct XML is created.

Apr 5 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Ryan | last post by:
hi i am having a problem with a php script i am trying to write. the problem is that the ouput from the script (the html) was sometimes being truncated, but sometimes it would work as expected....
6
by: Aaron Collins | last post by:
Hello, i have a string that was entered in a web form and stored in a mysql database where the user entered carriage returns as well as <br>'s after each line in the form. the data is stored as...
10
by: Tom Szabo | last post by:
Hi All, Just wondering there is any other way to remove the toolbar from the browser than using " ...toolbar=no ..." in the window.open(...)
2
by: matt | last post by:
I have an upload script for a photo and a caption. It all goes pear shaped when I upload a character like ' " or / \ | Is there anyway I can parse through the filename when submitting the form and...
7
by: A.M. Kuchling | last post by:
python.org has a page of "Python vs. X" language comparisons at <http://www.python.org/doc/Comparisons.html>. They're all pretty outdated, and often unfair because they're written by a person who...
6
by: Batista, Facundo | last post by:
I'm doing a small program, in which the user will have the option to enter his/her password everytime, or just save it (to a file). So, is there a module to obscure the password text in a secure...
6
by: bruce | last post by:
hi... i'm running into a problem where i'm seeing non-ascii chars in the parsing i'm doing. in looking through various docs, i can't find functions to remove/restrict strings to valid ascii...
17
by: Eric_Dexter | last post by:
def simplecsdtoorc(filename): file = open(filename,"r") alllines = file.read_until("</CsInstruments>") pattern1 = re.compile("</") orcfilename = filename + "orc" for line in alllines: if not...
0
by: Yobbo | last post by:
Hi All I have an ASP function in place to strip invalid chars out of a data store before I create an XML file of this data, but my function doesn't work on a certain set of chars. As far as I...
3
by: MLH | last post by:
Back in mid-2003, lucason posted a question about removing punctuation chars from a string. Suggested code was posted using Replace function. Could the FN below be easily modified for use with A97...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
2
by: Ricardo de Mila | last post by:
Dear people, good afternoon... I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control. Than I need to discover what...
1
by: ezappsrUS | last post by:
Hi, I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.