Hi,
I've created a utf8 encoded RSS feed which presents news data drawn from a database. I've set all aspects of my database to utf8 and also saved the text which i have put into the database as utf8 by pasting it into notepad and saving as utf8. So everything should be encoded in utf8 when the RSS feed is presented to the browser, however I am still getting the wierd question mark characters for pound signs :(
Here is my RSS feed code (coldfusion): -
<cfsilent>
-
<!--- Get News --->
-
<cfinvoke component="com.news" method="getAll" dsn="#Request.App.dsn#" returnvariable="news" />
-
</cfsilent>
-
<!--- If we have news items --->
-
<cfif news.RecordCount GT 0>
-
<!--- Serve RSS content-type --->
-
<cfcontent type="application/rss+xml">
-
<!--- Output feed --->
-
<cfcontent reset="true"><?xml version="1.0" encoding="utf-8"?>
-
<cfoutput>
-
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
-
<channel>
-
<title>News RSS Feed</title>
-
<link>#Application.siteRoot#</link>
-
<description>Welcome to the News RSS Feed</description>
-
<lastBuildDate>Wed, 19 Nov 2008 09:05:00 GMT</lastBuildDate>
-
<language>en-uk</language>
-
<atom:link href="#Application.siteRoot#news/rss/index.cfm" rel="self" type="application/rss+xml" />
-
-
<cfloop query="news">
-
<!--- Make data xml compliant --->
-
<cfscript>
-
news.headline = replace(news.headline, "<", "<", "ALL");
-
news.body = replace(news.body, "<", "<", "ALL");
-
news.date = dateformat(news.date, "ddd, dd mmm yyyy");
-
news.time = timeformat(news.time, "HH:mm:ss") & " GMT";
-
</cfscript>
-
<item>
-
<title>#news.headline#</title>
-
<link>#Application.siteRoot#news/index.cfm?id=#news.id#</link>
-
<guid>#Application.siteRoot#news/index.cfm?id=#news.id#</guid>
-
<pubDate>#news.date# #news.time#</pubDate>
-
<description>#news.body#</description>
-
</item>
-
</cfloop>
-
</channel>
-
</rss>
-
</cfoutput>
-
<cfelse>
-
<!--- If we have no news items, relocate to news page --->
-
<cflocation url="../news/index.cfm" addtoken="no">
-
</cfif>
-
Has anyone any suggestions? I've done loads of research but can't find the right answers :(
Thanks in advance,
Chromis
7 4290
does it help if you use £? (I know that may be only a workaraound, but from the code I can tell nothing without the actual feed)
is £ the only character outside the ascii charset? maybe your generator has some problems with utf-8 or doesn't know which charset to use....
regards
PS please don't post your questions in the insights section, ask a moderator to move it to the answers section.
Hi Dormilich thanks for your reply. My apologies, I am aware of the answers section but I accidentally put it in here, it's very easy to make the mistake sadly (preffered the old layout!).
Yes the only bad character is the pound sign, I've tryed replacing it manually in the database presuming that that would replace the chracter with the utf8 equivalent but it didn't work. If i use the £ it breaks the feed. I could use cdata but I need to display paragraph formatting, and using cdata displays the p element tags.
£ breaks your feed because it's an undefined entity (you'd need a DTD to fix that). have you tried £? this should not break the feed.
regards
Ok i've replaced all occurences of £ with £ it now works great thanks! Why would the pound sign not be recognised though, do you think that when i saved the file as utf8 it didn't convert the character properly?
Ideally i would like to create a function in coldfusion which will doctor text and make it utf8 compliant, do you know of the best way to do this?
I am most of the way there with the following function, apart from putting some code in to replace the pound signs what other ways could i improve it? -
<cfcomponent>
-
<cffunction name="CustomParagraphFormatXMLSafe" access="public" returntype="string">
-
<cfargument name="paragraph" type="string" required="yes">
-
-
<cfscript>
-
/**
-
* Returns a XHTML string suitable for insertion into a database in the UTF-8 encoding format.
-
* The string is then wrapped with opening and closing paragraph tags whilst ignoring list elements.
-
*
-
* @param paragraph String you want XHTML / XML formatted.
-
* @return Returns a string.
-
* @author ****
-
* @version 1.0, December 10th, 2008
-
*/
-
-
var returnValue = '';
-
var newParagraph = arguments.paragraph;
-
var sqlList = "-- ,'";
-
var replacementList = "#chr(38)##chr(35)##chr(52)##chr(53)##chr(59)##chr(38)##chr(35)##chr(52)##chr(53)##chr(59)# , #chr(38)##chr(35)##chr(51)##chr(57)##chr(59)##chr(163)#";
-
-
/* Replace pound signs */
-
Replace(newParagraph,"£","£");
-
-
/* Make sql safe */
-
newParagraph = trim(replaceList( newParagraph , sqlList , replacementList ));
-
-
/* Make XML and UTF-8 Safe */
-
newParagraph = XMLFormat(CharsetEncode(CharsetDecode(newParagraph,"utf-8"),"utf-8"));
-
-
/* Break into paragraphs */
-
newParagraph = ListToArray(newParagraph,Chr(13) & Chr(10));
-
newParagraphCount = ArrayLen(newParagraph);
-
-
for(i=1;i LTE newParagraphCount;i=i+1) {
-
-
//WriteOutput(newParagraph[i]);
-
-
/* Ignore blank lines */
-
if(newParagraph[i] NEQ "") {
-
-
/* Remove excess paragraph elements */
-
REReplace(newParagraph[i], "<?p*>", "", "All");
-
-
/* Loop through array of paragraphs wrapping in p elements, skipping list elements */
-
containsList = REFind("<\/?ul[^>]*>$|<\/?li[^>]*>",newParagraph[i]); //
-
if(containsList EQ 0) {
-
returnValue = returnValue & "<p>" & newParagraph[i] & "</p>" & Chr(13) & Chr(10);
-
}
-
else {
-
returnValue = returnValue & newParagraph[i] & Chr(13) & Chr(10);
-
}
-
}
-
}
-
return trim(returnValue);
-
</cfscript>
-
</cffunction>
-
</cfcomponent>
-
@chromis
this is a question more suited in the coldfusion forum. I have never used CF and I'm probably no help there....
regards
Ok thanks anyway, i'll ask in the cf forum.
I've moved your thread to the ColdFusion forum.
Hopefully you'll get more help here.
-Moderator Frinny
Sign in to post your reply or Sign up for a free account.
Similar topics
by: lawrence |
last post by:
I'm running this page:
http://www.krubner.com/rss/page938.xml
through this validator:
http://rss.scripting.com/?url=http%3A%2F%2Fwww.krubner.com%2Frss%2Fpage938.xml
|
by: lkrubner |
last post by:
Whenever users write a post in Microsoft Word and then post it to
their weblogs using my PHP software, their RSS feed ends up being
corrupted with garbage characters which violate the...
|
by: intl04 |
last post by:
I have a memo field that is included in some Access reports I created.
Is there some way for the memo field to display nicely formatted text,
with line breaks between paragraphs? Or is it necessary...
|
by: Sathyaish |
last post by:
A practice excercise from K&R. Kindly read the comments within the
program. I'd be very grateful to people who helped. Why is it that I
get the wierd face-like characters on the screen instead of...
|
by: Buddy Ackerman |
last post by:
Apparently .NET strips these white space characters (MSXML doesn't)
regardless of what the output method is set to. I'm using
<xsl:text> </xsl:text> to output a tab character and...
|
by: lawrence k |
last post by:
2 years ago I asked, on this newsgroup, how to weed out non-UTF-8
characters from my RSS feed. I was told that I could not do so with
certainty, but I could try various tricks that would give me...
|
by: =?Utf-8?B?RGlmZmlkZW50?= |
last post by:
Hi All,
I have created an RSS feed reader. However, the feed that I am trying to
read has some invalid characters which my reader does not like. I have no
control on the RSS feed but I would...
|
by: jt |
last post by:
hello everyone..,
i'm using ubuntu 8.04 OS. I'm not able to output the non-printable
ascii chatacters.
for eg.
printf("%c",1); // nothing is outputted.....
is there any way to output these...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |