473,324 Members | 2,567 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

how to stuff HTML into RSS??

Me and some friends are working on some PHP based templates for web
pages. We've templates that look like this (simplified):

<html>
<head>
<title>
The green and blue design for carpentry companies
</title>
</head>
<body>
<?php showMainContent(); ?>
<div style="width:200px; float:right">
<?php showLinkArea(3); ?>
</div>
</body>
</html>
I'd like to publish all the templates in our database in an RSS feed so
it will be easier to import them on other sites. Does it screw things
up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?

Jul 20 '05 #1
6 1825
On 2 Dec 2004 01:03:34 -0800, lk******@geocities.com wrote:
Does it screw things
up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?


It's not what you stuff, it's how you stuff it.

You should encode HTML, so that

<description><p>Some <b>HTML</b> in RSS</description>

becomes this

<description>&lt;p&gt;Some &lt;b&gt;HTML&lt;/b&gt; in
RSS&lt;/p&gt;</description>

Watch out as well for & (becomes &amp;) and for &eacute; etc. (turn
them into the equivalent numeric entity)

I'd also suggest that you make your HTML fragments into well-formed,
balanced XHTML fragments before you embed them (lower case element
names, close open elements). Although this isn't required, it can make
life easier with XML toolsets.

This stuff isn't hard to do, but it's very poorly documented. There
are many RSS versions, and few of them describe it fully. This is a
useful read
http://diveintomark.org/archives/200...compatible-rss
I'd also avoid the obsolete RSS 0.91 in favour of RSS 1.0 (far
better), or you might prefer the more popular RSS 2.0

--
Smert' spamionam
Jul 20 '05 #2
> You should encode HTML, so that

<description><p>Some <b>HTML</b> in RSS</description>

becomes this

<description>&lt;p&gt;Some &lt;b&gt;HTML&lt;/b&gt; in
RSS&lt;/p&gt;</description>

Hi,

I don't know anything about RSS, but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces? And if that wouldn't be possible yet, shouldn't it become possible?

regards,
--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
Jul 20 '05 #3
On Thu, 02 Dec 2004 15:41:32 GMT, "Joris Gillis" <ro**@pandora.be>
wrote:
I don't know anything about RSS,
I suggest you read the Dive Into Mark article. It explains some of the
background to this and is a good explanation.
http://diveintomark.org/archives/200...compatible-rss

RSS has suffered because of too many standards, and especially because
these standards have generally been poorly specified. In particular
there is no clear guidance on how to embed HTML content within an RSS
item.

A problem with RSS, and all such protocols that try to become an open
publication medium, is that many creators will make content and many
consumers will try to read it. Where the spec isn't exhaustive on how
it _must_ be done, then a situation soon develops of de facto
behaviour for how it _is_ done. Readers become dependent on this, and
you diverge from it at your peril.
but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces?
That's an attractive option. However it's not a viable one.
There are several reasons:

Namespacing relies on using XHTML, and you may wish to include HTML
_as_HTML_ not XHTML. Some consumers may be confused if they receive
XHTML

Namespacing relies on including a balanced fragment (i.e. one that can
be well-formed as as XML fragment). This wasn't a requirement on the
original RSS/HTML enclosure, so this is hard to re-impose in some
cases (<a name="..." > is one of the more awkward cases to deal
with).

RSS is not an XML protocol. Successive versions of badly-written specs
have clouded this. There are all sorts of references of "ASCII" when
it should really be CDATA. It's commonplace to include HTML entities,
even when these aren't valid outside the HTML DTD. Reliable parsing
of RSS from external sources is a mess, and it often relies on
knife-and-fork parsing with non-XML tools. It's not reliable to
assume good support for standard XML features if you're working with
external feeds, even though you "should" be able to do this.
And if that wouldn't be possible yet, shouldn't it become possible?


RSS is old. It's post-XML, but pre-XHTML and (arguably)
pre-namespacing. So even if a namespaced approach became widespread,
consumers should (strongly) keep supporting the old way if they still
want to accept content supplied that way.

I use namespaced content for internal RSS feeds within my projects,
where I always use RSS 1.0. For external work though, I encode plain
HTML. I use balanced fragments, so I close elements like <p>...</p>,
but I don't use the <br /> form for <br>

--
Smert' spamionam
Jul 20 '05 #4
On Thu, 02 Dec 2004 20:30:17 +0000, Andy Dingley <di*****@codesmiths.com> wrote:
On Thu, 02 Dec 2004 15:41:32 GMT, "Joris Gillis" <ro**@pandora.be>
wrote:
I don't know anything about RSS,


I suggest you read the Dive Into Mark article. It explains some of the
background to this and is a good explanation.
http://diveintomark.org/archives/200...compatible-rss

RSS has suffered because of too many standards, and especially because
these standards have generally been poorly specified. In particular
there is no clear guidance on how to embed HTML content within an RSS
item.

A problem with RSS, and all such protocols that try to become an open
publication medium, is that many creators will make content and many
consumers will try to read it. Where the spec isn't exhaustive on how
it _must_ be done, then a situation soon develops of de facto
behaviour for how it _is_ done. Readers become dependent on this, and
you diverge from it at your peril.
but wouldn'it be easier and more logical to insert the XHTML as elements using namespaces?


That's an attractive option. However it's not a viable one.
There are several reasons:

Namespacing relies on using XHTML, and you may wish to include HTML
_as_HTML_ not XHTML. Some consumers may be confused if they receive
XHTML

Namespacing relies on including a balanced fragment (i.e. one that can
be well-formed as as XML fragment). This wasn't a requirement on the
original RSS/HTML enclosure, so this is hard to re-impose in some
cases (<a name="..." > is one of the more awkward cases to deal
with).

RSS is not an XML protocol. Successive versions of badly-written specs
have clouded this. There are all sorts of references of "ASCII" when
it should really be CDATA. It's commonplace to include HTML entities,
even when these aren't valid outside the HTML DTD. Reliable parsing
of RSS from external sources is a mess, and it often relies on
knife-and-fork parsing with non-XML tools. It's not reliable to
assume good support for standard XML features if you're working with
external feeds, even though you "should" be able to do this.
And if that wouldn't be possible yet, shouldn't it become possible?


RSS is old. It's post-XML, but pre-XHTML and (arguably)
pre-namespacing. So even if a namespaced approach became widespread,
consumers should (strongly) keep supporting the old way if they still
want to accept content supplied that way.

I use namespaced content for internal RSS feeds within my projects,
where I always use RSS 1.0. For external work though, I encode plain
HTML. I use balanced fragments, so I close elements like <p>...</p>,
but I don't use the <br /> form for <br>


Now that what I call a valuable reply:-)
Thank you very much.

--
Joris Gillis (http://www.ticalc.org/cgi-bin/acct-v...i?userid=38041)
Ceterum censeo XML omnibus esse utendum
Jul 20 '05 #5
lk******@geocities.com wrote:
Me and some friends are working on some PHP based templates for web
pages. We've templates that look like this (simplified):

<html>
<head>
<title>
The green and blue design for carpentry companies
</title>
</head>
<body>
<?php showMainContent(); ?>
<div style="width:200px; float:right">
<?php showLinkArea(3); ?>
</div>
</body>
</html>
I'd like to publish all the templates in our database in an RSS feed so
it will be easier to import them on other sites. Does it screw things
up if I stuff HTML into the DESCRIPTION tag on an RSS .91 feed?


Yes. Implementations of RSS readers are almost all hopelessly broken and
non-conformant, and the RSS "spec" -- such as it is -- has been so kicked
about and bastardised as to be virtually worthless except as a carrier
format like HTML. There were plans to make a newer, better version, but
like HTML it has now become so fossilised that it's not worth changing.

///Peter
--
"The cat in the box is both a wave and a particle"
-- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
Jul 20 '05 #6
Thank you for your in-depth reply. I've already read Mark's article and
one thing I got from it was that it didn't matter much which version of
RSS you used, they were all broken.

For now I'm in the lucky position of being the consumer of my own
output. We have some HTML templates we'd like publish, but we are
publishing them for people who have our software, so we control the
source and the point of consumption. I'd love to eventualy use a richer
RSS but I'm short on time this month and so I'd like to reuse what PHP
code we already have written and tested. The code we have puts out
valid RSS .91.

To publish an HTML template in the description tag of RSS, should I
just wrap it in a CDATA tag? Or escape it as someone ablove remarked.

Jul 20 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: juglesh | last post by:
hi, I'm just getting started here in php, but i've been able to get some basic variable stuff working, so I'm just looking for advice on the basic set up of some thumbnail->picture pages. the...
1
by: David Fraser | last post by:
I used this when trying to retrieve the McMillan site. Others might find it useful... David #!/usr/bin/env python import urlparse import urllib2 import os
72
by: Mel | last post by:
Are we going backwards ? (please excuse my spelling...) In my opinion an absolute YES ! Take a look at what we are doing ! we create TAGS, things like <H1> etc. and although there are tools...
7
by: Pierre Jelenc | last post by:
I've been asked to do a favor to a friend and fix the site at http://www.cwrpartners.com/ I'm totally baffled by the amount of rubbish in the header section; is that the kind of coding that...
6
by: Olaf Baeyens | last post by:
Can someone out there point me to a URL or other reference how to use these security stuff in .NET? I know everything can be found online on the msdn but since I am new to this security stuff, I...
1
by: themf | last post by:
Hi, I'm trying to make a page that will become a part of another page, ie. included in the HTML at a particular point. How do I do this so all the stuff inside MY page remains intact, relative...
7
by: David Sworder | last post by:
Hi, I'm developing an application that will support several thousand simultaneous connections on the server-side. I'm trying to maximize throughput. The client (WinForms) and server communicate...
7
by: +The_Taco+ | last post by:
Ok i'm kinda new to ASP.NET. I got like 4 aspx pages right now, all with there aspx.vb codebehind. Each of them need to connect on the same database, so what I want to do is to create a module or a...
2
by: David | last post by:
Hi all, using ASP.NET 1.1 C# I am using URL ReWriting to rewrite the URL of the incoming page. A problem I had was of pages without .aspx extension not mapping, so no rewriting would...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.