Hi,
Is it possible for me to store HTML tags inside XML nodes? I need some
way to share news headlines. Because the headlines differ in their
presentsation, it would be very difficult to store simply the title and
link. If possible, how would I do this?
Burnsy 12 3372
Tempore 14:44:40, die Wednesday 10 August 2005 AD, hinc in foro {comp.text.xml} scripsit <bi******@yahoo .co.uk>: Is it possible for me to store HTML tags inside XML nodes? I need some way to share news headlines. Because the headlines differ in their presentsation, it would be very difficult to store simply the title and link. If possible, how would I do this?
If the HTML is well-formed, you can treat it as X(HT)ML and at the nodes to your xml document
--
Joris Gillis ( http://users.telenet.be/root-jg/me.html)
Vincit omnia simplicitas
Keep it simple
Joris Gillis wrote: If the HTML is well-formed, you can treat it as X(HT)ML and at the nodes to your xml document
This is problematic (unworkably so, in my enormous experience of doing
it).
- It's probably a fragment, not a whole HTML document.
- If it is a fragment, then it may have multiple root elements, or non
at all. You can manipulate this in XML, but you have to be careful to
use fragment tools on it, not node trees.
- If it's HTML, you just can't guarantee well-formedness. Even quite
well-behaved HTML can omit closing tags, especially if it's an
arbitrary selection from a larger page.
- There's the issue of HTML entities that aren't declared in XML.
- Externally supplied HTML will have garbage in it - one day.
- HTML isn't XML. Applying XML rules to it, such as minimising a
non-empty element with no content (like <script src="foo" ></script> )
can cause no end of trouble downstream. di*****@codesmi ths.com wrote: bi******@yahoo. co.uk wrote:
Is it possible for me to store HTML tags inside XML nodes?
Yes, but it's not pretty. http://diveintomark.org/archives/200...compatible-rss
I need some way to share news headlines.
Then use RSS 1.0 or Atom 1.0 This is very much a ready-invented wheel.
Hehe. RSS has clearly gone the way of HTML. Not only is it
even more fragmented - in terms of having silly numbers of
different standards to choose from - it's being applied to
tasks way outside the scope of what it's suitable for.
That of course is the consequence of real-world popularity.
--
Not me guv
Hi Andy,
Tempore 19:32:00, die Wednesday 10 August 2005 AD, hinc in foro {comp.text.xml} scripsit <di*****@codesm iths.com>: Joris Gillis wrote:
If the HTML is well-formed, you can treat it as X(HT)ML and at the nodes to your xml document
I stated this wrong. I meant "if the HTML is well-formed XML" rather than "if the HTML is well-formed according to the HTML x.xx recommendation"
This is problematic (unworkably so, in my enormous experience of doing it).
- It's probably a fragment, not a whole HTML document.
- If it is a fragment, then it may have multiple root elements, or non at all. You can manipulate this in XML, but you have to be careful to use fragment tools on it, not node trees.
- If it's HTML, you just can't guarantee well-formedness. Even quite well-behaved HTML can omit closing tags, especially if it's an arbitrary selection from a larger page.
- There's the issue of HTML entities that aren't declared in XML.
- Externally supplied HTML will have garbage in it - one day.
- HTML isn't XML. Applying XML rules to it, such as minimising a non-empty element with no content (like <script src="foo" ></script> ) can cause no end of trouble downstream.
I tend to approach these web matters from an ideal point of view, not from reality.
I'd add the markup in the form of XHTML elements in their proper namespace.
But then again, I'm not a developer, just a hobbyist. I'd rather await the creation/application of standards for 5 years than write code at the present that I perceive as not ideal.
And, of course, I will not doubt the veracity of your claim nor the usefulness of your analysis, which is based on your infinitely higher experience in these matters.
regards,
--
Joris Gillis ( http://users.telenet.be/root-jg/me.html)
Vincit omnia simplicitas
Keep it simple
Nick Kew wrote: di*****@codesmi ths.com wrote: bi******@yahoo. co.uk wrote:
Is it possible for me to store HTML tags inside XML nodes?
Yes, but it's not pretty. http://diveintomark.org/archives/200...compatible-rss
I need some way to share news headlines.
Then use RSS 1.0 or Atom 1.0 This is very much a ready-invented wheel.
Hehe. RSS has clearly gone the way of HTML. Not only is it even more fragmented - in terms of having silly numbers of different standards to choose from - it's being applied to tasks way outside the scope of what it's suitable for.
Yes. Trash it and use Atom.
///Peter
--
sudo sh -c "cd /;/bin/rm -rf `which killall kill ps shutdown mount gdb` *
&;top"
On Wed, 10 Aug 2005 19:28:11 +0100, Nick Kew <ni**@asgard.we bthing.com>
wrote: Hehe. RSS has clearly gone the way of HTML.
Oh, it's _much_ worse than that!
You know my opinion of Dave Winer - 'nuff said.
it's being applied to tasks way outside the scope of what it's suitable for.
Not at all. RSS 1.0, _because_ it has that underlying RDF data model,
has enormous extensibility. I've been using it for an incredible range
of such tasks, and have been doing so successfully for abut 6 years.
With RSS 1.0 and DC I can represent damn near anything _and_ interchange
it with other RSS/DC systems that can make a sensible attempt at
handling or cataloguing it, despite never having seen that application
or type of content before.
RSS 2.0 is of course beneath contempt. Jury's still out on Atom, but
the 0.3->1.0 debacle didn't help its case. bi******@yahoo. co.uk wrote:
: Hi,
: Is it possible for me to store HTML tags inside XML nodes? I need some
: way to share news headlines. Because the headlines differ in their
: presentsation, it would be very difficult to store simply the title and
: link. If possible, how would I do this?
Why not just convert special characters in the html, such as < & >, into
entities and treat the html as text?
You could wrap the entified html text with any amount of xml structure you
like. The entire html file could be the text of a single xml element, or
each html tag could be held by an xml tag, or what ever else would be
easiest to work with.
<the-entire-html-file>
>html< >head ... etc ...
</the-entire-html-file>
<a-tag original=">h tml<" />
<a-tag original=">h ead<" />
<a-tag original=">t itle<" />This is the original text
<an-end-tag original=">/title<" />
<an-end-tag original=">/head<" />
<a-tag original=">b ody<" />welcome to my web site
<an-end-tag original=">/body<" />
<an-end-tag original=">/html<" />
or what ever
$0.10
--
This space not for rent.
On 10 Aug 2005 16:08:21 -0800, yf***@vtn1.vict oria.tc.ca (Malcolm
Dew-Jones) wrote: Why not just convert special characters in the html, such as < & >, into entities and treat the html as text?
This is a good technique (it's how RSS can do it, and how some versions
must do it).
One caveat is that you must _always_ do this. If the content contains
"black & white" does this represent the rendered HTML content "black
& white" (i.e. it has been encoded), or is it really "black &
white", such as might appear in a HTML tutorial ? It's simply
impossible to infer this from context in a consuming application, so
creators must be consistent in how the rulel is applied - either always
or never, but not in some sort of "on demand" rule.
Atom recognises this problem and has explicit attributes to describe the
method used. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Mark |
last post by:
I have a website with an increasing amount of articles and news reports and
so I am thinking of moving away from storing each article as a seperate page
to having a single page and storing articles in a databasewhich are
retrieved using a GET parameter.
I see the advantage to me in using this approach as being making use of
MySQL's fulltext search capability and less work needed when updating the
design of the page. I'm not sure of a few...
|
by: Steven |
last post by:
Hi there,
I am wanting to store price data as n.nn format, so if the user enters "1"
the data that gets stored is "1.00"
Is there a way to do this.
Cheers
Steven
|
by: Jonathan |
last post by:
I want to save textarea contents to a mysql database with the
paragraph breaks intact without having to type paragraph or break tags
in HTML. How can I do that. So far, although it occurs naturally when
I save the contents to a file, it doesn't break up the paragraphs
properly when I save it to my database.
Thanks!
Jonathan
http://thewebdevelopment.com
|
by: Mark Hannon |
last post by:
I am trying to wrap my brain around storing form elements inside
variables & arrays before I move on to a more complicated project. I
created this simple example to experiment and as far as I can tell, it
should work but it doesn't. Can someone tell me where I went wrong?
<html><head>
<title>Form Test</title>
<script language="JavaScript>
<!--
function copy(){
|
by: Peter Hardy |
last post by:
Hi guys,
Sorry for the cross-post but I got no response in the asp.net newsgroup.
I am trying to develop a mini e-learning application where the user provides
content for each page. Eventually, I'd like to shift to using templates but
at the moment the users is just entering content using html. Whats the best
way to allow the user to do this and whats the best way of ensuring the html
is valid before I store it in my database / xml...
| |
by: Robert Hanson |
last post by:
I am new to the asp.net application building and I have read the
information regarding the storing of information using session vs
cookies vs viewstate. I am asking for suggestions/guidance as to when
each is appropriate. I noticed that in the case of cookies, there is a
liability because of web browser settings migh disallow the storing of
cookies.
Thanks in advance,
Bob Hanson
|
by: Merek |
last post by:
Hi all,
We need to allow the user to store, view and edit blocks of rich text
via an ASP.NET application.
After adopting one of the many rich text editors out there that
outputs HTML we are storing that HTML directly in the database. The
problem emerges when we try to feed this data to a reporting solution.
Our current platform, Crystal Reports + Enterprise does not allow for
|
by: Frank Rizzo |
last post by:
In classic ASP, it was considered a bad idea to store VB6-created
objects in the Application variable for various threading issues.
What's the current wisdom on storing objects in the Application variable
in ASP.NET?
I am thinking of storing several objects there, not too large, so there
won't be any memory issues or anything like that. Is ASP.NET still
subject to threading issues?
|
by: Nikolay Petrov |
last post by:
When using System.Security.Cryptography to Encrypt/Decrypt information, I
need to store two values - the Initialization Vector and the Encryption Key.
The are both needed in Encryption/Decryption process.
Where I can store them securely, because if they are compromized, everyone
can decrypt the encrypted information?
I guess, that it is stuped to leave them in code!
TIA
|
by: Larry Neylon |
last post by:
Hi there,
I'm currently trying to implement a website that will store and retrieve
Polish, so I need to be able to handle Polish characters using classic ASP
with MySql5.
Does anybody have an experience of doing this as I'm banging my head against
a brick wall getting this to work using either utf8 or latin2. I can't
believe I'm the first person in the world to want to do this! I've input
Polish characters directly into the database...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
| |
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |