473,396 Members | 1,864 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

archival of newsfeeds?

I want to program a feedreader which is able to archive all messages so that
I can view messages from weeks or months ago.

The question is now *how* to archive them. Since the feeds can have
different formats do I have to convert them in my own format? Is it better
to store them in a database or is it better to use a large xml file?
Will I still have satisfiing performance if I search a XmlDocument for a
newsfeed containing specific words in the title or feed having a specific
category?
Nov 12 '05 #1
4 953
Hello!
The question is now *how* to archive them. Since the feeds can have
different formats do I have to convert them in my own format?
Atom & RSS have equal features, you could for example use Atom
internally with own additions to support RSS features (or swapped).

Is it better
to store them in a database or is it better to use a large xml file?
Will I still have satisfiing performance if I search a XmlDocument for a
newsfeed containing specific words in the title or feed having a specific
category?


For optimal Performance, use a Database. But this not as flexible as
xml-files wich you could search using XPath.

Always try to avoid using the XmlDocument, especially for large files.
They will be read into memory completely wich is a total waste of
Resources (why parse & build a DOM-Tree of 10MB XML when you just want
to read the first Elements text value?).
For best comfort, use the XPathDocument wich allows you to use XPath on
streamed xml (eg. it is not loaded into memory) for even more
Performance but more specific and schema-centric code use Xml(Text)Reader.
--
Pascal Schmitt
Nov 12 '05 #2
> > The question is now *how* to archive them. Since the feeds can have
different formats do I have to convert them in my own format?


Atom & RSS have equal features, you could for example use Atom
internally with own additions to support RSS features (or swapped).

> Is it better
to store them in a database or is it better to use a large xml file?
Will I still have satisfiing performance if I search a XmlDocument for a
newsfeed containing specific words in the title or feed having a specific category?


For optimal Performance, use a Database. But this not as flexible as
xml-files wich you could search using XPath.

Always try to avoid using the XmlDocument, especially for large files.
They will be read into memory completely wich is a total waste of
Resources (why parse & build a DOM-Tree of 10MB XML when you just want
to read the first Elements text value?).
For best comfort, use the XPathDocument wich allows you to use XPath on
streamed xml (eg. it is not loaded into memory) for even more
Performance but more specific and schema-centric code use Xml(Text)Reader.


But if I want to search within my feed is XmlDocument the right solution or
is there a better way?
How fast is XPath? Does it simply walk through all nodes or are there
optimized algorithms used, for example hashing?
Nov 12 '05 #3
Hello!
For best comfort, use the XPathDocument wich allows you to use XPath on
streamed xml (eg. it is not loaded into memory) for even more
Performance but more specific and schema-centric code use Xml(Text)Reader.
But if I want to search within my feed is XmlDocument the right solution or
is there a better way?


XPathDocument. If there is no need to modify anything, use it!
(and IF you need to modify a big XML file consider using XmlTextReader &
XmlTextWriter simultaneously: read data, modify it, write it at once -
not as nice too look at as DOM operations but really fast.)

XPathDocument x = new XPathDocument("file.xml");
int f = (int)(double)x.CreateNavigator().Evaluate("count(//foo)");

How fast is XPath? Does it simply walk through all nodes or are there
optimized algorithms used, for example hashing?


Afaik there is no need for optimisation because XPath just walks the
Document using an XPathNavigator (wich both XPathDocument and
XmlDocument implement but XPathDocument is faster but does not allow
editing data until .NET 2.0).
--
Pascal Schmitt
Nov 12 '05 #4
But the problem when I use xml files is that if I want to modifiy them, I
have to rewrite the entire file, right?
"Pascal Schmitt" <ne*******@cebra.nu> schrieb im Newsbeitrag
news:uF*************@tk2msftngp13.phx.gbl...
Hello!
For best comfort, use the XPathDocument wich allows you to use XPath on
streamed xml (eg. it is not loaded into memory) for even more
Performance but more specific and schema-centric code use
Xml(Text)Reader.


But if I want to search within my feed is XmlDocument the right solution
or
is there a better way?


XPathDocument. If there is no need to modify anything, use it!
(and IF you need to modify a big XML file consider using XmlTextReader &
XmlTextWriter simultaneously: read data, modify it, write it at once - not
as nice too look at as DOM operations but really fast.)

XPathDocument x = new XPathDocument("file.xml");
int f = (int)(double)x.CreateNavigator().Evaluate("count(//foo)");

How fast is XPath? Does it simply walk through all nodes or are there
optimized algorithms used, for example hashing?


Afaik there is no need for optimisation because XPath just walks the
Document using an XPathNavigator (wich both XPathDocument and XmlDocument
implement but XPathDocument is faster but does not allow editing data
until .NET 2.0).
--
Pascal Schmitt

Nov 12 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: John Hunter | last post by:
hashtar is a utility designed for encrypted archiving to media vulnerable to corruption (eg, CDR, DVDR). http://nitace.bsd.uchicago.edu:8080/hashtar Comments, bug reports, suggestions for...
0
by: Rare Book School | last post by:
RARE BOOK SCHOOL (RBS) is pleased to announce its Winter and Early Spring Sessions 2004, a collection of five-day, non-credit courses on topics concerning rare books, manuscripts, the history of...
0
by: Rare Book School | last post by:
RARE BOOK SCHOOL 2005 Rare Book School is pleased to announce its schedule of courses for 2005, including sessions at the University of Virginia, the Walters Art Museum/Johns Hopkins University...
11
by: siliconmike | last post by:
Is there a way to protect data files from access by root ? I have a data-centered website and would like to protect data piracy from any foot-loose hosting company employee. Any ideas? ...
8
by: eugene | last post by:
Is there any issue re-setting system time while DB2 database is online and the system clock is ahead. The database is actually a 24x7 operational and a unscheduled shutdown would be a problem....
1
by: Sam | last post by:
I want to create header whereby I could reuse whenever new aspx. However, it is display nothing and please find my coding: index.aspx ========== <%@ Page Language="VB" %> <%@ Register...
0
by: Takpol | last post by:
Hello, I have several archived filegroups that have data in them partitioned based on the date. These filegroups have been removed from database after archival. For example two months ago....
1
by: Patrick Finnegan | last post by:
The db2 diag log shows the last log file archived was S0011941.LOG. cat db2diag.log| grep -i archived MESSAGE : Successfully archived log file S0011938.LOG to USEREXIT from MESSAGE :...
1
by: deshaipet | last post by:
As only primary database does archival logging - 1) How should I setup archival logging(LOGARCHMETH1 and LOGARCHMETH2 for Primary and Standby databases in HADR setup ? 2) Should I only use...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.