473,574 Members | 3,135 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Data Storage for news client

Hi all,

I'm writing a news client (mainly to test out CAB & ClickOnce!), and trying
to decide on what to use for the storage of messages etc.. SQL Express seems
like overkill (and is a hefty download for a < 1MB app!). Also, since there
could be thousands of messages (potentially binary), I'm not sure that
serializing my classes to disk would perform at all well.

What would other people use for a small app like this? And why?

Thanks,
Nov 17 '05 #1
6 1466
I would simply store them sequentially in a single file and then create an
index file which has some header information (perhaps subject, author, date,
etc) and an offset to the message's text in the main file. Similar to my
response to the message just a bit earlier under "squeeze few image file
into on binary file"

You could also compress the text prior to storing it in the single file
(using SharpZipLib or 7Zip or something). I suspect it would compress well,
even messages with uuencoded or yenc encoded binaries.

I actually need to integrate a newsreader, at some point, into an app I'm
writing and I suspect this is the direction I'll take.

The advantage of this is that access is quick and it easily accommodates
thousands of messages. If you store the messages in separate files, you'll
soon find your directory getting large and getting to the data in a single
file with an index, using Seek will be much faster than having the file
system find a match for your file name in a directory with thousands of
files.

It's also fairly easy to purge lots of contiguous messages (which is likely
how you'd want to handle purging from a newsreader) from the file. For
example, if you want to delete the first 1000 messages, simply find the
index to the 1001'st message, then copy the data from there to the end to a
new file, delete the original file, and then rename the new one to the name
of the old. Do the same with the index file.

Pete

"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** *************** @ptn-nntp-reader03.plus.n et...
Hi all,

I'm writing a news client (mainly to test out CAB & ClickOnce!), and
trying to decide on what to use for the storage of messages etc.. SQL
Express seems like overkill (and is a hefty download for a < 1MB app!).
Also, since there could be thousands of messages (potentially binary), I'm
not sure that serializing my classes to disk would perform at all well.

What would other people use for a small app like this? And why?

Thanks,

Nov 17 '05 #2
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:Zs******** ************@gi ganews.com...

Hi Peter,
I would simply store them sequentially in a single file and then create an
index file which has some header information (perhaps subject, author,
date, etc) and an offset to the message's text in the main file. Similar to
my response to the message just a bit earlier under "squeeze few image file
into on binary file"

You could also compress the text prior to storing it in the single file
(using SharpZipLib or 7Zip or something). I suspect it would compress
well, even messages with uuencoded or yenc encoded binaries.

I actually need to integrate a newsreader, at some point, into an app I'm
writing and I suspect this is the direction I'll take.


Interesting response. What about performance though? If the user opens a
folder that has 1,000 messages, either I have to load them all *very*
quickly (I need to display Sender, Subject, Date, etc.), or I fetch them as
the user scrolls (which could be pretty unresponsive if the user is dragging
the scrollbar).

What would you store in the index file? The user will be able to change the
sort order in the display, so unless I maintain a few indexes, it'd be
difficult to get a list in order. The message list will show the Sender,
Date, Subject etc., and so if I have to scan through the data file for
thousands of these things, surely it'll take an age? I've never done this
kind of processing before, so I've no idea of how it would perform. I don't
want to build it and find it's unacceptable, so any experiences anyone can
share would be much appreciated! :)

As for compression - again, without testing it, I wouldn't know - but
although compression would save tons of disk space, wouldn't the overhead of
the compression make is slower than reading more uncompressed data? I assume
compression would be variable, so it'd be difficult to seek within a
compressed stream. Any ideas?

Thanks,

Danny
Nov 17 '05 #3


"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** **************@ ptn-nntp-reader02.plus.n et...
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:Zs******** ************@gi ganews.com...

Hi Peter,
[snip] Interesting response. What about performance though? If the user opens a
folder that has 1,000 messages, either I have to load them all *very*
quickly (I need to display Sender, Subject, Date, etc.), or I fetch them
as the user scrolls (which could be pretty unresponsive if the user is
dragging the scrollbar).

I suspect it will load much faster than you think.

Assuming in the index you store Sender, subject, date, message ID, offset in
main file, and a few other header items, I suspect you're looking at an
average of roughly 100-200 bytes per message. Let's say 200 bytes, but
that's probably on the high side. That works out to only 200K per thousand
messages or 5000 messages per megabyte. That will load into memory pretty
quickly.
What would you store in the index file? The user will be able to change
the sort order in the display, so unless I maintain a few indexes, it'd be
difficult to get a list in order. The message list will show the Sender,
Date, Subject etc., and so if I have to scan through the data file for
thousands of these things, surely it'll take an age? I've never done this
kind of processing before, so I've no idea of how it would perform. I
don't want to build it and find it's unacceptable, so any experiences
anyone can share would be much appreciated! :)

Well, if they're going to be able to sort them, then it makes sense to load
it all into memory, assuming that's feasible. Given the figures above, that
should be doable on most modern computers, assuming your just loading
messages from a single group at a time. Load the messages into memory and
then sort them. Leave them sorted in the files however you want. It won't
make much difference.

I don't expect it to be lighting fast, but I think it will be much faster
than you think. Implementing the IComparer interface, sorting should be a
piece of cake and the built-in sort algorithm is quick sort, I believe.
As for compression - again, without testing it, I wouldn't know - but
although compression would save tons of disk space, wouldn't the overhead
of the compression make is slower than reading more uncompressed data? I
assume compression would be variable, so it'd be difficult to seek within
a compressed stream. Any ideas?


Compressing data is slow. Decomrpessing is generally quite fast. I suspect
it'll be faster to read due to the large amount of saved space, particularly
if data is located on a network drive.

Remember, 2 files: Index file and Data File. Leave the index file
uncompressed. Don't compress the entire data file, just compress the
individual messages. That way you have an offset to each compressed message
and just begin decompression at the beginning of the message. Again, look at
the message I posted earlier where I use a simple index file and store a
bunch of thumbnails in a single file. It easily loads 500 thumbnails (and
that includes jpeg decoding of the data) in a matter of maybe 2 seconds.
Without the jpeg decoding, it would be less than half a second, I'm sure.

Nov 18 '05 #4
Why don't you try the SQLite database engine? It's a single small DLL,
requires no installation, has an ADO.NET provider, and it's extremely fast.
There's now a 2.0 version as well. Check it out at Sourceforge.net
peter

--
Co-founder, Eggheadcafe.com developer portal:
http://www.eggheadcafe.com
UnBlog:
http://petesbloggerama.blogspot.com


"Danny Tuppeny" wrote:
Hi all,

I'm writing a news client (mainly to test out CAB & ClickOnce!), and trying
to decide on what to use for the storage of messages etc.. SQL Express seems
like overkill (and is a hefty download for a < 1MB app!). Also, since there
could be thousands of messages (potentially binary), I'm not sure that
serializing my classes to disk would perform at all well.

What would other people use for a small app like this? And why?

Thanks,

Nov 18 '05 #5
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:M4******** *************** *******@giganew s.com...
I suspect it will load much faster than you think.
After Googling a little more last night, I think you're right! :)
I ran this:

http://www.codeproject.com/csharp/Fa...yFileInput.asp

Which didn't take long to create 10,000,000 structs in a binary file -
276MB of data :)

Well, if they're going to be able to sort them, then it makes sense to
load it all into memory, assuming that's feasible. Given the figures
above, that should be doable on most modern computers, assuming your just
loading messages from a single group at a time. Load the messages into
memory and then sort them. Leave them sorted in the files however you
want. It won't make much difference.
I was thinking about this - if once loaded into memory, I let the user sort
(probably by clicking column headers), once they more to another folder (or
close the app), I can write the index back in this order - which persists
their sort order, but also means I don't ever have to load it and
immediately sort afterwards :)

Do you think the index file would perform well as normal Serialized objects?
The smaller thngs (Folders, user settings etc.) I was going to just
serialize as XML. Since the messages (indexes) won't be huge, I'm wondering
if they can be done the same way, or if I'd need to think about something
slightly different, like the message data..?

Remember, 2 files: Index file and Data File. Leave the index file
uncompressed. Don't compress the entire data file, just compress the
individual messages. That way you have an offset to each compressed
message and just begin decompression at the beginning of the message.
Again, look at the message I posted earlier where I use a simple index
file and store a bunch of thumbnails in a single file. It easily loads 500
thumbnails (and that includes jpeg decoding of the data) in a matter of
maybe 2 seconds. Without the jpeg decoding, it would be less than half a
second, I'm sure.


I forgot to look! Just looked now, and it looks very helpful - thanks! :)

Danny
Nov 18 '05 #6

"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** *************** @ptn-nntp-reader03.plus.n et...
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:M4******** *************** *******@giganew s.com...
I suspect it will load much faster than you think.

[snip]
I was thinking about this - if once loaded into memory, I let the user
sort (probably by clicking column headers), once they more to another
folder (or close the app), I can write the index back in this order -
which persists their sort order, but also means I don't ever have to load
it and immediately sort afterwards :)
yes, it will/ But again, I don't think sorting is going to be slow at all.

Do you think the index file would perform well as normal Serialized
objects? The smaller thngs (Folders, user settings etc.) I was going to
just serialize as XML. Since the messages (indexes) won't be huge, I'm
wondering if they can be done the same way, or if I'd need to think about
something slightly different, like the message data..?

Xml serialization for the headers is probably fine.

Pete
Nov 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1746
by: Tauseef | last post by:
hi people I am running my PHP on Apache server on Linux. Now my problem is as follows. There may be several clients connecting to my server for some jobs whenever a client connects i am executing a php script which connects to a MySQL database and gathers some table information into an array. So in this case every time a client connects to...
10
5770
by: Zap | last post by:
Widespread opinion is that public data members are evil, because if you have to change the way the data is stored in your class you have to break the code accessing it, etc. After reading this (also copied below for easier reference): http://groups.google.it/groups?hl=en&lr=&safe=off&selm=6beiuk%24cje%40netlab.cs.rpi.edu&rnum=95 I don't...
4
11801
by: zamolxe | last post by:
Only one client can connect to DB2 at a time from our office. Once a connection is established, anybody trying to connect gets the message "SQL0973N Not enough storage is available in the "MON_HEAP_SZ" heap to process the statement. SQLSTATE=57011". Multiple connections were possible before, but then something changed somewhere (probably...
2
2223
by: Victor Fees | last post by:
I have a general question for which I would like to get some general input from the online community. I'm building an asp.net web application that will make extensive use of a SQL Server 2K database. One of the relationships in this database is many Tasks to a 1 Project. I can see two basic ways to manage this in my web application: The...
12
3745
by: Chris Springer | last post by:
I'd like to get some feedback on the issue of storing data out to disk and where to store it. I've never been in a production environment in programming so you'll have to bear with me... My question is about storing data in a database. Yes I understand that you can link to a database in your program and read and write to the database etc...
7
1567
by: charpour | last post by:
Hello, I am implementing a server in C using the select function and I have problems implementing a buffering system for holding client data until the client socket is available for reading/writing (sendq and receivq). What I am trying to do is "save" the data in the client's recvq right after data is availiable for the socket and write...
11
3146
by: eBob.com | last post by:
I have this nasty problem with Shared methods and what I think of as "global storage" - i.e. storage declared outside of any subroutines or functions. In the simple example below this "global" storage is ButtonHasBeenClicked. In this simple example code in Form1 calls a routine in Module1 which then calls code back in Form1 (subroutine...
18
1248
by: Brock | last post by:
I'm trying to develop a web service to expose an XML file for product manufacturers for a client application to consume and populate a datagrid on the consuming end. I have successfully tested the web service with simple mathematic returns like: <%@ WebService Language="VB" Class="aWebService" %> Imports System.Web
3
2274
by: Andrew Poulos | last post by:
I have a piece of elearning where users move from screen to screen. The content itself is displayed within an IFRAME with the parent window containing the various navigation controls and javascript. I did it this way because there's a substantial amount of javascript that gets loaded and a significant amount of data that gets created as the...
0
7841
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7760
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8271
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
8137
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6511
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5654
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
1
2273
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1369
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1099
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.