Hi all,
I'm writing a news client (mainly to test out CAB & ClickOnce!), and trying
to decide on what to use for the storage of messages etc.. SQL Express seems
like overkill (and is a hefty download for a < 1MB app!). Also, since there
could be thousands of messages (potentially binary), I'm not sure that
serializing my classes to disk would perform at all well.
What would other people use for a small app like this? And why?
Thanks, 6 1477
I would simply store them sequentially in a single file and then create an
index file which has some header information (perhaps subject, author, date,
etc) and an offset to the message's text in the main file. Similar to my
response to the message just a bit earlier under "squeeze few image file
into on binary file"
You could also compress the text prior to storing it in the single file
(using SharpZipLib or 7Zip or something). I suspect it would compress well,
even messages with uuencoded or yenc encoded binaries.
I actually need to integrate a newsreader, at some point, into an app I'm
writing and I suspect this is the direction I'll take.
The advantage of this is that access is quick and it easily accommodates
thousands of messages. If you store the messages in separate files, you'll
soon find your directory getting large and getting to the data in a single
file with an index, using Seek will be much faster than having the file
system find a match for your file name in a directory with thousands of
files.
It's also fairly easy to purge lots of contiguous messages (which is likely
how you'd want to handle purging from a newsreader) from the file. For
example, if you want to delete the first 1000 messages, simply find the
index to the 1001'st message, then copy the data from there to the end to a
new file, delete the original file, and then rename the new one to the name
of the old. Do the same with the index file.
Pete
"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** *************** @ptn-nntp-reader03.plus.n et... Hi all,
I'm writing a news client (mainly to test out CAB & ClickOnce!), and trying to decide on what to use for the storage of messages etc.. SQL Express seems like overkill (and is a hefty download for a < 1MB app!). Also, since there could be thousands of messages (potentially binary), I'm not sure that serializing my classes to disk would perform at all well.
What would other people use for a small app like this? And why?
Thanks,
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:Zs******** ************@gi ganews.com...
Hi Peter, I would simply store them sequentially in a single file and then create an index file which has some header information (perhaps subject, author, date, etc) and an offset to the message's text in the main file. Similar to my response to the message just a bit earlier under "squeeze few image file into on binary file"
You could also compress the text prior to storing it in the single file (using SharpZipLib or 7Zip or something). I suspect it would compress well, even messages with uuencoded or yenc encoded binaries.
I actually need to integrate a newsreader, at some point, into an app I'm writing and I suspect this is the direction I'll take.
Interesting response. What about performance though? If the user opens a
folder that has 1,000 messages, either I have to load them all *very*
quickly (I need to display Sender, Subject, Date, etc.), or I fetch them as
the user scrolls (which could be pretty unresponsive if the user is dragging
the scrollbar).
What would you store in the index file? The user will be able to change the
sort order in the display, so unless I maintain a few indexes, it'd be
difficult to get a list in order. The message list will show the Sender,
Date, Subject etc., and so if I have to scan through the data file for
thousands of these things, surely it'll take an age? I've never done this
kind of processing before, so I've no idea of how it would perform. I don't
want to build it and find it's unacceptable, so any experiences anyone can
share would be much appreciated! :)
As for compression - again, without testing it, I wouldn't know - but
although compression would save tons of disk space, wouldn't the overhead of
the compression make is slower than reading more uncompressed data? I assume
compression would be variable, so it'd be difficult to seek within a
compressed stream. Any ideas?
Thanks,
Danny
"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** **************@ ptn-nntp-reader02.plus.n et... "Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message news:Zs******** ************@gi ganews.com...
Hi Peter,
[snip] Interesting response. What about performance though? If the user opens a folder that has 1,000 messages, either I have to load them all *very* quickly (I need to display Sender, Subject, Date, etc.), or I fetch them as the user scrolls (which could be pretty unresponsive if the user is dragging the scrollbar).
I suspect it will load much faster than you think.
Assuming in the index you store Sender, subject, date, message ID, offset in
main file, and a few other header items, I suspect you're looking at an
average of roughly 100-200 bytes per message. Let's say 200 bytes, but
that's probably on the high side. That works out to only 200K per thousand
messages or 5000 messages per megabyte. That will load into memory pretty
quickly.
What would you store in the index file? The user will be able to change the sort order in the display, so unless I maintain a few indexes, it'd be difficult to get a list in order. The message list will show the Sender, Date, Subject etc., and so if I have to scan through the data file for thousands of these things, surely it'll take an age? I've never done this kind of processing before, so I've no idea of how it would perform. I don't want to build it and find it's unacceptable, so any experiences anyone can share would be much appreciated! :)
Well, if they're going to be able to sort them, then it makes sense to load
it all into memory, assuming that's feasible. Given the figures above, that
should be doable on most modern computers, assuming your just loading
messages from a single group at a time. Load the messages into memory and
then sort them. Leave them sorted in the files however you want. It won't
make much difference.
I don't expect it to be lighting fast, but I think it will be much faster
than you think. Implementing the IComparer interface, sorting should be a
piece of cake and the built-in sort algorithm is quick sort, I believe.
As for compression - again, without testing it, I wouldn't know - but although compression would save tons of disk space, wouldn't the overhead of the compression make is slower than reading more uncompressed data? I assume compression would be variable, so it'd be difficult to seek within a compressed stream. Any ideas?
Compressing data is slow. Decomrpessing is generally quite fast. I suspect
it'll be faster to read due to the large amount of saved space, particularly
if data is located on a network drive.
Remember, 2 files: Index file and Data File. Leave the index file
uncompressed. Don't compress the entire data file, just compress the
individual messages. That way you have an offset to each compressed message
and just begin decompression at the beginning of the message. Again, look at
the message I posted earlier where I use a simple index file and store a
bunch of thumbnails in a single file. It easily loads 500 thumbnails (and
that includes jpeg decoding of the data) in a matter of maybe 2 seconds.
Without the jpeg decoding, it would be less than half a second, I'm sure.
Why don't you try the SQLite database engine? It's a single small DLL,
requires no installation, has an ADO.NET provider, and it's extremely fast.
There's now a 2.0 version as well. Check it out at Sourceforge.net
peter
--
Co-founder, Eggheadcafe.com developer portal: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
"Danny Tuppeny" wrote: Hi all,
I'm writing a news client (mainly to test out CAB & ClickOnce!), and trying to decide on what to use for the storage of messages etc.. SQL Express seems like overkill (and is a hefty download for a < 1MB app!). Also, since there could be thousands of messages (potentially binary), I'm not sure that serializing my classes to disk would perform at all well.
What would other people use for a small app like this? And why?
Thanks,
"Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message
news:M4******** *************** *******@giganew s.com... I suspect it will load much faster than you think.
After Googling a little more last night, I think you're right! :)
I ran this: http://www.codeproject.com/csharp/Fa...yFileInput.asp
Which didn't take long to create 10,000,000 structs in a binary file -
276MB of data :)
Well, if they're going to be able to sort them, then it makes sense to load it all into memory, assuming that's feasible. Given the figures above, that should be doable on most modern computers, assuming your just loading messages from a single group at a time. Load the messages into memory and then sort them. Leave them sorted in the files however you want. It won't make much difference.
I was thinking about this - if once loaded into memory, I let the user sort
(probably by clicking column headers), once they more to another folder (or
close the app), I can write the index back in this order - which persists
their sort order, but also means I don't ever have to load it and
immediately sort afterwards :)
Do you think the index file would perform well as normal Serialized objects?
The smaller thngs (Folders, user settings etc.) I was going to just
serialize as XML. Since the messages (indexes) won't be huge, I'm wondering
if they can be done the same way, or if I'd need to think about something
slightly different, like the message data..?
Remember, 2 files: Index file and Data File. Leave the index file uncompressed. Don't compress the entire data file, just compress the individual messages. That way you have an offset to each compressed message and just begin decompression at the beginning of the message. Again, look at the message I posted earlier where I use a simple index file and store a bunch of thumbnails in a single file. It easily loads 500 thumbnails (and that includes jpeg decoding of the data) in a matter of maybe 2 seconds. Without the jpeg decoding, it would be less than half a second, I'm sure.
I forgot to look! Just looked now, and it looks very helpful - thanks! :)
Danny
"Danny Tuppeny" <gr****@dannytu ppeny.commmmmm> wrote in message
news:43******** *************** @ptn-nntp-reader03.plus.n et... "Pete Davis" <pdavis68@[nospam]hotmail.com> wrote in message news:M4******** *************** *******@giganew s.com... I suspect it will load much faster than you think.
[snip] I was thinking about this - if once loaded into memory, I let the user sort (probably by clicking column headers), once they more to another folder (or close the app), I can write the index back in this order - which persists their sort order, but also means I don't ever have to load it and immediately sort afterwards :)
yes, it will/ But again, I don't think sorting is going to be slow at all. Do you think the index file would perform well as normal Serialized objects? The smaller thngs (Folders, user settings etc.) I was going to just serialize as XML. Since the messages (indexes) won't be huge, I'm wondering if they can be done the same way, or if I'd need to think about something slightly different, like the message data..?
Xml serialization for the headers is probably fine.
Pete This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Tauseef |
last post by:
hi people
I am running my PHP on Apache server on Linux.
Now my problem is as follows.
There may be several clients connecting to my server for some jobs
whenever a client connects i am executing a php script which connects
to a MySQL database and gathers some table information into an array.
So in this case every time a client connects to my server i have to
connect to a database and and read it into an array.My question is that
is it...
|
by: Zap |
last post by:
Widespread opinion is that public data members are evil, because if you
have to change the way the data is stored in your class you have to
break the code accessing it, etc.
After reading this (also copied below for easier reference):
http://groups.google.it/groups?hl=en&lr=&safe=off&selm=6beiuk%24cje%40netlab.cs.rpi.edu&rnum=95
I don't agree anymore.
|
by: zamolxe |
last post by:
Only one client can connect to DB2 at a time from our office. Once a
connection is established, anybody trying to connect gets the message
"SQL0973N Not enough storage is available in the "MON_HEAP_SZ" heap
to process the statement. SQLSTATE=57011".
Multiple connections were possible before, but then something changed
somewhere (probably on the server). Funny thing is, people in other
locations don't have this problem connecting to the...
|
by: Victor Fees |
last post by:
I have a general question for which I would like to get some general input
from the online community.
I'm building an asp.net web application that will make extensive use of a
SQL Server 2K database. One of the relationships in this database is many
Tasks to a 1 Project. I can see two basic ways to manage this in my web
application: The first is by creating a Project class and a Task class,
and talking directly to the database with...
|
by: Chris Springer |
last post by:
I'd like to get some feedback on the issue of storing data out to disk and
where to store it. I've never been in a production environment in
programming so you'll have to bear with me...
My question is about storing data in a database. Yes I understand that you
can link to a database in your program and read and write to the database
etc etc. Well, that's all find and dandy but what if the person you're
writing the application for...
| |
by: charpour |
last post by:
Hello,
I am implementing a server in C using the select function and I have
problems implementing a buffering system for holding client data until
the client socket is available for reading/writing (sendq and
receivq). What I am trying to do is "save" the data in the client's
recvq right after data is availiable for the socket and write buffered
data (sendq) to the socket when it's ready. The prog skeleton is like
this:
|
by: eBob.com |
last post by:
I have this nasty problem with Shared methods and what I think of as "global
storage" - i.e. storage declared outside of any subroutines or functions.
In the simple example below this "global" storage is ButtonHasBeenClicked.
In this simple example code in Form1 calls a routine in Module1 which then
calls code back in Form1 (subroutine WhatEver). WhatEver needs to access
ButtonHasBeenClicked but the reference to ButtonHasBeenClicked...
|
by: Brock |
last post by:
I'm trying to develop a web service to expose an XML file for product
manufacturers for a client application to consume and populate a
datagrid on the consuming end.
I have successfully tested the web service with simple mathematic
returns like:
<%@ WebService Language="VB" Class="aWebService" %>
Imports System.Web
|
by: Andrew Poulos |
last post by:
I have a piece of elearning where users move from screen to screen. The
content itself is displayed within an IFRAME with the parent window
containing the various navigation controls and javascript.
I did it this way because there's a substantial amount of javascript
that gets loaded and a significant amount of data that gets created as
the user moves about the lesson.
All seems well and good until the user refreshes the screen. Then...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |