Serialization/Compression - .NET Framework

Bala Nagarajan

Hello,

We are developing an application (Windows forms) that allows users to take
a snapshot of the some database tables and save them another set set of
tables (called Model tables)and work with them . Since the data that goes
into model tables are huge (in the order of 30000 records for each table) we
envisaged that we are going to run out of database space if more users start
hitting our application database. To solve this problem i suggetsed to split
the records that go into the model tables in chunks of say 5000 ,binary
serialize the data, compress it and store the compressed form of the data in
a blob field in the database. Of course the application will have to the
reverse : decompress, deserialize and then render data to GUI. This proces
does incur overhead because of the intermediate operations but i thought it
is worth implementing since it can save us atleast 60-70% of space which i
guess is pretty signifcant. Also the time taken to retrieve 6 records
(instead of 30000) from the database which contains 30000 records in
serialized format will be much efficient and faster.

I just want to know if this approach is a good solution. Please let me know
if there is a better way of resolving this issue. The downside if this
approach is when the user modifies the data. The problems are as follows.

1. If the user has edited the data i will have to find out which chunk he
has modified, serialize and compress only that portion of the data. I don't
want to serialize all the chunks of data if the user just modifies only one
chunk of the data. Though i can use some kind of identifier to identify the
chunk the process may be cumbersome.

2. Even If the user just modifies one record i will have to serialize and
compress 5000 records not matter what, which is kind of bad.

I am not sure as to how to tackle these problem and will greatly apperciate
if you help me out.

Thanks a lot for the help.

Bala

Jan 27 '06 #1

Subscribe Reply

1451

Peter Rilling

Here are some questions.

1) Will the snapshots exists forever or only while they are being worked
on? Will you be able to delete the snapshots at some point in time?
2) How many users are expected to make snapshots at around the same time?
3) Are the snapshots only for viewing or will the user be making changes as
well?
4) If the user can make changes to a snapshot, where will those changes
persist? Will they be pushed back to the main take that the snapshot came
from?

"Bala Nagarajan" <ba********@new sgroups.nospam> wrote in message
news:%2******** ********@TK2MSF TNGP10.phx.gbl. ..

Hello,

We are developing an application (Windows forms) that allows users to take
a snapshot of the some database tables and save them another set set of
tables (called Model tables)and work with them . Since the data that goes
into model tables are huge (in the order of 30000 records for each table)
we envisaged that we are going to run out of database space if more users
start hitting our application database. To solve this problem i suggetsed
to split the records that go into the model tables in chunks of say 5000
,binary serialize the data, compress it and store the compressed form of
the data in a blob field in the database. Of course the application will
have to the reverse : decompress, deserialize and then render data to GUI.
This proces does incur overhead because of the intermediate operations
but i thought it is worth implementing since it can save us atleast 60-70%
of space which i guess is pretty signifcant. Also the time taken to
retrieve 6 records (instead of 30000) from the database which contains
30000 records in serialized format will be much efficient and faster.

I just want to know if this approach is a good solution. Please let me
know if there is a better way of resolving this issue. The downside if
this approach is when the user modifies the data. The problems are as
follows.

1. If the user has edited the data i will have to find out which chunk he
has modified, serialize and compress only that portion of the data. I
don't want to serialize all the chunks of data if the user just modifies
only one chunk of the data. Though i can use some kind of identifier to
identify the chunk the process may be cumbersome.

2. Even If the user just modifies one record i will have to serialize and
compress 5000 records not matter what, which is kind of bad.

I am not sure as to how to tackle these problem and will greatly
apperciate if you help me out.

Thanks a lot for the help.

Bala

Jan 27 '06 #2

John Bailo

I wrote an smart client multiuser application that handles much less
data, but it does serialize individual data on the client as XML.

I find that a more balanced approach, architecturally .

Bala Nagarajan wrote:

Hello,

We are developing an application (Windows forms) that allows users to take
a snapshot of the some database tables and save them another set set of
tables (called Model tables)and work with them . Since the data that goes
into model tables are huge (in the order of 30000 records for each table) we
envisaged that we are going to run out of database space if more users start
hitting our application database. To solve this problem i suggetsed to split
the records that go into the model tables in chunks of say 5000 ,binary
serialize the data, compress it and store the compressed form of the data in
a blob field in the database. Of course the application will have to the
reverse : decompress, deserialize and then render data to GUI. This proces
does incur overhead because of the intermediate operations but i thought it
is worth implementing since it can save us atleast 60-70% of space which i
guess is pretty signifcant. Also the time taken to retrieve 6 records
(instead of 30000) from the database which contains 30000 records in
serialized format will be much efficient and faster.

I just want to know if this approach is a good solution. Please let me know
if there is a better way of resolving this issue. The downside if this
approach is when the user modifies the data. The problems are as follows.

1. If the user has edited the data i will have to find out which chunk he
has modified, serialize and compress only that portion of the data. I don't
want to serialize all the chunks of data if the user just modifies only one
chunk of the data. Though i can use some kind of identifier to identify the
chunk the process may be cumbersome.

2. Even If the user just modifies one record i will have to serialize and
compress 5000 records not matter what, which is kind of bad.

I am not sure as to how to tackle these problem and will greatly apperciate
if you help me out.

Thanks a lot for the help.

Bala

Jan 27 '06 #3

Bala Nagarajan

Peter,
Thanks for responding. Let me know if you have more questions. I really
apperciate your time.

1) Will the snapshots exists forever or only while they are being worked
on? Will you be able to delete the snapshots at some point in time?
The snapshot will get refreshed every month. Every month a batch process
will be run to refresh the snapshot.

2) How many users are expected to make snapshots at around the same time?
Around 50.

3) Are the snapshots only for viewing or will the user be making changes
as
well?

The Snapshot is for viewing only. But the users can create a copy of this
snapshot (called Model in our world). The users can isert/update/delete
data at the model level.

Thanks
Bala

"Bala Nagarajan" <ba********@new sgroups.nospam> wrote in message
news:%2******** ********@TK2MSF TNGP10.phx.gbl. ..
Hello,

We are developing an application (Windows forms) that allows users to
take a snapshot of the some database tables and save them another set set
of tables (called Model tables)and work with them . Since the data that
goes into model tables are huge (in the order of 30000 records for each
table) we envisaged that we are going to run out of database space if
more users start hitting our application database. To solve this problem
i suggetsed to split the records that go into the model tables in chunks
of say 5000 ,binary serialize the data, compress it and store the
compressed form of the data in a blob field in the database. Of course
the application will have to the reverse : decompress, deserialize and
then render data to GUI. This proces does incur overhead because of the
intermediate operations but i thought it is worth implementing since it
can save us atleast 60-70% of space which i guess is pretty signifcant.
Also the time taken to retrieve 6 records (instead of 30000) from the
database which contains 30000 records in serialized format will be much
efficient and faster.

I just want to know if this approach is a good solution. Please let me
know if there is a better way of resolving this issue. The downside if
this approach is when the user modifies the data. The problems are as
follows.

1. If the user has edited the data i will have to find out which chunk he
has modified, serialize and compress only that portion of the data. I
don't want to serialize all the chunks of data if the user just modifies
only one chunk of the data. Though i can use some kind of identifier to
identify the chunk the process may be cumbersome.

2. Even If the user just modifies one record i will have to serialize and
compress 5000 records not matter what, which is kind of bad.

I am not sure as to how to tackle these problem and will greatly
apperciate if you help me out.

Thanks a lot for the help.

Bala

Jan 30 '06 #4

Peter Huang [MSFT]

Hi

Commonly we did not recommend to compress whole table and stored in the
database. Because that will make the database hard to maintain. Once a mini
error occur, the whole database will be unavailable, because common a mini
error in a compress package will cause the whole package unavailable. That
is why we stored the data in the database, and the database helped to store
and maintain. The database have special mechanisms to maintain the data
including backup.

Also I am curious why you need to take a tables snapshot, if the snapshot
is for view only, we can query from the db directly.
If the users will make change to the modal, so will the modal be update
back into the database.
If no, why we did not use a dataset directly.

Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

Jan 31 '06 #5

Similar topics

409

Compression algorithims....

by: Jim Hubbard | last post by:

I went to the compression newsgroups, but all I saw was spam. So, I thought I'd post his question here to get the best info I could from other programmers. Which compression algorithm offers the fastest compression of text data? Which compression algorithm offers the best compression of text data? I need to do in memory compression of text data before sending it over a TCP connection. Zip is the old standby....but are there better...

.NET Framework

1973

Custom File Format & Serialization

by: Phil Price | last post by:

Hi there, I'm developing a shape recognition application for the tablet PC for a) fun b) university project. Currently I'm working on the learning stage using neural networks, and have to store a load of learning data (a 25 by 25 matrix) each shape group has a number of user drawn shapes, then the application will create variations of these shapes (by moving nodes and drawing lines into the matrix between nodes, after normalization). So...

.NET Framework

1086

Remoting and Serialization

by: Markus Minichmayr | last post by:

Hello! I already posted this issue on the remoting NG, but didn't get an answer, so I try it here. I have a client/server app., that communicates via .NET remoting and I need to transport large binary objects, encapsulated in a special class. The clients sumtimes run on the same machine as the server and sometimes on others.

.NET Framework

2866

Serialization and data compression methods

by: Alex Vinokur | last post by:

Are data compression methods used as the serialization technique? -- Alex Vinokur email: alex DOT vinokur AT gmail DOT com http://mathforum.org/library/view/10978.html http://sourceforge.net/users/alexvn

C / C++

3700

Dataset Binary Serialization vs Compression

by: Matt | last post by:

I have a web service that currently returns a dataset. Depending on the data being returned its size will be in megabytes (XML Document could be possibly 100 or more megabytes). To speed up the data transfer I'm looking at two options: 1. One is compress the XML generated by the Dataset and transfer it to the client as an array of bytes. (I was able to compress down to 2 MB) 2. The second option is to use binary serialization. I've...

.NET Framework

2355

Copying zlib compression objects

by: chris.atlee | last post by:

I'm writing a program in python that creates tar files of a certain maximum size (to fit onto CD/DVD). One of the problems I'm running into is that when using compression, it's pretty much impossible to determine if a file, once added to an archive, will cause the archive size to exceed the maximum size. I believe that to do this properly, you need to copy the state of tar file (basically the current file offset as well as the state of...

Python

5676

Xml Serialization

by: Sascha Dietl | last post by:

Hi NG, I got a problem while decrypting an encrypted a Serialized class: I Serialize a simple class to a Stream then encrypt it and write it to file everything seems to work here until i try to read the file and decrypt it. When i read the file into a byte array everything seems to be correct but when I get to decrypt the byte array i get just \0 for the length of the byte array. The Problem occurs when the byte array is decrypted by the...

C# / C Sharp

2950

Adding compression

by: chance | last post by:

Hello, I want to add compression to a memory stream and save it in an Oracle database. This is the code I have so far: //save the Word document to a binary field, MemoryStream dataStream = new MemoryStream(); doc.Save(dataStream, SaveFormat.Doc); //now compress it GZipStream compressedZipStream = new GZipStream(dataStream,

C# / C Sharp

6025

Compression size

by: =?Utf-8?B?VkJB?= | last post by:

I compressed a file with GZipStream class and is larger than the original file.... how can this be?, the original file is 737 KB and the "compressed" file is 1.1 MB. Did i miss something or is normal with that compression class? -- VBA

C# / C Sharp

7534

TIFF compression & transfer

by: GiJeet | last post by:

Hello, we have an app that scans documents into TIFF format and we need to transfer them over the internet. If anyone knows of a SDK we can use that can compress TIFFs on the fly or even if it can compress them so they take up less space on the server, would be appreciated. Actually any info on handling tiff files programatically would be appreciated as I know very little about tiffs. TIA

C# / C Sharp

9957

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

9799

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

10877

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

7988

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

7143

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5810

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4633

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

4239

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

3245

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General