473,785 Members | 3,032 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

About Xml serialization scalability and persistence

I am developing a Windows Service that is resident on the machine. The
program needs to synchronize certain object list in memory (an object
typed as List<Foo>) with disc, serializing and deserializing XML.

The first simplest technique I have used ATM is the following:

- At program start, if the XML file exists, deserialize it and create an
object with its content.
- Every time the list changes (an element is added or remove), I
serialize the entire object again and I write a new file (overwriting
the old one).

The problem is that I am a little bit concerned about this going to
production, because of the following matters:

- Scalability: What would happen if the list begins to grow to, for
example, 1000 or 10000 elements? (I suppose that the data serialization
would take much longer and the program would go much slower when the
list is modified).

- Data loss: what would happen if the computer looses energy in the
moment when the serializer is writing data to disc? (I suppose that I
would loose the last list written and I would get a new corrupt file, am
I right?).

Do you have any ideas to improve this behaviour?

Thanks in advance.
Andrés [ knocte ]

--
Jan 18 '07 #1
2 1778
Andres,

For performance reasons, the biggest question you need to ask is how
frequently this list will change and you'll be writing this list to disc?
If it's only occasionally, then this will work, but otherwise, I might look
for a different solution.

To ensure that this file is always written out, you might look at Messaging
(MSMQ). It can ensure that a message is processed, and can therefore ensure
that the message is written to disc.

I'd do this in the following manner:

When a synchronization request is made, write a message to the queue with
the new value.
Update the list with the new value.

On a separate thread, you have a queue monitor that looks for new messages
in the queue.
When it sees a new message, it peeks the queue (or reads the queue in a
transaction) and starts the file write process to a temporary file.
Upon completion of the file write process, the new temp file is copied over
the top of the old file and the temp file deleted.
The process then consumes the message from the queue (or commits the
transaction) and looks for the next message to process.

If the power fails while the write process is in progress, worst case is
that you end up with a partial temp file, which wastes disk space. Because
the message hasn't been consumed from the queue, the thread that monitors
the queue will see it after the failure and immediately write out the file
again.

Additionally, if you have multiple threads changing the list, you can ensure
that only one change is processed at a time (you could consume all of the
outstanding messages in one file update as well). You could also do batch
updates to the file. Once at startup, and once every 5 minutes or whatever.
This would limit the impact on the overall performance of the application.

Make sense?

Robert
""Andrés G. Aragoneses [ knocte ]"" <kn****@NO-SPAM-PLEASE-gmail.comwrote
in message news:ef******** ******@TK2MSFTN GP03.phx.gbl...
>I am developing a Windows Service that is resident on the machine. The
program needs to synchronize certain object list in memory (an object typed
as List<Foo>) with disc, serializing and deserializing XML.

The first simplest technique I have used ATM is the following:

- At program start, if the XML file exists, deserialize it and create an
object with its content.
- Every time the list changes (an element is added or remove), I serialize
the entire object again and I write a new file (overwriting the old one).

The problem is that I am a little bit concerned about this going to
production, because of the following matters:

- Scalability: What would happen if the list begins to grow to, for
example, 1000 or 10000 elements? (I suppose that the data serialization
would take much longer and the program would go much slower when the list
is modified).

- Data loss: what would happen if the computer looses energy in the moment
when the serializer is writing data to disc? (I suppose that I would loose
the last list written and I would get a new corrupt file, am I right?).

Do you have any ideas to improve this behaviour?

Thanks in advance.
Andrés [ knocte ]

--

Jan 18 '07 #2
Robert May escribió:
For performance reasons, the biggest question you need to ask is how
frequently this list will change and you'll be writing this list to disc?
If it's only occasionally, then this will work, but otherwise, I might look
for a different solution.

To ensure that this file is always written out, you might look at Messaging
(MSMQ). It can ensure that a message is processed, and can therefore ensure
that the message is written to disc.

I'd do this in the following manner:

When a synchronization request is made, write a message to the queue with
the new value.
Update the list with the new value.

On a separate thread, you have a queue monitor that looks for new messages
in the queue.
When it sees a new message, it peeks the queue (or reads the queue in a
transaction) and starts the file write process to a temporary file.
Upon completion of the file write process, the new temp file is copied over
the top of the old file and the temp file deleted.
The process then consumes the message from the queue (or commits the
transaction) and looks for the next message to process.

If the power fails while the write process is in progress, worst case is
that you end up with a partial temp file, which wastes disk space. Because
the message hasn't been consumed from the queue, the thread that monitors
the queue will see it after the failure and immediately write out the file
again.

Additionally, if you have multiple threads changing the list, you can ensure
that only one change is processed at a time (you could consume all of the
outstanding messages in one file update as well). You could also do batch
updates to the file. Once at startup, and once every 5 minutes or whatever.
This would limit the impact on the overall performance of the application.

Make sense?

Thanks for your comment.
I already thought about the temp file solution but wasn't sure if it was
the most correct. Perhaps I was dreaming about a more elegant one, that
involved some transactional way of accessing the disc (without not
deletting the original file until finally written the next one). I am
not going to use MSMQ because it's not portable, but the idea is
interesting.

Well, the application won't update the list very frequently but I am
still concerned about rewriting the whole object in each modification
(even if I create a thread so as to write the data in a larger
interval). Isn't there a way to write and remove the data incrementally?

Regards,

Andrés [ knocte ]

--
Jan 18 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2837
by: fdsl ysnh | last post by:
--- python-list-request@python.orgдµÀ: > Send Python-list mailing list submissions to > python-list@python.org > > To subscribe or unsubscribe via the World Wide Web, > visit > http://mail.python.org/mailman/listinfo/python-list > or, via email, send a message with subject or body > 'help' to
1
1472
by: Ursula Peter-Czichi | last post by:
I have not planned data persistence at the beginning of a larger project. This now causes some problems. Doing some tests, I have figured out how to serialize and deserialize simple variables as well as a sorted list. Here is the problem: My classes contain both, a sorted list and simple data types. Two classes contain two sorted lists each. There are no errors in the serialization part (on closing), however, there is a problem...
2
6211
by: Dominic | last post by:
Hi everybody, I'm planning to use serialization to persist an object (and possibly its child objects) in my application. However, I'm concerned about the backward compatibility issue. I'm evaluating if we can easily resolve this issue. For example, I have a class MyClass consisting of 100 fields.
10
387
by: Simon Harvey | last post by:
Hi everyone, Can anyone tell me if I declare a global variable in my pages code behind, is it persisted if the page does a post back, or do I need to add the object to the session object in order to persist it. Is the session the best mechnism for persisiting this object Thanks everyone
1
1294
by: Shane Story | last post by:
I am confused on serializtion. I have a class called picture. It has a bitmap member and several others. I have a collection class called pictures which inherits from collectionbase. How should I best serialize them? They are part of a larger object graph of course, but so far I am having trouble and have tried various things.
0
1247
by: Jason Hales | last post by:
XML Serialization is a great feature. Recently I've found myself "leveraging" it on some of my classes and spending far too much time altering a class so that it's resultant XML message is in the correct format using XmlAttribute etc Eventually I end up with a who collection of classes with default constructors, writeable properties which should be readonly/Property Sets etc, Typically these would be wrong from an OO point of view...
8
1402
by: Casper | last post by:
Hi, i read several articles about serialization. I know now that it is a process of converting an object into a stream of data so that it can be is easily transmittable over the network or can be continued in a persistent storage location. Now i did some tests in order to understand it better: I first executed the code below (this (summarized) code produces a virtual simple shopping cart which is put in the Profile of the user) with...
2
1615
by: =?ISO-8859-2?Q?Rafa=B3_Grzybowski?= | last post by:
Hello there, I need to design classes, that can be serialized to XML like this: <?xml version="1.0"?> <Groups xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <Group> <Item name="Group 1 item 1" /> <Item name="Group 1 item 2" />
0
2400
myusernotyours
by: myusernotyours | last post by:
Hi all, Am trying to create a Java Desktop App that uses Java Persistence in Netbeans. The database is MS Access but I tried with Mysql and got the same error. When I run the app( Create the entity manager), I keep getting the following... Exception in thread "AWT-EventQueue-0" javax.persistence.PersistenceException: No Persistence provider for EntityManager named ReceiptingPU: The following providers:...
0
9645
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9950
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7499
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6740
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5381
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4053
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3647
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2879
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.