473,396 Members | 1,849 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

A bug in .Net Binary Serialization?

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:


using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<CcList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string_dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C_cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<CCList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string_dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}





















If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!

Jul 2 '08 #1
12 3038
On Tue, 01 Jul 2008 22:50:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
[...]
If you run the code, you will find that the stream has a length of
4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string,
object>
and run the code again. Now, the stream length is 462,924 bytes. Why is
there
such a big difference just by changing the type? What I noticed also was
that
this might be due to the fact that I have another dictionary of the same
type
in Class C.

Am I doing something wrong here? If not, is this a bug?
I'll vote bug. But I admit, I'm no serialization expert so I might be
missing something.

But I do agree that it seems remarkable that such a simple change in the
type parameter for a Dictionary<TKey, TValueinstance would produce such
a dramatic difference. And I have in fact confirmed the behavior (albeit
with slightly different numbers as the output...but the relative scale is
the same).

It would be interesting to try to serialize to a more readable format and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.

Pete
Jul 2 '08 #2
Sorry my post seems to have a huge white space in between. Reposting it below:

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<CcList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string_dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C_cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<CCList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string_dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
Jul 2 '08 #3
It would be interesting to try to serialize to a more readable format and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.
This example was actually derived from a more complex code if that was what
you meant. And in my unit testing of it, I noticed that the size recently
tripled due to the addition of one dictionary even though there is only ever
one instance of it. This was when I started to debug and finally were able to
pinpoint its cause and came out with a simpler example to express this
problem.
Jul 2 '08 #4
On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
>It would be interesting to try to serialize to a more readable format
and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.

This example was actually derived from a more complex code if that was
what
you meant.
No, it's not. The code you posted was fine. I'm talking about the
resulting data itself. Serialize less, and to a format like SOAP so that
you can take the two alternatives and inspect them as text files
side-by-side. That should give you some clues as to what differences
exist between the two. And that _might_ lead you to some useful
conclusion as to why such a simple change produces such a dramatic
difference.

If you can accomplish that with the output from the BinaryFormatter, more
power to you. :) But I'd go with a text-format serialization. I naïvely
tried to swap in an XmlSerializer for the BinaryFormatter, but of course
it has different requirements from the regular serialization stuff (for
one, it requires everything to be public that's going to be serialized).
I didn't have the time to make the necessary adjustments, but that could
be something you might try, since the output from the XmlSerializer is yet
again much more readable than SOAP.

Pete
Jul 2 '08 #5
The problem with something like the XmlSerializer is that it does not support
serialization of dictionaries.

Does anyone else have any other ideas to this problem?

Thanks.

"Peter Duniho" wrote:
On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
It would be interesting to try to serialize to a more readable format
and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.
This example was actually derived from a more complex code if that was
what
you meant.

No, it's not. The code you posted was fine. I'm talking about the
resulting data itself. Serialize less, and to a format like SOAP so that
you can take the two alternatives and inspect them as text files
side-by-side. That should give you some clues as to what differences
exist between the two. And that _might_ lead you to some useful
conclusion as to why such a simple change produces such a dramatic
difference.

If you can accomplish that with the output from the BinaryFormatter, more
power to you. :) But I'd go with a text-format serialization. I naïvely
tried to swap in an XmlSerializer for the BinaryFormatter, but of course
it has different requirements from the regular serialization stuff (for
one, it requires everything to be public that's going to be serialized).
I didn't have the time to make the necessary adjustments, but that could
be something you might try, since the output from the XmlSerializer is yet
again much more readable than SOAP.

Pete
Jul 3 '08 #6
On Wed, 02 Jul 2008 18:17:10 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
The problem with something like the XmlSerializer is that it does not
support
serialization of dictionaries.
Not by default, no. You can customize it though. Of course, it's
entirely possible that the problem is related to the default, automatic
serialization of dictionaries, in which case customizing XmlSerializer to
serialize your dictionaries won't help.
Does anyone else have any other ideas to this problem?
Well, like I said, SOAP is also basically text-based and you can use that
as easily as BinaryFormatter.

Pete
Jul 3 '08 #7
But isn't it the same with SOAP? I think SOAP does not support Generics which
thus means that it doesn't support dictionaries?

Well, like I said, SOAP is also basically text-based and you can use that
as easily as BinaryFormatter.

Pete
Jul 3 '08 #8
On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
But isn't it the same with SOAP? I think SOAP does not support Generics
which
thus means that it doesn't support dictionaries?
My recollection is that it does. I admit, I haven't tested it recently to
make sure. But I am under the impression that you can just swap in
SoapFormatter where you have BinaryFormatter, and it will "just work".

If I'm wrong, well...it should only take you a few minutes to find out. :)

Pete
Jul 3 '08 #9
I actually did that yesterday using the SOAPFormatter and it did not work,
which was why I thought maybe you meant something else.

"Peter Duniho" wrote:
On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
But isn't it the same with SOAP? I think SOAP does not support Generics
which
thus means that it doesn't support dictionaries?

My recollection is that it does. I admit, I haven't tested it recently to
make sure. But I am under the impression that you can just swap in
SoapFormatter where you have BinaryFormatter, and it will "just work".

If I'm wrong, well...it should only take you a few minutes to find out. :)

Pete
Jul 3 '08 #10
On Wed, 02 Jul 2008 22:28:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
I actually did that yesterday using the SOAPFormatter and it did not
work,
which was why I thought maybe you meant something else.
Okay. Well, since I posted that message and now I had a chance to try it
myself, and found the same thing you did. :)

I find it ironic, in a very unfortunate way, that the various
serialization techniques in .NET are basically incompatible with each
other. That is, there is no uniform serialization paradigm that allows
"pluggable" formatters.

Oh well. Sorry I wasn't of any help. Though, perhaps it was at least
some help to have someone validate your findings. :)

If you do make progress on the issue, please post your results here so
that others can benefit from the experience.

Thanks,
Pete
Jul 3 '08 #11
On Jul 2, 6:50*am, ztRon <zt...@discussions.microsoft.comwrote:
Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System; *
using System.Collections.Generic; *
using System.IO; *
using System.Runtime.Serialization.Formatters.Binary; *

namespace ConsoleApplication5 *
{ *
* * class Program *
* * { *
* * * * static void Main(string[] args) *
* * * * { *
* * * * * * A a = new A(); *
* * * * * * B b = new B(a); *
* * * * * * List<CcList = new List<C>(); *
* * * * * * for (int i = 0; i < 10000; i++) *
* * * * * * { *
* * * * * * * * cList.Add(new C("someValue")); *
* * * * * * } *
* * * * * * b.CList = cList; *

* * * * * * MemoryStream stream = new MemoryStream(); *
* * * * * * BinaryFormatter objFormatter = new BinaryFormatter(); *
* * * * * * objFormatter.Serialize(stream, b); *
* * * * } *
* * } *

* * [Serializable] *
* * class A *
* * { *
* * * * private Dictionary<string, string_dic1 = new Dictionary<string,
string>(); *

* * * * public A() *
* * * * { *
* * * * * * _dic1.Add("key1", "value1"); *
* * * * * * _dic1.Add("key2", "value2"); *
* * * * } *
* * } *

* * [Serializable] *
* * class B *
* * { *
* * * * private List<C_cList = new List<C>(); *
* * * * private A _a; *

* * * * public B(A a) *
* * * * { *
* * * * * * _a = a; *
* * * * } *

* * * * public List<CCList *
* * * * { *
* * * * * * get { return _cList; } *
* * * * * * set { _cList = value; } *
* * * * } *
* * } *

* * [Serializable] *
* * class C *
* * { *
* * * * private Dictionary<string, string_dic2 = new Dictionary<string,
string>(); *
* * * * private string _value; *

* * * * public C(string value) *
* * * * { *
* * * * * * _value = value; *
* * * * } *
* * } *

} *

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
ztRon,

I don't think this is a bug, but just the way the data is stored when
you use binary serialization.

I had a similar problem when serializing classes to a file, where my
class contained an array of strings. If the string values were all the
same, then only one copy of the string was stored rather than multiple
copies of the same string (which I think is quiet clever really, saves
space and is probably quicker or something).

My bug was that when I changed one of the strings the serialized class
size changed so shouldn't have been writen back to the same slot in my
file and I ended up corrupting my data file.

So I think you don't have a bug, just a feature of binary
serialization.

SMJT
Jul 3 '08 #12
I think you'll have better luck with the newer DataContractSerializer.
It works way better than the older serialization stuff. Here's a cut
from my code.

public byte[] GetDataBytes(params Type[] types)
{
var ds = new DataContractSerializer(GetType(), types);
using (var mem = new MemoryStream())
{
//using (var w = XmlDictionaryWriter.CreateTextWriter(mem)) // for
xml
using (var w = XmlDictionaryWriter.CreateBinaryWriter(mem))
{
ds.WriteObject(w, this);
}
return mem.ToArray();
}
}

And I prefer the DataContract and DataMember attributes more than the
Serializable one.
Jul 3 '08 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: hs | last post by:
Hi I am serializing a dataset using a binary formatter as follows: IFormatter formater = new BinaryFormatter(); formatter.Serialize(stream, ds); // ds=DataSet, stream=MemoryStream .... DataSet...
9
by: Ching-Lung | last post by:
Hi all, I try to create a tool to check the delta (diff) of 2 binaries and create the delta binary. I use binary formatter (serialization) to create the delta binary. It works fine but the...
2
by: Dave Veeneman | last post by:
I'm working on a project where I have to persist data to a file, rather than to a database. Basically, I need to save the state of several classes, each of which will have a couple of dozen...
11
by: ajou_king | last post by:
I was running some tests on my Win32 1GHZ processor to see how long it would take to transmit objects numerous times via TCP/IP using C# ..NET Remoting vs the C++ trustworthy method of binary...
7
by: schoenfeld1 | last post by:
I've implemented IPC between two applications using named pipes and binary serialization, but have noticed that the binary formatter is rather slow. It seems that the binary formatter reflects...
15
by: Jacques | last post by:
Hi I am an dotNet newby, so pardon my ignorance. I am looking for a method of saving/copying a managed class to a stream/file WITHOUT saving the object's state, eg. if I have a ref class with...
0
by: Vince Filby | last post by:
Hi, We are working with distributing Lucene.net. We have Master Index Server which takes responsibility of distributing the index searching to multiple Index Servers by calling the remote...
1
by: kikisan | last post by:
I am developing a windows service which utilizes the following classes: interface IPersistable; abstract class PersistableObject : IPersistable;
2
by: mkvenkit.vc | last post by:
Hello, I hope this is the right place to post a question on Boost. If not, please let me know where I can post this message and I will do so. I am having a strange problem with std::string as...
1
by: sandeepbhutani304 | last post by:
have 2 projects communicating each other with .NET remoting. But when I am trying to call these functions I am getting the error: The input stream is not a valid binary format. The starting...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.