By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,930 Members | 1,383 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,930 IT Pros & Developers. It's quick & easy.

A bug in .Net Binary Serialization?

P: n/a
Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:


using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<CcList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string_dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C_cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<CCList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string_dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}





















If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!

Jul 2 '08 #1
Share this Question
Share on Google+
12 Replies


P: n/a
On Tue, 01 Jul 2008 22:50:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
[...]
If you run the code, you will find that the stream has a length of
4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string,
object>
and run the code again. Now, the stream length is 462,924 bytes. Why is
there
such a big difference just by changing the type? What I noticed also was
that
this might be due to the fact that I have another dictionary of the same
type
in Class C.

Am I doing something wrong here? If not, is this a bug?
I'll vote bug. But I admit, I'm no serialization expert so I might be
missing something.

But I do agree that it seems remarkable that such a simple change in the
type parameter for a Dictionary<TKey, TValueinstance would produce such
a dramatic difference. And I have in fact confirmed the behavior (albeit
with slightly different numbers as the output...but the relative scale is
the same).

It would be interesting to try to serialize to a more readable format and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.

Pete
Jul 2 '08 #2

P: n/a
Sorry my post seems to have a huge white space in between. Reposting it below:

Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;

namespace ConsoleApplication5
{
class Program
{
static void Main(string[] args)
{
A a = new A();
B b = new B(a);
List<CcList = new List<C>();
for (int i = 0; i < 10000; i++)
{
cList.Add(new C("someValue"));
}
b.CList = cList;

MemoryStream stream = new MemoryStream();
BinaryFormatter objFormatter = new BinaryFormatter();
objFormatter.Serialize(stream, b);
}
}

[Serializable]
class A
{
private Dictionary<string, string_dic1 = new Dictionary<string,
string>();

public A()
{
_dic1.Add("key1", "value1");
_dic1.Add("key2", "value2");
}
}

[Serializable]
class B
{
private List<C_cList = new List<C>();
private A _a;

public B(A a)
{
_a = a;
}

public List<CCList
{
get { return _cList; }
set { _cList = value; }
}
}

[Serializable]
class C
{
private Dictionary<string, string_dic2 = new Dictionary<string,
string>();
private string _value;

public C(string value)
{
_value = value;
}
}
}

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
Jul 2 '08 #3

P: n/a
It would be interesting to try to serialize to a more readable format and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.
This example was actually derived from a more complex code if that was what
you meant. And in my unit testing of it, I noticed that the size recently
tripled due to the addition of one dictionary even though there is only ever
one instance of it. This was when I started to debug and finally were able to
pinpoint its cause and came out with a simpler example to express this
problem.
Jul 2 '08 #4

P: n/a
On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
>It would be interesting to try to serialize to a more readable format
and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.

This example was actually derived from a more complex code if that was
what
you meant.
No, it's not. The code you posted was fine. I'm talking about the
resulting data itself. Serialize less, and to a format like SOAP so that
you can take the two alternatives and inspect them as text files
side-by-side. That should give you some clues as to what differences
exist between the two. And that _might_ lead you to some useful
conclusion as to why such a simple change produces such a dramatic
difference.

If you can accomplish that with the output from the BinaryFormatter, more
power to you. :) But I'd go with a text-format serialization. I naïvely
tried to swap in an XmlSerializer for the BinaryFormatter, but of course
it has different requirements from the regular serialization stuff (for
one, it requires everything to be public that's going to be serialized).
I didn't have the time to make the necessary adjustments, but that could
be something you might try, since the output from the XmlSerializer is yet
again much more readable than SOAP.

Pete
Jul 2 '08 #5

P: n/a
The problem with something like the XmlSerializer is that it does not support
serialization of dictionaries.

Does anyone else have any other ideas to this problem?

Thanks.

"Peter Duniho" wrote:
On Wed, 02 Jul 2008 00:20:01 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
It would be interesting to try to serialize to a more readable format
and
see what the specific differences are. I don't have the time at the
moment to explore too much, but it's something you might like to try.
This example was actually derived from a more complex code if that was
what
you meant.

No, it's not. The code you posted was fine. I'm talking about the
resulting data itself. Serialize less, and to a format like SOAP so that
you can take the two alternatives and inspect them as text files
side-by-side. That should give you some clues as to what differences
exist between the two. And that _might_ lead you to some useful
conclusion as to why such a simple change produces such a dramatic
difference.

If you can accomplish that with the output from the BinaryFormatter, more
power to you. :) But I'd go with a text-format serialization. I naïvely
tried to swap in an XmlSerializer for the BinaryFormatter, but of course
it has different requirements from the regular serialization stuff (for
one, it requires everything to be public that's going to be serialized).
I didn't have the time to make the necessary adjustments, but that could
be something you might try, since the output from the XmlSerializer is yet
again much more readable than SOAP.

Pete
Jul 3 '08 #6

P: n/a
On Wed, 02 Jul 2008 18:17:10 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
The problem with something like the XmlSerializer is that it does not
support
serialization of dictionaries.
Not by default, no. You can customize it though. Of course, it's
entirely possible that the problem is related to the default, automatic
serialization of dictionaries, in which case customizing XmlSerializer to
serialize your dictionaries won't help.
Does anyone else have any other ideas to this problem?
Well, like I said, SOAP is also basically text-based and you can use that
as easily as BinaryFormatter.

Pete
Jul 3 '08 #7

P: n/a
But isn't it the same with SOAP? I think SOAP does not support Generics which
thus means that it doesn't support dictionaries?

Well, like I said, SOAP is also basically text-based and you can use that
as easily as BinaryFormatter.

Pete
Jul 3 '08 #8

P: n/a
On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
But isn't it the same with SOAP? I think SOAP does not support Generics
which
thus means that it doesn't support dictionaries?
My recollection is that it does. I admit, I haven't tested it recently to
make sure. But I am under the impression that you can just swap in
SoapFormatter where you have BinaryFormatter, and it will "just work".

If I'm wrong, well...it should only take you a few minutes to find out. :)

Pete
Jul 3 '08 #9

P: n/a
I actually did that yesterday using the SOAPFormatter and it did not work,
which was why I thought maybe you meant something else.

"Peter Duniho" wrote:
On Wed, 02 Jul 2008 21:06:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
But isn't it the same with SOAP? I think SOAP does not support Generics
which
thus means that it doesn't support dictionaries?

My recollection is that it does. I admit, I haven't tested it recently to
make sure. But I am under the impression that you can just swap in
SoapFormatter where you have BinaryFormatter, and it will "just work".

If I'm wrong, well...it should only take you a few minutes to find out. :)

Pete
Jul 3 '08 #10

P: n/a
On Wed, 02 Jul 2008 22:28:00 -0700, ztRon
<zt***@discussions.microsoft.comwrote:
I actually did that yesterday using the SOAPFormatter and it did not
work,
which was why I thought maybe you meant something else.
Okay. Well, since I posted that message and now I had a chance to try it
myself, and found the same thing you did. :)

I find it ironic, in a very unfortunate way, that the various
serialization techniques in .NET are basically incompatible with each
other. That is, there is no uniform serialization paradigm that allows
"pluggable" formatters.

Oh well. Sorry I wasn't of any help. Though, perhaps it was at least
some help to have someone validate your findings. :)

If you do make progress on the issue, please post your results here so
that others can benefit from the experience.

Thanks,
Pete
Jul 3 '08 #11

P: n/a
On Jul 2, 6:50*am, ztRon <zt...@discussions.microsoft.comwrote:
Hi all,

I recently came across something really strange and after a couple of days
of debugging, I finally nailed the cause of it. However, I have absolutely no
idea what I am doing wrong or is it just a bug in binary serialization. The
following is a simple example of the code:

using System; *
using System.Collections.Generic; *
using System.IO; *
using System.Runtime.Serialization.Formatters.Binary; *

namespace ConsoleApplication5 *
{ *
* * class Program *
* * { *
* * * * static void Main(string[] args) *
* * * * { *
* * * * * * A a = new A(); *
* * * * * * B b = new B(a); *
* * * * * * List<CcList = new List<C>(); *
* * * * * * for (int i = 0; i < 10000; i++) *
* * * * * * { *
* * * * * * * * cList.Add(new C("someValue")); *
* * * * * * } *
* * * * * * b.CList = cList; *

* * * * * * MemoryStream stream = new MemoryStream(); *
* * * * * * BinaryFormatter objFormatter = new BinaryFormatter(); *
* * * * * * objFormatter.Serialize(stream, b); *
* * * * } *
* * } *

* * [Serializable] *
* * class A *
* * { *
* * * * private Dictionary<string, string_dic1 = new Dictionary<string,
string>(); *

* * * * public A() *
* * * * { *
* * * * * * _dic1.Add("key1", "value1"); *
* * * * * * _dic1.Add("key2", "value2"); *
* * * * } *
* * } *

* * [Serializable] *
* * class B *
* * { *
* * * * private List<C_cList = new List<C>(); *
* * * * private A _a; *

* * * * public B(A a) *
* * * * { *
* * * * * * _a = a; *
* * * * } *

* * * * public List<CCList *
* * * * { *
* * * * * * get { return _cList; } *
* * * * * * set { _cList = value; } *
* * * * } *
* * } *

* * [Serializable] *
* * class C *
* * { *
* * * * private Dictionary<string, string_dic2 = new Dictionary<string,
string>(); *
* * * * private string _value; *

* * * * public C(string value) *
* * * * { *
* * * * * * _value = value; *
* * * * } *
* * } *

} *

If you run the code, you will find that the stream has a length of 4,532,517
bytes. Now, try changing _dic1(Class A) to be a Dictionary<string, object>
and run the code again. Now, the stream length is 462,924 bytes. Why is there
such a big difference just by changing the type? What I noticed also was that
this might be due to the fact that I have another dictionary of the same type
in Class C.

Am I doing something wrong here? If not, is this a bug?

Thanks in advance!!
ztRon,

I don't think this is a bug, but just the way the data is stored when
you use binary serialization.

I had a similar problem when serializing classes to a file, where my
class contained an array of strings. If the string values were all the
same, then only one copy of the string was stored rather than multiple
copies of the same string (which I think is quiet clever really, saves
space and is probably quicker or something).

My bug was that when I changed one of the strings the serialized class
size changed so shouldn't have been writen back to the same slot in my
file and I ended up corrupting my data file.

So I think you don't have a bug, just a feature of binary
serialization.

SMJT
Jul 3 '08 #12

P: n/a
I think you'll have better luck with the newer DataContractSerializer.
It works way better than the older serialization stuff. Here's a cut
from my code.

public byte[] GetDataBytes(params Type[] types)
{
var ds = new DataContractSerializer(GetType(), types);
using (var mem = new MemoryStream())
{
//using (var w = XmlDictionaryWriter.CreateTextWriter(mem)) // for
xml
using (var w = XmlDictionaryWriter.CreateBinaryWriter(mem))
{
ds.WriteObject(w, this);
}
return mem.ToArray();
}
}

And I prefer the DataContract and DataMember attributes more than the
Serializable one.
Jul 3 '08 #13

This discussion thread is closed

Replies have been disabled for this discussion.