473,770 Members | 2,082 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Binary Differencing

I posted a question a few days ago concerning file differencing and I got
some thought provoking answers. My original question dealt with identifying
file differences for storage of multiple versions of documents, some binary
some textual. I know that there are many excellent version control systems
out there (cvs, subversion, etc) but due to restrictions on the project I'm
working on we have to have a custom implementation. In order to gain what
could be some substantial savings for us on both bandwidth and file storage
we have decided to only store and transmit version differences in our system
(backend is sql server). For the most flexibility we want to go with a
binary differencing model. While searching around I found RFC 3284 "The
VCDIFF Generic Differencing and Compression Data Format" that really seemed
to fit my needs from a very high level. A little more work showed that other
version control systems (Vault) actually implemented this RFC for it's
system. Now I've started to digest the RFC in an effort to start some
prototype implementations but as is common the RFC is a fairly high level
document. Is anyone aware on any subject matter concerning implementing
binary differencing such as VCDiff, preferably some .net (or any) code
snippets. I plan to implement this in the .Net framework (C#) hence my post
to this group. Any advice or direction would be greatly welcomed. Thanks!

Josh Carlisle

Nov 16 '05 #1
2 4790
Josh Carlisle <jc*******@remo veforspam.viewf usion.com> wrote:
I posted a question a few days ago concerning file differencing and I got
some thought provoking answers. My original question dealt with identifying
file differences for storage of multiple versions of documents, some binary
some textual. I know that there are many excellent version control systems
out there (cvs, subversion, etc) but due to restrictions on the project I'm
working on we have to have a custom implementation. In order to gain what
could be some substantial savings for us on both bandwidth and file storage
we have decided to only store and transmit version differences in our system
(backend is sql server). For the most flexibility we want to go with a
binary differencing model. While searching around I found RFC 3284 "The
VCDIFF Generic Differencing and Compression Data Format" that really seemed
to fit my needs from a very high level. A little more work showed that other
version control systems (Vault) actually implemented this RFC for it's
system. Now I've started to digest the RFC in an effort to start some
prototype implementations but as is common the RFC is a fairly high level
document. Is anyone aware on any subject matter concerning implementing
binary differencing such as VCDiff, preferably some .net (or any) code
snippets. I plan to implement this in the .Net framework (C#) hence my post
to this group. Any advice or direction would be greatly welcomed. Thanks!


I have a C# *decoder* for VCDiff which is freely available -
http://www.pobox.com/~skeet/csharp/miscutil

Unfortunately I don't have an encoder in C#. It's one of those things
I'd like to do some time, but don't have the time at the moment.

I found RFC 3284 to be one of the best written ones I've seen - the
implementation of a decoder only took about 4 hours.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #2
Thanks Jon I'll take a look at your decoder. I'm sure it will prove to be
helpfull at the very least for getting me on the right track. I don't have
much experience taking a RFC to code but it does seem to be well written.
Thanks again.

Josh

"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...
Josh Carlisle <jc*******@remo veforspam.viewf usion.com> wrote:
I posted a question a few days ago concerning file differencing and I got
some thought provoking answers. My original question dealt with
identifying
file differences for storage of multiple versions of documents, some
binary
some textual. I know that there are many excellent version control
systems
out there (cvs, subversion, etc) but due to restrictions on the project
I'm
working on we have to have a custom implementation. In order to gain what
could be some substantial savings for us on both bandwidth and file
storage
we have decided to only store and transmit version differences in our
system
(backend is sql server). For the most flexibility we want to go with a
binary differencing model. While searching around I found RFC 3284 "The
VCDIFF Generic Differencing and Compression Data Format" that really
seemed
to fit my needs from a very high level. A little more work showed that
other
version control systems (Vault) actually implemented this RFC for it's
system. Now I've started to digest the RFC in an effort to start some
prototype implementations but as is common the RFC is a fairly high level
document. Is anyone aware on any subject matter concerning implementing
binary differencing such as VCDiff, preferably some .net (or any) code
snippets. I plan to implement this in the .Net framework (C#) hence my
post
to this group. Any advice or direction would be greatly welcomed. Thanks!


I have a C# *decoder* for VCDiff which is freely available -
http://www.pobox.com/~skeet/csharp/miscutil

Unfortunately I don't have an encoder in C#. It's one of those things
I'd like to do some time, but don't have the time at the moment.

I found RFC 3284 to be one of the best written ones I've seen - the
implementation of a decoder only took about 4 hours.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 16 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
15257
by: yaipa | last post by:
What would be the common sense way of finding a binary pattern in a ..bin file, say some 200 bytes, and replacing it with an updated pattern of the same length at the same offset? Also, the pattern can occur on any byte boundary in the file, so chunking through the code at 16 bytes a frame maybe a problem. The file itself isn't so large, maybe 32 kbytes is all and the need for speed is not so great, but the need for accuracy in the...
20
7558
by: Christian Stigen Larsen | last post by:
A signed int reserves one bit to signify whether a number is positive or negative. In light of this, a colleague asked me whether there existed an int in C++ that was -0, a zero with the negative bit set. I was intrigued by this, so I tried the following code: #include <stdio.h> int main(int, char**) { int a(-0); printf("a=%d\n", a);
3
3496
by: Tron Thomas | last post by:
What does binary mode for an ofstream object do anyway? Despite which mode the stream uses, operator << writes numeric value as their ASCII representation. I read on the Internet that it is possible to change the behavior of operator << so it will stream numeric values as their actual values when an ofstream is in binary mode. I did not, however, find any information on how this can be accomplished. What is involved in getting this...
103
48760
by: Steven T. Hatton | last post by:
§27.4.2.1.4 Type ios_base::openmode Says this about the std::ios::binary openmode flag: *binary*: perform input and output in binary mode (as opposed to text mode) And that is basically _all_ it says about it. What the heck does the binary flag mean? -- If our hypothesis is about anything and not about some one or more particular things, then our deductions constitute mathematics. Thus mathematics may be defined as the subject in...
2
2532
by: Lisa Pearlson | last post by:
Hi, My php application (on Apache/Linux) needs to do the following: The PHP script receives a request from a client (binary), asking for certain records of data. My PHP script loops through all records and sends each of them ONE BY ONE. After each record that my server script sends, it waits for the client to confirm proper reception with an ACK (binary digit). When there are no more records, my server script sends the client a binary
9
6520
by: Ching-Lung | last post by:
Hi all, I try to create a tool to check the delta (diff) of 2 binaries and create the delta binary. I use binary formatter (serialization) to create the delta binary. It works fine but the delta binary is pretty huge in size. I have 1 byte file and 2 bytes file, the delta should be 1 byte but somehow it turns out to be 249 bytes using binary formatter. I guess serialization has some other things added to the delta file.
1
355
by: Josh Carlisle | last post by:
I posted a question a few days ago concerning file differencing and I got some thought provoking answers. My original question dealt with identifying file differences for storage of multiple versions of documents, some binary some textual. I know that there are many excellent version control systems out there (cvs, subversion, etc) but due to restrictions on the project I'm working on we have to have a custom implementation. In order to...
3
2128
by: John R. Delaney | last post by:
I am running in debugging mode after a clean C++ compilation under .NET 2003. In a BIG loop (controlled many levels up in the call stack), I open a file with fopen using the "a" option. Then I write 23 doubles to it with fwrite, one call for each double. Then I close the file using fclose. After three times around the loop in the debugger, I stop the program (using "Stop debugging"). That is writing 552 bytes. The resulting file's properties...
10
22745
by: rory | last post by:
I can't seem to append a string to the end of a binary file. I'm using the following code: fstream outFile("test.exe", ios::in | ios::out | ios::binary | ios::ate | ios::app) outFile.write("teststring", 10); outFile.close(); If I leave out the ios::ate and ios::app modes my string is written to the start of the file as I'd expect but I want to write the data to
16
4501
by: Erwin Moller | last post by:
Why is a binary file executable? Is any binary file executable? Is only binary file executable? Are all executable files binary? What is the connection between the attribute of binary and that of executable?
0
10232
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10059
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10008
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8891
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7420
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6682
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5454
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3974
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3578
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.