473,389 Members | 1,374 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,389 software developers and data experts.

MD5CryptoServiceProvider Hashing a split file

Hi,

I am very new to C# and NET framework. I am trying to hash (using
MD5CryptoServiceProvider) a source that is split into several files.

Now when the source is in one file I can produce the correct md5 hash.

My issue is how can I reproduce the correct hash when the file is split
into different files.

Thanks :)

Jun 27 '08 #1
7 4359
John Smith wrote:
I am very new to C# and NET framework. I am trying to hash (using
MD5CryptoServiceProvider) a source that is split into several files.

Now when the source is in one file I can produce the correct md5 hash.

My issue is how can I reproduce the correct hash when the file is split
into different files.
A hash is calculated based on the byte content.

Why does it make the difference whether those bytes are read
from a single file or from multiple files ?

Arne
Jun 27 '08 #2
Arne Vajhj wrote:
John Smith wrote:
> I am very new to C# and NET framework. I am trying to hash (using
MD5CryptoServiceProvider) a source that is split into several files.

Now when the source is in one file I can produce the correct md5 hash.

My issue is how can I reproduce the correct hash when the file is
split into different files.

A hash is calculated based on the byte content.

Why does it make the difference whether those bytes are read
from a single file or from multiple files ?

Arne

Thanks Arne.

I think I might not have explained myself. Let me rephrase it I have no
clue how I to do it. :?

I think best way is to show you my problem with quick example code:

------------------------------------------------------------
MD5CryptoServiceProvider oMD5 = new MD5CryptoServiceProvider();
string sRet;

string s1 = "First String Sample";
string s2 = "Second String Sample";
string s3 = s1 + s2;
byte[] bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s1);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-", string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s2);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-", string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s3);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-", string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);
-----------------------------------------------------------------

The output hash is as follows:
s1 = 1EC25881AD012D4CA6E73D1986AE93FB
s2 = D8D46AC432C7251F863C2D5B91FE48FC
s3 = 9E158DDEE697EBAEC2A036F459B02448

Now what I want is basically to be able to hash s1 get the
result and then continue hashing s2 and get the final s3 result.

Right now the only way I know of getting s3 hash is by first
concatenating the strings then running it through ComputeHash.

This isn't much of an issue when the input is a small string, however
if I am trying to hash several files then that is a different matter.
**These files can be large, and the only way I know of doing it, is to
basically combining all the files into a single temporary file and then
passing the stream to ComputeHash.

Surely there has to be a better method.

Any advice?

Thanks





Jun 27 '08 #3
John Smith wrote:
Arne Vajhj wrote:
>John Smith wrote:
>> I am very new to C# and NET framework. I am trying to hash (using
MD5CryptoServiceProvider) a source that is split into several files.

Now when the source is in one file I can produce the correct md5 hash.

My issue is how can I reproduce the correct hash when the file is
split into different files.

A hash is calculated based on the byte content.

Why does it make the difference whether those bytes are read
from a single file or from multiple files ?
I think best way is to show you my problem with quick example code:
Example code is always good.
MD5CryptoServiceProvider oMD5 = new MD5CryptoServiceProvider();
string sRet;

string s1 = "First String Sample";
string s2 = "Second String Sample";
string s3 = s1 + s2;
byte[] bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s1);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s2);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s3);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);
-----------------------------------------------------------------

The output hash is as follows:
s1 = 1EC25881AD012D4CA6E73D1986AE93FB
s2 = D8D46AC432C7251F863C2D5B91FE48FC
s3 = 9E158DDEE697EBAEC2A036F459B02448

Now what I want is basically to be able to hash s1 get the
result and then continue hashing s2 and get the final s3 result.

Right now the only way I know of getting s3 hash is by first
concatenating the strings then running it through ComputeHash.

This isn't much of an issue when the input is a small string, however
if I am trying to hash several files then that is a different matter.
**These files can be large, and the only way I know of doing it, is to
basically combining all the files into a single temporary file and then
passing the stream to ComputeHash.
You can not "add" MD5 checksums.

But if you use TransformBlock and TransformFinalBlock instead
of ComputeHash, then you should be able to process small
chunks (like 1 MB or 10 MB) at a time - even coming from
multiple files.

Arne
Jun 27 '08 #4
Arne Vajhj wrote:
John Smith wrote:
>Arne Vajhj wrote:
>>John Smith wrote:
I am very new to C# and NET framework. I am trying to hash (using
MD5CryptoServiceProvider) a source that is split into several files.

Now when the source is in one file I can produce the correct md5 hash.

My issue is how can I reproduce the correct hash when the file is
split into different files.

A hash is calculated based on the byte content.

Why does it make the difference whether those bytes are read
from a single file or from multiple files ?
>I think best way is to show you my problem with quick example code:

Example code is always good.
>MD5CryptoServiceProvider oMD5 = new MD5CryptoServiceProvider();
string sRet;

string s1 = "First String Sample";
string s2 = "Second String Sample";
string s3 = s1 + s2;
byte[] bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s1);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s2);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);

bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s3);
sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Re place("-",
string.Empty);
System.Diagnostics.Debug.WriteLine(sRet);
-----------------------------------------------------------------

The output hash is as follows:
s1 = 1EC25881AD012D4CA6E73D1986AE93FB
s2 = D8D46AC432C7251F863C2D5B91FE48FC
s3 = 9E158DDEE697EBAEC2A036F459B02448

Now what I want is basically to be able to hash s1 get the
result and then continue hashing s2 and get the final s3 result.

Right now the only way I know of getting s3 hash is by first
concatenating the strings then running it through ComputeHash.

This isn't much of an issue when the input is a small string, however
if I am trying to hash several files then that is a different matter.
**These files can be large, and the only way I know of doing it, is to
basically combining all the files into a single temporary file and then
passing the stream to ComputeHash.

You can not "add" MD5 checksums.

But if you use TransformBlock and TransformFinalBlock instead
of ComputeHash, then you should be able to process small
chunks (like 1 MB or 10 MB) at a time - even coming from
multiple files.
Example:

using System;
using System.Text;
using System.Security.Cryptography;

namespace E
{
public class Program
{
public static void Main(string[] args)
{
MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
string s1 = "First String Sample";

Console.WriteLine(BitConverter.ToString(md5.Comput eHash(Encoding.UTF8.GetBytes(s1))).Replace("-",
""));
string s2 = "Second String Sample";

Console.WriteLine(BitConverter.ToString(md5.Comput eHash(Encoding.UTF8.GetBytes(s2))).Replace("-",
""));
string s3 = s1 + s2;

Console.WriteLine(BitConverter.ToString(md5.Comput eHash(Encoding.UTF8.GetBytes(s3))).Replace("-",
""));
md5.Initialize();
byte[] garbage = new Byte[1000000];
md5.TransformBlock(Encoding.UTF8.GetBytes(s1), 0,
Encoding.UTF8.GetByteCount(s1), garbage, 0);
md5.TransformFinalBlock(Encoding.UTF8.GetBytes(s2) , 0,
Encoding.UTF8.GetByteCount(s2));

Console.WriteLine(BitConverter.ToString(md5.Hash). Replace("-", ""));
Console.ReadKey();
}
}
}

(it may be possible to optimize it a bit, but it should
show the concept)

Arne
Jun 27 '08 #5
(it may be possible to optimize it a bit, but it should
show the concept)

Arne
Ahhhh. I wish I saw the code before. I actually figured it out after you pointed me to the TransformBlock.
Thanks Arne, you've been a great help. Saved me a lot of time.

Still have one final issue and I don't think it can be solved (easily). That is working out the hash at each stage.

So hash for s1
So hash for s1 + s2
So hash for s1 + s2 + s3
etc...

It seems that I can use the TransformBlock but I am unable to get the current "total" hash of processed chunks.

The only way I can think of doing it is if I can make a copy of the md5 object, which to my understanding is a pain in the butt in C#;

Have any suggestions?

Thx for all the help




Jun 27 '08 #6
John Smith wrote:
Still have one final issue and I don't think it can be solved (easily).
That is working out the hash at each stage.

So hash for s1
So hash for s1 + s2
So hash for s1 + s2 + s3
etc...

It seems that I can use the TransformBlock but I am unable to get the
current "total" hash of processed chunks.

The only way I can think of doing it is if I can make a copy of the md5
object, which to my understanding is a pain in the butt in C#;

Have any suggestions?
I don't think that is possible easily.

I think what I would do was to have to MD5 hashers.

One that I reset for each file and one for total. And
then call both of them with the data.

I know that MD5(individual) and MD5(total) is not the
same as MD5(accumulate(individual)) and MD5(total), but
it may be OK.

Arne

Jun 27 '08 #7
Thanks. I think it would have to be separate hashers like you said.
Jun 27 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: snowteo | last post by:
Hi,I have to do this exercises can you help me: 1)Write a program to implement exetendible hashing.If the table is small enough to fin in main memory,how does its performance compare with open and...
2
by: Viktor Popov | last post by:
Hi ! I have a question:) I'm using the MD5CryptoServiceProvider class because I encrypt the user's password when he/she opens the registration form. The question is when a user creates for...
1
by: Casual Reader | last post by:
Hi, According to MS Knowledge Base article 307020, we have to instantiate a new instance of the MD5CryptoServiceProvider class for every new hash value we'd like to compute (from step 6 of the...
11
by: Wm. Scott Miller | last post by:
Hello all! We are building applications here and have hashing algorithms to secure secrets (e.g passwords) by producing one way hashes. Now, I've read alot and I've followed most of the advice...
8
by: Maya | last post by:
Hello all, I'm using MD5 hashing in my application to give unique values to huge list of items my application receives, originally every item's name was difficult to use as an id for this item...
2
by: =?Utf-8?B?TW91dGhPZk1hZG5lc3M=?= | last post by:
How can I add an MD5 hash to XMLSerializer.Serialize without corrupting the content of the file; then how to read it back to verify is correct? I'd like to code up something (see below) that...
8
by: theCancerus | last post by:
Hi All, I am not sure if this is the right place to ask this question but i am very sure you may have faced this problem, i have already found some post related to this but not the answer i am...
1
by: Tinku | last post by:
Hi friends I know Static Hashing and i know about Dynamic Hashing, still i have problem to make program with Dynamic Hashing I am new in "C" world, please help me, my problem is: i have to...
4
by: Kerem Gmrkc | last post by:
Hi, this is not a pure MFC/VC++ question but my apologizes at first. Well, i have a application that calculates the hash for a file. You can request a CALG_SHA1 or a CALG_MD5 for the File. The...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.