473,380 Members | 1,387 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,380 software developers and data experts.

File Hashing

I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.
Jun 27 '08 #1
24 1916
Johnny Jörgensen wrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?
No, but fairly sure.
2) Two different files will NEVER have the same hash value?
No, but fairly sure.
3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?
No, but fairly sure.

-- Barry

--
http://barrkel.blogspot.com/
Jun 27 '08 #2
Please do not cross-post between language groups. It's one thing to
"abuse" the C# newsgroup with non-language .NET questions (we all do it
all the time :) ). But if your .NET question is even nominally on-topic
in the C# newsgroup (by virtue of the language you're using), it's
definitely off-topic in the VB.NET newsgroup, and vice a versa.

Follow-ups to m.p.d.l.csharp.

As for the question, you would do well to search this newsgroup for
keywords like "hash", "identical", "file", etc. You'd be amazed at what's
already been said on the topic (especially on your first two questions).

But, the short version is:

On Wed, 28 May 2008 11:44:27 -0700, Johnny Jörgensen <jo**@altcom.se>
wrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?
Your first two questions are the same, and so the answer for both is the
same: no, you cannot be sure of that.
3) If two files have the same MD5 hash value, they will ALSO have the
same
SHA1 hash value (I should think that will always be the case)?
Granted, I'm not a crypto expert. However, I'd say the answer to this is
also "no". If MD5 provided just as much differentiating power as SHA1,
even though it's 128 bits while SHA1 is 160 bits, then why would anyone
bother with SHA1? No, I think it's safe to say that there are at least
some pairs of files for which the MD5 hash is identical, but the SHA1 hash
is not.

Of course, finding two different files that produce the exact same hash in
either algorithm is either contrived or very difficult. But then, it's
still a possibility (see the answer to questions #1 and #2). :)

Pete
Jun 27 '08 #3
On May 28, 8:44*pm, "Johnny Jörgensen" <j...@altcom.sewrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.
Hello,

All hashing functions have a finite set of return values (say : 2^128)
but an infinite number of possible input values. This clearly implies
that two input values CAN generate the same output value.

But in practice, the probability that you can find two input values
generating the same hash signature are pretty close to zero. I would
say :

1) Yes. It will be the same file (well, most of the time, read this :
http://www.mathstat.dal.ca/~selinger/md5collision/)
2) Yes.
3) No. Using both an MD5 and a SHA-1 will in fact reduce the number of
possible collisions.
Jun 27 '08 #4
Johnny Jörgensen <jo**@altcom.sewrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?
No. Think how much data is contained in a hash. Suppose you have a 128
bit hash. Now think about just files which are (say) 136 bits in
length. How many possible files of that length are there? Now how many
possible 128 bit hash values are there?

A slightly different way of looking at this: suppose you see some
people, and label each one with a different (capital) letter of the
alphabet to tell them apart. When you've got more than 26 people,
you're *bound* to have at least two people who have the same letter.
2) Two different files will NEVER have the same hash value?
That's the same question as question 1.
3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?
No, not necessarily. It's incredibly likely - hashes are designed such
that you'd be extremely unlucky to run into two files with the same
hash but different content. It's possible though.

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Jun 27 '08 #5
You could use the file length as an additional piece of "metadata" -- if two
files were to have the same hash but different byte lengths then they are not
the same. That's probably going to solve most hash collissions. If you do
find a case of two files having the same hash and length, then you need to do
a byte-for-byte comparison to determine equality.

HTH
"Johnny Jörgensen" wrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.
Jun 27 '08 #6
Johnny Jörgensen wrote:
If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?
Other have already answered that question.

But there is an important point that should be
emphasized:

* if you want to protect against accidentally matching
files, then you should not worry, the probabilities
of 1/2^128 and 1/2^160 are close to impossible, so
both MD5 and SHA1 are fine

* if you want to protect against malicious matching
files then it a completely different game - MD5 is
completely broken and SHA1 is somewhat broken - neither
is usable and you should go for SHA256 instead

Arne
Jun 27 '08 #7
Peter Duniho wrote:
On Wed, 28 May 2008 11:44:27 -0700, Johnny Jörgensen <jo**@altcom.se>
wrote:
>3) If two files have the same MD5 hash value, they will ALSO have the
same
SHA1 hash value (I should think that will always be the case)?

Granted, I'm not a crypto expert. However, I'd say the answer to this
is also "no". If MD5 provided just as much differentiating power as
SHA1, even though it's 128 bits while SHA1 is 160 bits, then why would
anyone bother with SHA1? No, I think it's safe to say that there are at
least some pairs of files for which the MD5 hash is identical, but the
SHA1 hash is not.
I would rephrase it as: if identical MD5 hash implied identical SHA1
hash, then SHA1 could only return 2^128 different values.
Of course, finding two different files that produce the exact same hash
in either algorithm is either contrived or very difficult.
Serious weaknesses in both has been found.

Arne
Jun 27 '08 #8
Johnny,

No you only will be sure that there is a low change that somebody can create
your files new with guessing what it would have as content.

The check if something is complete has in my idea nothing to do with an
security encryption.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:%2****************@TK2MSFTNGP04.phx.gbl...
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.
Jun 27 '08 #9
Good idea - Thanks

/Johnny
"KH" <KH@discussions.microsoft.comskrev i meddelandet
news:16**********************************@microsof t.com...
You could use the file length as an additional piece of "metadata" -- if
two
files were to have the same hash but different byte lengths then they are
not
the same. That's probably going to solve most hash collissions. If you do
find a case of two files having the same hash and length, then you need to
do
a byte-for-byte comparison to determine equality.

HTH
"Johnny Jörgensen" wrote:
>I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the
same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.

Jun 27 '08 #10
Thanks

/Johnny J.


"Barry Kelly" <ba***********@gmail.comskrev i meddelandet
news:i5********************************@4ax.com...
Johnny Jörgensen wrote:
>I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be
sure that:

1) Two files with the same hash value are in fact identical?

No, but fairly sure.
>2) Two different files will NEVER have the same hash value?

No, but fairly sure.
>3) If two files have the same MD5 hash value, they will ALSO have the
same
SHA1 hash value (I should think that will always be the case)?

No, but fairly sure.

-- Barry

--
http://barrkel.blogspot.com/

Jun 27 '08 #11
Hi Peter

First of all - thanks for your reply to my question. Much appreciated.

As for your comments on my posting technique:

1) The question is NOT crossposted. It would have been if I posted TWO
seperate messages to which people responded indicidually. But I have posted
ONE message to two different groups and replies from one group will show up
in the other.

2) Are you a programmer at all? How can you reason that a post that's
relevant in a C# group cannot possibly be relevant in a VB.NET group? The
only difference (ok maybe not the only, but the most important difference)
is different syntax. If somebody has a general question about the
functionality of a .NET class then syntax doesn't matter, and a VB.NET
programmer can just as well tell you the correct answer as a C# programmer.

3) To everybody who answered that my first two questions were identical:
They're not - it depends on the answer.

Thanks,
Johnny J.
"Peter Duniho" <Np*********@nnowslpianmk.comskrev i meddelandet
news:op***************@petes-computer.local...
Please do not cross-post between language groups. It's one thing to
"abuse" the C# newsgroup with non-language .NET questions (we all do it
all the time :) ). But if your .NET question is even nominally on-topic
in the C# newsgroup (by virtue of the language you're using), it's
definitely off-topic in the VB.NET newsgroup, and vice a versa.

Follow-ups to m.p.d.l.csharp.

As for the question, you would do well to search this newsgroup for
keywords like "hash", "identical", "file", etc. You'd be amazed at what's
already been said on the topic (especially on your first two questions).

But, the short version is:

On Wed, 28 May 2008 11:44:27 -0700, Johnny Jörgensen <jo**@altcom.se>
wrote:
>I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

Your first two questions are the same, and so the answer for both is the
same: no, you cannot be sure of that.
>3) If two files have the same MD5 hash value, they will ALSO have the
same
SHA1 hash value (I should think that will always be the case)?

Granted, I'm not a crypto expert. However, I'd say the answer to this is
also "no". If MD5 provided just as much differentiating power as SHA1,
even though it's 128 bits while SHA1 is 160 bits, then why would anyone
bother with SHA1? No, I think it's safe to say that there are at least
some pairs of files for which the MD5 hash is identical, but the SHA1 hash
is not.

Of course, finding two different files that produce the exact same hash in
either algorithm is either contrived or very difficult. But then, it's
still a possibility (see the answer to questions #1 and #2). :)

Pete

Jun 27 '08 #12
Thanks

Johnny J.

<sy**************@gmail.comskrev i meddelandet
news:2b**********************************@d1g2000h sg.googlegroups.com...
On May 28, 8:44 pm, "Johnny Jörgensen" <j...@altcom.sewrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.
Hello,

All hashing functions have a finite set of return values (say : 2^128)
but an infinite number of possible input values. This clearly implies
that two input values CAN generate the same output value.

But in practice, the probability that you can find two input values
generating the same hash signature are pretty close to zero. I would
say :

1) Yes. It will be the same file (well, most of the time, read this :
http://www.mathstat.dal.ca/~selinger/md5collision/)
2) Yes.
3) No. Using both an MD5 and a SHA-1 will in fact reduce the number of
possible collisions.
Jun 27 '08 #13
Thanks

Johnny J.

"Jon Skeet [C# MVP]" <sk***@pobox.comskrev i meddelandet
news:MP*********************@msnews.microsoft.com. ..
Johnny Jörgensen <jo**@altcom.sewrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?
No. Think how much data is contained in a hash. Suppose you have a 128
bit hash. Now think about just files which are (say) 136 bits in
length. How many possible files of that length are there? Now how many
possible 128 bit hash values are there?

A slightly different way of looking at this: suppose you see some
people, and label each one with a different (capital) letter of the
alphabet to tell them apart. When you've got more than 26 people,
you're *bound* to have at least two people who have the same letter.
2) Two different files will NEVER have the same hash value?
That's the same question as question 1.
3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?
No, not necessarily. It's incredibly likely - hashes are designed such
that you'd be extremely unlucky to run into two files with the same
hash but different content. It's possible though.

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
Jun 27 '08 #14
Thanks

Johnny J.

"Arne Vajhøj" <ar**@vajhoej.dkskrev i meddelandet
news:48***********************@news.sunsite.dk...
Johnny Jörgensen wrote:
>If I calculate the hash value of files (either MD5 or SHA1), can I then
be sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

Other have already answered that question.

But there is an important point that should be
emphasized:

* if you want to protect against accidentally matching
files, then you should not worry, the probabilities
of 1/2^128 and 1/2^160 are close to impossible, so
both MD5 and SHA1 are fine

* if you want to protect against malicious matching
files then it a completely different game - MD5 is
completely broken and SHA1 is somewhat broken - neither
is usable and you should go for SHA256 instead

Arne


Jun 27 '08 #15
That wasn't the intention behind my question either. Simply to determine if
two files are identical or not.

/Johnny J.


"Cor Ligthert[MVP]" <no************@planet.nlskrev i meddelandet
news:30**********************************@microsof t.com...
Johnny,

No you only will be sure that there is a low change that somebody can
create your files new with guessing what it would have as content.

The check if something is complete has in my idea nothing to do with an
security encryption.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:%2****************@TK2MSFTNGP04.phx.gbl...
>I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the
same SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.

Jun 27 '08 #16
KH wrote:
You could use the file length as an additional piece of "metadata" -- if two
files were to have the same hash but different byte lengths then they are not
the same. That's probably going to solve most hash collissions.
File length (specifically, bit count) is part of the MD5 and SHA1 hash
calculations. There is less information per bit of a separate length
indicator than you're getting out of the bits in the MD5 or SHA1 hashes.
If minimizing collisions is the priority, then using a better hash
function, like SHA-224, SHA-256, etc. will give better "bang for buck"
in terms of bit information. Given that you could probably expect file
length to require a 64-bit number, choosing SHA-224 over SHA-160 seems
to be obvious.

Because of the birthday paradox, accidental collisions with hash
functions are more common than the astronomical numbers like 2**128 and
2**160 seem to suggest; 50% chance with roughly 1.25 times the square
root of the number of possible hash values, assuming the hash values are
distributed evenly.

That works out to a 50% chance of collision after around 2**64 (MD5) or
2**80 (SHA-1).

2**64 and 2**80 are still large numbers, unlikely to be met in practice
where file comparison is the goal of hashing.

Of course, specially crafted collisions have been found for MD5, and
attacks are underway with 2**35 evaluations for SHA-1. But these won't
be of concern for file comparison.

-- Barry

--
http://barrkel.blogspot.com/
Jun 27 '08 #17
Johnny Jörgensen wrote:
As for your comments on my posting technique:

1) The question is NOT crossposted. It would have been if I posted TWO
seperate messages to which people responded indicidually.
That is called multi-posting. Your message was indeed cross-posted.
2) Are you a programmer at all? How can you reason that a post that's
relevant in a C# group cannot possibly be relevant in a VB.NET group?
The question was a general cryptography one, and indeed could be argued
wasn't relevant to any language-specific group. .framework might have
been closest, had it been phrased as recommended use of
System.Security.Cryptography.

-- Barry

--
http://barrkel.blogspot.com/
Jun 27 '08 #18
In my idea would that mean, that you can make from the hash code the file.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:OW**************@TK2MSFTNGP03.phx.gbl...
That wasn't the intention behind my question either. Simply to determine
if two files are identical or not.

/Johnny J.


"Cor Ligthert[MVP]" <no************@planet.nlskrev i meddelandet
news:30**********************************@microsof t.com...
>Johnny,

No you only will be sure that there is a low change that somebody can
create your files new with guessing what it would have as content.

The check if something is complete has in my idea nothing to do with an
security encryption.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:%2****************@TK2MSFTNGP04.phx.gbl...
>>I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the
same SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.


Jun 27 '08 #19
On May 29, 10:12 am, "Johnny Jörgensen" <j...@altcom.sewrote:
3) To everybody who answered that my first two questions were identical:
They're not - it depends on the answer.
No it doesn't. They're logically equivalent.

Suppose the answer to question 2 was "yes" - i.e. there can never be
two different files with the same hash value. That implies that if two
files have the same hash value, they can't be different, i.e. they are
identical. Hence the answer to question 1 would have to be "yes" as
well.

To put it in pure logic terms, if A is the predicate "X and Y have
identical hashes" and B is the predicate "X and Y are identical
files", your questions were:

1) A =B
2) !B =!A

I hope you can see how these are logically equivalent questions.

Jon
Jun 27 '08 #20

Well, if you could create the file from the hash code, then it really
wouldn't be hashing, would it?

Then it would be cryptography, in which case there would be no
point for the purposes of what the OP is trying to achieve.

Regards,

Joergen Bech

On Thu, 29 May 2008 12:36:38 +0200, "Cor Ligthert [MVP]"
<no************@planet.nlwrote:
>In my idea would that mean, that you can make from the hash code the file.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:OW**************@TK2MSFTNGP03.phx.gbl...
>That wasn't the intention behind my question either. Simply to determine
if two files are identical or not.

/Johnny J.


"Cor Ligthert[MVP]" <no************@planet.nlskrev i meddelandet
news:30**********************************@microso ft.com...
>>Johnny,

No you only will be sure that there is a low change that somebody can
create your files new with guessing what it would have as content.

The check if something is complete has in my idea nothing to do with an
security encryption.

Cor

"Johnny Jörgensen" <jo**@altcom.seschreef in bericht
news:%2****************@TK2MSFTNGP04.phx.gbl.. .
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then
be sure that:

1) Two files with the same hash value are in fact identical?

2) Two different files will NEVER have the same hash value?

3) If two files have the same MD5 hash value, they will ALSO have the
same SHA1 hash value (I should think that will always be the case)?

TIA,
Johnny J.


Jun 27 '08 #21
I guess I don't see your point; I don't know exactly what the OP is doing,
but I was simply suggesting a shortcut he could take to identify hash
collissions using a piece of info he may very well have on hand, before doing
a possibly more expensive byte-for-byte comparison.
"Barry Kelly" wrote:
KH wrote:
You could use the file length as an additional piece of "metadata" -- if two
files were to have the same hash but different byte lengths then they are not
the same. That's probably going to solve most hash collissions.

File length (specifically, bit count) is part of the MD5 and SHA1 hash
calculations. There is less information per bit of a separate length
indicator than you're getting out of the bits in the MD5 or SHA1 hashes.
If minimizing collisions is the priority, then using a better hash
function, like SHA-224, SHA-256, etc. will give better "bang for buck"
in terms of bit information. Given that you could probably expect file
length to require a 64-bit number, choosing SHA-224 over SHA-160 seems
to be obvious.

Because of the birthday paradox, accidental collisions with hash
functions are more common than the astronomical numbers like 2**128 and
2**160 seem to suggest; 50% chance with roughly 1.25 times the square
root of the number of possible hash values, assuming the hash values are
distributed evenly.

That works out to a 50% chance of collision after around 2**64 (MD5) or
2**80 (SHA-1).

2**64 and 2**80 are still large numbers, unlikely to be met in practice
where file comparison is the goal of hashing.

Of course, specially crafted collisions have been found for MD5, and
attacks are underway with 2**35 evaluations for SHA-1. But these won't
be of concern for file comparison.

-- Barry

--
http://barrkel.blogspot.com/
Jun 27 '08 #22
On Thu, 29 May 2008 02:12:22 -0700, Johnny Jörgensen <jo**@altcom.se>
wrote:
[...]
1) The question is NOT crossposted. It would have been if I posted TWO
seperate messages to which people responded indicidually. But I have
posted
ONE message to two different groups and replies from one group will show
up
in the other.
As Barry says, you are confusing "multi-posting" with "cross-posting".
Inasmuch as a message should go into multiple newsgroups, "cross-posting"
is the correct approach and "multi-posting" is not, and I do appreciate
that you cross-posted instead of multi-posted. But not all cross-posting
is appropriate".
2) Are you a programmer at all?
What kind of question is that? Do you think that by insulting me, you'll
have an easier time achieving whatever goal you have?
How can you reason that a post that's
relevant in a C# group cannot possibly be relevant in a VB.NET group?
The two newsgroups' _primary_ purpose is to answer questions that are
_specific_ to each language. In the C# newsgroup, the questions really
should only be about questions relating to the specific use of C#.
Likewise, in the VB.NET newsgroup, the questions really should only be
about questions relating to the specific use of VB.NET.

So, questions like "how do I write a C# iterator method" or "when should I
use the 'Shadows' keyword" are examples of appropriate questions in the C#
and VB.NET newsgroups, respectively.

Technically, _any_ other question is off-topic.

In reality, the C# newsgroup (and I assume the VB.NET newsgroup, though I
haven't looked to see) winds up used for broader questions. But they are
acceptable only inasmuch as they are asked in the context of someone
writing a C# program. Since such questions "inherit" acceptability only
by virtue of the language being used, the same rules for the use of the
language-specific newsgroups apply, and that means that two completely
different languages are mutually exclusive.

Either you're writing the code in C# or you're writing in VB.NET.
Whichever language you're using, that's the appropriate newsgroup to use.
If you don't want a language-specific answer, you shouldn't be posting in
_either_ language newsgroup. If you do want a language-specific answer,
you should only be posting in one or the other newsgroup.
The
only difference (ok maybe not the only, but the most important
difference)
is different syntax. If somebody has a general question about the
functionality of a .NET class then syntax doesn't matter, and a VB.NET
programmer can just as well tell you the correct answer as a C#
programmer.
If you don't care about the language aspects of your question, then the
question doesn't belong in _either_ language newsgroup. Pick a different
newsgroup where your question is actually on-topic on its own merits,
rather than by sneaking in on the basis of the language being used.
3) To everybody who answered that my first two questions were identical:
They're not - it depends on the answer.
I see Jon's already addressed that. I have to admit, I'm _sorely_ tempted
to reply by simply quoting the first sentence in your #2 question.

But that would be childish. :)

Pete
Jun 27 '08 #23
Johnny Jörgensen wrote:
I'm wondering (and hoping that somebody will be able to answer this):

If I calculate the hash value of files (either MD5 or SHA1), can I then be
sure that:

1) Two files with the same hash value are in fact identical?
Yes (sort of). If you hash two non-identical files and the same hash is
produced, this is more likely to be due to memory corruption than a
break in either MD5 or SHA1.
2) Two different files will NEVER have the same hash value?
No (sort of). By the pigeonhole principle.
3) If two files have the same MD5 hash value, they will ALSO have the same
SHA1 hash value (I should think that will always be the case)?
Yes and No. As above.

Alun Harford
Jun 27 '08 #24
Johnny Jörgensen wrote:
1) The question is NOT crossposted. It would have been if I posted TWO
seperate messages to which people responded indicidually. But I have posted
ONE message to two different groups and replies from one group will show up
in the other.
It has already been explained a couple of time, but:

two identical messages posted to two groups is called multi posting

the same message posted to two groups is called cross posting

You did not multi post, but you did cross post.

Those terms are very well defined.
2) Are you a programmer at all? How can you reason that a post that's
relevant in a C# group cannot possibly be relevant in a VB.NET group? The
only difference (ok maybe not the only, but the most important difference)
is different syntax. If somebody has a general question about the
functionality of a .NET class then syntax doesn't matter, and a VB.NET
programmer can just as well tell you the correct answer as a C# programmer.
There is a separate group for language independent framework questions.

But I don't have a problem with a question being posted to both C# and
VB.NET groups.
3) To everybody who answered that my first two questions were identical:
They're not - it depends on the answer.
I can not see any way they can be anything but true+true and
false+false.

Arne

Jun 27 '08 #25

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Wm. Scott Miller | last post by:
Hello all! We are building applications here and have hashing algorithms to secure secrets (e.g passwords) by producing one way hashes. Now, I've read alot and I've followed most of the advice...
3
by: Pohihihi | last post by:
I am implementing dup file finder via getting Hash (MD5 or SHA1) from files. I know how to get Hash but my problem is files are big. Real big (~3-4G each). That is lot of read from HDD if I am...
19
by: Ole Nielsby | last post by:
How does the GetHashCode() of an array object behave? Does it combine the GetHashCode() of its elements, or does it create a sync block for the object? I want to use readonly arrays as...
8
by: Maya | last post by:
Hello all, I'm using MD5 hashing in my application to give unique values to huge list of items my application receives, originally every item's name was difficult to use as an id for this item...
4
by: giftson.john | last post by:
Hi, I am creating an application which migrates all documents from one repository to another repository. Before migration i have to verify all the documents are unique. No duplicates has to be...
1
by: Tinku | last post by:
Hi friends I know Static Hashing and i know about Dynamic Hashing, still i have problem to make program with Dynamic Hashing I am new in "C" world, please help me, my problem is: i have to...
15
by: Vinodh | last post by:
I am reading about hashing techniques. The map data structure available in C++ STL uses hashing techniques?
7
by: John Smith | last post by:
Hi, I am very new to C# and NET framework. I am trying to hash (using MD5CryptoServiceProvider) a source that is split into several files. Now when the source is in one file I can produce the...
18
by: Johnny Jörgensen | last post by:
I'm wondering (and hoping that somebody will be able to answer this): If I calculate the hash value of files (either MD5 or SHA1), can I then be sure that: 1) Two files with the same hash...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.