By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,471 Members | 722 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,471 IT Pros & Developers. It's quick & easy.

Java's MD5 output shorter than 128 bits?

P: n/a
Hello,

I have a problem with MessageDigest MD5 in Java. I want to calculate the
digest from a file, encode it in base64 and display it. Now, according to
MD5 spec, digest shall be 128 bits so, encoding it to base64 shall give 44
characters output ((128bit + 4bit padding)/3 = 44). When I execute a
program to digest a given file:
# java Masher Masher.java
# nfEOH/5M+yDLaxaJ+XpJ5Q==
So, the output is 24 chars, ie. 64bits + 8 bit padding (two '=' chars at
the end represent padding in base64). Now, when I execute the following:
# more Masher.java | openssl md5 | openssl base64
# OWYxNGU2YzZjYTAyOGVhYzJhMjgyM2E0ZTJhNmU2ZTAK
So, 44 characters as expected. My assumption is that output of these two
commands shall be the same (after all MD5 and base64 implementations shall
give always the same output given same input, or? - excluding the hash
function collision of course.).

What's wrong here? Java's MD5 output is too short? base64 encoding in Java
doesn't work properly? Or am I simply missing something?

The code I used for MD5 calculation in Java is from Knudsen's "Java
Cryptography" book:

// obtain a message digest object
MessageDigest md = MessageDigest.getInstance("MD5");

// calculate the digest for the given file
FileInputStream in = new FileInputStream(args[0]);
byte[] buffer = new byte[8192];
int length;
while ((length = in.read(buffer)) != -1)
md.update(buffer,0,length);
byte[] raw = md.digest();

// print out the digest in base64
BASE64Encoder encoder = new BASE64Encoder();
String base64 = encoder.encode(raw);
System.out.println(base64);

Any help appreciated.

BRs,
Zulik
Jul 17 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Zulik <zu*****@yahoo.com> said:
I have a problem with MessageDigest MD5 in Java. I want to calculate the
digest from a file, encode it in base64 and display it. Now, according to
MD5 spec, digest shall be 128 bits so, encoding it to base64 shall give 44
characters output ((128bit + 4bit padding)/3 = 44). When I execute a
program to digest a given file:
# java Masher Masher.java
# nfEOH/5M+yDLaxaJ+XpJ5Q==
So, the output is 24 chars, ie. 64bits + 8 bit padding (two '=' chars at
the end represent padding in base64).
Are you certain you're calculating this correctly? The divisor of three
you're using seems strange. I guess it should be 4/3, and then your
source number should be bytes, not bits, as you can encode 3 binary bytes
into four base64-characters.

A single base64 -character carries 6 bits of data, thus in base64
you can express 128 bits with 128 / 6 = 21 characters and two bits
(so, 22 characters).
Now, when I execute the following:
# more Masher.java | openssl md5 | openssl base64


Note that here the output of "openssl md5" is already hex-encoded,
so you're getting two bytes for each one binary byte, so your output
length is twice the binary length (128 bits would be 16 binary bytes;
output of "openssl md5" is 33 bytes; 32 bytes of hex-encided data and
a newline; output of "openssl md5 -binary" is exactly 16 bytes).

However, if you do
echo foo | openssh md5 -binary | openssl base64
you get again 24 chars (plus newline) - which is nicely 22*6=132 bits,
so 128 bits and 4 bits of padding.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
Jul 17 '05 #2

P: n/a
On Sun, 15 Feb 2004 13:27:01 +0000, Juha Laiho wrote:

Terve,

<cut>
....
</cut>
Note that here the output of "openssl md5" is already hex-encoded,
so you're getting two bytes for each one binary byte, so your output
length is twice the binary length (128 bits would be 16 binary bytes;
output of "openssl md5" is 33 bytes; 32 bytes of hex-encided data and
a newline; output of "openssl md5 -binary" is exactly 16 bytes).

However, if you do
echo foo | openssh md5 -binary | openssl base64
you get again 24 chars (plus newline) - which is nicely 22*6=132 bits,
so 128 bits and 4 bits of padding.


Yup, you're right - I somehow missed the point that digest() method gives
pure binary result. I encoded buffer to hex and the result is now correct...

Kiitos!

BRs,
Zulik
Jul 17 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.