473,699 Members | 2,905 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

MD5 for Large Files

I am new to VB.NET and I am trying to learn. So, your indulgence for the
triviality of my questions is kindly requested.

I would like to calculate an MD5 hash for very lage files. The examples I
came across read the file into a byte array and apply the hash to that
array. The following code illustrates what I am doing. I would like to
perform the hash calculation on a stream. Is this possible? If so, an
example would be greatly appreciated.

Imports System.Text
Imports System.Security .Cryptography
Imports System.IO

Public Class Form1
Inherits System.Windows. Forms.Form
Dim fs As FileStream = New FileStream("c:\ ConfDenise.PDF" ,
FileMode.Open)
Dim r As BinaryReader = New BinaryReader(fs )

Public Function GenerateHash(By Ref Buff() As Byte) As String
Dim Md5 As New MD5CryptoServic eProvider()
Dim ByteHash() As Byte = Md5.ComputeHash (Buff)
Return Convert.ToBase6 4String(ByteHas h)
End Function

#Region " Windows Form Designer generated code "
' snip
#End Region

Private Sub Button1_Click(B yVal sender As System.Object, ByVal e As
System.EventArg s) Handles Button1.Click
fs.Seek(0, SeekOrigin.Begi n)
Label1.Text = GenerateHash(r. ReadBytes(fs.Le ngth))
End Sub
End Class

--
Richard
Please reply to group or
Remove 'REMOVE' from my email address.
Nov 20 '05 #1
2 2893
If you are familiar with Block cipher encryption in .NET, hashes are
supposed to work the same way.
In other words, hash algorithms (like MD5) implement the ICryptoTransfor m
interface, which allows them to be used by the CryptoStream object.
You could create a memory stream, and wrap it with a CryptoStream, and pass
the MD5 instance to the CryptoStream constructor. Anytime you "write" to the
CryptoStream, it should transform the data through the protecte HashCore
function of MD5. Then, when you are done, you call FlushFinalBlock on the
CryptoStream, which in turn calls the MD5 HashFinal method. Then you can
check the Hash property of the md5 object for the final hash value.

Function GetHash() As String

Dim cs As CryptoStream
Dim ms As MemoryStream = New MemoryStream
Dim md5Hash As MD5CryptoServic eProvider = New MD5CryptoServic eProvider

Try

Dim buffer As Byte() = New Byte() {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15}

cs = New CryptoStream(ms , md5Hash, CryptoStreamMod e.Write)
cs.Write(buffer , 0, buffer.Length)

cs.FlushFinalBl ock()
Return Convert.ToBase6 4String(md5Hash .Hash())

Catch ex As Exception

MsgBox("Error during hash operation: " + ex.ToString())

Finally
If Not (cs Is Nothing) Then cs.Close()
If Not (md5Hash Is Nothing) Then md5Hash.Clear()
End Try
End Function

The only thing to note is that in my sample, i just filled the buffer once
with a bunch of numbers.
In your case, you need to wrap a Loop around the cs.Write(...) line, and
read from your file to the buffer one chunk at a time, and write the buffer
to the CryptoStream until you run out of file.

-Rob Teixeira [MVP]

"Richard Lemay" <rl************ *********@mailb locks.com> wrote in message
news:ul******** ******@TK2MSFTN GP11.phx.gbl...
I am new to VB.NET and I am trying to learn. So, your indulgence for the
triviality of my questions is kindly requested.

I would like to calculate an MD5 hash for very lage files. The examples I
came across read the file into a byte array and apply the hash to that
array. The following code illustrates what I am doing. I would like to
perform the hash calculation on a stream. Is this possible? If so, an
example would be greatly appreciated.

Imports System.Text
Imports System.Security .Cryptography
Imports System.IO

Public Class Form1
Inherits System.Windows. Forms.Form
Dim fs As FileStream = New FileStream("c:\ ConfDenise.PDF" ,
FileMode.Open)
Dim r As BinaryReader = New BinaryReader(fs )

Public Function GenerateHash(By Ref Buff() As Byte) As String
Dim Md5 As New MD5CryptoServic eProvider()
Dim ByteHash() As Byte = Md5.ComputeHash (Buff)
Return Convert.ToBase6 4String(ByteHas h)
End Function

#Region " Windows Form Designer generated code "
' snip
#End Region

Private Sub Button1_Click(B yVal sender As System.Object, ByVal e As
System.EventArg s) Handles Button1.Click
fs.Seek(0, SeekOrigin.Begi n)
Label1.Text = GenerateHash(r. ReadBytes(fs.Le ngth))
End Sub
End Class

--
Richard
Please reply to group or
Remove 'REMOVE' from my email address.

Nov 20 '05 #2
Thanks Rob,
I'll give it a try.
Richard

"Rob Teixeira [MVP]" <RobTeixeira@@m sn.com> wrote in message
news:%2******** ********@tk2msf tngp13.phx.gbl. ..
If you are familiar with Block cipher encryption in .NET, hashes are
supposed to work the same way.
In other words, hash algorithms (like MD5) implement the ICryptoTransfor m
interface, which allows them to be used by the CryptoStream object.
You could create a memory stream, and wrap it with a CryptoStream, and pass the MD5 instance to the CryptoStream constructor. Anytime you "write" to the CryptoStream, it should transform the data through the protecte HashCore
function of MD5. Then, when you are done, you call FlushFinalBlock on the
CryptoStream, which in turn calls the MD5 HashFinal method. Then you can
check the Hash property of the md5 object for the final hash value.

Function GetHash() As String

Dim cs As CryptoStream
Dim ms As MemoryStream = New MemoryStream
Dim md5Hash As MD5CryptoServic eProvider = New MD5CryptoServic eProvider

Try

Dim buffer As Byte() = New Byte() {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15}

cs = New CryptoStream(ms , md5Hash, CryptoStreamMod e.Write)
cs.Write(buffer , 0, buffer.Length)

cs.FlushFinalBl ock()
Return Convert.ToBase6 4String(md5Hash .Hash())

Catch ex As Exception

MsgBox("Error during hash operation: " + ex.ToString())

Finally
If Not (cs Is Nothing) Then cs.Close()
If Not (md5Hash Is Nothing) Then md5Hash.Clear()
End Try
End Function

The only thing to note is that in my sample, i just filled the buffer once
with a bunch of numbers.
In your case, you need to wrap a Loop around the cs.Write(...) line, and
read from your file to the buffer one chunk at a time, and write the buffer to the CryptoStream until you run out of file.

-Rob Teixeira [MVP]

"Richard Lemay" <rl************ *********@mailb locks.com> wrote in message
news:ul******** ******@TK2MSFTN GP11.phx.gbl...
I am new to VB.NET and I am trying to learn. So, your indulgence for the triviality of my questions is kindly requested.

I would like to calculate an MD5 hash for very lage files. The examples I came across read the file into a byte array and apply the hash to that
array. The following code illustrates what I am doing. I would like to
perform the hash calculation on a stream. Is this possible? If so, an
example would be greatly appreciated.

Imports System.Text
Imports System.Security .Cryptography
Imports System.IO

Public Class Form1
Inherits System.Windows. Forms.Form
Dim fs As FileStream = New FileStream("c:\ ConfDenise.PDF" ,
FileMode.Open)
Dim r As BinaryReader = New BinaryReader(fs )

Public Function GenerateHash(By Ref Buff() As Byte) As String
Dim Md5 As New MD5CryptoServic eProvider()
Dim ByteHash() As Byte = Md5.ComputeHash (Buff)
Return Convert.ToBase6 4String(ByteHas h)
End Function

#Region " Windows Form Designer generated code "
' snip
#End Region

Private Sub Button1_Click(B yVal sender As System.Object, ByVal e As
System.EventArg s) Handles Button1.Click
fs.Seek(0, SeekOrigin.Begi n)
Label1.Text = GenerateHash(r. ReadBytes(fs.Le ngth))
End Sub
End Class

--
Richard
Please reply to group or
Remove 'REMOVE' from my email address.


Nov 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2098
by: Edvard Majakari | last post by:
Hi all ya unit-testing experts there :) Code I'm working on has to parse large and complex files and detect equally complex and large amount of errors before the contents of the file is fed to module interpreting it. First I created a unit-test class named TestLoad which loaded, say, 40 files of which about 10 are correct and other 30 files contained over 20 different types of errors. Different methods on the TestLoad class were coded so...
6
2650
by: Greg | last post by:
I am working on a project that will have about 500,000 records in an XML document. This document will need to be queried with XPath, and records will need to be updated. I was thinking about splitting up the XML into several XML documents (perhaps 50,000 per document) to be more efficient but this will make things a lot more complex because the searching needs to go accross all 500,000 records. Can anyone point me to some best practices...
3
6354
by: Buddy Ackerman | last post by:
I'm trying to write files directly to the client so that it forces the client to open the Save As dialog box rather than display the file. On some occasions the files are very large (100MB+). On these files teh time that it takes until the client displays the Save As dialog can be extrordinarily long (3+ minutes). I don't understand why. I was initiall using the format: Respnse.writefile("filepath", offset, length) but that simply...
3
2336
by: A.M-SG | last post by:
Hi, I have a ASP.NET aspx file that needs to pass large images from a network storage to client browser. The requirement is that users cannot have access to the network share. The aspx file must be the only method that users receive image files.
2
1965
by: jdev8080 | last post by:
We are looking at creating large XML files containing binary data (encoded as base64) and passing them to transformers that will parse and transform the data into different formats. Basically, we have images that have associated metadata and we are trying to develop a unified delivery mechanism. Our XML documents may be as large as 1GB and contain up to 100,000 images. My question is, has anyone done anything like this before?
20
4268
by: mike | last post by:
I help manage a large web site, one that has over 600 html pages... It's a reference site for ham radio folks and as an example, one page indexes over 1.8 gb of on-line PDF documents. The site is structured as an upside-down tree, and (if I remember correctly) never more than 4 levels. The site basically grew (like the creeping black blob) ... all the pages were created in Notepad over the last
1
6317
by: Lars B | last post by:
Hey guys, I have written a C++ program that passes data from a file to an FPGA board and back again using software and DMA buffers. In my program I need to compare the size of a given file against a software buffer of size 3MB. This is needed so as to see which function to use to read from the file. As the files used range from very large (>30GB) to very small (<3MB), I have enabled large file support and I obtain the file size by using the...
8
6390
by: theCancerus | last post by:
Hi All, I am not sure if this is the right place to ask this question but i am very sure you may have faced this problem, i have already found some post related to this but not the answer i am looking for. My problem is that i have to upload images and store them. I am using filesystem for that. setup is something like this, their will be items/groups/user each can
1
3892
by: =?Utf-8?B?UVNJRGV2ZWxvcGVy?= | last post by:
Using .NET 2.0 is it more efficient to copy files to a single folder versus spreading them across multiple folders. For instance if we have 100,000 files to be copied, Do we copy all of them to a single folder called 'All Files' Do we spread them out and copy them to multiple folders like Folder 000 - Copy files from 0 to 1000 Folder 001 - Copy files from 1000 to 2000 Folder 002 - Copy files from 2000 to 2999
17
9921
by: byte8bits | last post by:
How does C++ safely open and read very large files? For example, say I have 1GB of physical memory and I open a 4GB file and attempt to read it like so: #include <iostream> #include <fstream> #include <string> using namespace std; int main () {
0
8704
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8623
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9192
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8940
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6546
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5879
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4390
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4637
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3071
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.