473,320 Members | 1,861 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

MD5 hash on very large files 500mb to 4gb+

I need to compute the MD5 hash on VERY large files 500mb to 4gb+

I have found two ways but neither one of them does what i need.

Private Function ComputeDataMD5(ByVal path As String) As String
Dim fi As New FileInfo(path)
Dim fs As FileStream = fi.OpenRead()
fs = fi.OpenRead

Dim Md5 As New MD5CryptoServiceProvider
Dim hash As String =
BitConverter.ToString(Md5.ComputeHash(fs)).Replace ("-", "")

'fs.Close()
ComputeDataMD5 = hash.ToLower
End Function

This function uses the filestream object to create the hash from,
problem is that it locks up the application and does not allowe me to
show/update a progress bar.
Function GetHash(ByVal path As String) As String

Dim cs As CryptoStream
Dim ms As MemoryStream = New MemoryStream
Dim md5Hash As MD5CryptoServiceProvider = New
MD5CryptoServiceProvider

Dim fi As New FileInfo(path)
Dim fs As FileStream = fi.OpenRead()

Try

fs = fi.OpenRead

Dim buffer(1024) As Byte
Dim size As Integer

Do While fs.Position <> fs.Length
size = fs.Read(buffer, 0, 1024)

cs = New CryptoStream(ms, md5Hash,
CryptoStreamMode.Write)
cs.Write(buffer, 0, size)
Loop

cs.FlushFinalBlock()
Return BitConverter.ToString(md5Hash.Hash()).Replace("-",
"").ToLower
Catch ex As Exception

MsgBox("Error during hash operation: " + ex.ToString())

Finally
If Not (fs Is Nothing) Then fs.Close()
If Not (cs Is Nothing) Then cs.Close()
If Not (md5Hash Is Nothing) Then md5Hash.Clear()
End Try
End Function

This function reads a block of data and places it into the
CryptoStream object, after we are done reading the file we compute the
MD5. Problem with this function is that it reads the whole file into
memory, 500mb file = 500mb in ram.
Since i need to compute hash on files that are in the range of 4gb
this method is useless.
Nov 21 '05 #1
3 6413
This function uses the filestream object to create the hash from,
problem is that it locks up the application and does not allowe me to
show/update a progress bar.


generate the hash on another thread so that the main thread is free to
update the form and its controls and wont lock up.
Imran.
Nov 21 '05 #2
This function uses the filestream object to create the hash from,
problem is that it locks up the application and does not allowe me to
show/update a progress bar.


generate the hash on another thread so that the main thread is free to
update the form and its controls and wont lock up.
Imran.
Nov 21 '05 #3

As Imran said, put it in a separate thread to leave the main thread
open to update the UI.

As for generating a progress bar, you'll need to get the progress of
the Hash. To do this you can create a custom stream that wraps
FileStream and broadcasts progress events. Or you can use an
indeterminate progress bar that just shows something is hapenning
without showing the percent completed.

HTH,

Sam

On 22 Sep 2004 15:01:25 -0700, ps*********@insightvideonet.com (Paul
Spielvogel) wrote:
I need to compute the MD5 hash on VERY large files 500mb to 4gb+

I have found two ways but neither one of them does what i need.

Private Function ComputeDataMD5(ByVal path As String) As String
Dim fi As New FileInfo(path)
Dim fs As FileStream = fi.OpenRead()
fs = fi.OpenRead

Dim Md5 As New MD5CryptoServiceProvider
Dim hash As String =
BitConverter.ToString(Md5.ComputeHash(fs)).Replac e("-", "")

'fs.Close()
ComputeDataMD5 = hash.ToLower
End Function

This function uses the filestream object to create the hash from,
problem is that it locks up the application and does not allowe me to
show/update a progress bar.

Nov 21 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: shailesh kumar | last post by:
Hi, I need to design data interfaces for accessing files of very large sizes efficiently. The data will be accessed in chunks of fixed size ... My data interface should be able to do a random...
5
by: Michael H | last post by:
Hi all, I guess I don't fully understand how a SHA1 hash value is calculated in C# / .NET for a large file... I'm trying to calculate SHA1 values for large files that are much larger than my...
4
by: Sebastian Dau | last post by:
Hello, I'm writing a security related application in .NET 1.1 and I wonder how I compute hashes from very large files (up to 1 GB). Does anybody know a good pattern of how to create a stream...
0
by: ajayguptab | last post by:
Hi Friends, I need to copy large files (more than 500mb) from one node to another node. ( node has some hard disks ). or let me know how to compress a huge file into a small...
1
by: Lars B | last post by:
Hey guys, I have written a C++ program that passes data from a file to an FPGA board and back again using software and DMA buffers. In my program I need to compare the size of a given file against...
12
by: raj | last post by:
Hi friends, In an interview I was asked to write a C program to create a large file of 8GB The first 4GB is filled with "Hello" and the secod 4GB is filled with "World" Sorry to say that...
2
by: robert | last post by:
Somebody who uses my app gets a error : os.stat('/path/filename') OSError: Value too large for defined data type: '/path/filename' on a big file >4GB ( Python 2.4.4 / Linux )
17
by: byte8bits | last post by:
How does C++ safely open and read very large files? For example, say I have 1GB of physical memory and I open a 4GB file and attempt to read it like so: #include <iostream> #include <fstream>...
6
by: Terry Carroll | last post by:
I am trying to do something with a very large tarfile from within Python, and am running into memory constraints. The tarfile in question is a 4-gigabyte datafile from freedb.org,...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.