MD5 hash on very large files 500mb to 4gb+

From: Paul Spielvogel (pspielvogel_at_insightvideonet.com)
Date: 09/22/04


Date: 22 Sep 2004 15:01:25 -0700

I need to compute the MD5 hash on VERY large files 500mb to 4gb+

I have found two ways but neither one of them does what i need.

Private Function ComputeDataMD5(ByVal path As String) As String
        Dim fi As New FileInfo(path)
        Dim fs As FileStream = fi.OpenRead()
        fs = fi.OpenRead

        Dim Md5 As New MD5CryptoServiceProvider
        Dim hash As String =
BitConverter.ToString(Md5.ComputeHash(fs)).Replace("-", "")

        'fs.Close()
        ComputeDataMD5 = hash.ToLower
 End Function

This function uses the filestream object to create the hash from,
problem is that it locks up the application and does not allowe me to
show/update a progress bar.

Function GetHash(ByVal path As String) As String

        Dim cs As CryptoStream
        Dim ms As MemoryStream = New MemoryStream
        Dim md5Hash As MD5CryptoServiceProvider = New
MD5CryptoServiceProvider

        Dim fi As New FileInfo(path)
        Dim fs As FileStream = fi.OpenRead()

        Try
            
            fs = fi.OpenRead
          
            Dim buffer(1024) As Byte
            Dim size As Integer

            Do While fs.Position <> fs.Length
                size = fs.Read(buffer, 0, 1024)

                cs = New CryptoStream(ms, md5Hash,
CryptoStreamMode.Write)
                cs.Write(buffer, 0, size)
            Loop

            cs.FlushFinalBlock()
            Return BitConverter.ToString(md5Hash.Hash()).Replace("-",
"").ToLower

        Catch ex As Exception

            MsgBox("Error during hash operation: " + ex.ToString())

        Finally
            If Not (fs Is Nothing) Then fs.Close()
            If Not (cs Is Nothing) Then cs.Close()
            If Not (md5Hash Is Nothing) Then md5Hash.Clear()
        End Try
    End Function

This function reads a block of data and places it into the
CryptoStream object, after we are done reading the file we compute the
MD5. Problem with this function is that it reads the whole file into
memory, 500mb file = 500mb in ram.
Since i need to compute hash on files that are in the range of 4gb
this method is useless.



Relevant Pages

  • Re: Complex Theoretical One Way Hash Question
    ... there is 2^n ways of writing the value of the hash in the string - for a given hash (upper/lower case for each letter). ... if anyone finds a value of n for which md5(the md5 hash of this string is n) = n, I will buy them a beer. ...
    (sci.crypt)
  • Dumb Question
    ... a way that the computed hash equals the same hash generated from the ... Here's what I have for generating the MD5 hash through the ... Function MD5Hash(ByVal inputString As String) As String ... Dim myEncoder As New System.Text.UnicodeEncoding ...
    (microsoft.public.dotnet.framework.aspnet.security)
  • Re: reversing hash ?
    ... The hash value generated by the script with the following line: ... of the character in a predefined string + 1 ... > If Say I took password A and encrypted it with some sort of MD5 Hash, ...
    (sci.crypt)
  • Re: Question about MD5 and SHA1 hasing algorithms
    ... on how to produce the MD5 hash of a simple string of varied length? ... The length of the hash key varies quite a lot ...
    (microsoft.public.dotnet.security)
  • Re: How to write a diff in VB6 for comparing two xml files?
    ... No, the best you could do is to read both into string and use StrCompbut it's inefficient and, but using the hash ... Private Declare Function CryptAcquireContext Lib "AdvAPI32.dll" Alias _ ... Dim HashAAs Byte, HashLenA As Long ...
    (microsoft.public.vb.general.discussion)