0
\$\begingroup\$

How do I make this go faster? It currently takes about 30 seconds on a 3gb file. I am using this to get a checksum from the file to compare it to Microsoft's official checksums. I am wondering if using a buffer would help and if so how should I do that?

public static String GetSHA1Hash(string fileName)
 {
 string strResult = String.Empty;
 string strHashData = String.Empty;
 byte[] arrbytHashValue;
 System.IO.FileStream oFileStream = null;
 System.Security.Cryptography.SHA1CryptoServiceProvider oSHAHasher =
 new System.Security.Cryptography.SHA1CryptoServiceProvider();
 try
 {
 oFileStream = GetFileStream(fileName);
 arrbytHashValue = oSHAHasher.ComputeHash(oFileStream);
 oFileStream.Close();
 strHashData = BitConverter.ToString(arrbytHashValue);
 strHashData = strHashData.Replace("-", String.Empty);
 strResult = strHashData;
 }
 catch (Exception ex) { return ex.Message; }
 finally { ((IDisposable)oSHAHasher).Dispose(); }
 return (strResult);
 }
asked Dec 3, 2015 at 20:13
\$\endgroup\$
5
  • \$\begingroup\$ how long do other methods (e.g. FCIV) take to run on the file? are the results the same? \$\endgroup\$ Commented Dec 3, 2015 at 20:15
  • \$\begingroup\$ What does GetFileStream look like? \$\endgroup\$ Commented Dec 3, 2015 at 20:57
  • \$\begingroup\$ " I am wondering if using a buffer would help" - ComputeHash(System.IO.Stream) calculates the hash by using blocks of 4096 bytes already, according to the reference source \$\endgroup\$ Commented Dec 3, 2015 at 21:11
  • \$\begingroup\$ Is the disk you're reading from even faster than 100MB/s? Computing the hash from RAM takes 7-10 seconds on my machine. \$\endgroup\$ Commented Dec 4, 2015 at 11:06
  • \$\begingroup\$ When you write gb in the question, you mean GB (gigabyte)? Lowercase b should be used for bits and lower case g isn't standard. \$\endgroup\$ Commented Dec 4, 2015 at 11:08

1 Answer 1

2
\$\begingroup\$

Your code is full of boilerplate and what seems to be misunderstood bits of the language as well as outmoded naming schemes. What I will show below does the following:

  1. Eliminates a number of intermediate variables, as well as their unused initializations, not needed, in order to reduce noise from what the method is attempting to do.

  2. use the using construct instead of try..finally. Very nice and compact way to deal with IDisposable resources.

  3. Followed common camelCase naming convention of local variables without any Hungarian notation.

Et, voila:

public static string GetSHA1Hash(string fileName)
{
 try
 {
 using (var shaHasher = new System.Security.Cryptography.SHA1CryptoServiceProvider())
 using (var fileStream = GetFileStream(fileName))
 {
 return BitConverter.ToString(shaHasher.ComputeHash(fileStream)).Replace("-", string.Empty);
 }
 }
 catch (Exception ex)
 {
 return ex.Message;
 }
}

However, I would also caution you to not return the exception message as a string. In fact, get rid of the try..except completely. This method clearly doesn't know how to deal with the exception, so let the caller catch it and provide whatever processing necessary.

As for speed, profile the code. Seriously. Find where the bottleneck is. I'm betting it's purely the I/O subsystem. You can control a lot of the read buffering and caching if you wish in code, but a file on spinning rust (or coming over a network!) is slow-going in any case.

Good luck!

answered Dec 3, 2015 at 22:17
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.