I wrote this class:
public class MediaFile : IDisposable
{
private const int SECTION_SIZE = 8192;
public string FilePath { get; private set; }
public List<byte[]> Checksums { get; private set; }
private FileInfo fileInfo;
private MemoryMappedFile memoryMappedFile;
public MediaFile(string filePath)
{
if (!File.Exists(filePath))
throw new FileNotFoundException($"{filePath} was not found!");
this.FilePath = filePath;
this.Checksums = null;
this.fileInfo = new FileInfo(filePath);
this.memoryMappedFile = MemoryMappedFile.CreateFromFile(filePath, FileMode.Open);
this.Checksums = this.GenerateChecksums();
}
public void Dispose()
{
this.memoryMappedFile.Dispose();
}
public byte[] Read(long offset, int size)
{
using (MemoryMappedViewStream memoryMappedViewStream = this.memoryMappedFile.CreateViewStream(offset, size, MemoryMappedFileAccess.Read))
using (BinaryReader binaryReader = new BinaryReader(memoryMappedViewStream))
return binaryReader.ReadBytes(size);
}
private List<byte[]> GenerateChecksums()
{
List<byte[]> checksums = new List<byte[]>();
double sectionsCalc = this.fileInfo.Length / (double) SECTION_SIZE;
long normalSizedSectionCount = (long) Math.Floor(sectionsCalc);
int lastSectionSize = (int) ((sectionsCalc - normalSizedSectionCount) * (double) SECTION_SIZE);
using (MD5 hashProvider = MD5.Create())
{
for (long i = 0; i < normalSizedSectionCount; i++)
checksums.Add(hashProvider.ComputeHash(this.Read(i * SECTION_SIZE, SECTION_SIZE)));
checksums.Add(hashProvider.ComputeHash(this.Read(normalSizedSectionCount * SECTION_SIZE, lastSectionSize)));
}
return checksums;
}
}
The purpose of this class is to read large media files and create md5 hashes (checksums in this case) for all x bytes (SECTION_SIZE
) of the file and add them to a list.
I want to know if there is a better/faster way to accomplish this.
-
\$\begingroup\$ If "faster" is a goal then why are you using a crypto-strength hash? Remember, crypto strength hashes are more secure when they are slower. You want an attacker to have to do a lot of work to break a hash. If all you want is a checksum then why are you not using an algorithm specifically designed for checksums, like CRC? \$\endgroup\$Eric Lippert– Eric Lippert2016年02月29日 17:57:42 +00:00Commented Feb 29, 2016 at 17:57
1 Answer 1
You named your class
MediaFile
but basically its a class which creates checksums for any given file. You should consider to rename the class to e.gChecksumCreator
ofFileChecksumCreator
. This would make the purpose and the responsibility of that class obvious by looking at its name.Omitting braces
{}
although they might be optional can lead to serious bugs which aren't tracked easily. I would like to encourage you to always use them.You are using C# 6 but you still use
private
setters for your properties. With C# 6 you can just omit them if you either set the proerties at startup or from inside of the constructor. The backing field of a getter only property is internally handled asreadonly
and because you only set them inside the constructor you should take this route. While we are at the constructor, you are first assigningnull
to theChecksums
property and later on you are using theGenerateChecksums()
method to assign a value to it. The mentioned points could be implemented like sopublic string FilePath { get; } public List<byte[]> Checksums { get; } private FileInfo fileInfo; private MemoryMappedFile memoryMappedFile; public MediaFile(string filePath) { if (!File.Exists(filePath)) { throw new FileNotFoundException($"{filePath} was not found!"); } this.FilePath = filePath; this.fileInfo = new FileInfo(filePath); this.memoryMappedFile = MemoryMappedFile.CreateFromFile(filePath, FileMode.Open); this.Checksums = this.GenerateChecksums(); }
But basically, you won't need a
FileInfo
object here, because you are only using theLength
property of theFileInfo
object. So let us change this like sopublic string FilePath { get; } public List<byte[]> Checksums { get; } private long fileLength; private MemoryMappedFile memoryMappedFile; public MediaFile(string filePath) { if (!File.Exists(filePath)) { throw new FileNotFoundException($"{filePath} was not found!"); } this.FilePath = filePath; this.fileLength= new FileInfo(filePath).Length; this.memoryMappedFile = MemoryMappedFile.CreateFromFile(filePath, FileMode.Open); this.Checksums = this.GenerateChecksums(); }
But what happens if you decide later on to go the way @EricLippert has shown in the comments, like using a
CRC
hashprovider to generate the checksums ? You would need to change the class in question. Maybe it would be better to inject aHashAlgorithm
provider so you could easily switch to another one.This would result in an overloaded constructor and a slightly changed
Dispose()
method like soprivate bool hasOwnershipOfHashprovider = false; private HashAlgorithm hashProvider; public MediaFile(string filePath) : this(filePath, MD5.Create()) { hasOwnershipOfHashprovider = true; } public MediaFile(string filePath, HashAlgorithm hashProvider) { if (!File.Exists(filePath)) { throw new FileNotFoundException($"{filePath} was not found!"); } if (hashProvider == null) { throw new ArgumentNullException($"hashProvider"); } this.FilePath = filePath; this.hashProvider = hashProvider; this.fileLength = new FileInfo(filePath).Length; this.memoryMappedFile = MemoryMappedFile.CreateFromFile(filePath, FileMode.Open); this.Checksums = this.GenerateChecksums(); } public void Dispose() { this.memoryMappedFile.Dispose(); if (this.hasOwnershipOfHashprovider) { this.hashProvider.Dispose(); } }
But what happens if the user of that class won't call
Dispose()
? Take a look at proper-use-of-the-idisposable-interface.I don't see why the
Read()
method should be public. If it needs to be public, you should add some proper validation of the passed arguments.If a
filePath
is passed to the constructor which represents a file withLength == 0
theMemoryMappedFikle.Create()
method will throw anArgumentException
which should better be thrown by your code.