Enhancing speed of looping cycle using Freepascal

Question 1

I use Lazarus 1.2.4 and Freepascal 2.6.4.

I have created a program that reads a disk in buffers of 64Kb (tried various buffer sizes) using a repeat...until loop. Each buffer is hashed using the SHA1 unit, specifically, SHA1Init, SHA1Update and SHA1Final.

The trouble is, is that although it works and the hashes always match that computed by other tools that do the same job, my program is not as fast. On a specific workstation with an 80Gb disk attached, it reads and hashes at about 1.8Gb per minute, and this is as a result of some enhanced compiler directives (and using specific optimisations offered by the Lazarus\FPC compiler). Before those tweaks, it was just 1.22Gb p\min as an average. The other tools do it at about 2.5Gb+ a minute (around 45Mb a second) and some are faster than that.

If I remove the hashing element and just do the disk reading, it reads at about 4Gb per minute, so I am fairly sure my loop structure is actually fairly quick. So I'm almost certain the bottleneck is the hashing aspect and this has been discussed at the Lazarus forum, here where it has been suggested that maybe the library needs to be improved a little for better speed. One poster suggested I re-write the three functions in assembly but I am not that good.

There is a related post HERE regarding SHA256, where the gentlemen concerned experienced similar issues, though with a different language. His implementation was very similar to mine - Init, Update, Final. One suggestion was to use a buffer of 16Mb in that post. I have tried 4Kb, 8Kb, 64Kb, 256Kb, 512Kb and 1Mb. I haven't gone to 16Mb or anywhere near that - might that prove to be worthwhile? I read that once you go above about 1Mb programs usually go backward?

Is there an obvious way to improve speed?

I have included only the relevant parts in the hope it will make the task easier to read.

// Main parts of my code responsible for loop.
// The SHA1 functions from the SHA1 Freepascal unit follow
hSelectedDisk := CreateFileW(PWideChar(DiskName), FILE_READ_DATA,
 FILE_SHARE_READ or FILE_SHARE_WRITE, nil, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, 0); // please note I have also tried FILE_FLAG_NO_BUFFER but that made no apparant difference
// We need the exact disk size in bytes to know when to stop reading
ExactDiskSize := GetDiskLengthInBytes(hSelectedDisk);
SectorCount := ExactDiskSize DIV 512;
// Now read the disk FROM START TO END and hash it until completion or the user aborts it
try
 SHA1Init(ctx);
 FileSeek(hSelectedDisk, 0, 0);
 repeat
 ProgressCounter := ProgressCounter + 1; // We use this update the progress display occasionally, instead of every buffer read
 TimeStartRead := Now;
 // The hashing bit...read the disk in buffers, hash each buffer and then
 // finalise the finished hash. If there's a read error, abort.
 // Step 1 : Check we are not at the end of the disk where bytes remaining
 // could be less than the size of the buffer
 if (ExactDiskSize - TotalBytesRead) < SizeOf(Buffer) then
 begin
 BytesRead := FileRead(hSelectedDisk, Buffer, (ExactDiskSize - TotalBytesRead)); // Read 65535 or less bytes
 end
 else
 begin
 BytesRead := FileRead(hSelectedDisk, Buffer, SizeOf(Buffer)); // Read 65536 (64kb) at a time
 end;
 if BytesRead = -1 then
 begin
 ShowMessage('There was a read error encountered. Aborting');
 exit;
 end
 else
 // Step 2 : No read errors, so now we hash ...
 // Update positions, update hash sequence, and update GUI
 begin
 inc(TotalBytesRead, BytesRead);
 NewPos := NewPos + BytesRead;
 SHA1Update(ctx, Buffer, BytesRead);
 lblBytesLeftToHashB.Caption := IntToStr(ExactDiskSize - NewPos) + ' bytes, ' + FormatByteSize(ExactDiskSize - NewPos);
 until (TotalBytesRead = ExactDiskSize) or (Stop = true);
 // Compute the final hash value
 SHA1Final(ctx, Digest);
 lblHash.Caption := SHA1Print(Digest);
 end;
end; // End of main looping cycle. Following code is the FPC procedures
procedure SHA1Init(out ctx: TSHA1Context);
begin
 FillChar(ctx, sizeof(TSHA1Context), 0);
 ctx.State[0] := 67452301ドル;
 ctx.State[1] := $efcdab89;
 ctx.State[2] := 98ドルbadcfe;
 ctx.State[3] := 10325476ドル;
 ctx.State[4] := $c3d2e1f0;
end;
procedure SHA1Update(var ctx: TSHA1Context; const Buf; BufLen: PtrUInt);
var
 Src: PByte;
 Num: PtrUInt;
begin
 if BufLen = 0 then
 Exit;
 Src := @Buf;
 Num := 0;
 // 1. Transform existing data in buffer
 if ctx.BufCnt > 0 then
 begin
 // 1.1 Try to fill buffer up to block size
 Num := 64 - ctx.BufCnt;
 if Num > BufLen then
 Num := BufLen;
 Move(Src^, ctx.Buffer[ctx.BufCnt], Num);
 Inc(ctx.BufCnt, Num);
 Inc(Src, Num);
 // 1.2 If buffer is filled, transform it
 if ctx.BufCnt = 64 then
 begin
 SHA1Transform(ctx, @ctx.Buffer);
 ctx.BufCnt := 0;
 end;
 end;
 // 2. Transform input data in 64-byte blocks
 Num := BufLen - Num;
 while Num >= 64 do
 begin
 SHA1Transform(ctx, Src);
 Inc(Src, 64);
 Dec(Num, 64);
 end;
 // 3. If there's less than 64 bytes left, add it to buffer
 if Num > 0 then
 begin
 ctx.BufCnt := Num;
 Move(Src^, ctx.Buffer, Num);
 end;
end;
const
 PADDING: array[0..63] of Byte =
 (80,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,ドル
 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
 );
procedure SHA1Final(var ctx: TSHA1Context; out Digest: TSHA1Digest);
var
 Length: QWord;
 Pads: Cardinal;
begin
 // 1. Compute length of the whole stream in bits
 Length := 8 * (ctx.Length + ctx.BufCnt);
 // 2. Append padding bits
 if ctx.BufCnt >= 56 then
 Pads := 120 - ctx.BufCnt
 else
 Pads := 56 - ctx.BufCnt;
 SHA1Update(ctx, PADDING, Pads);
 // 3. Append length of the stream (8 bytes)
 Length := NtoBE(Length);
 SHA1Update(ctx, Length, 8);
 // 4. Invert state to digest
 Invert(@ctx.State, @Digest, 20);
 FillChar(ctx, sizeof(TSHA1Context), 0);
end;

Question 2

There is a frequently run code in SHA1Transform() which you did not show (it maybe the assembly rewrite candidate). Do you use this github.com/graemeg/freepascal/blob/master/packages/hash/src/… code?

Question 3

There is a assembly language implementation of the SHA1 e.g. at nayuki.eigenstate.org/page/…, it should be easily possible to link assembly code (other then inline assembly, in Delphi terms it would be a *.obj file) into FreePascal

Question 4

Another one is at software.intel.com/en-us/articles/… (just using Google "sha1 assembly algorithm")

Question 5

If you're processing 1.8Gb per minute using 64Kb buffers, that's (1800000 / 64 =) 28000 buffers per minute i.e. (28000 / 60 =) 470 buffers / second.

I don't know what you're doing elsewhere with these statements ...

ProgressCounter := ProgressCounter + 1; // We use this update the progress display occasionally, instead of every buffer read
TimeStartRead := Now;

... but you should probably NOT try to update a GUI progress bar 500 times/second!

Also, avoid calling this 500 times times/second:

 lblBytesLeftToHashB.Caption := IntToStr(ExactDiskSize - NewPos) + ' bytes, ' + FormatByteSize(ExactDiskSize - NewPos);

Try disabling/removing that GUI-updating code completely, to see whether that improves performance. If it does improve performance then partially re-add the GUI-updating code: for example, update the GUI once every 100 buffers (instead of once every buffer as you're doing now).

Also you should get better performance if you move the I/O to a separate thread.

Alternatively this would be a time to use 'overlapped I/O' (e.g. passing a non-null LPOVERLAPPED parameter to the Win32 ReadFile function) but I don't know how to do that with the Freepascal run-time library.

Question 6

I have a progress speed checker that is activated every 2000 buffer reads. So, moving lblBytesLeftToHashB.Caption to the same if statement that checks the progress has had the ffect of meaning the interface refreshes every second or so rather than 500 times a second, and that shaves about another 8 minutes off. So it is now about as fast as the other tool! Thanks for your help. Answer accepted.

ChrisW ChrisW 13k1 gold badge35 silver badges76 bronze badges · Accepted Answer · 2014-07-06 22:49:36Z

If you're processing 1.8Gb per minute using 64Kb buffers, that's (1800000 / 64 =) 28000 buffers per minute i.e. (28000 / 60 =) 470 buffers / second.

I don't know what you're doing elsewhere with these statements ...

ProgressCounter := ProgressCounter + 1; // We use this update the progress display occasionally, instead of every buffer read
TimeStartRead := Now;

... but you should probably NOT try to update a GUI progress bar 500 times/second!

Also, avoid calling this 500 times times/second:

 lblBytesLeftToHashB.Caption := IntToStr(ExactDiskSize - NewPos) + ' bytes, ' + FormatByteSize(ExactDiskSize - NewPos);

Try disabling/removing that GUI-updating code completely, to see whether that improves performance. If it does improve performance then partially re-add the GUI-updating code: for example, update the GUI once every 100 buffers (instead of once every buffer as you're doing now).

Also you should get better performance if you move the I/O to a separate thread.

Alternatively this would be a time to use 'overlapped I/O' (e.g. passing a non-null LPOVERLAPPED parameter to the Win32 ReadFile function) but I don't know how to do that with the Freepascal run-time library.

I have a progress speed checker that is activated every 2000 buffer reads. So, moving lblBytesLeftToHashB.Caption to the same if statement that checks the progress has had the ffect of meaning the interface refreshes every second or so rather than 500 times a second, and that shaves about another 8 minutes off. So it is now about as fast as the other tool! Thanks for your help. Answer accepted.

Stack Exchange Network

Enhancing speed of looping cycle using Freepascal

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Enhancing speed of looping cycle using Freepascal

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions