Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

NitroPascal Performance Benchmarks #7

jarroddavis68 started this conversation in DevLog
Discussion options

Overview

Performance comparison between NitroPascal (releasefast optimization) and Delphi (Release mode) using the NPBench micro-benchmark suite.

Test Environment:

  • Platform: Windows x64
  • Date: October 17, 2025
  • NitroPascal: releasefast optimization mode
  • Delphi: Release mode with full optimizations

Benchmark Suite

NPBench consists of three micro-benchmarks designed to measure different aspects of compiler performance:

1. String Concatenation (string_concat_1k)

Repeatedly concatenates a single character to build a 1KB string:

LStr := '';
for LIndex := 1 to 1024 do
 LStr := LStr + 'x';

Measures: String memory allocation, copy operations, runtime library efficiency
Data processed: 1,024 bytes per iteration

2. Array Sum (array_sum_10m)

Computes running sum over 10 million integers with array writes:

LSum := 0;
for LIndex := 1 to 10,000,000 do
begin
 LSum := LSum + LIndex;
 LArray[LIndex mod 100] := LSum;
end;

Measures: Integer arithmetic, array indexing, loop optimization, cache utilization
Data processed: 80 MB per iteration

3. Matrix Multiplication (matmul_64)

Standard matrix multiplication of two ×ばつ64 double-precision matrices:

for i := 0 to 63 do
 for j := 0 to 63 do
 for k := 0 to 63 do
 C[i,j] := C[i,j] + A[i,k] * B[k,j];

Measures: Floating-point arithmetic, nested loop optimization, memory access patterns
Data processed: 98,304 bytes per iteration

Methodology

Each benchmark:

  1. Performs 2 warmup iterations
  2. Auto-adjusts iteration count to achieve ~400ms total runtime
  3. Reports average time per operation

Metrics:

  • ns/op: Nanoseconds per operation (lower is better)
  • ops/s: Operations per second (higher is better)
  • MB/s: Data throughput in megabytes per second (higher is better)

Results

NitroPascal (releasefast)

Benchmark Iterations ns/op ops/s MB/s
string_concat_1k 40 242,495.00 4,123.80 4.03
array_sum_10m 1 6,343,000.00 157.65 12,028.05
matmul_64 73 136,708.22 7,314.85 685.77

Delphi (Release)

Benchmark Iterations ns/op ops/s MB/s
string_concat_1k 28,169 14,689.94 68,073.80 66.48
array_sum_10m 40 10,328,132.50 96.82 7,387.00
matmul_64 1,899 215,436.28 4,641.74 435.16

Performance Comparison

Benchmark NitroPascal Delphi Faster Ratio
string_concat_1k 242,495 ns 14,690 ns Delphi ×ばつ
array_sum_10m 6,343,000 ns 10,328,133 ns NitroPascal ×ばつ
matmul_64 136,708 ns 215,436 ns NitroPascal ×ばつ

Analysis

String Concatenation

  • Winner: Delphi (×ばつ faster)
  • Delphi's native string implementation with copy-on-write semantics and optimized memory allocation significantly outperforms NitroPascal's C++ string runtime

Array Sum

  • Winner: NitroPascal (×ばつ faster)
  • NitroPascal benefits from modern C++ compiler optimizations including aggressive loop optimization and better instruction scheduling
  • Throughput: NitroPascal 12,028 MB/s vs Delphi 7,387 MB/s

Matrix Multiplication

  • Winner: NitroPascal (×ばつ faster)
  • NitroPascal's C++ backend provides superior floating-point optimization and nested loop handling
  • Throughput: NitroPascal 685.77 MB/s vs Delphi 435.16 MB/s

Summary

NitroPascal Performance Profile:

  • Excels at numeric computation (×ばつ faster than Delphi)
  • Strong performance on array operations and floating-point math
  • Slower string handling (×ばつ slower than Delphi)

Delphi Performance Profile:

  • Superior string manipulation performance
  • Competitive numeric performance
  • Mature, well-optimized runtime library

Reproducibility

Both compilers use identical source code to ensure fair comparison. Benchmarks can be reproduced using the NPBench suite included with NitroPascal.

Version: NPBench 1.0
Report Date: October 17, 2025

You must be logged in to vote

Replies: 1 comment

Comment options

🎉 UPDATE: String Performance Problem SOLVED!

Great news! We've just implemented a code generation optimization that dramatically improves the string concatenation benchmark results:

New Results:

  • String concatenation: 18,628 ns/op (was 242,495 ns/op)
  • That's a ×ばつ performance improvement! 🚀
  • Now only ×ばつ slower than Delphi (was ×ばつ slower)

What we did:
The compiler now detects the pattern LStr := LStr + 'x' and generates efficient += operations instead of creating new allocations every time.

Bottom line: NitroPascal is now competitive with Delphi on string operations while maintaining its ×ばつ advantage on numeric benchmarks. Best of both worlds! 💪

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
DevLog
Labels
None yet
1 participant

AltStyle によって変換されたページ (->オリジナル) /