-
-
Notifications
You must be signed in to change notification settings - Fork 6
🚀 NitroPascal Deep Dive: The RTL Wrapping Strategy - Our Secret Sauce! #3
-
|
Hey everyone! 👋 I'm excited to share a major architectural breakthrough in NitroPascal that makes the entire compiler dramatically simpler and more maintainable. Let me explain the key insight that changed everything, with all the juicy technical details! The Problem We SolvedTraditional transpilers face a huge challenge: how do you map one language's semantics to another? We could have built a complex code generator with 11+ units full of intricate logic trying to translate Delphi constructs to C++... but that's a maintenance nightmare waiting to happen. The Traditional Approach (Complex): Our Solution: Wrap Everything in the RTL! 🎁Instead, we took a fundamentally different approach: ✨ We wrap ALL Delphi semantics in C++ Runtime Library (RTL) functions and classes! This means:
Show Me The Code! 💻Example 1: For LoopsDelphi code: for i := 1 to 10 do WriteLn(i); Traditional transpiler approach (complex): // Generator must handle: // - Inclusive end (<=, not <) // - Range evaluation // - Iterator protection for (int i = 1; i <= 10; i++) { std::cout << i << std::endl; } // Problem: Easy to get wrong, hard to maintain Our RTL approach (simple): The RTL provides this (runtime.h): namespace np { template<typename Func> void ForLoop(Integer start, Integer end, Func body) { for (Integer i = start; i <= end; i++) { body(i); } } } Code generator just emits: np::ForLoop(1, 10, [&](int i) { np::WriteLn(i); }); Result:
Example 2: String Class with 1-Based IndexingDelphi uses 1-based indexing for strings (first character is at index 1), while C++ uses 0-based. Our RTL handles this transparently: The RTL String class (runtime.h): namespace np { class String { private: std::u16string data_; // UTF-16 internally (matches Delphi) public: // 1-based indexing operator char16_t operator[](Integer index) const { #ifndef NDEBUG assert(index >= 1 && index <= Length()); #endif return data_[index - 1]; // Convert to 0-based } Integer Length() const { return static_cast<Integer>(data_.length()); } String operator+(const String& other) const { return String(data_ + other.data_); } // UTF-8 ↔ UTF-16 conversion (more on this below!) std::string ToStdString() const; }; } Code generator emits: np::String s = "Hello"; np::String first = s[1]; // Gets 'H' (1-based!) No special logic needed in the generator - just emit the subscript operator! This Works For EVERYTHING! 🎯Control Flow → RTL Functions// for...to template<typename Func> void ForLoop(Integer start, Integer end, Func body); // for...downto template<typename Func> void ForLoopDownto(Integer start, Integer end, Func body); // while...do template<typename CondFunc, typename BodyFunc> void WhileLoop(CondFunc condition, BodyFunc body); // repeat...until template<typename BodyFunc, typename CondFunc> void RepeatUntil(BodyFunc body, CondFunc condition); Why templates?
Operators → RTL Functions// Delphi: x div y → C++: np::Div(x, y) inline Integer Div(Integer a, Integer b) { return a / b; // C++ / is integer division for ints } // Delphi: x mod y → C++: np::Mod(x, y) inline Integer Mod(Integer a, Integer b) { return a % b; } // Delphi: x shl n → C++: np::Shl(x, n) inline Integer Shl(Integer value, Integer shift) { return value << shift; } // Delphi: element in set → C++: np::In(element, set) template<typename T, typename SetType> bool In(const T& element, const SetType& set) { return set.contains(element); } Key point: Even simple operators get wrapped. Why?
I/O → Variadic Templates// Handles any number and type of arguments! template<typename... Args> void WriteLn(Args&&... args) { (std::cout << ... << std::forward<Args>(args)); // C++17 fold expression std::cout << std::endl; } // Usage (code generator just emits this): np::WriteLn("Count: ", i, " Value: ", x); Technical benefits:
Type System MappingWe use fixed-size types for cross-platform consistency:
Why fixed-size types?
Today's Major Win: UTF-8 ↔ UTF-16 Conversion! 🎉One of today's technical challenges was handling string encoding. Delphi uses UTF-16 for strings (like Windows, Java, JavaScript), while most Unix tools use UTF-8. The Challenge:
Our Solution: namespace { std::u16string utf8_to_utf16(const std::string& utf8) { std::u16string result; size_t i = 0; while (i < utf8.size()) { uint32_t codepoint = 0; unsigned char ch = utf8[i]; if (ch <= 0x7F) { // 1-byte sequence (ASCII) codepoint = ch; i++; } else if ((ch & 0xE0) == 0xC0) { // 2-byte sequence if (i + 1 < utf8.size()) { codepoint = ((ch & 0x1F) << 6) | (utf8[i + 1] & 0x3F); i += 2; } } else if ((ch & 0xF0) == 0xE0) { // 3-byte sequence if (i + 2 < utf8.size()) { codepoint = ((ch & 0x0F) << 12) | ((utf8[i + 1] & 0x3F) << 6) | (utf8[i + 2] & 0x3F); i += 3; } } else if ((ch & 0xF8) == 0xF0) { // 4-byte sequence (becomes surrogate pair in UTF-16) if (i + 3 < utf8.size()) { codepoint = ((ch & 0x07) << 18) | ((utf8[i + 1] & 0x3F) << 12) | ((utf8[i + 2] & 0x3F) << 6) | (utf8[i + 3] & 0x3F); i += 4; } } else { // Invalid UTF-8, skip i++; continue; } // Convert codepoint to UTF-16 if (codepoint <= 0xFFFF) { // BMP character (single UTF-16 code unit) result += static_cast<char16_t>(codepoint); } else { // Supplementary character (surrogate pair) codepoint -= 0x10000; result += static_cast<char16_t>(0xD800 + (codepoint >> 10)); result += static_cast<char16_t>(0xDC00 + (codepoint & 0x3FF)); } } return result; } } Technical highlights:
Memory Management// New/Dispose wrappers template<typename T> void New(T*& ptr) { ptr = new T(); } template<typename T> void Dispose(T*& ptr) { delete ptr; ptr = nullptr; // Auto-nullify (Delphi behavior) } Why wrap these?
Why This Architecture Is Brilliant 🧠1. Simple Code Generator// The ENTIRE for-loop code generator: procedure TCodeGenerator.EmitFor(const ANode: TJSONObject); var LStart, LEnd, LIterator: string; begin LIterator := GetIteratorName(ANode); LStart := EmitExpression(GetStartNode(ANode)); LEnd := EmitExpression(GetEndNode(ANode)); EmitLine(Format('np::ForLoop(%s, %s, [&](int %s) {', [LStart, LEnd, LIterator])); IncIndent; EmitStatements(GetBodyNode(ANode)); DecIndent; EmitLine('});'); end; That's 12 lines. A traditional generator would be 200+ lines handling all the edge cases! 2. Correctness Guarantees
3. Performance
Example: Our 4. Maintainability5. ExtensibilityWant to add a new Delphi feature? Traditional: Rewrite parts of the generator (risky!) Our approach:
Example - adding // RTL (runtime.h): inline void Inc(Integer& value, Integer amount = 1) { value += amount; } // Generator (1 line): 'INC': EmitLine(Format('np::Inc(%s);', [GetVarName(ANode)])); C++20 Features We're Using
The Complete ArchitectureCurrent RTL Status 📊Implemented (as of today!):
Coming Soon:
Real-World ExampleThis Delphi program: program test01; var i: Integer; sum: Integer; begin sum := 0; for i := 1 to 10 do sum := sum + i; WriteLn('Sum: ', sum); end. Generates this C++: #include "nitropascal_rtl.h" int main() { np::Integer sum; sum = 0; np::ForLoop(1, 10, [&](np::Integer i) { sum = sum + i; }); np::WriteLn("Sum: ", sum); return 0; } Compiles cleanly. Runs perfectly. Native speed. ✨ Performance NotesQ: Doesn't all this function calling add overhead? A: No! Modern C++ compilers are amazing:
Proof: Compile with Why This MattersThis architecture means we can:
The Philosophy
This is the NitroPascal way - elegant simplicity through careful architecture. Technical ResourcesWant to dive deeper? Check out:
Everything is on GitHub (coming soon!) and the design doc is meticulously documented. Questions? Comments? Technical discussions welcome! Have you worked with:
Let's discuss! Drop your thoughts below! 💬👇 TL;DR for the skimmers:
🚀 NitroPascal: Real Pascal. Real Performance. Real Simple (under the hood). 🚀 |
Beta Was this translation helpful? Give feedback.