2010年04月07日
More looping
For context, see Nathan's blog.
[Edit: I've just learned on IRC from Maciej Katafiasz that iterate also has a feature built in which renders this hack unnecessary. Even more iterate awesomeness! I'll leave this post up though, because I still think it' s pretty cute that SBCL optimizes this case so nicely.]
MY-USER> (defun outer-collect-in-inner-loop () (declare (optimize speed (safety 0) (debug 0))) (iter (for outer from 0 below 10) (declare (type (integer -1 10) outer)) (flet ((outer-collect (x) (collect x))) (iter (for inner from 0 below 10) (declare (type (integer -1 10) inner)) (outer-collect (+ (* 10 outer) inner)))))) OUTER-COLLECT-IN-INNER-LOOPNote that the COLLECT in the outer loop is ITERATE's collecting feature. The trick is to use it in a FLET before starting the inner loop, thereby making it available further down. Declarations added only to get a nicer disassembly.
MY-USER> (OUTER-COLLECT-IN-INNER-LOOP) (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99)Okay, it works. But how is it being compiled?
MY-USER> (disassemble 'OUTER-COLLECT-IN-INNER-LOOP) ; disassembly for OUTER-COLLECT-IN-INNER-LOOP ; 069222FF: 31C9 XOR ECX, ECX ; no-arg-parsing entry point ; 301: BE17001020 MOV ESI, 537919511 ; 306: BF17001020 MOV EDI, 537919511 ; 30B: BA17001020 MOV EDX, 537919511 ; 310: 48C7C1F8FFFFFF MOV RCX, -8 ; 317: E977000000 JMP L5 ; 31C: 90 NOP ; 31D: 90 NOP ; 31E: 90 NOP ; 31F: 90 NOP ; 320: L0: 31DB XOR EBX, EBX ; 322: 48C7C3F8FFFFFF MOV RBX, -8 ; 329: EB5B JMP L4 ; 32B: 90 NOP ; 32C: 90 NOP ; 32D: 90 NOP ; 32E: 90 NOP ; 32F: 90 NOP ; 330: L1: 488BD1 MOV RDX, RCX ; 333: 486BD20A IMUL RDX, RDX, 10 ; 337: 4C8BCB MOV R9, RBX ; 33A: 4E8D040A LEA R8, [RDX+R9] ; 33E: 4989AC24B8000000 MOV [R12+184], RBP ; 346: 4D8B5C2468 MOV R11, [R12+104] ; 34B: 498D5310 LEA RDX, [R11+16] ; 34F: 4939542470 CMP [R12+112], RDX ; 354: 765C JBE L7 ; 356: 4989542468 MOV [R12+104], RDX ; 35B: 498D5307 LEA RDX, [R11+7] ; 35F: L2: 4931AC24B8000000 XOR [R12+184], RBP ; 367: 7402 JEQ L3 ; 369: CC09 BREAK 9 ; pending interrupt trap ; 36B: L3: 4C8942F9 MOV [RDX-7], R8 ; 36F: 48C7420117001020 MOV QWORD PTR [RDX+1], 537919511 ; 377: 4881FE17001020 CMP RSI, 537919511 ; 37E: 7529 JNE L6 ; 380: 488BF2 MOV RSI, RDX ; 383: 488BFA MOV RDI, RDX ; 386: L4: 4883C308 ADD RBX, 8 ; 38A: 488BD3 MOV RDX, RBX ; 38D: 4883FA50 CMP RDX, 80 ; 391: 7C9D JL L1 ; 393: L5: 4883C108 ADD RCX, 8 ; 397: 488BD1 MOV RDX, RCX ; 39A: 4883FA50 CMP RDX, 80 ; 39E: 7C80 JL L0 ; 3A0: 488BD6 MOV RDX, RSI ; 3A3: 488BE5 MOV RSP, RBP ; 3A6: F8 CLC ; 3A7: 5D POP RBP ; 3A8: C3 RET ; 3A9: L6: 48895701 MOV [RDI+1], RDX ; 3AD: 488BFA MOV RDI, RDX ; 3B0: EBD4 JMP L4 ; 3B2: L7: 6A10 PUSH 16 ; 3B4: 4C8D1C25601B4200 LEA R11, [#x421B60] ; alloc_tramp ; 3BC: 41FFD3 CALL R11 ; 3BF: 5A POP RDX ; 3C0: 488D5207 LEA RDX, [RDX+7] ; 3C4: EB99 JMP L2
Subscribe to:
Comments (Atom)