Challenge:
Your challenge is to write an interpreter for Whitespace. Given a string consisting of spaces, tabs, newlines, and potentially other characters, as well as possible inputs for the Whitespace program itself, output the result of the given Whitespace program.
Here is an overview of the Whitespace language and its builtins:
Whitespace is a stack-based language which uses only three characters: spaces (ASCII codepoint 32); tabs (ASCII codepoint 9); and newlines (ASCII codepoint 10); all other characters are ignored.
It only has a couple of basic builtins, which I will go over below. Whitespace has a stack, which can only consist of integers, as well as a heap, which is a map of integers (both the key and value).
From here on out I will be using S for spaces, T for tabs, and N for newlines to make the text more compact.
Stack Manipulation (always starts with leading S):
- Push a number to the stack:
SS, followed by either anS/Tfor positive/negative respectively, followed by someSand/orTwhich is the binary representation of the number (S=0;T=1), followed by a trailing newlineN. Some examples:SSSTNpushes the number1; a positive integer with binary1.SSTTSNpushes the number-2; a negative integer with binary10.SSSTSTSNpushes the number10; a positive integer with binary1010.SSTTTSSTSSNpushes the number-100; a negative integer with binary1100100.- Pushing number
0is an edge case, since it can be done in multiple ways. Some examples:SSSN: push a positive integer without any binary digits.SSTN: push a negative integer without any binary digits.SSSSN: push a positive integer with binary0.SSTSSSN: push a negative integer with binary000.
- Duplicate the top of the stack:
SNS. - Copy the 0-based \$n\$th item from the top of the stack to the top of the stack:
STSfollowed by a number similar as mentioned earlier (excluding the leadingSS). I.e. let's say the stack currently contains the integers[47,12,0,55], then we could useSTSSTSNto copy the 0-based 2nd item (which is the12in this case) to the top. So the stack becomes:[47,12,0,55,12].- NOTE: This index may not be negative. On TIO this would result in a negative index error, but that same program would push a
0in the vii5ard interpreter, and it could even be different in yet another interpreter. For the sake of this challenge, you can therefore assume a given copy will never be negative. So the copy will always start withSTSS, followed by the binary of the top-to-bottom index, followed by a trailingN.
- NOTE: This index may not be negative. On TIO this would result in a negative index error, but that same program would push a
- Swap the top two items on the stack:
SNT. - Discard the top item of the stack:
SNN. - Discard/slice \$n\$ items from the top of the stack, but keep the top item:
STN, followed by a number similar as mentioned earlier (excluding the leadingSS). I.e. let's say the stack currently contains the integers[1,2,3,4,5,6,7], then we could useSTNSTTNto discard 3 items from the stack (except the top one). So the stack becomes:[1,2,3,7].- You can again assume no negative slice values will be used, so this will always start with
STNS, followed by the amount to slice, followed by a trailingN.
- You can again assume no negative slice values will be used, so this will always start with
Arithmetic (always starts with leading TS):
- Addition; add the top two items on the stack together:
TSSS. - Subtraction; subtract the top item of the stack from the (top-1)th item of the stack:
TSST. - Multiplication; multiply the top two items on the stack:
TSSN. - Integer division; integer divide the (top-1)th item of the stack by the top item of the stack:
TSTS. (NOTE: Since Whitespace only has integers, this will always be integer division and never result in decimal values.) - Modulo; take the modulo of the (top-1)th item of the stack with the top item of the stack:
TSTT.- You can assume no negative values will be used for the modulo.
For each of these two argument builtins the same applies: if none or only a single integer is on the stack, this will result in an error. How you implement this error in your interpreter is up to you, as long as the program stops when this occurs. I thought about not allowing it for the sake of this challenge, but decided not to because it's a commonly used strategy to stop a program with an error when printing hardcoded texts, using the approach explained in this Whitespace codegolfing tip.
Heap Access (always starts with leading TT):
- Pop the top two items of the stack, and store the top item in the (top-1)th address of the heap:
TTS. I.e. let's say the stack contains the integers[1,2,3,4,5]and the heap already contains[{2:10}]. When you use this store builtin twice in a row, the stack would contain[1]and the heap will contain[{2:3},{4:5}](note how the{2:10}has been replaced with the{2:3}).- NOTE: Just like with the Arithmetic builtins, if no or a single argument is given, it will cause an error. But for the sake of this challenge you can assume this will never be given for this builtin.
- Pop the top item of the stack, and push the item corresponding with that heap address to the top of the stack:
TTT. I.e. let's say the stack contains the integers[1,4]and the heap contains[{2:3},{4:5}]. If you now use the retrieve builtin once, the stack would become[1,5](and the heap will remain the same).- NOTE: If you use an address that isn't in the heap yet (or the heap is empty), it will push a
0to the stack instead. But for the sake of this challenge you can also ignore this.
- NOTE: If you use an address that isn't in the heap yet (or the heap is empty), it will push a
Flow Control (always starts with leading N):
- Mark a location in the program with a label:
NSS, followed by some (optional)S/Twhich aren't used by other labels/subroutines, followed by anN. I.e. if you're only using a single label in your full program,NSSNwould be what to use when code-golfing. If you need two or three labels, you can addNSSSNand/orNSSTN.- Although it is possible to have multiple of the same labels in the TIO and vii5args interpreters, it will cause issues, so we assume the input will always only create a label/subroutine once.
- Also, although
NSSNwould be a logical first label to use, it's completely valid to use a labelNSSTSTSTTTSNinstead as only label in the program.
- Call a subroutine with the given label:
NST, followed by some (optional)S/Twhich aren't used by other labels/subroutines, followed by anN. I.e.NSTTSTSTTTSNwould jump to the labelTSTSTTTSas subroutine. - Jump unconditionally to a label:
NSN, followed by some (optional)S/T, followed by anN. I.e.NSNNwould jump to the (empty) labelNand continue the program flow from there. - Pop the top integer, and jump to a label if it is exactly 0:
NTS, followed by some (optional)S/T, followed by anN. I.e. if the stack is currently[4,1,0]and we'd useNTSSN, it would jump to the labelSNand continue the program flow from there (with stack[4,1]). If instead the stack is currently[4,1]and we'd use theNTSSN, it would jump past it to the next builtin below it (with stack[4]). - Pop the top integer, and jump to a label if it is negative:
NTT, followed by some (optional)S/T, followed by anN. I.e. if the stack is currently[4,1,-10]and we'd useNTTTN, it would jump to the labelTNand continue the program flow from there (with stack[4,1]). If instead the stack is currently[4,1]and we'd use theNTTTN, it would jump past it to the next builtin below it (with stack[4]).- Minor note: There is no Jump to label if positive builtin available in Whitespace.
- End a subroutine, and go back to the caller (a.k.a. return):
NTN. - End the entire program:
NNN(everything after that becomes no-ops).
I/O (always starts with leading TN):
- Pop the top integer, and print as character with that codepoint to STDOUT:
TNSS. I.e. if the stack is currently[10,101]and we'd call theTNSStwice, it will output a lowercaseefollowed by a newline to STDOUT. - Pop the top integer, and print as integer to STDOUT:
TNST. - Pop the top integer, and read a character from STDIN, for which its codepoint-integer will be stored in the heap with the popped integer as address:
TNTS. I.e. if the stack is[0,0], the heap is empty, and STDIN contains the capital letterI, and we'd useTNTS. The stack will become[0]and the heap[{0:73}]. (After which we could use the retrieve builtinTTTto put this input on the stack.) - Pop the top integer, and read an integer from STDIN, which will be stored in the heap with the popped integer as address:
TNTT.
Challenge rules:
- You can assume the Whitespace input is always valid with the builtins above. So the compilation phase would always succeed.
- You can assume executing the Whitespace input will never result in any errors, except when the Arithmetic builtins will only get 0 or 1 stack-arguments instead of the required 2. Although there are loads of other possible errors when executing a Whitespace program, like negative indices, jumps to labels that doesn't exist, read from STDIN when it's empty, etc. For the sake of this challenge you can assume none of those kind of errors will occur, and the input-program is error-free (except for the Arithmetic builtins).
- You can assume the additional inputs given will be valid as well. So an integer when we want to read an integer or a character / string of multiple characters if we want to read those.
- The Whitespace program-input can be taken in any reasonable format. Could be a string, list of characters, list of codepoint-integers, etc. Same applies to the other inputs.
- Any non-whitespace character in the Whitespace-program input is ignored. You can assume the no-op characters will only be printable ASCII. So the Whitespace input will only contain the UTF-8/ASCII characters with the codepoints [9, 10, 32..126].
- Whitespace numbers can in theory be as large as you want, but for the sake of this challenge you can assume the integers used will be in the range [-9223372036854775808, 9223372036854775807] (signed 64 bit integer).
General rules:
- This is code-golf, so shortest answer in bytes wins.
Don't let code-golf languages discourage you from posting answers with non-codegolfing languages. Try to come up with an as short as possible answer for 'any' programming language. - Standard rules apply for your answer with default I/O rules, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters and return-type, full programs. Your call.
- Default Loopholes are forbidden.
- If possible, please add a link with a test for your code (i.e. TIO).
- Also, adding an explanation for your answer is highly recommended.
Test cases:
You should copy the actual test cases from the Try it online links to verify your interpreter.
Input(s):
SSTTSTTTTNSSSTSNSSSTNSTSSTNSSTTSTNSSTTSNSNSSSSTNSTSSTNSSTTSSTTTSNSSTTTSSTTNSSSNSSTTSTNSSTTTNSSSTSNSSTTNSSSTTTNSSSTSNSSTTSSTTTSNSSSTSNSSTTNSSSTTTNSSTTSNSSSTSNSSTTSSTTTSNSSTTTSSTTNSSTTTNSSTTSNSSTTSTNSSTTNSSTTSSTTTSNSSTTNSSSTTTNSSTTSTNSSSNSSTTSTNSSTTSNSNSSSSTNSSTTTTTSNNSSNSSSTTSTTTSNTSSSTNSSNSNN
Builtins used:
Push num; duplicate; copy; add; create label; jump to label; print as char
Output:
Pollinium milk; plump pumpkin; lollipop?
Try it online as actual program using raw spaces, tabs, newlines.
See this Whitespace answer of mine for an explanation.
Input(s):
SSTTSNSNSTSSNTNST
Builtins used:
Push num; duplicate; multiply; print as int
Output:
4
Try it online as actual program using spaces, tabs, newlines, and comments.
Input(s):
SSTTNTNSTNNNTSNTNTSSS
Builtins used:
Push num; print as int; stop program; (no-ops)
Output:
-1
Try it online as actual program using raw spaces, tabs, newlines.
See this Whitespace answer of mine for an explanation.
Input(s):
SSSNSNSSNSTNTTTTTNSSNSNSTNSTSSSTSTSNTNSSSNTSSSTNTSSSSTSSTNSTSSTNSNSSSSTSSNTSTTSNSNTSSNSSSTNTSSTSNSNTSTNSSSTNTSSTNTSSSNSNTSTSSTNTSSTNSNNNSSSNSNNTSTSSTSSTSNSTSSTSNTSTTNTSNNNNNSSTNSNNTSSNNSNNNSSSSNTSSSNSNN
21
Builtins used:
Push num; duplicate; swap; copy; discard; add; subtract; multiply; divide; modulo; retrieve; create label; jump to label; jump to label if 0; stop program; read as int; print as char; print as int
Output:
21
21
23
20
5
25
31
24
3
27
37
26
Try it online as actual program using spaces, tabs, newlines, and comments.
See this Whitespace answer of mine for an explanation.
Input(s):
NSSNSSSNSNSTNTSTTTSNSSSSTSTSNTSSTNTSSNSSSNSNSTNTSTTTTSSTNTSNSSSNTNSTNSSSN
TThhiiss iiss ddoouubblee ssppeeaakk!!\n
Builtins used:
Push num; duplicate; subtract; retrieve; create label; jump to label if 0; read as char; read as int
Output:
0
Try it online as actual program using spaces, tabs, newlines, and comments.
See this Whitespace answer of mine for an explanation.
Inputs(s):
SSSNSNSTNTTTTTSSSNSNSTNTTTTTTSSSTNST
-3
5
Builtins used:
Push num; duplicate; add; retrieve; read as int; print as int
Output:
2
Try it online as actual program using spaces, tabs, newlines, and comments.
See this Whitespace answer of mine for an explanation.
Input(s):
SSSTSSSTTTTTSSSSSSSTTSSSTSTTTTTSSTTSNTNST
Builtins used:
Push num; print as int
Output:
4815162342
Try it online as actual program using spaces, tabs, newlines, and comments.
See this Whitespace answer of mine for an explanation.
Input(s):
SSSTTTNTNTSSSSTSSSSTNSSSTSTSNNSSTTNSSSTNTSSTSNSNTTSNSTSSTNSNTNSNTTNNSSSNSSSTTTNTTTSTNSSSTSTNTNSTSSSTSNNSSTNSNTSNNSNTTNSSSSSTSNTSTSSNSNTSSSNNSNTNNSSSSNNNN
卄
Builtins used:
Push num; duplicate; swap; discard; copy; slice; retrieve; create label; jump to label; jump to label if 0; jump to label if negative; exit program; read as char; print as char; print as int
Output:
12345!!
Try it online as actual program using spaces, tabs, newlines, and comments.
Inputs(s):
SSSTTNSNSSNSSNSTTSSSSTTTNTTSNSTNTNSSNNNSSTNNSSNTTTNTNSSTN
Builtins used:
Push num; duplicate; store; retrieve; call subroutine; return; exit program; print as char; (no-ops)
Output:
(character with unicode value 7)
Try it online as actual program using spaces, tabs, newlines, and comments.
3 Answers 3
Vim, 2366 bytes
:let@z=''
:%s/\S//eg
:%s/ /S/&
:%s/ /T/&
:%s/\n/N/&
$xo"Zyl:if"<C-v><C-r>z"=="SS"||"<C-v><C-r>z"=="STS"||"<C-v><C-r>z"=="STN"||"<C-v><C-r>z"=="NST"||"<C-v><C-r>z"=="NSN"||"<C-v><C-r>z"=="NTS"||"<C-v><C-r>z"=="NTT"<C-v>
norm fN<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="SNS"||"<C-v><C-r>z"=="SNT"||"<C-v><C-r>z"=="SNN"||"<C-v><C-r>z"=="TSSS"||"<C-v><C-r>z"=="TSST"||"<C-v><C-r>z"=="TSSN"||"<C-v><C-r>z"=="TSTS"||"<C-v><C-r>z"=="TSTT"||"<C-v><C-r>z"=="TTS"||"<C-v><C-r>z"=="TTT"||"<C-v><C-r>z"=="NTN"||"<C-v><C-r>z"=="TNSS"||"<C-v><C-r>z"=="TNST"||"<C-v><C-r>z"=="TNTS"||"<C-v><C-r>z"=="TNTT"||"<C-v><C-r>z"=="NNN"<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NSS"<C-v>
norm hhxxrmfN<C-v>
let@z=''<C-v>
en<C-v>
l@t<Esc>^"tC"Zyl:if"<C-v><C-r>z"=="SS"<C-v>
norm @a<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="STS"<C-v>
norm @b<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="STN"<C-v>
norm @c<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="SNS"<C-v>
norm @d<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="SNT"<C-v>
norm @e<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="SNN"<C-v>
norm @f<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TSSS"<C-v>
norm @g<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TSST"<C-v>
norm @h<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TSSN"<C-v>
norm @i<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TSTS"<C-v>
norm @j<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TSTT"<C-v>
norm @k<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TTS"<C-v>
norm @l<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TTT"<C-v>
norm @m<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NST"<C-v>
norm @o<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NSN"<C-v>
norm @p<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NTS"<C-v>
norm @q<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NTT"<C-v>
norm @r<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NTN"<C-v>
norm @s<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TNSS"<C-v>
norm @u<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TNST"<C-v>
norm @v<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TNTS"<C-v>
norm @w<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="TNTT"<C-v>
norm @x<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="m"<C-v>
norm fNl<C-v>
let@z=''<C-v>
elsei"<C-v><C-r>z"=="NNN"<C-v>
k<C-v>
el<C-v>
norm l<C-v>
en<C-v>
@n<Esc>^"nClylmc'sO<C-v><Esc>p:s/T/-/e<C-v>
:s/S/+/e<C-v>
"9xms`cytN'sp^x@y^"9PC<C-v><C-r>=<C-v><C-r>"<C-v>
<C-v><Esc>ms`cfNl<Esc>^"aCllytNmc'sO<C-v><Esc>p@y<C-v><C-a>^Dms@"jy$'spms`cfNl<Esc>^"bCllytNmc'so<C-v><Esc>p@y^DJ@"dd`cfNl<Esc>^"cCmc'sy$O<C-v><C-r>"<C-v><Esc>ms`cl<Esc>^"dCmc'sddpkms`cl<Esc>^"eCmc'sddms`cl<Esc>^"fCmc'sy$ddmsC<C-v><C-r>=<C-v><C-r>0+<C-v><C-r>"<C-v>
<C-v><Esc>`cl<Esc>^"gCmc'sy$ddmsC<C-v><C-r>=-<C-v><C-r>0+<C-v><C-r>"<C-v>
<C-v><Esc>`cl<Esc>^"hCmc'sy$ddmsC<C-v><C-r>=<C-v><C-r>"*<C-v><C-r>0<C-v>
<C-v><Esc>`cl<Esc>^"iCmc'sy$ddmsC<C-v><C-r>=<C-v><C-r>"/<C-v><C-r>0<C-v>
<C-v><Esc>`cl<Esc>^"jCmc'sy$ddmsC<C-v><C-r>=<C-v><C-r>"%<C-v><C-r>0<C-v>
<C-v><Esc>`cl<Esc>^"kCmc'sy$ddDJms'h:s/ <C-v><C-r>"(\d*)//eg<C-v>
A <C-v><C-r>"(<C-v><C-r>0)<C-v><Esc>`cl<Esc>^"lCmc'sy$'h/ <C-v><C-r>"(<C-v>
%yT('sC<C-v><C-r>0<C-v><Esc>`cl<Esc>^"mCfN:let@9=col('.')+1<C-v>
vTNy/m<C-v><C-r>"<C-v>
fNlmc'xO<C-v><Esc>"9pmx`c<Esc>^"oCfNvTNy/m<C-v><C-r>"<C-v>
fNl<Esc>^"pCmc'sDJ`c:if<C-v><C-r>"==0<C-v>
norm fNvTNlly/m<C-v><C-v><C-v><C-r>"<C-v><C-v><C-v>
fNl<C-v>
el<C-v>
norm fNl<C-v>
en<C-v>
<Esc>^"qC
mc'sDJ`c:if<C-v><C-r>"<0<C-v>
norm fNvTNlly/m<C-v><C-v><C-v><C-r>"<C-v><C-v><C-v>
fNl<C-v>
el<C-v>
norm fNl<C-v>
en<C-v>
<Esc>^"rCmc'xDJmx`c@"|<Esc>^"sCmc'sDJms`oA<C-v><C-r>=nr2char(<C-v><C-r>-)<C-v>
<C-v><Esc>mo`cl<Esc>^"uCmc'sDJms`o$"-pmo`cl<Esc>^"vCmc'iDJ'sO<C-v><C-r>=char2nr('<C-v><C-r>"')<C-v>
<C-v><Esc>ms`c@l<Esc>^"wCmc'iDJ'sO<C-v><C-r>"<C-v><Esc>ms`c@l<Esc>^"xC0i0<C-v><Esc>y$@=len("<C-v><C-r>"")<C-v>
I(<C-v><Esc>^x:s/S/*2)/eg<C-v>
:s/T/*2+1)/&<C-v>
^C<C-v><C-r>=<C-v><C-r>"<C-v>
<C-v><Esc><Esc>^"yDo<Esc>mxo
Stack:
<Esc>mso
Heap:
<Esc>mho
Input:
<Esc>mio
Output:
<Esc>mo:let@z=''
I sure do love me a thicc Vim program.
This program replaces all of the spaces, tabs, and newlines with the letters STN. This is only necessary for the newlines, because Vim is primarily a text editor and treats them differently, but it does it to each character for consistency. It then takes each Whitespace command and puts it into its own macro. Here's a list of all of the macros, and what command they map to. After creating all of the macros, it goes through a compilation phase, followed by an execution phase.
The compilation phase goes through the program, and for every NSS command, it replaces the NSS with an m, to be able to easily jump to it later.
The execution phase collects characters into the "z register, and once the register matches a command, the macro for that command is executed, and the "z register is cleared.
The stack is simply a bunch of lines containing a number, with the top of the stack having the label s. When we want to access the top of the stack, we just jump to the label, then we can use the c label to jump back to the code.
The heap is a line that stores many values in the form key(value). To set a value, the program deletes the matching key/value pair if it exists, then adds the key/value to the end of the heap. To get the value, it just does / key(, which finds the key, then it uses %yT( to copy only the value.
Input can be passed to the program by adding `ii<input> to the beginning of the footer. Each value must be on a separate line to work: Try it online! The gg@tgg@n at the end of the footer compiles and runs the program. To view only the output of the Whitespace program, add `idH to the end of the footer.
JavaScript (Node.js), (削除) 608 597 586 (削除ここまで) 551 bytes
Takes input as (source)(input), where input is an array containing single characters and/or integers.
s=>I=>eval([..."@CDEJKLMNQTUWYZ^_bkmqw"].reduce((s,c)=>(a=s.split(c)).join(a.pop()),"(R=X=>(H={},o=P=[],S=[],z=x=p=i=0,gUs[p]?~(j=` \n`.indexOf(s[p++]N?j:gK:3,hUgK<2?Mn*2+j):V=n,GUx=z--&&S.pop(y=x),FUg(x=`mbQQ$mS[z+~(b])TSCce((z-=V=b-1,V)$mkJLmyJkTTLH[x]=y$mHD)qR[M1)]=_?R[P@p),V]:_Y_&!kY_&k<0Yp$E?P.popK:pT$^QQQQQqW+yw-yw*yw/y|0w%yN:^QTTo+=Buffer(D)$o+=k$Z.codePointAtKZ`Ct`$`[n-9],x&&eval(x,n=1N<3&&F(n*3+jN(1N(0)||R(1)||owN:^$WqTT$mz=S@kGKbgK?-M0):M0N_p$M1),E^E?g:pZ$HD=I[i++]Y?R[V]:Wz>1?(LmxU=n=>T$$QqqN))Mh(Lk,k,K()J),mx)$Ep=XD[k][email protected]"))
Try it online! (all test cases without comments)
Try it online! (an example with a commented source code as input)
How?
Because the code is packed with RegPack, it is more efficient to repeat the same pieces of code over and over rather than defining too many helper functions.
The sequences of spaces, tabs and linefeeds are converted to an integer in base 3 with a leading \1ドル\$ until they match a known value:
seq. | mnemonic | base 3 | decimal
------+----------+--------+---------
SS | PSH | 100 | 9
STS | CPY | 1010 | 30
STN | SLI | 1012 | 32
SNS | DUP | 1020 | 33
SNT | SWP | 1021 | 34
SNN | DIS | 1022 | 35
------+----------+--------+---------
TTS | PUT | 1110 | 39
TTT | GET | 1111 | 40
------+----------+--------+---------
NSS | LBL | 1200 | 45
NST | JSR | 1201 | 46
NSN | JMP | 1202 | 47
NTS | JZE | 1210 | 48
NTT | JMI | 1211 | 49
NTN | RTN | 1212 | 50
NNN | END | 1222 | 53
------+----------+--------+---------
TSSS | ADD | 11000 | 108
TSST | SUB | 11001 | 109
TSSN | MUL | 11002 | 110
TSTS | DIV | 11010 | 111
TSTT | MOD | 11011 | 112
------+----------+--------+---------
TNSS | CHR | 11200 | 126
TNST | INT | 11201 | 127
TNTS | RCH | 11210 | 129
TNTT | RNU | 11211 | 130
The commands are simply stored in a lookup table and executed with eval().
Everything is wrapped within the function \$R\$ which is called twice:
- First pass (\$X=0\$): the jumps and the stack errors are ignored so that the whole code is parsed linearly and all labels are stored
- Second pass (\$X=1\$): the code is executed normally
Unpacked and formatted
s => I => (
R = X => (
H = {},
o = P = [],
S = [],
z = x = p = i = 0,
g = n => s[p] ? ~(j = ` \t\n`.indexOf(s[p++])) ? j : g() : 3,
h = n => g() < 2 ? h(n * 2 + j) : V = n,
G = n => x = z-- && S.pop(y = x),
F = n => g(
x = [
/* PSH */ "z=S.push(g()?-h(0):h(0))",,,,,,,,,,,,,,,,,,,,,
/* CPY */ "z=S.push(S[z+~(g()?-h(0):h(0))])",,
/* SLI */ "S.splice((z-=V=g()?-h(0):h(0))-1,V)",
/* DUP */ "z=S.push(G()),z=S.push(x)",
/* SWP */ "G(),G(),z=S.push(y),z=S.push(x)",
/* DIS */ "G()",,,,
/* PUT */ "G(),G(),H[x]=y",
/* GET */ "z=S.push(H[G()])",,,,,
/* LBL */ "R[h(1)]=p",
/* JSR */ "h(1),p=X?R[P.push(p),V]:p",
/* JMP */ "h(1),p=X?R[V]:p",
/* JZE */ "h(1),p=X&!G()?R[V]:p",
/* JMI */ "h(1),p=X&G()<0?R[V]:p",
/* RTN */ "p=X?P.pop():p",,,
/* END */ "p=X?g:p",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
/* ADD */ "z>1?(G(),G(),z=S.push(x+y)):p=X?g:p",
/* SUB */ "z>1?(G(),G(),z=S.push(x-y)):p=X?g:p",
/* MUL */ "z>1?(G(),G(),z=S.push(x*y)):p=X?g:p",
/* DIV */ "z>1?(G(),G(),z=S.push(x/y|0)):p=X?g:p",
/* MOD */ "z>1?(G(),G(),z=S.push(x%y)):p=X?g:p",,,,,,,,,,,,,,
/* CHR */ "o+=Buffer([G()])",
/* INT */ "o+=G()",,
/* RCH */ "H[G()]=I[i++].codePointAt()",
/* RNU */ "H[G()]=I[i++]"
][n - 9],
x && eval(x, n = 1)
) < 3 && F(n * 3 + j)
)(1)
)(0) || R(1) || o
-
\$\begingroup\$ This is awesome :D \$\endgroup\$RGS– RGS2020年02月02日 23:27:07 +00:00Commented Feb 2, 2020 at 23:27
Pip Classic, (削除) 441 (削除ここまで) (削除) 399 (削除ここまで) 377 bytes
,:mn:@>gs:lt:v**@_*FBTM_a:(Ja@wTR"
"012`(00|01.|20.|21[01]).*?2|1[02]..|...`)M{!a@<2?"lPU(t".a@>2.')a@<3=10?"lPUl@(t".a@>3.')a@<3=12?"l:@lALl@>++(t".a@>3.')2=@a?("7080I!370I3<z70i:POs00sPUi7"^0a@<3%13).a@>3a=20?"lPU@l"a=21?"S@ll@o"a=22?3a=110?"Y36"a=111?"lPUm@3"a@1?("O C30O30Y APO@60YPO6"^0a)98J['*x"//"'%'+'-]@a}Wi<#a{Va@iR9"Y3l?lPU3"R8"y;Vk"R7"i:a@?"R6"nm@3:y"R3C`POl`++i}
Takes the Whitespace code as its first command-line argument, and the arguments to the Whitespace program as the following command-line arguments. (A string of characters should be given as a single argument.) The heap has a fixed size of 1000 entries.
How?
The broad-strokes version:
- Strip non-whitespace characters from code
- Transliterate space, tab, and newline to 0, 1, and 2
- Use regex to break the program into a list of valid statements
- Translate each statement into Pip code
- While the program counter (initially set to 0) is less than the length of the program, execute the statement at that index and increment the program counter
When the program needs to halt, either because the Halt instruction was executed or because there were not enough operands for an arithmetic operation, it does so by throwing an error using the idiom Vk (explained in this tip).
Ungolfed & commented
Here's the ungolfed version that I started from. Note: this is written in modern Pip, so it won't work on TIO. I've restructured the golfed version a bit, particularly to compress some of the cases in the translation section, but the basic approach is the same.
;;;;;;;;;;;
;; Setup ;;
;;;;;;;;;;;
; Use S/T/N temporarily for ease of development
$codepage : "STN"
; Stack starts out empty
$stack : []
; Heap size is 1000 (initial values don't matter)
$heap : u RL 1000
; Whitespace code is first command-line argument
$code : a
; Program inputs are remaining command-line args
$inputs : g @> 1
; Program counter starts at 0
$pc : 0
; To handle subroutines, we need to store a stack of return-to line numbers
$return_addrs : []
; Function to decode number literals
$to_int : {
$magnitude : FB TM a
$sign : (-1) E @a
$sign * $magnitude
}
;;;;;;;;;;;;;
;; Parsing ;;
;;;;;;;;;;;;;
; Remove characters from code that aren't in codepage
$code FI: _ N $codepage
; Transliterate code from codepage to 0 (space), 1 (tab), and 2 (newline)
$code TR: $codepage 012
; Split code into separate instructions
$command_rgx : `(00|01.|20.|21[01]).*?2|1[02]..|...`
P $code : (J $code) @ $command_rgx
; Translate each instruction into Pip commands
$program : $code M {
$instr : a
; Push a number
$instr @< 2 Q 00 ? "$stack PU " . ($to_int $instr @> 2)
; Copy nth number on stack
$instr @< 3 Q 010 ? "$stack PU $stack @ " . ($to_int $instr @> 3)
; Slide n numbers off from under top of stack
$instr @< 3 Q 012 ? "$stack : $stack @ 0 AL $stack @> " . 1 + ($to_int $instr @> 3)
; Label (no-op)
$instr @< 3 Q 200 ? $instr @> 3
; Gosub label
$instr @< 3 Q 201 ? "$return_addrs PU $pc; $pc : $program @? " . $instr @> 3
; Goto label
$instr @< 3 Q 202 ? "$pc : $program @? " . $instr @> 3
; If top of stack is zero, goto label
$instr @< 3 Q 210 ? "I PO $stack = 0 $pc : $program @? " . $instr @> 3
; If top of stack is negative, goto label
$instr @< 3 Q 211 ? "I PO $stack < 0 $pc : $program @? " . $instr @> 3
; Dup
$instr Q 020 ? "$stack PU $stack @ 0"
; Swap
$instr Q 021 ? "$stack @ 0 :: $stack @ 1"
; Drop
$instr Q 022 ? "PO $stack"
; Add
$instr Q 1000 ? "Y PO $stack; $stack ? $stack PU (PO $stack) + y; V k"
; Subtract
$instr Q 1001 ? "Y PO $stack; $stack ? $stack PU (PO $stack) - y; V k"
; Multiply
$instr Q 1002 ? "Y PO $stack; $stack ? $stack PU (PO $stack) * y; V k"
; Divide
$instr Q 1010 ? "Y PO $stack; $stack ? $stack PU (PO $stack) // y; V k"
; Modulo
$instr Q 1011 ? "Y PO $stack; $stack ? $stack PU (PO $stack) % y; V k"
; Store top of stack on heap at second-on-stack address
$instr Q 110 ? "Y PO $stack; $addr : PO $stack; $heap @ $addr : y"
; Recall number from heap at top-of-stack address
$instr Q 111 ? "$stack PU $heap @ PO $stack"
; Print as char
$instr Q 1200 ? "O C PO $stack"
; Print as number
$instr Q 1201 ? "O PO $stack"
; Read char and store it on heap at top-of-stack address
$instr Q 1210 ? "Y A PO $inputs @ 0; $addr : PO $stack; $heap @ $addr : y"
; Read number and store it on heap at top-of-stack address
$instr Q 1211 ? "Y PO $inputs; $addr : PO $stack; $heap @ $addr : y"
; Return from subroutine
$instr Q 212 ? "$pc : PO $return_addrs"
; Halt
$instr Q 222 ? "V k"
; Unrecognized instruction
""
}
P $program
P "------------"
;;;;;;;;;;;;;;;
;; Execution ;;
;;;;;;;;;;;;;;;
; Loop while the program counter hasn't gone past the end of the program
W $pc < # $program {
; Eval the current statement
V $program @ $pc
; Increment the program counter
U $pc
}
Explore related questions
See similar questions with these tags.
0is just the shortest, and in most cases you don't need to re-use the heap too much. \$\endgroup\$