15
\$\begingroup\$

What general tips do you have for golfing in AWK? I'm looking for ideas that can be applied to code golf problems in general that are at least somewhat specific to AWK. Please post one tip per answer.

asked Oct 26, 2013 at 5:01
\$\endgroup\$
0

6 Answers 6

13
\$\begingroup\$

These are not in any particular order, and some of them might even apply to other languages too, but not to a lot of them I think.

  • You can sometimes use awk's simple string concatenation to assemble numbers

    z=x*10+y and a=b*10

    become

    z=x y and a=b 0

  • You can use the ~ operator (match pattern on the right side) to compare numbers or strings if you know that the two parts meet certain conditions

    for instance if you know that a will always be smaller than or equal to b, you can replace

    a==b

    with

    a~b

    or if you want to check for an empty string s (which means that s isn't allowed to be "0" either), and you haven's changed the variable X before, so that it is still undefined, you could use

    X~s

    instead of

    s==""

    if you want to check if i is an integer you can use

    i!~/\./ (i doesn't contain a dot)

    instead of

    i==int(i)

  • You can swap two numbers in a single command

    t=a;a=b;b=t

    becomes

    a+=b-(b=a)

    saving one character

  • If you don't need them as input anymore, you can use the input variables 1,ドル 2,ドル 3,ドル ... as an array. So instead of writing a[n] you can just write $n. Sometimes even while reading the input from these you can use the already handled ones as a stack for something.

  • You can make good use of the built-in separators FS (space, if not changed by you) and RS (newline) when concatenating strings

    a=1ドル" "2ドル

    becomes

    a=1ドルFS2ドル

    or even better

    a=1ドル"\n"2ドル

    becomes

    a=1ドルRS2ドル

  • If there is no input and you use the BEGIN block, you can try to use the END block instead. Depending on the judge it works. If you test it, you have to press Ctrl-D (end of input) to access the END block.

  • If you want to skip the first line, you can use

    C++

    instead of

    NR>1

    if you won't be using C anywhere in your program.

answered Sep 5, 2015 at 9:59
\$\endgroup\$
5
\$\begingroup\$

Techniques

  • Thanks to Peter Taylor for pointing out the language-generic version of this question for which I did not bother to search. I'll try to limit this answer to things that are not available in other C-like languages.

  • The frequently accessed builtin variables are 2 characters long. Find the break-even point for assigning to a temp variable. a=1ドル; is five characters, so it breaks even after five uses of 1ドル;

  • Remember that the default action for a matching pattern is print 0ドル, so if you get what you want printed into 0ドル, just use

    1
    
  • Getting creative with operator ordering can save you some parentheses or extra statements

    for(i=1;i<=NF;i++)print$i;
    

    vs

    for(;i<NF;)print$++i;
    

Examples

  • A solution to Do We Sink or Swim? in 70 characters:

    n{for(;NF;NF--)s+=$NF;n--}NR==1{n=1ドル;p=3ドル}END{print p<s?"Swim":"Sink"}
    
  • A solution to The Floating Horde in 44 characters:

    {for(i=NF;i>1;){n=int($i/2);$i%=2;$--i+=n}}1
    

Summary

awk will probably never outgolf APL, Golfscript, J, or K, but you can quite consistently beat other high level languages.

answered Mar 17, 2014 at 17:50
\$\endgroup\$
5
  • 3
    \$\begingroup\$ The question asks for tips which are somewhat specific to Awk. The ternary operator is included in the tips for all languages. \$\endgroup\$ Commented Mar 17, 2014 at 20:35
  • \$\begingroup\$ @laindir : for(;i<NF;)print$++i; can become NF=NF OFS='\n' \$\endgroup\$ Commented Nov 1, 2022 at 14:13
  • 1
    \$\begingroup\$ @laindir : '{for(_=NF;1<_;)$--_+=($_-($_%=2))/2}_' i got the floating horde one down to 37 bytes \$\endgroup\$ Commented May 6, 2024 at 22:00
  • \$\begingroup\$ as for the sequential printing one, I think '{while(_++<NF)print$_}' is same # bytes but more elegant \$\endgroup\$ Commented May 6, 2024 at 22:04
  • \$\begingroup\$ for(;i<NF;)print$++i; - that's kinda verbose for basically doing awk 'NF+=OFS=RS' \$\endgroup\$ Commented Mar 29 at 0:44
4
\$\begingroup\$

one of the great things about awk's sigils $ is that string concat can be even more condensed than having to use the built-ins as buffer zones - say, u wanna prepend a zero to the full row:

 _=0
$_=_$_
jot -c 8 75 | gawk '$_=+_$_' 
0K
0L
0M
0N
0O
0P
0Q
0R

and you wanna make patterns out of it ?

jot -c 8 75 | gawk '$_=_++$_' # integers
0K
L 1
M 2
N 3
O 4
P 5
Q 6
R 7
jot -c 8 75 | gawk '$_=_++$++_' # even numbers
K 0
L 2
M 4
N 6
O 8
P 10
Q 12
R 14
jot -c 8 75 | gawk '$_=++_$++_' # odd numbers
K 1
L 3
M 5
N 7
O 9
P 11
Q 13
R 15

And honestly, what language can repeat stings THIS easily :

jot 20 | mawk 'NF=OFS=$_'
1
22
333
4444
55555
666666
7777777
88888888
999999999
10101010101010101010
1111111111111111111111
121212121212121212121212
13131313131313131313131313
1414141414141414141414141414
151515151515151515151515151515
16161616161616161616161616161616
1717171717171717171717171717171717
181818181818181818181818181818181818
19191919191919191919191919191919191919
2020202020202020202020202020202020202020

or can decode arbitrary precision hex with this few keystrokes :

echo 0xEDCFAB12EDCFAB127659438976594389EDCFAB | 
gawk -nM '$_=+$_'
5303367068685265828195859270035065456131166123
answered Nov 1, 2022 at 11:41
\$\endgroup\$
3
\$\begingroup\$

To read and process a number on each line:

{
 n=1ドル; 
 print(n*n);
 // OR printf("%d\n",n*n);
}

Compressed form (Length = 14):

{print(1ドル*1ドル)} // thanks due to @manatwork

Shorter Code (Length = 7)

1,0ドル^=2 // thanks due to @llhuii

When compiled and run in gawk with inputs:

1
2
3

Output:

1
4
9
answered Oct 26, 2013 at 5:03
\$\endgroup\$
6
  • 2
    \$\begingroup\$ Why the variable? And why the parenthesis? {print1ドル*1ドル} is shorter. \$\endgroup\$ Commented Oct 26, 2013 at 11:02
  • 6
    \$\begingroup\$ 1,0ドル^=2 is shorter \$\endgroup\$ Commented Dec 8, 2013 at 7:12
  • \$\begingroup\$ For any input different from 0 or null, 0ドル^=2 does the same. Careful that when using 1,0ドル^=2, blank lines return 0. \$\endgroup\$ Commented Dec 26, 2020 at 16:11
  • \$\begingroup\$ @PedroMaimere : this solves it echo "1\n2\n3\n\n4\n5" | mawk '!NF || ($!_^=2)^_' 1 4 9 16 25 .... or not write any numbers at all mawk '!NF || ($!_*=$!_)^_' \$\endgroup\$ Commented Nov 1, 2022 at 11:46
  • \$\begingroup\$ @PedroMaimere : better yet : echo "0\n1\n2\n3\n\n4\n5" | mawk '!NF || ($!_^=++NF)^_' 0 1 4 9 16 25 - actual input lines containing a zero will print out zero squared, while blank lines remain as such. u can directly square hexes with mawk : echo 0x4F0FFF9 | mawk '($!_^=++NF)^_' CONVFMT='%.f' 6872912880599089 \$\endgroup\$ Commented Nov 1, 2022 at 11:49
2
\$\begingroup\$

regarding string concat freebies in awk, there are 5 scenarios where gapless concat is guaranteed to be safe (first 4 examples are attempting to either prepend or append some arbitrary string inside awk variable __)

  1. Immediately trailing numbers : e.g. 367 __ —> 367__

    — ditto for fields referenced by digits. e.g. 19ドル __ —> 19ドル__

  1. Immediately before fields/sigils : e.g. __ $NF —> __$NF

    — to perform ++$NF or --$NF, do (__)++$NF instead.

  1. Immediately trailing array cells: e.g. ___[_] __ —> ___[_]__

    — this is mostly useful when performing array join with seps

  1. Immediately trailing closing parenthesis ) (like grouping pairs or function calls) : e.g. split(...) __ —> split(...)__, or (sumN - cntN) __ —> (sumN - cntN)__

    — the primary use case for this is to concat an empty string after the grouping, which would force convert a numeric zero (0) to string zero ("0") so it would evaluate to TRUE in any boolean context or pattern space. The alternative approach would be extra logic and verbosity to handle the edge case.

  1. The strangest one - conjuring up arbitrary chain of digits by concating the same variable repeatedly while altering its value along the way :
gawk -p- -be 'BEGIN { print _++_!_--_!_++_++_^_^_^_, _ }' 
 
01001165536 2
 # gawk profile, created Fri Mar 28 21:29:37 2025
 # BEGIN rule(s)
 BEGIN {
 1 print _++ _ !_-- _ !_++ _++ _^_^_^_, _
 }

— By end of the sequence, _ only has a measly value of 2, since it never stored 65,536 back into itself

answered Mar 29 at 2:27
\$\endgroup\$
1
\$\begingroup\$

Truthy and falsey values

Boolean evaluation is somehow flexible in AWK, and this is awesome. Remember: AWK's basis is pattern{action}; when pattern is true, it executes action.

Besides doing their business, some built-in functions return values, e.g., split(), gsub(). They are also useful as a pattern when manipulating the input.

Examples of truthy and falsey variables

I tried to come up with valuable examples of how to exploit truthy/falsey variables. This is non-exhaustive. Anyone is encouraged to share more examples, and I would add them to this list.

Try it online!

"strings are truthy"{print 1} # truthy; strings are always truthy, except for the null string
-0xF3e10{print 2} # truthy; numbers different from zero are truthy
0{print 3} # falsey; zeroes are false: 0, -0, +0, 0x0, 0000... expect for "0", which is a string
0 b{print 4} # truthy; now the number 0 is concatenated to a null string (a variable still not assigned), thus converting it to "0"
""{print 5} # falsey; null string
b{print 6} # falsey; b is null (a variable not assigned)
a=@/x/{print 7} # truthy; defining a strongly typed regex constant. Different from a /x/ pattern
/x/{print 8} # falsey; /x/ does not match the input
c["test"]{print 9} # falsey; undefined item of array, null
c["test"]++{print 10} # falsey; variable is evaluated before increment
c["test"]{print 11} # truthy; now, c["test"] equals 1
c["test2"]{print 12} # falsey; although the c array now exists, the "test2" element does not

Result:

1
2
4
7
11
answered May 22, 2021 at 7:06
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.