Skip to main content
We’ve updated our Terms of Service. A new AI Addendum clarifies how Stack Overflow utilizes AI interactions.
Code Golf

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

Required fields*

High throughput Fizz Buzz

Fizz Buzz is a common challenge given during interviews. The challenge goes something like this:

Write a program that prints the numbers from 1 to n. If a number is divisible by 3, write Fizz instead. If a number is divisible by 5, write Buzz instead. However, if the number is divisible by both 3 and 5, write FizzBuzz instead.

The goal of this question is to write a FizzBuzz implementation that goes from 1 to infinity (or some arbitrary very very large number), and that implementation should do it as fast as possible.

Checking throughput

Write your fizz buzz program. Run it. Pipe the output through <your_program> | pv > /dev/null. The higher the throughput, the better you did.

Example

A naive implementation written in C gets you about 170MiB/s on an average machine:

#include <stdio.h>
int main() {
 for (int i = 1; i < 1000000000; i++) {
 if ((i % 3 == 0) && (i % 5 == 0)) {
 printf("FizzBuzz\n");
 } else if (i % 3 == 0) {
 printf("Fizz\n");
 } else if (i % 5 == 0) {
 printf("Buzz\n");
 } else {
 printf("%d\n", i);
 }
 }
}

There is a lot of room for improvement here. In fact, I've seen an implementation that can get more than 3GiB/s on the same machine.

I want to see what clever ideas the community can come up with to push the throughput to its limits.

Rules

  • All languages are allowed.
  • The program output must be exactly valid fizzbuzz. No playing tricks such as writing null bytes in between the valid output - null bytes that don't show up in the console but do count towards pv throughput.

Here's an example of valid output:

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
# ... and so on

Valid output must be simple ASCII, single-byte per character, new lines are a single \n and not \r\n. The numbers and strings must be correct as per the FizzBuzz requirements. The output must go on forever (or a very high astronomical number, at-least 2^58) and not halt / change prematurely.

  • Parallel implementations are allowed (and encouraged).

  • Architecture specific optimizations / assembly is also allowed. This is not a real contest - I just want to see how people push fizz buzz to its limit - even if it only works in special circumstances/platforms.

Scores

Scores are from running on my desktop with an AMD 5950x CPU (16C / 32T). I have 64GB of 3200Mhz RAM. CPU mitigations are disabled.

**By far the best score so far is in C++ by @David Frank - generating FizzBuzz at around 1.7 Terrabit/s

At the second place - @ais523 - generating FizzBuzz at 61GiB/s with x86 assembly.

Results for java:

Author Throughput
ioan2 2.6 GiB/s
randy 0.6 GiB/s
randy_ioan 3.3 GiB/s
ioan 4.6 GiB/s
olivier 5.8 GiB/s

Results for python:

Author Throughput
bconstanzo 0.1 GiB/s
michal 0.1 GiB/s
ksousa_chunking 0.1 GiB/s
ksousa_initial 0.0 GiB/s
arrmansa 0.2 GiB/s
antoine 0.5 GiB/s

Results for pypy:

Author Throughput
bconstanzo_pypy 0.3 GiB/s

Results for rust:

Author Throughput
aiden 3.4 GiB/s
xavier 0.9 GiB/s

Results for ruby:

Author Throughput
lonelyelk 0.1 GiB/s
dgutov 1.7 GiB/s

Results for asm:

Author Throughput
ais523 60.8 GiB/s
paolo 39.0 GiB/s

Results for julia:

Author Throughput
marc 0.7 GiB/s
tkluck 15.5 GiB/s

Results for c:

Author Throughput
random832 0.5 GiB/s
neil 8.4 GiB/s
kamila 8.4 GiB/s
xiver77 20.9 GiB/s
isaacg 5.7 GiB/s

Results for cpp:

Author Throughput
jdt 4.8 GiB/s
tobycpp 5.4 GiB/s
david 208.3 GiB/s

Results for numba:

Author Throughput
arrmansa 0.1 GiB/s

Results for numpy:

Author Throughput
arrmansa 0.3 GiB/s
arrmansa-multiprocessing 0.7 GiB/s
arrmansa-multiprocessing-2 0.7 GiB/s

Results for go:

Author Throughput
Bysmyyr 3.7 GiB/s
psaikko 6.8 GiB/s

Results for php:

Author Throughput
no_gravity 0.5 GiB/s

Results for elixir:

Author Throughput
technusm1 0.3 GiB/s

Results for csharp:

Author Throughput
neon-sunset 1.2 GiB/s

asm submissions java submissions pypy submissions python submissions ruby submissions rust submissions c submissions cpp submissions julia submissions numba submissions numpy submissions go submissions php submissions csharp submissions

Plots generated using https://github.com/omertuc/fizzgolf

Answer*

Draft saved
Draft discarded
Cancel
7
  • \$\begingroup\$ Wouldn't memcpy(p,"Fizz\n",6); p+=5 be faster than sprintf? \$\endgroup\$ Commented Jan 19, 2021 at 22:05
  • 1
    \$\begingroup\$ Optimal with OMP_NUM_THREADS=1 🤔. @EasyasPi tried that, didn't make a noticeable difference \$\endgroup\$ Commented Jan 19, 2021 at 22:22
  • \$\begingroup\$ @EasyasPi, ITYM memcpy(p,"Fizz\n",5); to avoid pointlessly writing the null character each time? memcpy/sprintf tuning makes no measurable difference, as that's only the setup, outside the main loop. sprint() makes for more maintainable code. (I'm new to programming for all-out speed; in my day job that comes in third, after robustness and maintainability) \$\endgroup\$ Commented Jan 20, 2021 at 7:24
  • \$\begingroup\$ @Omer: yes, the poor parallelisation was a disappointment. I tried some other parallelisations, too (separate array of formatted numbers, and putting the serialisation and writing into its own thread). If I'm to get any real benefit, I might have to go low-level and hand-craft the threads and their synchronisation (probably switch to C++ for that). \$\endgroup\$ Commented Jan 20, 2021 at 7:31
  • \$\begingroup\$ I think I manage slightly better if I have atomic_int shuffle instead of the reduction. (time passes...) Yes, and I've updated to code that actually works faster in parallel, at last! \$\endgroup\$ Commented Jan 20, 2021 at 7:39

AltStyle によって変換されたページ (->オリジナル) /