I know that some languages like APL have a dedicated NAND operator, but I'm thinking about languages like C, C++, Java, Rust, Go, Swift, Kotlin, even instruction sets, etc. since these are the languages which actually implement golfing languages.
As far as I know, modern processors are made up of NAND gates, since any logic gate can be implemented as a combination of NAND gates. For example, processors compose NAND gates like this to create these operators:
A AND B == ( A NAND B ) NAND ( A NAND B )
A OR B == ( A NAND A ) NAND ( B NAND B )
NOT A == ( A NAND A )
So, it just makes sense to me that it would be really efficient for a program to say "Hey, I have a NAND operation for you; please pass it through only one gate and store the result in a register." However, as far as I'm aware, no such operator exists in actual compiled languages.
I've seen arguments saying that these languages don't contain a dedicated NAND operator since C = A NAND B can be expressed as these:
C = NOT ( A AND B )
C = ( NOT A ) OR ( NOT B )
The problem I see is that, since each of these operators is implemented in the processor as a combination of NANDs, then expressing NAND as these compositions might have them be executed as this:
; C = NOT ( A AND B )
X = A NAND B
X = X NAND X
C = X NAND X
; alternatively:
C = ( ( A NAND B ) NAND ( A NAND B ) ) NAND ( ( A NAND B ) NAND ( A NAND B ) )
; C = ( NOT A ) OR ( NOT B )
X = A NAND A
Y = B NAND B
C = ( X NAND X ) NAND ( Y NAND Y )
; alternatively:
C = ( ( A NAND A ) NAND ( A NAND A ) ) NAND ( ( B NAND B ) NAND ( B NAND B ) )
This seems silly and inefficient to me! Just look at the first line of the implementation of C = NOT ( A AND B ): it gives us the result right away! But we toss it back and forth a few times after that anyway to return to the result we started with.
I know each of these gates is a single processor instruction, but wouldn't that instruction be so much more efficient if it only passed through one gate rather than several?
I assume and pray that some compiler or something is smart enough to optimize this (after all, I see NAND in some IBM instruction sets), but I haven't heard of higher-level languages having an operator that explicitly invokes it. My question is why not just let the dev specify "Here is a NAND operation; just do that"?
--
Related:
- Why is there no
nandinstruction in modern CPUs? (about the circuits within a CPU, not languages) - Why do higher level languages have neither xor nor nand short-circuit operators? (about missing operators in general, not this specific one. Also about short-circuiting)
5 Answers 5
There are so many levels in between the source code of some program and the circuits on a chip. While some microarchitecture details definitely have a high-level performance impact, the ability to specify a NAND operation wouldn't be one of them. High-level programs have little connection to circuits on a chip
First, our program must be converted to machine code. This might happen through an ahead of time compiler, or a JIT compiler immediately before execution, or the program might be interpreted without compilation. All of these approaches already introduce substantial overhead that goes beyond what would be saved by a simpler circuit.
When the CPU executes machine code, the machine code is handled in a pipeline. Instructions in this pipeline are decoded and partially executed ahead of time. Actually, the machine code instructions are usually decoded into microcode, a CPU-internal instruction set. And then finally the microcode is dispatched on specialized hardware circuits. This decoding is controlled largely by firmware, i.e. the CPU performs a hardware-assisted just-in-time re-compilation. The point is that every instruction is touched by millions of transistors before it is actually executed.
Per the instruction set specification, the CPU has a certain number of cycles to execute a machine code operation. This number depends on the complexity of the operation. On this scale, an AND and NAND operation would likely take the same number of cycles.
However, adding a NAND instruction could still be worth it e.g. if this would save power or lead to more compact programs. But those are fairly big assumptions that have to be balanced with the complexity of the instruction set. Every instruction complicates the decoding pipeline. Adding very small instructions is not generally worth it.
Of course, the decoding pipeline might recognize a pattern like and a, b; not a and issue a NAND microcode instruction in its place. But again: this wouldn't likely save any cycles, makes the CPU more complicated, and in any case would be unobservable from the outside. I don't know if modern CPUs have such microcode, but I'd bet they have better uses for that die space than a somewhat rare operation that can be easily emulated through other logical operations.
-
1Worth mentioning that this's true for most commercial CPUs. Those targeting the mainstream consumer or/and most of the industry needs. But systems implementing this operator (and more complex ones) aren't that rare. During the past 5 years, we have seen how ASICs have increased in popularity due to the cryptos. But they have been there for a while. As this answer points out, efficiency and power saving are among the driving forces that make designers integrate this sort of circuit in their hardware and firmware. For most of the tasks tho, a regular CPU architecture is more than enough.Laiv– Laiv2022年12月05日 13:51:34 +00:00Commented Dec 5, 2022 at 13:51
I have seen only a few languages that implemented a NAND keyword.
Remember that source-code is designed to be expressive of the logic that is needed to solve the problem at hand. It is entirely up to the [optimizing ...] compiler to decide what machine instructions to generate to correspond to what that source-code says. If the chip has a useful NAND instruction, the compiler might decide to use it.
-
2Indeed, the OP gives the example of APL, but it was likely included there more for completeness of coverage of Boolean operators and their complements, than due to popular demand. It's popularity in electronics - to which the OP alludes - is more by virtue of its ability to form other gates from assemblies consisting purely of wires and of copies of the same NAND component, than for its conceptual simplicity or computational efficiency.Steve– Steve2021年02月13日 19:31:55 +00:00Commented Feb 13, 2021 at 19:31
NAND doesn’t match the mental image of logic that normal people including software developers have. A nand B doesn’t make sense to a normal person. It’s absolutely fine in an instruction set, but not in a programming language. And compilers have no problem at all converting "not (a and b)" or "not a or not b" into a nand.
Where it gets really bad is when you realise that most uses of and are for program glow aka short circuited operations.
-
1Gosh, I was with you until you implied that { Normal People ∩ Software Developers } ≠ ∅. (Speaking as a software developer.) Maybe its not actually null, but merely of such low cardinality that you never meet one.davidbak– davidbak2021年02月13日 00:37:28 +00:00Commented Feb 13, 2021 at 0:37
-
3I hadn't heard of
NANDbefore this. Or maybe it's been 20 years since my discrete mathematics course in college coupled with the fact I haven't seen this in 20 years of programming. To be honest, it just doesn't read well and it would take me a moment every time I came across this operator to convert it to "not a and b". I have a hard enough time with people inserting 8 unnecessary matching parenthesis in boolean conditions, much less jumping through the mental hoops you need to jump through with theNANDlogic.Greg Burghardt– Greg Burghardt2021年02月13日 00:55:45 +00:00Commented Feb 13, 2021 at 0:55 -
Greg, nand is important on the lowest hardware level, where just because of the way physics works either nand with 1, 2 and possibly more inputs is the only logic available, or nor with similar inputs. There is no way to build an AND other than from a NAND with two inputs, used as input to a NAND with one input aka "not". That’s the level where you need NAND. Any level above that you don’t need it and avoid it.gnasher729– gnasher7292021年02月13日 11:17:50 +00:00Commented Feb 13, 2021 at 11:17
-
@davidbak Sorry to hear that.gnasher729– gnasher7292021年02月13日 11:18:58 +00:00Commented Feb 13, 2021 at 11:18
-
1@gnasher729, indeed. The only sensible implementation of such 3-input gates must be as
NOT(a AND b AND c)(orNAND(a,b,c)) - meaning "not all" or "not every" input is true - rather than as(a NAND b) NAND c(orNAND(NAND(a,b),c)) which is how it would be parsed most typically in existing computer languages. Hence, why languages tend not to provide a NAND infix operator, and simply let the programmer compose it fromANDandNOToperators when required.Steve– Steve2021年02月14日 14:44:29 +00:00Commented Feb 14, 2021 at 14:44
There is absolutely no reason to minimize the number of gates involved in executing an instruction. The amount of gates that are used to read the instruction from memory, figuring out what it does, and routing the input and output values far outweigh the number of gates involved in actually executing an instruction like AND.
Imagine it takes something like 50000 NAND gates to read and decode an instruction. Then 96 NAND gates to calculate the AND of the two values.
That means a NAND instruction would use 50000 gates to read and decode the instructions, and 32 gates to calculate the NAND of the two values. Why bother?
And you'd need three of them to calculate AND, so it would take three times as long.
That said, very old processors really did care about minimizing the total number of gates in the processor (not just the number that are used). For example, look at the PDP-8. It has AND and CMA (NOT) instructions, but it doesn't have an OR instruction. If you want to calculate A OR B, you have to calculate NOT ((NOT A) AND (NOT B)). Why? Well, to keep the number of gates down! Of course, there is no subtraction either. Try INCREMENT(NOT(B)) + A.
-
My thoughts involve the industry's recent struggles with optimization, for instance those which led Intel to cut security corners in the name of squeezing out a little bit more performance, after they hit their limits of GHz, and which caused AMD to keep adding more cores to solve that same problem. Your example is wonderful and puts it into perspective, but I am still tickled that a CISC processor that is desperate for performance is OK with using 3 to 7 gate clusters where 1 will doKy -– Ky -2021年02月16日 16:20:08 +00:00Commented Feb 16, 2021 at 16:20
-
Also, I think your example of PDP-8's
ORapproach was supposed to haveANDinstead ofORin itKy -– Ky -2021年02月16日 16:20:37 +00:00Commented Feb 16, 2021 at 16:20 -
@BenLeggiero Those 50000 gates? Those are there so the processor can read 64 bytes of instructions and execute 4 instructions every single clock cycle. (note that 50000 is a number I pulled out of my ass, it could be 5000000 but either it's not a small number). The result of the AND has to wait for the clock cycle anyway; the rest of the processor is doing multiplies and divisions and memory loads and stores, you think the clock cycle could be shorter in order to speed up the AND, without breaking all that other stuff? Most instructions are deeper than 1 gate!Stack Exchange Broke The Law– Stack Exchange Broke The Law2021年02月16日 16:24:34 +00:00Commented Feb 16, 2021 at 16:24
-
1@BenLeggiero ah, I thought you were asking why they don't only have NAND. If you're asking why they don't include a NAND as well, it's because of instruction encoding, and just complexity in general. It's a waste of opcodes, and a waste of gates to decode the instruction (not to execute it, that part is trivial).Stack Exchange Broke The Law– Stack Exchange Broke The Law2021年02月16日 16:32:04 +00:00Commented Feb 16, 2021 at 16:32
-
1They do add new instructions to speed up code, that's what SSE and AVX are, but they only do it for a big speedup, not a tiny speedup. Almost nobody actually needs a NAND instruction. And if they did, Intel could put a special case in their instruction decoder, so that AND followed by NOT does a NAND in 1 cycle.Stack Exchange Broke The Law– Stack Exchange Broke The Law2021年02月16日 16:33:00 +00:00Commented Feb 16, 2021 at 16:33
Like already mentioned in another answer, NAND and NOR are hardly semantically helpful to a coder so they would just be noise in the feature set of a high level language. But let's focus on your understanding of efficiency.
One CPU instruction is implemented as a number of steps that require so called cycles to be executed. A cycle is typically a period or half a period of an oscillator's electrical signal. The simple instructions require only one cycle to be executed, although there may be a network of tens or hundreds of NAND gates involved. The NAND gates that implement the instruction are hardwired and will operate at the speed the electrons are flowing. So, a couple of NAND gates more to realize the same one-cycle instruction does not hurt performance at all.
-
I would argue that there's a lot of obscure language features that coders have no use for, like C's VLA typedef, trigraphs, array designators, accessing an index of an int, etc. JavaScript notoriously has silly unintuitive equality properties, like
false != "false"butfalse == "0". I can't ever imagine using these features nor understanding code that uses them, but folks do both these. I feel likefoo !& baror whatever would be simple enough to understand compared to those.Ky -– Ky -2021年02月16日 16:45:40 +00:00Commented Feb 16, 2021 at 16:45 -
Also, speaking of CPU instructions, coders have no use for most of them, like
FISUBRorMADD. But the CPUs include these so compilers can optimize code to take advantage of processor instructions. If there was aNANDinstruction, then a compiler could take any!(foo && bar)or!hoge || !quxand translate it to usingNANDKy -– Ky -2021年02月16日 16:46:19 +00:00Commented Feb 16, 2021 at 16:46 -
1@Ky, first this wouldn't work. Because && is a short-circuit operator (second part is only evaluated if the first part is true), and for NAND you always need both inputs to produce the output. Second, a bitwise and is implemented using ONE single cycle instruction that uses say one third of the time that single cycle instructions can use. A single NAND is a single cycle instruction that uses say one tenth of that time. That doesn't help you one bit. It takes one cycle. The !(foo && bar( is 99.9% evaluated through conditional branch instructions.gnasher729– gnasher7292022年12月06日 00:05:03 +00:00Commented Dec 6, 2022 at 0:05
-
@gnasher729 Thank you! I was thinking about efficiency (performance per watt). The fewer gates that a process needs, the more efficient, so a signal which takes advantage of implementation details like this would be more efficient... right?Ky -– Ky -2022年12月30日 20:46:30 +00:00Commented Dec 30, 2022 at 20:46
Explore related questions
See similar questions with these tags.
|and&- but unlike NAND, rotate is actually useful to programmers!