Assuming that Machine-1 accesses data in the memory of Machine-2 via RDMA, will this action incur memory bandwidth overhead on both Machine-1 and Machine-2?
If a regular network card is used for data transmission via TCP, both Machine-1 and Machine-2 will experience DMA operation by NIC and data processing by the kernel network stack, both of which will incur memory bandwidth overhead.
If RDMA is used, can it bypass the kernel network stack? Is it still not possible to bypass the action of reading/writing data?
1 Answer 1
can it bypass the kernel network stack?
According to Wikipedia: "Remote direct memory access":
In computing, remote direct memory access is a direct memory access from the memory of one computer into that of another without involving either one's operating system.
So, according to it's definition, it will bypass the kernel network stack.
However, I suspect that this definition is misleading: these operations cannot possibly refrain from involving the operating system; what the definition is trying to say is that the operating system is only involved to set up the transfer, and from that moment on the transfer will proceed without the operating system acting as an intermediary. Otherwise, the whole concept would be a gaping security hole.
Is it still not possible to bypass the action of reading/writing data?
It depends on what you mean "the action of reading/writing data". DMA stands for Direct Memory Access. Obviously, its job is "accessing" memory. When we "access" memory we do not do it just to check if it is doing okay, we access it to either read it or write it. So there will be reading and writing involved. It is just that the CPU does not have to do it.
does it incur memory bandwidth overhead [?]
Again, it depends on what you mean by "memory bandwidth overhead". At the very least, each RDMA transfer will need to be set up; that's overhead. But most importantly, during any DMA transfer, the memory banks are required to handle memory read and write requests by the DMA circuitry every few clock cycles, which means that some requests by the CPU (which is free to do other stuff) often get delayed by a clock cycle or so. This means that there will be a slow-down.
I do not know what percentage this overhead represents. It may be negligible, and then again it may not be negligible, especially if we are talking about 100 GBit/s Ethernet, where the network bandwidth is an appreciable percentage of the actual memory bandwidth. If your budgets are so tight that you have to be asking the question, then do not rely on anyone's answer; do your own bench-marking instead.
on both side[s]?
Anything that happens on one side will necessarily have to happen on the other side, too. Otherwise, one of the sides would be working by magic, and we have not invented that yet. "Accessing memory on another machine" means that the memory of the other machine gets copied to this machine. This requires RDMA on the other machine to read memory and send to the network, and RDMA on this machine to receive from the network and write to memory.
Comments
Explore related questions
See similar questions with these tags.