Block nested loop
A block-nested loop (BNL) is an algorithm used to join two relations in a relational database.[1]
This algorithm[2] is a variation of the simple nested loop join and joins two relations {\displaystyle R} and {\displaystyle S} (the "outer" and "inner" join operands, respectively). Suppose {\displaystyle |R|<|S|}. In a traditional nested loop join, {\displaystyle S} will be scanned once for every tuple of {\displaystyle R}. If there are many qualifying {\displaystyle R} tuples, and particularly if there is no applicable index for the join key on {\displaystyle S}, this operation will be very expensive.
The block nested loop join algorithm improves on the simple nested loop join by only scanning {\displaystyle S} once for every group of {\displaystyle R} tuples. Here groups are disjoint sets of tuples in {\displaystyle R} and the union of all groups has the same tuples as {\displaystyle R}. For example, one variant of the block nested loop join reads an entire page of {\displaystyle R} tuples into memory and loads them into a hash table. It then scans {\displaystyle S}, and probes the hash table to find {\displaystyle S} tuples that match any of the tuples in the current page of {\displaystyle R}. This reduces the number of scans of {\displaystyle S} that are necessary.
algorithm block_nested_loop_join is for each page pr in R do for each page ps in S do for each tuple r in pr do for each tuple s in ps do if r and s satisfy the join condition then yield tuple <r,s>
A more aggressive variant of this algorithm loads as many pages of {\displaystyle R} as can be fit in the available memory, loading all such tuples into a hash table, and then repeatedly scans {\displaystyle S}. This further reduces the number of scans of {\displaystyle S} that are necessary. In fact, this algorithm is essentially a special-case of the classic hash join algorithm.[citation needed ]
The block nested loop runs in {\displaystyle O(P_{r}P_{s}/M)} I/Os where {\displaystyle M} is the number of available pages of internal memory and {\displaystyle P_{r}} and {\displaystyle P_{s}} is size of {\displaystyle R} and {\displaystyle S} respectively in pages. Note that block nested loop runs in {\displaystyle O(P_{r}+P_{s})} I/Os if {\displaystyle R} fits in the available internal memory.
References
[edit ]- ^ "8.2.1.14 Block Nested-Loop and Batched Key Access Joins". MySQL 5.6 Reference Manual. Oracle Corporation. Retrieved 2 August 2015.
- ^ "Block Nested Loop Join". MariaDB. MariaDB Corporation Ab. Retrieved 2 August 2015.