Interface HashJoinMemoryCalculator
- All Superinterfaces:
HashJoinStateCalculator<HashJoinMemoryCalculator.BuildSidePartitioning>
- All Known Implementing Classes:
HashJoinMechanicalMemoryCalculator
,HashJoinMemoryCalculatorImpl
This class is responsible for managing the memory calculations for the HashJoin operator.
Since the HashJoin operator has different phases of execution, this class needs to perform
different memory calculations at each phase. The phases of execution have been broken down
into an explicit state machine diagram below. What ocurrs in each state is described in
the documentation of the HashJoinState
class below. Note: the transition from Probing
and Partitioning back to Build Side Partitioning. This happens when we had to spill probe side
partitions and we needed to recursively process spilled partitions. This recursion is
described in more detail in the example below.
+--------------+ <-------+ | Build Side | | | Partitioning| | | | | +------+-------+ | | | | | v | +--------------+ | |Probing and | | |Partitioning | | | | | +--------------+ | | | +----------------+ | v Done
An overview of how these states interact can be summarized with the following example.
Consider the case where we have 4 partition configured initially.
- We first start consuming build side batches and putting their records into one of 4 build side partitions.
- Once we run out of memory we start spilling build side partition one by one
- We keep partitioning build side batches until all the build side batches are consumed.
- After we have consumed the build side we prepare to probe by building hashtables for the partitions we have in memory. If we don't have enough room for all the hashtables in memory we spill build side partitions until we do have enough room.
- We now start processing the probe side. For each probe record we determine its build partition. If the build partition is in memory we do the join for the record and emit it. If the build partition is not in memory we spill the probe record. We continue this process until all the probe side records are consumed.
- If we didn't spill any probe side partitions because all the build side partition were in memory, our join operation is done. If we did spill probe side partitions we have to recursively repeat this whole process for each spilled probe and build side partition pair.
-
Nested Class Summary
Modifier and TypeInterfaceDescriptionstatic class
static interface
The interface representing theHashJoinStateCalculator
corresponding to theHashJoinState.BUILD_SIDE_PARTITIONING
state.static interface
static class
This class represents the memory size statistics for an entire set of partitions.static interface
The interface representing theHashJoinStateCalculator
corresponding to theHashJoinState.POST_BUILD_CALCULATIONS
state. -
Method Summary
Methods inherited from interface org.apache.drill.exec.physical.impl.join.HashJoinStateCalculator
getState, next
-
Method Details
-
initialize
void initialize(boolean doMemoryCalc)
-