Class HashPartition
java.lang.Object
org.apache.drill.exec.physical.impl.common.HashPartition
- All Implemented Interfaces:
HashJoinMemoryCalculator.PartitionStat
Overview
Created to represent an active partition for the Hash-Join operator (active means: currently receiving data, or its data is being probed; as opposed to fully spilled partitions). After all the build/inner data is read for this partition - if all its data is in memory, then a hash table and a helper are created, and later this data would be probed. If all this partition's build/inner data was spilled, then it begins to work as an outer partition (see the flag "processingOuter") -- reusing some of the fields (e.g., currentBatch, currHVVector, writer, spillFile, partitionBatchesCount) for the outer.
-
Field Summary
-
Constructor Summary
ConstructorDescriptionHashPartition
(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions) -
Method Summary
Modifier and TypeMethodDescriptionvoid
Allocate a new current Vector Container and current HV vectorvoid
appendBatch
(VectorAccessible batch) Append the incoming batch (actually only the vectors of that batch) into the tmp listvoid
appendInnerRow
(VectorContainer buildContainer, int ind, int hashCode, HashJoinMemoryCalculator.BuildSidePartitioning calc) Spills if neededvoid
appendOuterRow
(int hashCode, int recordsProcessed) Outer always spills when batch is fullvoid
Creates the hash table and join helper for this partition.void
cleanup
(boolean deleteFile) Free all in-memory allocated structures.void
close()
void
Close the writer without deleting the spill filevoid
completeAnInnerBatch
(boolean toInitialize, boolean needsSpill) void
completeAnOuterBatch
(boolean toInitialize) void
decreaseRecordNumForKey
(int currentIndex) int
getBuildHashCode
(int ind) long
int
getNextIndex
(int compositeIndex) com.carrotsearch.hppc.IntArrayList
int
long
int
int
int
getProbeHashCode
(int ind) int
getRecordNumForKey
(int currentIndex) getStartIndex
(int probeIndex) void
getStats
(HashTableStats newStats) boolean
Creates a debugging string containing information about memory usage.org.apache.commons.lang3.tuple.Pair<VectorContainer,
Integer> int
probeForKey
(int recordsProcessed, int hashCode) boolean
setRecordMatched
(int compositeIndex) void
setRecordNumForKey
(int currentIndex, int num) void
void
void
updateProbeRecordsPerBatch
(int newRecordsPerBatch) Configure a different temporary batch size when spilling probe batches.
-
Field Details
-
HASH_VALUE_COLUMN_NAME
- See Also:
-
HVtype
-
-
Constructor Details
-
HashPartition
public HashPartition(FragmentContext context, BufferAllocator allocator, ChainedHashTable baseHashTable, RecordBatch buildBatch, RecordBatch probeBatch, boolean semiJoin, int recordsPerBatch, SpillSet spillSet, int partNum, int cycleNum, int numPartitions)
-
-
Method Details
-
updateProbeRecordsPerBatch
public void updateProbeRecordsPerBatch(int newRecordsPerBatch) Configure a different temporary batch size when spilling probe batches.- Parameters:
newRecordsPerBatch
- The new temporary batch size to use.
-
allocateNewCurrentBatchAndHV
public void allocateNewCurrentBatchAndHV()Allocate a new current Vector Container and current HV vector -
appendInnerRow
public void appendInnerRow(VectorContainer buildContainer, int ind, int hashCode, HashJoinMemoryCalculator.BuildSidePartitioning calc) Spills if needed -
appendOuterRow
public void appendOuterRow(int hashCode, int recordsProcessed) Outer always spills when batch is full -
completeAnOuterBatch
public void completeAnOuterBatch(boolean toInitialize) -
completeAnInnerBatch
public void completeAnInnerBatch(boolean toInitialize, boolean needsSpill) -
appendBatch
Append the incoming batch (actually only the vectors of that batch) into the tmp list -
spillThisPartition
public void spillThisPartition() -
probeForKey
- Throws:
SchemaChangeException
-
getRecordNumForKey
public int getRecordNumForKey(int currentIndex) -
setRecordNumForKey
public void setRecordNumForKey(int currentIndex, int num) -
decreaseRecordNumForKey
public void decreaseRecordNumForKey(int currentIndex) -
getStartIndex
-
getNextIndex
public int getNextIndex(int compositeIndex) -
setRecordMatched
public boolean setRecordMatched(int compositeIndex) -
getNextUnmatchedIndex
public com.carrotsearch.hppc.IntArrayList getNextUnmatchedIndex() -
getBuildHashCode
- Throws:
SchemaChangeException
-
getProbeHashCode
- Throws:
SchemaChangeException
-
getContainers
-
updateBatches
- Throws:
SchemaChangeException
-
nextBatch
-
getInMemoryBatches
- Specified by:
getInMemoryBatches
in interfaceHashJoinMemoryCalculator.PartitionStat
-
getNumInMemoryBatches
public int getNumInMemoryBatches()- Specified by:
getNumInMemoryBatches
in interfaceHashJoinMemoryCalculator.PartitionStat
-
isSpilled
public boolean isSpilled()- Specified by:
isSpilled
in interfaceHashJoinMemoryCalculator.PartitionStat
-
getNumInMemoryRecords
public long getNumInMemoryRecords()- Specified by:
getNumInMemoryRecords
in interfaceHashJoinMemoryCalculator.PartitionStat
-
getInMemorySize
public long getInMemorySize()- Specified by:
getInMemorySize
in interfaceHashJoinMemoryCalculator.PartitionStat
-
getSpillFile
-
getPartitionBatchesCount
public int getPartitionBatchesCount() -
getPartitionNum
public int getPartitionNum() -
closeWriter
public void closeWriter()Close the writer without deleting the spill file -
buildContainersHashTableAndHelper
Creates the hash table and join helper for this partition. This method should only be called after all the build side records have been consumed.- Throws:
SchemaChangeException
-
getStats
-
cleanup
public void cleanup(boolean deleteFile) Free all in-memory allocated structures.- Parameters:
deleteFile
- - whether to delete the spill file or not
-
close
public void close() -
makeDebugString
Creates a debugging string containing information about memory usage.- Returns:
- A debugging string.
-