java.lang.Object

org.apache.drill.exec.physical.resultSet.impl.ResultSetLoaderImpl

All Implemented Interfaces:: ResultSetLoader

public class ResultSetLoaderImpl extends Object implements ResultSetLoader

Implementation of the result set loader. Caches vectors for a row or map.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

ResultSetLoaderImpl.ResultSetOptions

Read-only set of options for the result set loader.
Field Summary

Fields

Modifier and Type

Field

Description

protected int

accumulatedBatchSize

Total bytes allocated to the current batch.

protected static final org.slf4j.Logger

logger

Fields inherited from interface org.apache.drill.exec.physical.resultSet.ResultSetLoader
DEFAULT_ROW_COUNT
Constructor Summary

Constructors

Constructor

Description

ResultSetLoaderImpl(BufferAllocator allocator)

ResultSetLoaderImpl(BufferAllocator allocator, ResultSetLoaderImpl.ResultSetOptions options)
Method Summary

Modifier and Type

Method

Description

TupleMetadata

activeSchema()

Returns the active output schema; the schema used by the writers, minus any unprojected columns.

int

activeSchemaVersion()

BufferAllocator

allocator()

boolean

atLimit()

After a ResultSetLoader.harvest(), call, call this method to determine if the scan limit has been hit.

int

batchCount()

Total number of batches created.

int

bumpVersion()

boolean

canExpand(int delta)

void

close()

Called after all rows are returned, whether because no more data is available, or the caller wishes to cancel the current row batch and complete.

ColumnBuilder

columnBuilder()

void

dump(HierarchicalFormatter format)

CustomErrorContext

errorContext()

Context for error messages.

VectorContainer

harvest()

Harvest the current row batch, and reset the mutator to the start of the next row batch (which may already contain an overflow row.

boolean

hasOverflow()

boolean

hasRows()

Report whether the loader currently holds rows.

protected boolean

isFull()

Implementation of RowSetLoader.isFull()

boolean

isProjectionEmpty()

Reports if this is an empty projection such as occurs in a SELECT COUNT(*) query.

int

maxBatchSize()

The maximum number of rows for the present batch.

VectorContainer

outputContainer()

Returns the output container which holds (or will hold) batches from this loader.

TupleMetadata

outputSchema()

The schema of the harvested batch.

void

overflowed()

ProjectionFilter

projectionSet()

TupleState.RowState

rootState()

protected int

rowCount()

Implementation for {#link RowSetLoader.rowCount().

int

rowIndex()

protected void

saveRow()

Finalize the current row.

int

schemaVersion()

Current schema version.

ResultSetLoader

setRow(Object... values)

Load a row using column values passed as variable-length arguments.

void

setTargetRowCount(int rowCount)

Adjust the number of rows to produce in the next batch.

int

skipRows(int requestedCount)

Requests to skip the given number of rows.

boolean

startBatch()

Start a new row batch.

boolean

startBatch(boolean schemaOnly)

boolean

startEmptyBatch()

Start a batch to report only schema without data.

protected void

startRow()

Called before writing a new row.

void

tallyAllocations(int allocationBytes)

int

targetRowCount()

The number of rows produced by this loader (as configured in the loader options.)

int

targetVectorSize()

The largest vector size produced by this loader (as specified by the value vector limit.)

long

totalRowCount()

Total number of rows loaded for all previous batches and the current batch.

ResultVectorCache

vectorCache()

Peek at the internal vector cache for readers that need a bit of help resolving types based on what was previously seen.

boolean

writeable()

Reports whether the loader is in a writable state.

RowSetLoader

writer()

Writer for the top-level tuple (the entire row).

protected org.apache.drill.exec.physical.resultSet.impl.WriterIndexImpl

writerIndex()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- logger
  
  protected static final org.slf4j.Logger logger
- accumulatedBatchSize
  
  protected int accumulatedBatchSize
  
  Total bytes allocated to the current batch.
Constructor Details
- ResultSetLoaderImpl
  
  public ResultSetLoaderImpl(BufferAllocator allocator, ResultSetLoaderImpl.ResultSetOptions options)
- ResultSetLoaderImpl
  
  public ResultSetLoaderImpl(BufferAllocator allocator)
Method Details
- projectionSet
  
  public ProjectionFilter projectionSet()
- allocator
  
  public BufferAllocator allocator()
- bumpVersion
  
  public int bumpVersion()
- activeSchemaVersion
  
  public int activeSchemaVersion()
- schemaVersion
  
  public int schemaVersion()
  
  Description copied from interface: ResultSetLoader
  
  Current schema version. The version increments by one each time a column is added.
  
  Specified by:
  
  schemaVersion in interface ResultSetLoader
  
  Returns:
  
  the current schema version
- startBatch
  
  public boolean startBatch()
  
  Description copied from interface: ResultSetLoader
  
  Start a new row batch. Valid only when first started, or after the previous batch has been harvested.
  
  Specified by:
  
  startBatch in interface ResultSetLoader
  
  Returns:
  
  true if another batch can be read, false if the reader has reached the given scan limit.
- startEmptyBatch
  
  public boolean startEmptyBatch()
  
  Start a batch to report only schema without data.
- startBatch
  
  public boolean startBatch(boolean schemaOnly)
- hasRows
  
  public boolean hasRows()
  
  Description copied from interface: ResultSetLoader
  
  Report whether the loader currently holds rows. If within a batch, reports if at least one row has been read (which might be a look-ahead row.) If between batches, reports if a look-ahead row is available.
  
  Specified by:
  
  hasRows in interface ResultSetLoader
  
  Returns:
  
  true if at least one row is available to harvest, false otherwise
- writer
  
  public RowSetLoader writer()
  
  Description copied from interface: ResultSetLoader
  
  Writer for the top-level tuple (the entire row). Valid only when the mutator is actively writing a batch (after startBatch() but before harvest().)
  
  Specified by:
  
  writer in interface ResultSetLoader
  
  Returns:
  
  writer for the top-level columns
- setRow
  
  public ResultSetLoader setRow(Object... values)
  
  Description copied from interface: ResultSetLoader
  
  Load a row using column values passed as variable-length arguments. Expects map values to represented as an array. A schema of (a:int, b:map(c:varchar)) would be> set as
  loadRow(10, new Object[] {"foo"});
  Values of arrays can be expressed as a Java array. A schema of (a:int, b:int[]) can be set as
  loadRow(10, new int[] {100, 200});
  . Primarily for testing, too slow for production code.
  If the row consists of a single map or list, then the one value will be an Object array, creating an ambiguity. Use writer().set(0, value); in this case.
  
  Specified by:
  
  setRow in interface ResultSetLoader
  
  Parameters:
  
  values - column values in column index order
  
  Returns:
  
  this loader
- startRow
  
  protected void startRow()
  
  Called before writing a new row. Implementation of RowSetLoader.start().
- saveRow
  
  protected void saveRow()
  
  Finalize the current row. Implementation of RowSetLoader.save().
- isFull
  
  protected boolean isFull()
  
  Implementation of RowSetLoader.isFull()
  
  Returns:
  
  true if the batch is full (reached vector capacity or the row count limit), false if more rows can be added
- writeable
  
  public boolean writeable()
  
  Description copied from interface: ResultSetLoader
  
  Reports whether the loader is in a writable state. The writable state occurs only when a batch has been started, and before that batch becomes full.
  
  Specified by:
  
  writeable in interface ResultSetLoader
  
  Returns:
  
  true if the client can add a row to the loader, false if not
- rowCount
  
  protected int rowCount()
  
  Implementation for {#link RowSetLoader.rowCount().
  
  Returns:
  
  the number of rows to be sent downstream for this batch. Does not include the overflow row.
- writerIndex
  
  protected org.apache.drill.exec.physical.resultSet.impl.WriterIndexImpl writerIndex()
- setTargetRowCount
  
  public void setTargetRowCount(int rowCount)
  
  Description copied from interface: ResultSetLoader
  
  Adjust the number of rows to produce in the next batch. Takes affect after the next call to ResultSetLoader.startBatch().
  
  Specified by:
  
  setTargetRowCount in interface ResultSetLoader
  
  Parameters:
  
  rowCount - target batch row count
- targetRowCount
  
  public int targetRowCount()
  
  Description copied from interface: ResultSetLoader
  
  The number of rows produced by this loader (as configured in the loader options.)
  
  Specified by:
  
  targetRowCount in interface ResultSetLoader
  
  Returns:
  
  the target row count for batches that this loader produces
- targetVectorSize
  
  public int targetVectorSize()
  
  Description copied from interface: ResultSetLoader
  
  The largest vector size produced by this loader (as specified by the value vector limit.)
  
  Specified by:
  
  targetVectorSize in interface ResultSetLoader
  
  Returns:
  
  the largest vector size. Attempting to extend a vector beyond this limit causes automatic vector overflow and terminates the in-flight batch, even if the batch has not yet reached the target row count
- maxBatchSize
  
  public int maxBatchSize()
  
  Description copied from interface: ResultSetLoader
  The maximum number of rows for the present batch. Will be the lesser of the
  
  invalid @link
  
  {@link #targetRowCount()) and the overall scan limit remaining.
  Specified by:
  
  maxBatchSize in interface ResultSetLoader
- skipRows
  
  public int skipRows(int requestedCount)
  
  Description copied from interface: ResultSetLoader
  
  Requests to skip the given number of rows. Returns the number of rows actually skipped (which is limited by batch count.)
  Used in SELECT COUNT(*) style queries when the downstream operators want just record count, but no actual rows.
  Also used to fill in a batch of only null values (such a filling in a set of null vectors for unprojected columns.)
  
  Specified by:
  
  skipRows in interface ResultSetLoader
  
  Parameters:
  
  requestedCount - the number of rows to skip
  
  Returns:
  
  the actual number of rows skipped, which may be less than the requested amount. If less, the client should call this method for multiple batches until the requested count is reached
- isProjectionEmpty
  
  public boolean isProjectionEmpty()
  
  Description copied from interface: ResultSetLoader
  
  Reports if this is an empty projection such as occurs in a SELECT COUNT(*) query. If the projection is empty, then the downstream needs only the row count set in each batch, but no actual vectors will be created. In this case, the client can do the work to populate rows (the data will be discarded), or can call ResultSetLoader.skipRows(int) to skip over the number of rows that would have been read if any data had been projected.
  Note that the empty schema case can also occur if the project list from the SELECT clause is disjoint from the table schema. For example, SELECT a, b from a table with schema (c, d).
  
  Specified by:
  
  isProjectionEmpty in interface ResultSetLoader
  
  Returns:
  
  true if no columns are actually projected, false if at least one column is projected
- overflowed
  
  public void overflowed()
- hasOverflow
  
  public boolean hasOverflow()
- outputContainer
  
  public VectorContainer outputContainer()
  
  Description copied from interface: ResultSetLoader
  
  Returns the output container which holds (or will hold) batches from this loader. For use when the container is needed prior to "harvesting" a batch. The data is not valid until ResultSetLoader.harvest() is called, and is no longer valid once ResultSetLoader.startBatch() is called.
  
  Specified by:
  
  outputContainer in interface ResultSetLoader
  
  Returns:
  
  container used to publish results from this loader
- harvest
  
  public VectorContainer harvest()
  
  Description copied from interface: ResultSetLoader
  Harvest the current row batch, and reset the mutator to the start of the next row batch (which may already contain an overflow row.
  The schema of the returned container is defined as:
  
  The schema as passed in via the loader options, plus
  
  Columns added dynamically during write, minus
  
  Any columns not included in the project list, minus
  
  Any columns added in the overflow row.
  
  That is, column order is as defined by the initial schema and column additions. In particular, the schema order is not defined by the projection list. (Another mechanism is required to reorder columns for the actual projection.)
  Specified by:
  
  harvest in interface ResultSetLoader
  
  Returns:
  
  the row batch to send downstream
- atLimit
  
  public boolean atLimit()
  
  Description copied from interface: ResultSetLoader
  
  After a ResultSetLoader.harvest(), call, call this method to determine if the scan limit has been hit. If so, treat this as the final batch for the reader, even if more data is available to read.
  
  Specified by:
  
  atLimit in interface ResultSetLoader
  
  Returns:
  
  true if the scan has reached a set scan row limit, false if there is no limit, or more rows can be read.
- outputSchema
  
  public TupleMetadata outputSchema()
  
  Description copied from interface: ResultSetLoader
  
  The schema of the harvested batch. Valid until the start of the next batch.
  
  Specified by:
  
  outputSchema in interface ResultSetLoader
  
  Returns:
  
  the extended schema of the harvested batch which includes any allocation hints used when creating the batch
- activeSchema
  
  public TupleMetadata activeSchema()
  
  Description copied from interface: ResultSetLoader
  
  Returns the active output schema; the schema used by the writers, minus any unprojected columns. This is usually the same as the output schema, but may differ if the writer adds columns during an overflow row. Unlike the output schema, this schema is defined as long as the loader is open.
  
  Specified by:
  
  activeSchema in interface ResultSetLoader
- close
  
  public void close()
  
  Description copied from interface: ResultSetLoader
  
  Called after all rows are returned, whether because no more data is available, or the caller wishes to cancel the current row batch and complete.
  
  Specified by:
  
  close in interface ResultSetLoader
- batchCount
  
  public int batchCount()
  
  Description copied from interface: ResultSetLoader
  
  Total number of batches created. Includes the current batch if the row count in this batch is non-zero.
  
  Specified by:
  
  batchCount in interface ResultSetLoader
  
  Returns:
  
  the number of batches produced including the current one
- totalRowCount
  
  public long totalRowCount()
  
  Description copied from interface: ResultSetLoader
  
  Total number of rows loaded for all previous batches and the current batch.
  
  Specified by:
  
  totalRowCount in interface ResultSetLoader
  
  Returns:
  
  total row count
- rootState
  
  public TupleState.RowState rootState()
- canExpand
  
  public boolean canExpand(int delta)
- tallyAllocations
  
  public void tallyAllocations(int allocationBytes)
- dump
  
  public void dump(HierarchicalFormatter format)
- vectorCache
  
  public ResultVectorCache vectorCache()
  
  Description copied from interface: ResultSetLoader
  
  Peek at the internal vector cache for readers that need a bit of help resolving types based on what was previously seen.
  
  Specified by:
  
  vectorCache in interface ResultSetLoader
  
  Returns:
  
  real or dummy vector cache
- rowIndex
  
  public int rowIndex()
- columnBuilder
  
  public ColumnBuilder columnBuilder()
- errorContext
  
  public CustomErrorContext errorContext()
  
  Description copied from interface: ResultSetLoader
  
  Context for error messages.
  
  Specified by:
  
  errorContext in interface ResultSetLoader

Class ResultSetLoaderImpl

Nested Class Summary

Field Summary

Fields inherited from interface org.apache.drill.exec.physical.resultSet.ResultSetLoader

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

logger

accumulatedBatchSize

Constructor Details

ResultSetLoaderImpl

ResultSetLoaderImpl

Method Details

projectionSet

allocator

bumpVersion

activeSchemaVersion

schemaVersion

startBatch

startEmptyBatch

startBatch

hasRows

writer

setRow

startRow

saveRow

isFull

writeable

rowCount

writerIndex

setTargetRowCount

targetRowCount

targetVectorSize

maxBatchSize

skipRows

isProjectionEmpty

overflowed

hasOverflow

outputContainer

harvest

atLimit

outputSchema

activeSchema

close

batchCount

totalRowCount

rootState

canExpand

tallyAllocations

dump

vectorCache

rowIndex

columnBuilder

errorContext