public class ScanLifecycle extends Object
ScanLifecycleBuilder are
sufficient to drive the entire scan operator functionality.
Schema resolution and projection is done generically and is the same for all
data sources. Only the
reader (created via the factory class) differs from one type of file to
another.
The framework achieves the work described below by composing a set of detailed classes, each of which performs some specific task. This structure leaves the reader to simply infer schema and read data.
A reader may be "late schema", true "schema on read." In this case, the reader simply tells the result set loader to create a new column reader on the fly. The framework will work out if that new column is to be projected and will return either a real column writer (projected column) or a dummy column writer (unprojected column.)
See ScanSchemaTracker for details about how the scan schema
evolves over the scan lifecycle.
ScanSchemaTracker which resolves the scan schema over the
lifetime of the scan.
Implicit columns are unique to each storage plugin. At present, they
are defined only for the file system plugin. To handle such variation,
each extension defines a subclass of the ScanLifecycleBuilder class to
create the implicit columns manager (and schema negotiator) unique to
a certain kind of scan.
Each reader is tracked by a ReaderLifecycle which handles:
ResultSetLoader for the reader.| Constructor and Description |
|---|
ScanLifecycle(OperatorContext context,
ScanLifecycleBuilder builder) |
| Modifier and Type | Method and Description |
|---|---|
BufferAllocator |
allocator() |
int |
batchCount() |
void |
close() |
OperatorContext |
context() |
CustomErrorContext |
errorContext() |
boolean |
hasOutputSchema() |
protected SchemaNegotiatorImpl |
newNegotiator(ReaderLifecycle readerLifecycle) |
RowBatchReader |
nextReader() |
ScanLifecycleBuilder |
options() |
TupleMetadata |
outputSchema() |
ReaderFactory<?> |
readerFactory() |
ScanSchemaTracker |
schemaTracker() |
void |
tallyBatch() |
ResultVectorCacheImpl |
vectorCache() |
public ScanLifecycle(OperatorContext context, ScanLifecycleBuilder builder)
public OperatorContext context()
public ScanLifecycleBuilder options()
public ScanSchemaTracker schemaTracker()
public ResultVectorCacheImpl vectorCache()
public ReaderFactory<?> readerFactory()
public boolean hasOutputSchema()
public CustomErrorContext errorContext()
public BufferAllocator allocator()
public void tallyBatch()
public int batchCount()
public RowBatchReader nextReader()
protected SchemaNegotiatorImpl newNegotiator(ReaderLifecycle readerLifecycle)
public TupleMetadata outputSchema()
public void close()
Copyright © 2021 The Apache Software Foundation. All rights reserved.