public class ShimBatchReader extends Object implements RowBatchReader, SchemaNegotiatorImpl.NegotiatorListener
Provides the row set loader used to construct record batches.
The idea of this class is that schema construction is complex, and varies depending on the kind of reader. Rather than pack that logic into the scan operator and scan-level reader state, this class abstracts out the schema logic. This allows a variety of solutions as needed for different readers.
| Modifier and Type | Field and Description |
|---|---|
protected ManagedScanFramework |
framework |
protected ManagedReader<? extends SchemaNegotiator> |
reader |
protected ReaderSchemaOrchestrator |
readerOrchestrator |
protected SchemaNegotiatorImpl |
schemaNegotiator |
protected ResultSetLoader |
tableLoader |
| Constructor and Description |
|---|
ShimBatchReader(ManagedScanFramework manager,
ManagedReader<? extends SchemaNegotiator> reader) |
| Modifier and Type | Method and Description |
|---|---|
ResultSetLoader |
build(SchemaNegotiatorImpl schemaNegotiator) |
void |
close()
Release resources.
|
boolean |
defineSchema()
Called for the first reader within a scan.
|
String |
name()
Name used when reporting errors.
|
boolean |
next()
Read the next batch.
|
boolean |
open()
Setup the record reader.
|
VectorContainer |
output()
Return the container with the reader's output.
|
ManagedReader<? extends SchemaNegotiator> |
reader() |
int |
schemaVersion()
Return the version of the schema returned by
RowBatchReader.output(). |
protected final ManagedScanFramework framework
protected final ManagedReader<? extends SchemaNegotiator> reader
protected final ReaderSchemaOrchestrator readerOrchestrator
protected SchemaNegotiatorImpl schemaNegotiator
protected ResultSetLoader tableLoader
public ShimBatchReader(ManagedScanFramework manager, ManagedReader<? extends SchemaNegotiator> reader)
public String name()
RowBatchReadername in interface RowBatchReaderpublic ManagedReader<? extends SchemaNegotiator> reader()
public boolean open()
RowBatchReaderopen in interface RowBatchReaderpublic boolean defineSchema()
RowBatchReaderThis step is optional and is purely for performance.
defineSchema in interface RowBatchReaderpublic boolean next()
RowBatchReaderThis somewhat complex protocol avoids the need to allocate a final batch just to find out that no more data is available; it allows EOF to be returned along with the final batch.
next in interface RowBatchReaderpublic VectorContainer output()
RowBatchReaderRowBatchReader.open(). If the data source
can provide a schema at open time, then the reader should provide an
empty batch with the schema set. The scanner will return this schema
downstream to inform other operators of the schema.RowBatchReader.next() to retrieve
the batch produced by that call. (No call is made if next()
returns false.output in interface RowBatchReaderpublic void close()
RowBatchReaderclose in interface RowBatchReaderpublic int schemaVersion()
RowBatchReaderRowBatchReader.output(). The schema
is assumed to start at -1 (no schema). The reader is free to use any
numbering system it likes as long as:
If the reader can return a schema on open (so-called "early-schema), then this method must return a non-negative version number, even if the schema happens to be empty (such as reading an empty file.)
However, if the reader cannot return a schema on open (so-called "late schema"), then this method must return -1 (and output() must return null) to indicate now schema is available when called before the first call to next().
No calls will be made to this method before open() after close() or after next() returns false. The implementation is thus not required to handle these cases.
schemaVersion in interface RowBatchReaderpublic ResultSetLoader build(SchemaNegotiatorImpl schemaNegotiator)
build in interface SchemaNegotiatorImpl.NegotiatorListenerCopyright © 2021 The Apache Software Foundation. All rights reserved.