Class ParquetRecordReader
java.lang.Object
org.apache.drill.exec.store.AbstractRecordReader
org.apache.drill.exec.store.CommonParquetRecordReader
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader
- All Implemented Interfaces:
AutoCloseable,RecordReader
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
CommonParquetRecordReader.Metric -
Field Summary
Fields inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
footer, fragmentContext, NUM_RECORDS_TO_READ_NOT_SPECIFIED, operatorContext, parquetReaderStatsFields inherited from class org.apache.drill.exec.store.AbstractRecordReader
DEFAULT_TEXT_COLS_TO_READFields inherited from interface org.apache.drill.exec.store.RecordReader
ALLOCATOR_INITIAL_RESERVATION, ALLOCATOR_MAX_RESERVATION -
Constructor Summary
ConstructorsConstructorDescriptionParquetRecordReader(FragmentContext fragmentContext, long numRecordsToRead, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, long numRecordsToRead, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
Method Summary
Modifier and TypeMethodDescriptionvoidallocate(Map<String, ValueVector> vectorMap) voidclose()org.apache.parquet.compression.CompressionCodecFactoryFlag indicating if the old non-standard data format appears in this file, see DRILL-4203.protected List<SchemaPath> org.apache.hadoop.fs.FileSystemorg.apache.hadoop.fs.Pathintintnext()Read the next record batch from the file using the reader and read state created previously.voidsetup(OperatorContext operatorContext, OutputMutator output) Prepare the Parquet reader.toString()booleanMethods inherited from class org.apache.drill.exec.store.CommonParquetRecordReader
closeStats, handleAndRaise, initNumRecordsToRead, updateRowGroupsStatsMethods inherited from class org.apache.drill.exec.store.AbstractRecordReader
getColumns, hasNext, isSkipQuery, isStarQuery, setColumns, transformColumns
-
Constructor Details
-
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, long numRecordsToRead, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus) -
ParquetRecordReader
public ParquetRecordReader(FragmentContext fragmentContext, long numRecordsToRead, org.apache.hadoop.fs.Path path, int rowGroupIndex, org.apache.hadoop.fs.FileSystem fs, org.apache.parquet.compression.CompressionCodecFactory codecFactory, org.apache.parquet.hadoop.metadata.ParquetMetadata footer, List<SchemaPath> columns, ParquetReaderUtility.DateCorruptionStatus dateCorruptionStatus)
-
-
Method Details
-
getDateCorruptionStatus
Flag indicating if the old non-standard data format appears in this file, see DRILL-4203.- Returns:
- true if the dates are corrupted and need to be corrected
-
getCodecFactory
public org.apache.parquet.compression.CompressionCodecFactory getCodecFactory() -
getHadoopPath
public org.apache.hadoop.fs.Path getHadoopPath() -
getFileSystem
public org.apache.hadoop.fs.FileSystem getFileSystem() -
getRowGroupIndex
public int getRowGroupIndex() -
getBatchSizesMgr
-
getOperatorContext
-
getFragmentContext
-
useBulkReader
public boolean useBulkReader()- Returns:
- true if Parquet reader Bulk processing is enabled; false otherwise
-
getReadState
-
setup
public void setup(OperatorContext operatorContext, OutputMutator output) throws ExecutionSetupException Prepare the Parquet reader. First determine the set of columns to read (the schema for this read.) Then, create a state object to track the read across calls to the reader next() method. Finally, create one of three readers to read batches depending on whether this scan is for only fixed-width fields, contains at least one variable-width field, or is a "mock" scan consisting only of null fields (fields in the SELECT clause but not in the Parquet file.)- Parameters:
operatorContext- operator context for the readeroutput- The place where output for a particular scan should be written. The record reader is responsible for mutating the set of schema values for that particular record.- Throws:
ExecutionSetupException
-
allocate
- Specified by:
allocatein interfaceRecordReader- Overrides:
allocatein classAbstractRecordReader- Throws:
OutOfMemoryException
-
next
public int next()Read the next record batch from the file using the reader and read state created previously.- Returns:
- The number of additional records added to the output.
-
close
public void close() -
getDefaultColumnsToRead
- Overrides:
getDefaultColumnsToReadin classAbstractRecordReader
-
toString
- Overrides:
toStringin classAbstractRecordReader
-