Class BaseParquetMetadataProvider
java.lang.Object
org.apache.drill.exec.store.parquet.BaseParquetMetadataProvider
- All Implemented Interfaces:
ParquetMetadataProvider
,TableMetadataProvider
- Direct Known Subclasses:
DeltaParquetTableMetadataProvider
,HiveParquetTableMetadataProvider
,ParquetTableMetadataProviderImpl
Implementation of
ParquetMetadataProvider
which contains base methods for obtaining metadata from
parquet statistics.-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
-
Field Summary
Modifier and TypeFieldDescriptionprotected final List<ReadEntryWithPath>
protected Set<org.apache.hadoop.fs.Path>
static final Object
HashBasedTable
cannot contain nulls, used this object to represent null values.protected MetadataBase.ParquetTableMetadataBase
protected final ParquetReaderConfig
protected TupleMetadata
protected DrillStatsTable
protected final org.apache.hadoop.fs.Path
protected final String
-
Constructor Summary
ModifierConstructorDescriptionprotected
-
Method Summary
Modifier and TypeMethodDescriptionboolean
Whether metadata actuality should be checked.Returns list ofReadEntryWithPath
instances which represents paths to files to be scanned.getFileMetadata
(org.apache.hadoop.fs.Path location) ReturnsFileMetadata
instance which corresponds to metadata of file for specified location.Set<org.apache.hadoop.fs.Path>
Returns list of file locations for table.getFilesForPartition
(PartitionMetadata partition) Returns list ofFileMetadata
instances which belongs to specified partitions.Map<org.apache.hadoop.fs.Path,
FileMetadata> Returns map ofFileMetadata
instances which provides metadata for specific file and its columns.List<org.apache.hadoop.fs.Path>
Returns list of file paths which belong to current table.ReturnsNonInterestingColumnsMetadata
instance which provides metadata for non-interesting columns.Returns list of partition columns for table from thisTableMetadataProvider
.getPartitionMetadata
(SchemaPath columnName) Returns list ofPartitionMetadata
instances which corresponds to partitions for specified column and provides metadata for specific partitions and its columns.Returns list ofPartitionMetadata
instances which provides metadata for specific partitions and its columns.Returns list ofRowGroupMetadata
instances which provides metadata for specific row group and its columns.org.apache.drill.shaded.guava.com.google.common.collect.Multimap<org.apache.hadoop.fs.Path,
RowGroupMetadata> Returns multimap ofRowGroupMetadata
instances which provides metadata for specific row group and its columns mapped to their locations.Map<org.apache.hadoop.fs.Path,
SegmentMetadata> Returns map ofSegmentMetadata
instances which provides metadata for segment and its columns.ReturnsTableMetadata
instance which provides metadata for table and columns metadata.protected void
init
(BaseParquetMetadataProvider metadataProvider) void
Method which initializes all metadata kinds to get rid of parquetTableMetadata.protected abstract void
-
Field Details
-
NULL_VALUE
HashBasedTable
cannot contain nulls, used this object to represent null values. -
entries
-
readerConfig
-
tableName
-
tableLocation
protected final org.apache.hadoop.fs.Path tableLocation -
parquetTableMetadata
-
fileSet
-
schema
-
statsTable
-
-
Constructor Details
-
BaseParquetMetadataProvider
-
-
Method Details
-
init
- Throws:
IOException
-
initializeMetadata
public void initializeMetadata()Method which initializes all metadata kinds to get rid of parquetTableMetadata. Once deserialization and serialization from/into metastore classes is done, this method should be removed to allow lazy initialization. -
getNonInterestingColumnsMetadata
Description copied from interface:TableMetadataProvider
ReturnsNonInterestingColumnsMetadata
instance which provides metadata for non-interesting columns.- Specified by:
getNonInterestingColumnsMetadata
in interfaceTableMetadataProvider
- Returns:
NonInterestingColumnsMetadata
instance
-
getTableMetadata
Description copied from interface:TableMetadataProvider
ReturnsTableMetadata
instance which provides metadata for table and columns metadata.- Specified by:
getTableMetadata
in interfaceTableMetadataProvider
- Returns:
TableMetadata
instance
-
getPartitionColumns
Description copied from interface:TableMetadataProvider
Returns list of partition columns for table from thisTableMetadataProvider
.- Specified by:
getPartitionColumns
in interfaceTableMetadataProvider
- Returns:
- list of partition columns
-
getPartitionsMetadata
Description copied from interface:TableMetadataProvider
Returns list ofPartitionMetadata
instances which provides metadata for specific partitions and its columns.- Specified by:
getPartitionsMetadata
in interfaceTableMetadataProvider
- Returns:
- list of
PartitionMetadata
instances
-
getPartitionMetadata
Description copied from interface:TableMetadataProvider
Returns list ofPartitionMetadata
instances which corresponds to partitions for specified column and provides metadata for specific partitions and its columns.- Specified by:
getPartitionMetadata
in interfaceTableMetadataProvider
- Returns:
- list of
PartitionMetadata
instances which corresponds to partitions for specified column
-
getFileMetadata
Description copied from interface:TableMetadataProvider
ReturnsFileMetadata
instance which corresponds to metadata of file for specified location.- Specified by:
getFileMetadata
in interfaceTableMetadataProvider
- Parameters:
location
- location of the file- Returns:
FileMetadata
instance which corresponds to metadata of file for specified location
-
getFilesForPartition
Description copied from interface:TableMetadataProvider
Returns list ofFileMetadata
instances which belongs to specified partitions.- Specified by:
getFilesForPartition
in interfaceTableMetadataProvider
- Parameters:
partition
- partition which- Returns:
- list of
FileMetadata
instances which belongs to specified partitions
-
getSegmentsMetadataMap
Description copied from interface:TableMetadataProvider
Returns map ofSegmentMetadata
instances which provides metadata for segment and its columns.- Specified by:
getSegmentsMetadataMap
in interfaceTableMetadataProvider
- Returns:
- map of
SegmentMetadata
instances
-
getFilesMetadataMap
Description copied from interface:TableMetadataProvider
Returns map ofFileMetadata
instances which provides metadata for specific file and its columns.- Specified by:
getFilesMetadataMap
in interfaceTableMetadataProvider
- Returns:
- map of
FileMetadata
instances
-
getEntries
Description copied from interface:ParquetMetadataProvider
Returns list ofReadEntryWithPath
instances which represents paths to files to be scanned.- Specified by:
getEntries
in interfaceParquetMetadataProvider
- Returns:
- list of
ReadEntryWithPath
instances whith file paths
-
getFileSet
Description copied from interface:ParquetMetadataProvider
Returns list of file locations for table.- Specified by:
getFileSet
in interfaceParquetMetadataProvider
- Returns:
- list of file locations for table
-
getRowGroupsMeta
Description copied from interface:ParquetMetadataProvider
Returns list ofRowGroupMetadata
instances which provides metadata for specific row group and its columns.- Specified by:
getRowGroupsMeta
in interfaceParquetMetadataProvider
- Returns:
- list of
RowGroupMetadata
instances
-
getLocations
Description copied from interface:ParquetMetadataProvider
Returns list of file paths which belong to current table.- Specified by:
getLocations
in interfaceParquetMetadataProvider
- Returns:
- list of file paths
-
getRowGroupsMetadataMap
public org.apache.drill.shaded.guava.com.google.common.collect.Multimap<org.apache.hadoop.fs.Path,RowGroupMetadata> getRowGroupsMetadataMap()Description copied from interface:ParquetMetadataProvider
Returns multimap ofRowGroupMetadata
instances which provides metadata for specific row group and its columns mapped to their locations.- Specified by:
getRowGroupsMetadataMap
in interfaceParquetMetadataProvider
- Returns:
- multimap of
RowGroupMetadata
instances
-
checkMetadataVersion
public boolean checkMetadataVersion()Description copied from interface:TableMetadataProvider
Whether metadata actuality should be checked.- Specified by:
checkMetadataVersion
in interfaceTableMetadataProvider
- Returns:
- true if metadata actuality should be checked
-
initInternal
- Throws:
IOException
-