Class DrillStatsTable
java.lang.Object
org.apache.drill.exec.planner.common.DrillStatsTable
Wraps the stats table info including schema and tableName. Also materializes stats from storage
and keeps them in memory.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classstatic classstatic classstatic classstatic classstatic classstatic classStruct which contains the statistics for the entire directory structurestatic enumstatic class -
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionDrillStatsTable(DrillStatsTable.TableStatistics statistics) DrillStatsTable(DrillTable table, String schemaName, String tableName, org.apache.hadoop.fs.Path tablePath, org.apache.hadoop.fs.FileSystem fs) -
Method Summary
Modifier and TypeMethodDescriptionstatic PhysicalPlandirect(QueryContext context, boolean outcome, String message, Object... values) generateDirectoryStructure(String dirComputedTime, List<DrillStatsTable.ColumnStatistics> columnStatisticsList) static List<StatisticsHolder<?>> getEstimatedColumnStats(DrillStatsTable statsProvider, SchemaPath fieldName) Returns list ofStatisticsKindand statistics values obtained from specifiedDrillStatsTablefor specified column.static List<StatisticsHolder<?>> getEstimatedTableStats(DrillStatsTable statsProvider) Returns list ofStatisticsKindand statistics values obtained from specifiedDrillStatsTable.getHistogram(SchemaPath column) Get the histogram of a given column.static com.fasterxml.jackson.databind.ObjectMapperThis method returns the statistics (de)serializer which can be used to (de)/serialize theDrillStatsTable.TableStatisticsfrom/to JSONgetNdv(SchemaPath col) Get the approximate number of distinct values of given column.getNNRowCount(SchemaPath col) Get non-null rowcount for the column If stats are not present for the given column, a null is returned.Get row count of the table.booleanvoidRead the stats from storage and keep them in memory.static PhysicalPlannotRequired(QueryContext context, String tbl) static PhysicalPlannotSupported(QueryContext context, String tbl)
-
Field Details
-
CURRENT_VERSION
-
NUM_HISTOGRAM_BUCKETS
public static final int NUM_HISTOGRAM_BUCKETS- See Also:
-
-
Constructor Details
-
DrillStatsTable
public DrillStatsTable(DrillTable table, String schemaName, String tableName, org.apache.hadoop.fs.Path tablePath, org.apache.hadoop.fs.FileSystem fs) -
DrillStatsTable
-
-
Method Details
-
getSchemaName
-
getTableName
-
isMaterialized
public boolean isMaterialized() -
getNdv
Get the approximate number of distinct values of given column. If stats are not present for the given column, a null is returned. Note: returned data may not be accurate. Accuracy depends on whether the table data has changed after the stats are computed.- Parameters:
col- - column for which approximate count distinct is desired- Returns:
- approximate count distinct of the column, if available. NULL otherwise.
-
getColumns
-
getRowCount
Get row count of the table. Returns null if stats are not present. Note: returned data may not be accurate. Accuracy depends on whether the table data has changed after the stats are computed.- Returns:
- rowcount for the table, if available. NULL otherwise.
-
getNNRowCount
Get non-null rowcount for the column If stats are not present for the given column, a null is returned. Note: returned data may not be accurate. Accuracy depends on whether the table data has changed after the stats are computed.- Parameters:
col- - column for which non-null rowcount is desired- Returns:
- non-null rowcount of the column, if available. NULL otherwise.
-
getHistogram
Get the histogram of a given column. If stats are not present for the given column, a null is returned.Note: returned data may not be accurate. Accuracy depends on whether the table data has changed after the stats are computed.
- Parameters:
column- path to the column whose histogram should be obtained- Returns:
- Histogram for this column
-
materialize
public void materialize()Read the stats from storage and keep them in memory. -
generateDirectoryStructure
public static DrillStatsTable.TableStatistics generateDirectoryStructure(String dirComputedTime, List<DrillStatsTable.ColumnStatistics> columnStatisticsList) -
direct
public static PhysicalPlan direct(QueryContext context, boolean outcome, String message, Object... values) -
notSupported
-
notRequired
-
getMapper
public static com.fasterxml.jackson.databind.ObjectMapper getMapper()This method returns the statistics (de)serializer which can be used to (de)/serialize theDrillStatsTable.TableStatisticsfrom/to JSON -
getEstimatedTableStats
Returns list ofStatisticsKindand statistics values obtained from specifiedDrillStatsTable.- Parameters:
statsProvider- the source of statistics- Returns:
- list of
StatisticsKindand statistics values
-
getEstimatedColumnStats
public static List<StatisticsHolder<?>> getEstimatedColumnStats(DrillStatsTable statsProvider, SchemaPath fieldName) Returns list ofStatisticsKindand statistics values obtained from specifiedDrillStatsTablefor specified column.- Parameters:
statsProvider- the source of statisticsfieldName- name of the columns whose statistics should be obtained- Returns:
- list of
StatisticsKindand statistics values
-