Class TextFormatPlugin
java.lang.Object
org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin<TextFormatConfig>
org.apache.drill.exec.store.easy.text.TextFormatPlugin
- All Implemented Interfaces:
FormatPlugin
Text format plugin for CSV and other delimited text formats.
Allows use of a "provided schema", including using table properties
on that schema to override "static" ("or default") properties
defined in the plugin config. Allows, say, having ".csv" files
in which some have no headers (the default) and some do have
headers (as specified via table properties in the provided schema.)
Makes use of the scan framework and the result set loader mechanism to allow tight control of the size of produced batches (as well as to support provided schema.)
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
EasyFormatPlugin.EasyFormatConfig, EasyFormatPlugin.EasyFormatConfigBuilder, EasyFormatPlugin.ScanFrameworkVersion -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final intstatic final intstatic final charstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringFields inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
formatConfig -
Constructor Summary
ConstructorsConstructorDescriptionTextFormatPlugin(String name, DrillbitContext context, org.apache.hadoop.conf.Configuration fsConf, StoragePluginConfig config, TextFormatConfig formatPluginConfig) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidconfigureScan(FileScanLifecycleBuilder builder, EasySubScan scan) Configure an EVF (v2) scan, which must at least include the factory to create readers.getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, MetadataProviderManager metadataProviderManager) getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, OptionManager options, MetadataProviderManager metadataProviderManager) getRecordWriter(FragmentContext context, EasyWriter writer) protected ScanStatsgetScanStats(PlannerSettings settings, EasyGroupScan scan) Methods inherited from class org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin
easyConfig, frameworkBuilder, getConfig, getContext, getFsConf, getGroupScan, getMatcher, getName, getOptimizerRules, getReaderBatch, getReaderOperatorType, getRecordReader, getStatisticsRecordWriter, getStorageConfig, getWriter, getWriterBatch, getWriterOperatorType, initScanBuilder, isBlockSplittable, isCompressible, isStatisticsRecordWriter, newBatchReader, readStatistics, scanVersion, supportsAutoPartitioning, supportsFileImplicitColumns, supportsLimitPushdown, supportsPushDown, supportsRead, supportsStatistics, supportsWrite, writeStatisticsMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.drill.exec.store.dfs.FormatPlugin
getGroupScan, getOptimizerRules
-
Field Details
-
MAXIMUM_NUMBER_COLUMNS
public static final int MAXIMUM_NUMBER_COLUMNS- See Also:
-
MAX_CHARS_PER_COLUMN
public static final int MAX_CHARS_PER_COLUMN- See Also:
-
NULL_CHAR
public static final char NULL_CHAR- See Also:
-
TEXT_PREFIX
-
HAS_HEADERS_PROP
-
SKIP_FIRST_LINE_PROP
-
DELIMITER_PROP
-
COMMENT_CHAR_PROP
-
QUOTE_PROP
-
QUOTE_ESCAPE_PROP
-
LINE_DELIM_PROP
-
TRIM_WHITESPACE_PROP
-
PARSE_UNESCAPED_QUOTES_PROP
-
WRITER_OPERATOR_TYPE
- See Also:
-
-
Constructor Details
-
TextFormatPlugin
public TextFormatPlugin(String name, DrillbitContext context, org.apache.hadoop.conf.Configuration fsConf, StoragePluginConfig config, TextFormatConfig formatPluginConfig)
-
-
Method Details
-
getGroupScan
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, MetadataProviderManager metadataProviderManager) throws IOException - Specified by:
getGroupScanin interfaceFormatPlugin- Overrides:
getGroupScanin classEasyFormatPlugin<TextFormatConfig>- Throws:
IOException
-
getGroupScan
public AbstractGroupScan getGroupScan(String userName, FileSelection selection, List<SchemaPath> columns, OptionManager options, MetadataProviderManager metadataProviderManager) throws IOException - Throws:
IOException
-
configureScan
Description copied from class:EasyFormatPluginConfigure an EVF (v2) scan, which must at least include the factory to create readers.- Overrides:
configureScanin classEasyFormatPlugin<TextFormatConfig>- Parameters:
builder- the builder with default options already set, and which allows the plugin implementation to set others
-
getRecordWriter
- Overrides:
getRecordWriterin classEasyFormatPlugin<TextFormatConfig>- Throws:
IOException
-
getScanStats
- Overrides:
getScanStatsin classEasyFormatPlugin<TextFormatConfig>
-