java.lang.Object

org.apache.drill.exec.record.BatchSchema

All Implemented Interfaces:: Iterable<MaterializedField>

public class BatchSchema extends Object implements Iterable<MaterializedField>

Historically BatchSchema is used to represent the schema of a batch. However, it does not handle complex types well. If you have a choice, use TupleMetadata instead.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static enum

BatchSchema.SelectionVectorMode
Constructor Summary

Constructors

Constructor

Description

BatchSchema(BatchSchema.SelectionVectorMode selectionVector, List<MaterializedField> fields)
Method Summary

Modifier and Type

Method

Description

BatchSchema

clone()

boolean

equals(Object obj)

DRILL-5525: the semantics of this method are badly broken.

String

format()

Format the schema into a multi-line format.

MaterializedField

getColumn(int index)

int

getFieldCount()

BatchSchema.SelectionVectorMode

getSelectionVectorMode()

int

hashCode()

boolean

isEquivalent(BatchSchema other)

Compare that two schemas are identical according to the rules defined in MaterializedField.isEquivalent(MaterializedField).

Iterator<MaterializedField>

iterator()

BatchSchema

merge(BatchSchema otherSchema)

Merge two schemas to produce a new, merged schema.

static SchemaBuilder

newBuilder()

String

toString()

Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait

Methods inherited from interface java.lang.Iterable
forEach, spliterator

Constructor Details
- BatchSchema
  
  public BatchSchema(BatchSchema.SelectionVectorMode selectionVector, List<MaterializedField> fields)
Method Details
- newBuilder
  
  public static SchemaBuilder newBuilder()
- getFieldCount
  
  public int getFieldCount()
- getColumn
  
  public MaterializedField getColumn(int index)
- iterator
  
  public Iterator<MaterializedField> iterator()
  
  Specified by:
  
  iterator in interface Iterable<MaterializedField>
- getSelectionVectorMode
  
  public BatchSchema.SelectionVectorMode getSelectionVectorMode()
- clone
  
  public BatchSchema clone()
  
  Overrides:
  
  clone in class Object
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- hashCode
  
  public int hashCode()
  
  Overrides:
  
  hashCode in class Object
- equals
  
  public boolean equals(Object obj)
  
  DRILL-5525: the semantics of this method are badly broken. Caveat emptor. This check used for detecting actual schema change inside operator record batch will not work for AbstractContainerVectors (like MapVector). In each record batch a reference to incoming batch schema is stored (let say S:{a: int}) and then equals is called on that stored reference and current incoming batch schema. Internally schema object has references to Materialized fields from vectors in container. If there is change in incoming batch schema, then the upstream will create a new ValueVector in its output container with the new detected type, which in turn will have new instance for Materialized Field. Then later a new BatchSchema object is created for this new incoming batch (let say S":{a":varchar}). The operator calling equals will have reference to old schema object (S) and hence first check will not be satisfied and then it will call equals on each of the Materialized Field (a.equals(a")). Since new materialized field is created for newly created vector the equals check on field will return false. And schema change will be detected in this case. Now consider instead of int vector there is a MapVector such that initial schema was (let say S:{a:{b:int, c:int}} and then later schema for Map field c changes, then in container Map vector will be found but later the children vector for field c will be replaced. This new schema object will be created as (S":{a:{b:int, c":varchar}}). Now when S.equals(S") is called it will eventually call a.equals(a) which will return true even though the schema of children value vector c has changed. This is because no new vector is created for field (a) and hence it's object reference to MaterializedField has not changed which will be reflected in both old and new schema instances. Hence we should make use of isEquivalent(BatchSchema) method instead since MaterializedField.isEquivalent(MaterializedField) method is updated to remove the reference check.
  
  Overrides:
  
  equals in class Object
- isEquivalent
  
  public boolean isEquivalent(BatchSchema other)
  
  Compare that two schemas are identical according to the rules defined in MaterializedField.isEquivalent(MaterializedField). In particular, this method requires that the fields have a 1:1 ordered correspondence in the two schemas.
  
  Parameters:
  
  other - another non-null batch schema
  
  Returns:
  
  true if the two schemas are equivalent according to the MaterializedField.isEquivalent(MaterializedField) rules, false otherwise
- merge
  
  public BatchSchema merge(BatchSchema otherSchema)
  
  Merge two schemas to produce a new, merged schema. The caller is responsible for ensuring that column names are unique. The order of the fields in the new schema is the same as that of this schema, with the other schema's fields appended in the order defined in the other schema.
  Merging data with selection vectors is unlikely to be useful, or work well. With a selection vector, the two record batches would have to be correlated both in their selection vectors AND in the underlying vectors. Such a use case is hard to imagine. So, for now, this method forbids merging schemas if either of them carry a selection vector. If we discover a meaningful use case, we can revisit the issue.
  
  Parameters:
  
  otherSchema - the schema to merge with this one
  
  Returns:
  
  the new, merged, schema
- format
  
  public String format()
  
  Format the schema into a multi-line format. Useful when debugging a query with a very wide schema as the usual single-line format is far too hard to read.

Class BatchSchema

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface java.lang.Iterable

Constructor Details

BatchSchema

Method Details

newBuilder

getFieldCount

getColumn

iterator

getSelectionVectorMode

clone

toString

hashCode

equals

isEquivalent

merge

format