Class TableSplit

java.lang.Object
org.apache.hadoop.mapreduce.InputSplit
org.apache.hadoop.hbase.mapreduce.TableSplit
All Implemented Interfaces:
Comparable<TableSplit>, org.apache.hadoop.io.Writable

@Public public class TableSplit extends org.apache.hadoop.mapreduce.InputSplit implements org.apache.hadoop.io.Writable, Comparable<TableSplit>
A table split corresponds to a key range (low, high) and an optional scanner. All references to row below refer to the key of the row.
  • Field Details

    • LOG

      private static final org.slf4j.Logger LOG
    • VERSION

      private static final TableSplit.Version VERSION
    • tableName

    • startRow

      private byte[] startRow
    • endRow

      private byte[] endRow
    • regionLocation

    • encodedRegionName

    • scan

      private String scan
      The scan object may be null but the serialized form of scan is never null or empty since we serialize the scan object with default values then. Having no scanner in TableSplit doesn't necessarily mean there is no scanner for mapreduce job, it just means that we do not need to set it for each split. For example, it is not required to have a scan object for TableInputFormatBase since we use the scan from the job conf and scanner is supposed to be same for all the splits of table.
    • length

      private long length
  • Constructor Details

    • TableSplit

      public TableSplit()
      Default constructor.
    • TableSplit

      public TableSplit(TableName tableName, Scan scan, byte[] startRow, byte[] endRow, String location)
      Creates a new instance while assigning all variables. Length of region is set to 0 Encoded name of the region is set to blank
      Parameters:
      tableName - The name of the current table.
      scan - The scan associated with this split.
      startRow - The start row of the split.
      endRow - The end row of the split.
      location - The location of the region.
    • TableSplit

      public TableSplit(TableName tableName, Scan scan, byte[] startRow, byte[] endRow, String location, long length)
      Creates a new instance while assigning all variables. Encoded name of region is set to blank
      Parameters:
      tableName - The name of the current table.
      scan - The scan associated with this split.
      startRow - The start row of the split.
      endRow - The end row of the split.
      location - The location of the region.
    • TableSplit

      public TableSplit(TableName tableName, Scan scan, byte[] startRow, byte[] endRow, String location, String encodedRegionName, long length)
      Creates a new instance while assigning all variables.
      Parameters:
      tableName - The name of the current table.
      scan - The scan associated with this split.
      startRow - The start row of the split.
      endRow - The end row of the split.
      encodedRegionName - The region ID.
      location - The location of the region.
    • TableSplit

      public TableSplit(TableName tableName, byte[] startRow, byte[] endRow, String location)
      Creates a new instance without a scanner. Length of region is set to 0
      Parameters:
      tableName - The name of the current table.
      startRow - The start row of the split.
      endRow - The end row of the split.
      location - The location of the region.
    • TableSplit

      public TableSplit(TableName tableName, byte[] startRow, byte[] endRow, String location, long length)
      Creates a new instance without a scanner.
      Parameters:
      tableName - The name of the current table.
      startRow - The start row of the split.
      endRow - The end row of the split.
      location - The location of the region.
      length - Size of region in bytes
  • Method Details

    • getScan

      public Scan getScan() throws IOException
      Returns a Scan object from the stored string representation.
      Returns:
      Returns a Scan object based on the stored scanner.
      Throws:
      IOException - throws IOException if deserialization fails
    • getScanAsString

      @Private public String getScanAsString()
      Returns a scan string
      Returns:
      scan as string. Should be noted that this is not same as getScan().toString() because Scan object will have the default values when empty scan string is deserialized. Thus, getScan().toString() can never be empty
    • getTableName

      public byte[] getTableName()
      Returns the table name converted to a byte array.
      Returns:
      The table name.
      See Also:
    • getTable

      public TableName getTable()
      Returns the table name.
      Returns:
      The table name.
    • getStartRow

      public byte[] getStartRow()
      Returns the start row.
      Returns:
      The start row.
    • getEndRow

      public byte[] getEndRow()
      Returns the end row.
      Returns:
      The end row.
    • getRegionLocation

      Returns the region location.
      Returns:
      The region's location.
    • getLocations

      public String[] getLocations()
      Returns the region's location as an array.
      Specified by:
      getLocations in class org.apache.hadoop.mapreduce.InputSplit
      Returns:
      The array containing the region location.
      See Also:
      • InputSplit.getLocations()
    • getEncodedRegionName

      Returns the region's encoded name.
      Returns:
      The region's encoded name.
    • getLength

      public long getLength()
      Returns the length of the split.
      Specified by:
      getLength in class org.apache.hadoop.mapreduce.InputSplit
      Returns:
      The length of the split.
      See Also:
      • InputSplit.getLength()
    • readFields

      public void readFields(DataInput in) throws IOException
      Reads the values of each field.
      Specified by:
      readFields in interface org.apache.hadoop.io.Writable
      Parameters:
      in - The input to read from.
      Throws:
      IOException - When reading the input fails.
    • write

      public void write(DataOutput out) throws IOException
      Writes the field values to the output.
      Specified by:
      write in interface org.apache.hadoop.io.Writable
      Parameters:
      out - The output to write to.
      Throws:
      IOException - When writing the values to the output fails.
    • toString

      public String toString()
      Returns the details about this instance as a string.
      Overrides:
      toString in class Object
      Returns:
      The values of this instance as a string.
      See Also:
    • compareTo

      public int compareTo(TableSplit split)
      Compares this split against the given one.
      Specified by:
      compareTo in interface Comparable<TableSplit>
      Parameters:
      split - The split to compare to.
      Returns:
      The result of the comparison.
      See Also:
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object