Class TableInputFormatBase

java.lang.Object
org.apache.hadoop.hbase.mapred.TableInputFormatBase
All Implemented Interfaces:
org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
Direct Known Subclasses:
TableInputFormat

@Public public abstract class TableInputFormatBase extends Object implements org.apache.hadoop.mapred.InputFormat<ImmutableBytesWritable,Result>
A Base for TableInputFormats. Receives a Table, a byte[] of input columns and optionally a Filter. Subclasses may use other TableRecordReader implementations.

Subclasses MUST ensure initializeTable(Connection, TableName) is called for an instance to function properly. Each of the entry points to this class used by the MapReduce framework, getRecordReader(InputSplit, JobConf, Reporter) and getSplits(JobConf, int), will call initialize(JobConf) as a convenient centralized location to handle retrieving the necessary configuration information. If your subclass overrides either of these methods, either call the parent version or call initialize yourself.

An example of a subclass:

   class ExampleTIF extends TableInputFormatBase {

     @Override
     protected void initialize(JobConf context) throws IOException {
       // We are responsible for the lifecycle of this connection until we hand it over in
       // initializeTable.
       Connection connection =
          ConnectionFactory.createConnection(HBaseConfiguration.create(job));
       TableName tableName = TableName.valueOf("exampleTable");
       // mandatory. once passed here, TableInputFormatBase will handle closing the connection.
       initializeTable(connection, tableName);
       byte[][] inputColumns = new byte [][] { Bytes.toBytes("columnA"),
         Bytes.toBytes("columnB") };
       // mandatory
       setInputColumns(inputColumns);
       // optional, by default we'll get everything for the given columns.
       Filter exampleFilter = new RowFilter(CompareOp.EQUAL, new RegexStringComparator("aa.*"));
       setRowFilter(exampleFilter);
     }
   }