Class GroupingTableMap

java.lang.Object
org.apache.hadoop.mapred.MapReduceBase
org.apache.hadoop.hbase.mapred.GroupingTableMap
All Implemented Interfaces:
Closeable, AutoCloseable, TableMap<ImmutableBytesWritable,Result>, org.apache.hadoop.io.Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>

@Public public class GroupingTableMap extends org.apache.hadoop.mapred.MapReduceBase implements TableMap<ImmutableBytesWritable,Result>
Extract grouping columns from input record
  • Field Details

  • Constructor Details

  • Method Details

    • initJob

      public static void initJob(String table, String columns, String groupColumns, Class<? extends TableMap> mapper, org.apache.hadoop.mapred.JobConf job)
      Use this before submitting a TableMap job. It will appropriately set up the JobConf.
      Parameters:
      table - table to be processed
      columns - space separated list of columns to fetch
      groupColumns - space separated list of columns used to form the key used in collect
      mapper - map class
      job - job configuration object
    • configure

      public void configure(org.apache.hadoop.mapred.JobConf job)
      Specified by:
      configure in interface org.apache.hadoop.mapred.JobConfigurable
      Overrides:
      configure in class org.apache.hadoop.mapred.MapReduceBase
    • map

      public void map(ImmutableBytesWritable key, Result value, org.apache.hadoop.mapred.OutputCollector<ImmutableBytesWritable,Result> output, org.apache.hadoop.mapred.Reporter reporter) throws IOException
      Extract the grouping columns from value to construct a new key. Pass the new key and value to reduce. If any of the grouping columns are not found in the value, the record is skipped.
      Specified by:
      map in interface org.apache.hadoop.mapred.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>
      Throws:
      IOException
    • extractKeyValues

      protected byte[][] extractKeyValues(Result r)
      Extract columns values from the current record. This method returns null if any of the columns are not found. Override this method if you want to deal with nulls differently.
      Returns:
      array of byte values
    • createGroupKey

      protected ImmutableBytesWritable createGroupKey(byte[][] vals)
      Create a key by concatenating multiple column values. Override this function in order to produce different types of keys.
      Returns:
      key generated by concatenating multiple column values