java.lang.Object

org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,KEYOUT,VALUEOUT>

org.apache.hadoop.hbase.mapreduce.TableMapper<ImmutableBytesWritable,Result>

org.apache.hadoop.hbase.mapreduce.GroupingTableMapper

All Implemented Interfaces:: org.apache.hadoop.conf.Configurable

@Public public class GroupingTableMapper extends TableMapper<ImmutableBytesWritable,Result> implements org.apache.hadoop.conf.Configurable

Extract grouping columns from input record.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
org.apache.hadoop.mapreduce.Mapper.Context
Field Summary

Fields

Modifier and Type

Field

Description

protected byte[][]

columns

The grouping columns.

private org.apache.hadoop.conf.Configuration

conf

The current configuration.

static final String

GROUP_COLUMNS

JobConf parameter to specify the columns used to produce the key passed to collect from the map phase.
Constructor Summary

Constructors

Constructor

Description

GroupingTableMapper()
Method Summary

Modifier and Type

Method

Description

protected ImmutableBytesWritable

createGroupKey(byte[][] vals)

Create a key by concatenating multiple column values.

protected byte[][]

extractKeyValues(Result r)

Extract columns values from the current record.

org.apache.hadoop.conf.Configuration

getConf()

Returns the current configuration.

static void

initJob(String table, Scan scan, String groupColumns, Class<? extends TableMapper> mapper, org.apache.hadoop.mapreduce.Job job)

Use this before submitting a TableMap job.

void

map(ImmutableBytesWritable key, Result value, org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>.org.apache.hadoop.mapreduce.Mapper.Context context)

Extract the grouping columns from value to construct a new key.

void

setConf(org.apache.hadoop.conf.Configuration configuration)

Sets the configuration.

Methods inherited from class org.apache.hadoop.mapreduce.Mapper
cleanup, run, setup

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- GROUP_COLUMNS
  
  public static final String GROUP_COLUMNS
  
  JobConf parameter to specify the columns used to produce the key passed to collect from the map phase.
  See Also:
  
  Constant Field Values
- columns
  
  protected byte[][] columns
  
  The grouping columns.
- conf
  
  private org.apache.hadoop.conf.Configuration conf
  
  The current configuration.
Constructor Details
- GroupingTableMapper
  
  public GroupingTableMapper()
Method Details
- initJob
  
  public static void initJob(String table, Scan scan, String groupColumns, Class<? extends TableMapper> mapper, org.apache.hadoop.mapreduce.Job job) throws IOException
  
  Use this before submitting a TableMap job. It will appropriately set up the job.
  
  Parameters:
  
  table - The table to be processed.
  
  scan - The scan with the columns etc.
  
  groupColumns - A space separated list of columns used to form the key used in collect.
  
  mapper - The mapper class.
  
  job - The current job.
  
  Throws:
  
  IOException - When setting up the job fails.
- map
  
  public void map(ImmutableBytesWritable key, Result value, org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>.org.apache.hadoop.mapreduce.Mapper.Context context) throws IOException, InterruptedException
  
  Extract the grouping columns from value to construct a new key. Pass the new key and value to reduce. If any of the grouping columns are not found in the value, the record is skipped.
  
  Overrides:
  
  map in class org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>
  
  Parameters:
  
  key - The current key.
  
  value - The current value.
  
  context - The current context.
  
  Throws:
  
  IOException - When writing the record fails.
  
  InterruptedException - When the job is aborted.
- extractKeyValues
  
  protected byte[][] extractKeyValues(Result r)
  
  Extract columns values from the current record. This method returns null if any of the columns are not found.
  Override this method if you want to deal with nulls differently.
  
  Parameters:
  
  r - The current values.
  
  Returns:
  
  Array of byte values.
- createGroupKey
  
  protected ImmutableBytesWritable createGroupKey(byte[][] vals)
  
  Create a key by concatenating multiple column values.
  Override this function in order to produce different types of keys.
  
  Parameters:
  
  vals - The current key/values.
  
  Returns:
  
  A key generated by concatenating multiple column values.
- getConf
  
  public org.apache.hadoop.conf.Configuration getConf()
  
  Returns the current configuration.
  Specified by:
  
  getConf in interface org.apache.hadoop.conf.Configurable
  
  Returns:
  
  The current configuration.
  
  See Also:
  
  Configurable.getConf()
- setConf
  
  public void setConf(org.apache.hadoop.conf.Configuration configuration)
  
  Sets the configuration. This is used to set up the grouping details.
  Specified by:
  
  setConf in interface org.apache.hadoop.conf.Configurable
  
  Parameters:
  
  configuration - The configuration to set.
  
  See Also:
  
  Configurable.setConf(org.apache.hadoop.conf.Configuration)

Class GroupingTableMapper

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.Mapper

Methods inherited from class java.lang.Object

Field Details

GROUP_COLUMNS

columns

conf

Constructor Details

GroupingTableMapper

Method Details

initJob

map

extractKeyValues

createGroupKey

getConf

setConf