Class HRegionPartitioner<KEY,VALUE>

java.lang.Object
org.apache.hadoop.mapreduce.Partitioner<ImmutableBytesWritable,VALUE>
org.apache.hadoop.hbase.mapreduce.HRegionPartitioner<KEY,VALUE>
Type Parameters:
KEY - The type of the key.
VALUE - The type of the value.
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable

@Public public class HRegionPartitioner<KEY,VALUE> extends org.apache.hadoop.mapreduce.Partitioner<ImmutableBytesWritable,VALUE> implements org.apache.hadoop.conf.Configurable
This is used to partition the output keys into groups of keys. Keys are grouped according to the regions that currently exist so that each reducer fills a single region so load is distributed.

This class is not suitable as partitioner creating hfiles for incremental bulk loads as region spread will likely change between time of hfile creation and load time. See LoadIncrementalHFiles and Bulk Load.

  • Field Details

  • Constructor Details

  • Method Details

    • getPartition

      public int getPartition(ImmutableBytesWritable key, VALUE value, int numPartitions)
      Gets the partition number for a given key (hence record) given the total number of partitions i.e. number of reduce-tasks for the job.

      Typically a hash function on a all or a subset of the key.

      Specified by:
      getPartition in class org.apache.hadoop.mapreduce.Partitioner<ImmutableBytesWritable,VALUE>
      Parameters:
      key - The key to be partitioned.
      value - The entry value.
      numPartitions - The total number of partitions.
      Returns:
      The partition number for the key.
      See Also:
      • Partitioner.getPartition(java.lang.Object, java.lang.Object, int)
    • getConf

      public org.apache.hadoop.conf.Configuration getConf()
      Returns the current configuration.
      Specified by:
      getConf in interface org.apache.hadoop.conf.Configurable
      Returns:
      The current configuration.
      See Also:
      • Configurable.getConf()
    • setConf

      public void setConf(org.apache.hadoop.conf.Configuration configuration)
      Sets the configuration. This is used to determine the start keys for the given table.
      Specified by:
      setConf in interface org.apache.hadoop.conf.Configurable
      Parameters:
      configuration - The configuration to set.
      See Also:
      • Configurable.setConf(org.apache.hadoop.conf.Configuration)