Package org.apache.hadoop.hbase.master.normalizer
package org.apache.hadoop.hbase.master.normalizer
The Region Normalizer subsystem is responsible for coaxing all the regions in a table toward
a "normal" size, according to their storefile size. It does this by splitting regions that
are significantly larger than the norm, and merging regions that are significantly smaller than
the norm.
The public interface to the Region Normalizer subsystem is limited to the following classes:
-
The
RegionNormalizerFactory
provides an entry point for creating an instance of theRegionNormalizerManager
. -
The
RegionNormalizerManager
encapsulates the whole Region Normalizer subsystem. You'll find one of these hanging off of theHMaster
, which uses it to delegate API calls. There is usually only a single instance of this class. -
Various configuration points that share the common prefix of
hbase.normalizer
.- Whether to split a region as part of normalization. Configuration: "hbase.normalizer.split.enabled", default: true.
- Whether to merge a region as part of normalization. Configuration: "hbase.normalizer.merge.enabled", default: true.
- The minimum number of regions in a table to consider it for merge normalization. Configuration: "hbase.normalizer.merge.min.region.count", default: 3.
- The minimum age for a region to be considered for a merge, in days. Configuration: "hbase.normalizer.merge.min_region_age.days", default: 3.
- The minimum size for a region to be considered for a merge, in whole MBs. Configuration: "hbase.normalizer.merge.min_region_size.mb", default: 0.
- The limit on total throughput of the Region Normalizer's actions, in whole MBs. Configuration: "hbase.normalizer.throughput.max_bytes_per_sec", default: unlimited.
To see detailed logging of the application of these configuration values, set the log level for this package to
TRACE
.
-
The
RegionNormalizerStateStore
provides a system by which the Normalizer can be disabled at runtime. It currently does this by storing the state in master local region, but this is an implementation detail. -
The
RegionNormalizerWorkQueue
is aSet
-likeQueue
that permits a single copy of a given work item to exist in the queue at one time. It also provides a facility for a producer to add an item to the front of the line. Consumers are blocked waiting for new work. -
The
RegionNormalizerChore
wakes up periodically and schedules new normalization work, adding targets to the queue. -
The
RegionNormalizerWorker
runs in a daemon thread, grabbing work off the queue as is it becomes available. -
The
SimpleRegionNormalizer
implements the logic for calculating target region sizes and emitting a list of correspondingNormalizationPlan
objects.
-
ClassDescriptionNormalization plan to merge adjacent regions.A helper for constructing instances of
MergeNormalizationPlan
.ANormalizationPlan
describes some modification to region split points as identified by an instance ofRegionNormalizer
.A POJO that caries details about a region selected for normalization through the pipeline.Performs "normalization" of regions of a table, making sure that suboptimal choice of split keys doesn't leave cluster in a situation when some regions are substantially larger than others for considerable amount of time.Chore that will periodically callHMaster.normalizeRegions(NormalizeTableFilterParams, boolean)
.Factory to create instance ofRegionNormalizer
as configured.This class encapsulates the details of theRegionNormalizer
subsystem.Store region normalizer state.Consumes normalization request targets (TableName
s) off theRegionNormalizerWorkQueue
, dispatches them to theRegionNormalizer
, and executes the resultingNormalizationPlan
s.A specialized collection that holds pending work for theRegionNormalizerWorker
.Simple implementation of region normalizer.Holds the configuration values read fromConfiguration
.Normalization plan to split a region.