Package org.apache.hadoop.hbase.util
Interface RegionSplitter.SplitAlgorithm
- All Known Implementing Classes:
RegionSplitter.DecimalStringSplit
,RegionSplitter.HexStringSplit
,RegionSplitter.NumberStringSplit
,RegionSplitter.UniformSplit
- Enclosing class:
- RegionSplitter
public static interface RegionSplitter.SplitAlgorithm
A generic interface for the RegionSplitter code to use for all it's functionality. Note that
the original authors of this code use
RegionSplitter.HexStringSplit
to partition their table and set
it as default, but provided this for your custom algorithm. To use, create a new derived class
from this interface and call RegionSplitter.createPresplitTable(org.apache.hadoop.hbase.TableName, org.apache.hadoop.hbase.util.RegionSplitter.SplitAlgorithm, java.lang.String[], org.apache.hadoop.conf.Configuration)
or
RegionSplitter#rollingSplit(TableName, SplitAlgorithm, Configuration) with the argument
splitClassName giving the name of your class.-
Method Summary
Modifier and TypeMethodDescriptionbyte[]
firstRow()
In HBase, the first row is represented by an empty byte array.byte[]
lastRow()
In HBase, the last row is represented by an empty byte array.rowToStr
(byte[] row) byte array representing a row in HBaseReturns the separator character to use when storing / printing the rowvoid
setFirstRow
(byte[] userInput) Set the first rowvoid
setFirstRow
(String userInput) In HBase, the last row is represented by an empty byte array.void
setLastRow
(byte[] userInput) Set the last rowvoid
setLastRow
(String userInput) In HBase, the last row is represented by an empty byte array.byte[]
split
(byte[] start, byte[] end) Split a pre-existing region into 2 regions.byte[][]
split
(byte[] start, byte[] end, int numSplits, boolean inclusive) Some MapReduce jobs may want to run multiple mappers per region, this is intended for such usecase.byte[][]
split
(int numRegions) Split an entire table.byte[]
user or file input for row
-
Method Details
-
split
Split a pre-existing region into 2 regions. first row (inclusive) last row (exclusive)- Returns:
- the split row to use
-
split
Split an entire table. number of regions to split the table into user input is validated at this time. may throw a runtime exception in response to a parse failure- Returns:
- array of split keys for the initial regions of the table. The length of the returned array should be numRegions-1.
-
split
Some MapReduce jobs may want to run multiple mappers per region, this is intended for such usecase.- Parameters:
start
- first row (inclusive)end
- last row (exclusive)numSplits
- number of splits to generateinclusive
- whether start and end are returned as split points
-
firstRow
byte[] firstRow()In HBase, the first row is represented by an empty byte array. This might cause problems with your split algorithm or row printing. All your APIs will be passed firstRow() instead of empty array.- Returns:
- your representation of your first row
-
lastRow
byte[] lastRow()In HBase, the last row is represented by an empty byte array. This might cause problems with your split algorithm or row printing. All your APIs will be passed firstRow() instead of empty array.- Returns:
- your representation of your last row
-
setFirstRow
In HBase, the last row is represented by an empty byte array. Set this value to help the split code understand how to evenly divide the first region. raw user input (may throw RuntimeException on parse failure) -
setLastRow
In HBase, the last row is represented by an empty byte array. Set this value to help the split code understand how to evenly divide the last region. Note that this last row is inclusive for all rows sharing the same prefix. raw user input (may throw RuntimeException on parse failure) -
strToRow
user or file input for row- Returns:
- byte array representation of this row for HBase
-
rowToStr
byte array representing a row in HBase- Returns:
- String to use for debug & file printing
-
separator
Returns the separator character to use when storing / printing the row -
setFirstRow
Set the first row- Parameters:
userInput
- byte array of the row key.
-
setLastRow
Set the last row- Parameters:
userInput
- byte array of the row key.
-