org.apache.hadoop.hbase.util.OrderedBytes

@Public public class OrderedBytes extends Object

Utility class that handles ordered byte arrays. That is, unlike Bytes, these methods produce byte arrays which maintain the sort order of the original values.

Encoding Format summary

Each value is encoded as one or more bytes. The first byte of the encoding, its meaning, and a terse description of the bytes that follow is given by the following table:

Content Type	Encoding
NULL	0x05
negative infinity	0x07
negative large	0x08, ~E, ~M
negative medium	0x13-E, ~M
negative small	0x14, -E, ~M
zero	0x15
positive small	0x16, ~-E, M
positive medium	0x17+E, M
positive large	0x22, E, M
positive infinity	0x23
NaN	0x25
fixed-length 32-bit integer	0x27, I
fixed-length 64-bit integer	0x28, I
fixed-length 8-bit integer	0x29
fixed-length 16-bit integer	0x2a
fixed-length 32-bit float	0x30, F
fixed-length 64-bit float	0x31, F
TEXT	0x33, T
variable length BLOB	0x35, B
byte-for-byte BLOB	0x36, X

Null Encoding

Each value that is a NULL encodes as a single byte of 0x05. Since every other value encoding begins with a byte greater than 0x05, this forces NULL values to sort first.

Text Encoding

Each text value begins with a single byte of 0x33 and ends with a single byte of 0x00. There are zero or more intervening bytes that encode the text value. The intervening bytes are chosen so that the encoding will sort in the desired collating order. The intervening bytes may not contain a 0x00 character; the only 0x00 byte allowed in a text encoding is the final byte.

The text encoding ends in 0x00 in order to ensure that when there are two strings where one is a prefix of the other that the shorter string will sort first.

Binary Encoding

There are two encoding strategies for binary fields, referred to as "BlobVar" and "BlobCopy". BlobVar is less efficient in both space and encoding time. It has no limitations on the range of encoded values. BlobCopy is a byte-for-byte copy of the input data followed by a termination byte. It is extremely fast to encode and decode. It carries the restriction of not allowing a 0x00 value in the input byte[] as this value is used as the termination byte.

BlobVar

"BlobVar" encodes the input byte[] in a manner similar to a variable length integer encoding. As with the other OrderedBytes encodings, the first encoded byte is used to indicate what kind of value follows. This header byte is 0x37 for BlobVar encoded values. As with the traditional varint encoding, the most significant bit of each subsequent encoded byte is used as a continuation marker. The 7 remaining bits contain the 7 most significant bits of the first unencoded byte. The next encoded byte starts with a continuation marker in the MSB. The least significant bit from the first unencoded byte follows, and the remaining 6 bits contain the 6 MSBs of the second unencoded byte. The encoding continues, encoding 7 bytes on to 8 encoded bytes. The MSB of the final encoded byte contains a termination marker rather than a continuation marker, and any remaining bits from the final input byte. Any trailing bits in the final encoded byte are zeros.

BlobCopy

"BlobCopy" is a simple byte-for-byte copy of the input data. It uses 0x38 as the header byte, and is terminated by 0x00 in the DESCENDING case. This alternative encoding is faster and more space-efficient, but it cannot accept values containing a 0x00 byte in DESCENDING order.

Variable-length Numeric Encoding

Numeric values must be coded so as to sort in numeric order. We assume that numeric values can be both integer and floating point values. Clients must be careful to use inspection methods for encoded values (such as isNumericInfinite(PositionedByteRange) and isNumericNaN(PositionedByteRange) to protect against decoding values into object which do not support these numeric concepts (such as Long and BigDecimal).

Simplest cases first: If the numeric value is a NaN, then the encoding is a single byte of 0x25. This causes NaN values to sort after every other numeric value.

If the numeric value is a negative infinity then the encoding is a single byte of 0x07. Since every other numeric value except NaN has a larger initial byte, this encoding ensures that negative infinity will sort prior to every other numeric value other than NaN.

If the numeric value is a positive infinity then the encoding is a single byte of 0x23. Every other numeric value encoding begins with a smaller byte, ensuring that positive infinity always sorts last among numeric values. 0x23 is also smaller than 0x33, the initial byte of a text value, ensuring that every numeric value sorts before every text value.

If the numeric value is exactly zero then it is encoded as a single byte of 0x15. Finite negative values will have initial bytes of 0x08 through 0x14 and finite positive values will have initial bytes of 0x16 through 0x22.

For all numeric values, we compute a mantissa M and an exponent E. The mantissa is a base-100 representation of the value. The exponent E determines where to put the decimal point.

Each centimal digit of the mantissa is stored in a byte. If the value of the centimal digit is X (hence X≥0 and X≤99) then the byte value will be 2*X+1 for every byte of the mantissa, except for the last byte which will be 2*X+0. The mantissa must be the minimum number of bytes necessary to represent the value; trailing X==0 digits are omitted. This means that the mantissa will never contain a byte with the value 0x00.

If we assume all digits of the mantissa occur to the right of the decimal point, then the exponent E is the power of one hundred by which one must multiply the mantissa to recover the original value.

Values are classified as large, medium, or small according to the value of E. If E is 11 or more, the value is large. For E between 0 and 10, the value is medium. For E less than zero, the value is small.

Large positive values are encoded as a single byte 0x22 followed by E as a varint and then M. Medium positive values are a single byte of 0x17+E followed by M. Small positive values are encoded as a single byte 0x16 followed by the ones-complement of the varint for -E followed by M.

Small negative values are encoded as a single byte 0x14 followed by -E as a varint and then the ones-complement of M. Medium negative values are encoded as a byte 0x13-E followed by the ones-complement of M. Large negative values consist of the single byte 0x08 followed by the ones-complement of the varint encoding of E followed by the ones-complement of M.

Fixed-length Integer Encoding

All 4-byte integers are serialized to a 5-byte, fixed-width, sortable byte format. All 8-byte integers are serialized to the equivelant 9-byte format. Serialization is performed by writing a header byte, inverting the integer sign bit and writing the resulting bytes to the byte array in big endian order.

Fixed-length Floating Point Encoding

32-bit and 64-bit floating point numbers are encoded to a 5-byte and 9-byte encoding format, respectively. The format is identical, save for the precision respected in each step of the operation.

This format ensures the following total ordering of floating point values: Float.NEGATIVE_INFINITY < -Float.MAX_VALUE < ... < -Float.MIN_VALUE < -0.0 < +0.0; < Float.MIN_VALUE < ... < Float.MAX_VALUE < Float.POSITIVE_INFINITY < Float.NaN

Floating point numbers are encoded as specified in IEEE 754. A 32-bit single precision float consists of a sign bit, 8-bit unsigned exponent encoded in offset-127 notation, and a 23-bit significand. The format is described further in the Single Precision Floating Point Wikipedia page

The value of a normal float is -1 ^{sign bit} × 2^{exponent - 127} × 1.significand

The IEE754 floating point format already preserves sort ordering for positive floating point numbers when the raw bytes are compared in most significant byte order. This is discussed further at http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

Thus, we need only ensure that negative numbers sort in the the exact opposite order as positive numbers (so that say, negative infinity is less than negative 1), and that all negative numbers compare less than any positive number. To accomplish this, we invert the sign bit of all floating point numbers, and we also invert the exponent and significand bits if the floating point number was negative.

More specifically, we first store the floating point bits into a 32-bit int j using Float.floatToIntBits(float). This method collapses all NaNs into a single, canonical NaN value but otherwise leaves the bits unchanged. We then compute

 j ˆ= (j >> (Integer.SIZE - 1)) | Integer.MIN_SIZE

which inverts the sign bit and XOR's all other bits with the sign bit itself. Comparing the raw bytes of j in most significant byte order is equivalent to performing a single precision floating point comparison on the underlying bits (ignoring NaN comparisons, as NaNs don't compare equal to anything when performing floating point comparisons).

The resulting integer is then converted into a byte array by serializing the integer one byte at a time in most significant byte order. The serialized integer is prefixed by a single header byte. All serialized values are 5 bytes in length.

OrderedBytes encodings are heavily influenced by the SQLite4 Key Encoding. Slight deviations are make in the interest of order correctness and user extensibility. Fixed-width Long and Double encodings are based on implementations from the now defunct Orderly library.

Field Summary

Fields

Modifier and Type

Field

Description

private static final byte

BLOB_COPY

private static final byte

BLOB_VAR

static final MathContext

DEFAULT_MATH_CONTEXT

The context used to normalize BigDecimal values.

private static final byte

FIXED_FLOAT32

private static final byte

FIXED_FLOAT64

private static final byte

FIXED_INT16

private static final byte

FIXED_INT32

private static final byte

FIXED_INT64

private static final byte

FIXED_INT8

static final int

MAX_PRECISION

Max precision guaranteed to fit into a long.

private static final byte

NAN

private static final byte

NEG_INF

private static final byte

NEG_LARGE

private static final byte

NEG_MED_MAX

private static final byte

NEG_MED_MIN

private static final byte

NEG_SMALL

private static final byte

NULL

private static final byte

POS_INF

private static final byte

POS_LARGE

private static final byte

POS_MED_MAX

private static final byte

POS_MED_MIN

private static final byte

POS_SMALL

private static final byte

TERM

private static final byte

TEXT

static final Charset

UTF8

private static final byte

ZERO
Constructor Summary

Constructors

Constructor

Description

OrderedBytes()
Method Summary

Modifier and Type

Method

Description

(package private) static int

blobVarDecodedLength(int len)

Calculate the expected BlobVar decoded length based on encoded length.

static int

blobVarEncodedLength(int len)

Calculate the expected BlobVar encoded length based on unencoded length.

static byte[]

decodeBlobCopy(PositionedByteRange src)

Decode a Blob value, byte-for-byte copy.

static byte[]

decodeBlobVar(PositionedByteRange src)

Decode a blob value that was encoded using BlobVar encoding.

static float

decodeFloat32(PositionedByteRange src)

Decode a 32-bit floating point value using the fixed-length encoding.

static double

decodeFloat64(PositionedByteRange src)

Decode a 64-bit floating point value using the fixed-length encoding.

static short

decodeInt16(PositionedByteRange src)

Decode an int16 value.

static int

decodeInt32(PositionedByteRange src)

Decode an int32 value.

static long

decodeInt64(PositionedByteRange src)

Decode an int64 value.

static byte

decodeInt8(PositionedByteRange src)

Decode an int8 value.

static BigDecimal

decodeNumericAsBigDecimal(PositionedByteRange src)

Decode a BigDecimal value from the variable-length encoding.

static double

decodeNumericAsDouble(PositionedByteRange src)

Decode a primitive double value from the Numeric encoding.

static long

decodeNumericAsLong(PositionedByteRange src)

Decode a primitive long value from the Numeric encoding.

private static BigDecimal

decodeNumericValue(PositionedByteRange src)

Decode a BigDecimal from src.

private static BigDecimal

decodeSignificand(PositionedByteRange src, int e, boolean comp)

Read significand digits from src according to the magnitude of e.

static String

decodeString(PositionedByteRange src)

Decode a String value.

static int

encodeBlobCopy(PositionedByteRange dst, byte[] val, int voff, int vlen, Order ord)

Encode a Blob value as a byte-for-byte copy.

static int

encodeBlobCopy(PositionedByteRange dst, byte[] val, Order ord)

Encode a Blob value as a byte-for-byte copy.

static int

encodeBlobVar(PositionedByteRange dst, byte[] val, int voff, int vlen, Order ord)

Encode a Blob value using a modified varint encoding scheme.

static int

encodeBlobVar(PositionedByteRange dst, byte[] val, Order ord)

Encode a blob value using a modified varint encoding scheme.

static int

encodeFloat32(PositionedByteRange dst, float val, Order ord)

Encode a 32-bit floating point value using the fixed-length encoding.

static int

encodeFloat64(PositionedByteRange dst, double val, Order ord)

Encode a 64-bit floating point value using the fixed-length encoding.

static int

encodeInt16(PositionedByteRange dst, short val, Order ord)

Encode an int16 value using the fixed-length encoding.

static int

encodeInt32(PositionedByteRange dst, int val, Order ord)

Encode an int32 value using the fixed-length encoding.

static int

encodeInt64(PositionedByteRange dst, long val, Order ord)

Encode an int64 value using the fixed-length encoding.

static int

encodeInt8(PositionedByteRange dst, byte val, Order ord)

Encode an int8 value using the fixed-length encoding.

static int

encodeNull(PositionedByteRange dst, Order ord)

Encode a null value.

static int

encodeNumeric(PositionedByteRange dst, double val, Order ord)

Encode a numerical value using the variable-length encoding.

static int

encodeNumeric(PositionedByteRange dst, long val, Order ord)

Encode a numerical value using the variable-length encoding.

static int

encodeNumeric(PositionedByteRange dst, BigDecimal val, Order ord)

Encode a numerical value using the variable-length encoding.

private static int

encodeNumericLarge(PositionedByteRange dst, BigDecimal val)

Encode the large magnitude floating point number val using the key encoding.

private static int

encodeNumericSmall(PositionedByteRange dst, BigDecimal val)

Encode the small magnitude floating point number val using the key encoding.

static int

encodeString(PositionedByteRange dst, String val, Order ord)

Encode a String value.

private static void

encodeToCentimal(PositionedByteRange dst, BigDecimal val)

Encode a value val in [0.01, 1.0) into Centimals.

(package private) static long

getVaruint64(PositionedByteRange src, boolean comp)

Decode a sequence of bytes in src as a varuint64.

static boolean

isBlobCopy(PositionedByteRange src)

Return true when the next encoded value in src uses BlobCopy encoding, false otherwise.

static boolean

isBlobVar(PositionedByteRange src)

Return true when the next encoded value in src uses BlobVar encoding, false otherwise.

static boolean

isEncodedValue(PositionedByteRange src)

Returns true when src appears to be positioned an encoded value, false otherwise.

static boolean

isFixedFloat32(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Float32 encoding, false otherwise.

static boolean

isFixedFloat64(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Float64 encoding, false otherwise.

static boolean

isFixedInt16(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Int16 encoding, false otherwise.

static boolean

isFixedInt32(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Int32 encoding, false otherwise.

static boolean

isFixedInt64(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Int64 encoding, false otherwise.

static boolean

isFixedInt8(PositionedByteRange src)

Return true when the next encoded value in src uses fixed-width Int8 encoding, false otherwise.

static boolean

isNull(PositionedByteRange src)

Return true when the next encoded value in src is null, false otherwise.

static boolean

isNumeric(PositionedByteRange src)

Return true when the next encoded value in src uses Numeric encoding, false otherwise.

static boolean

isNumericInfinite(PositionedByteRange src)

Return true when the next encoded value in src uses Numeric encoding and is Infinite, false otherwise.

static boolean

isNumericNaN(PositionedByteRange src)

Return true when the next encoded value in src uses Numeric encoding and is NaN, false otherwise.

static boolean

isNumericZero(PositionedByteRange src)

Return true when the next encoded value in src uses Numeric encoding and is 0, false otherwise.

static boolean

isText(PositionedByteRange src)

Return true when the next encoded value in src uses Text encoding, false otherwise.

static int

length(PositionedByteRange buff)

Return the number of encoded entries remaining in buff.

(package private) static int

lengthVaruint64(PositionedByteRange src, boolean comp)

Inspect src for an encoded varuint64 for its length in bytes.

(package private) static BigDecimal

normalize(BigDecimal val)

Strip all trailing zeros to ensure that no digit will be zero and round using our default context to ensure precision doesn't exceed max allowed.

private static int

putUint32(PositionedByteRange dst, int val)

Write a 32-bit unsigned integer to dst as 4 big-endian bytes.

(package private) static int

putVaruint64(PositionedByteRange dst, long val, boolean comp)

Encode an unsigned 64-bit unsigned integer val into dst.

static int

skip(PositionedByteRange src)

Skip buff's position forward over one encoded value.

private static int

skipSignificand(PositionedByteRange src, boolean comp)

Skip src over the significand bytes.

(package private) static int

skipVaruint64(PositionedByteRange src, boolean cmp)

Skip src over the encoded varuint64.

private static IllegalArgumentException

unexpectedHeader(byte header)

Creates the standard exception when the encoded header byte is unexpected for the decoding context.

private static int

unsignedCmp(long x1, long x2)

Perform unsigned comparison between two long values.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- NULL
  
  private static final byte NULL
  See Also:
  
  Constant Field Values
- NEG_INF
  
  private static final byte NEG_INF
  See Also:
  
  Constant Field Values
- NEG_LARGE
  
  private static final byte NEG_LARGE
  See Also:
  
  Constant Field Values
- NEG_MED_MIN
  
  private static final byte NEG_MED_MIN
  See Also:
  
  Constant Field Values
- NEG_MED_MAX
  
  private static final byte NEG_MED_MAX
  See Also:
  
  Constant Field Values
- NEG_SMALL
  
  private static final byte NEG_SMALL
  See Also:
  
  Constant Field Values
- ZERO
  
  private static final byte ZERO
  See Also:
  
  Constant Field Values
- POS_SMALL
  
  private static final byte POS_SMALL
  See Also:
  
  Constant Field Values
- POS_MED_MIN
  
  private static final byte POS_MED_MIN
  See Also:
  
  Constant Field Values
- POS_MED_MAX
  
  private static final byte POS_MED_MAX
  See Also:
  
  Constant Field Values
- POS_LARGE
  
  private static final byte POS_LARGE
  See Also:
  
  Constant Field Values
- POS_INF
  
  private static final byte POS_INF
  See Also:
  
  Constant Field Values
- NAN
  
  private static final byte NAN
  See Also:
  
  Constant Field Values
- FIXED_INT8
  
  private static final byte FIXED_INT8
  See Also:
  
  Constant Field Values
- FIXED_INT16
  
  private static final byte FIXED_INT16
  See Also:
  
  Constant Field Values
- FIXED_INT32
  
  private static final byte FIXED_INT32
  See Also:
  
  Constant Field Values
- FIXED_INT64
  
  private static final byte FIXED_INT64
  See Also:
  
  Constant Field Values
- FIXED_FLOAT32
  
  private static final byte FIXED_FLOAT32
  See Also:
  
  Constant Field Values
- FIXED_FLOAT64
  
  private static final byte FIXED_FLOAT64
  See Also:
  
  Constant Field Values
- TEXT
  
  private static final byte TEXT
  See Also:
  
  Constant Field Values
- BLOB_VAR
  
  private static final byte BLOB_VAR
  See Also:
  
  Constant Field Values
- BLOB_COPY
  
  private static final byte BLOB_COPY
  See Also:
  
  Constant Field Values
- UTF8
  
  public static final Charset UTF8
- TERM
  
  private static final byte TERM
  See Also:
  
  Constant Field Values
- MAX_PRECISION
  
  public static final int MAX_PRECISION
  
  Max precision guaranteed to fit into a long.
  See Also:
  
  Constant Field Values
- DEFAULT_MATH_CONTEXT
  
  public static final MathContext DEFAULT_MATH_CONTEXT
  
  The context used to normalize BigDecimal values.
Constructor Details
- OrderedBytes
  
  public OrderedBytes()
Method Details
- unexpectedHeader
  
  private static IllegalArgumentException unexpectedHeader(byte header)
  
  Creates the standard exception when the encoded header byte is unexpected for the decoding context.
  
  Parameters:
  
  header - value used in error message.
- unsignedCmp
  
  private static int unsignedCmp(long x1, long x2)
  
  Perform unsigned comparison between two long values. Conforms to the same interface as CellComparator.
- putUint32
  
  private static int putUint32(PositionedByteRange dst, int val)
  
  Write a 32-bit unsigned integer to dst as 4 big-endian bytes.
  
  Returns:
  
  number of bytes written.
- putVaruint64
  
  @Private static int putVaruint64(PositionedByteRange dst, long val, boolean comp)
  
  Encode an unsigned 64-bit unsigned integer val into dst.
  
  Parameters:
  
  dst - The destination to which encoded bytes are written.
  
  val - The value to write.
  
  comp - Compliment the encoded value when comp is true.
  
  Returns:
  
  number of bytes written.
- lengthVaruint64
  
  @Private static int lengthVaruint64(PositionedByteRange src, boolean comp)
  
  Inspect src for an encoded varuint64 for its length in bytes. Preserves the state of src.
  
  Parameters:
  
  src - source buffer
  
  comp - if true, parse the compliment of the value.
  
  Returns:
  
  the number of bytes consumed by this value.
- skipVaruint64
  
  @Private static int skipVaruint64(PositionedByteRange src, boolean cmp)
  
  Skip src over the encoded varuint64.
  
  Parameters:
  
  src - source buffer
  
  cmp - if true, parse the compliment of the value.
  
  Returns:
  
  the number of bytes skipped.
- getVaruint64
  
  @Private static long getVaruint64(PositionedByteRange src, boolean comp)
  
  Decode a sequence of bytes in src as a varuint64. Compliment the encoded value when comp is true.
  
  Returns:
  
  the decoded value.
- normalize
  
  @Private static BigDecimal normalize(BigDecimal val)
  
  Strip all trailing zeros to ensure that no digit will be zero and round using our default context to ensure precision doesn't exceed max allowed. From Phoenix's NumberUtil.
  
  Returns:
  
  new BigDecimal instance
- decodeSignificand
  
  private static BigDecimal decodeSignificand(PositionedByteRange src, int e, boolean comp)
  
  Read significand digits from src according to the magnitude of e.
  
  Parameters:
  
  src - The source from which to read encoded digits.
  
  e - The magnitude of the first digit read.
  
  comp - Treat encoded bytes as compliments when comp is true.
  
  Returns:
  
  The decoded value.
  
  Throws:
  
  IllegalArgumentException - when read exceeds the remaining length of src.
- skipSignificand
  
  private static int skipSignificand(PositionedByteRange src, boolean comp)
  
  Skip src over the significand bytes.
  
  Parameters:
  
  src - The source from which to read encoded digits.
  
  comp - Treat encoded bytes as compliments when comp is true.
  
  Returns:
  
  the number of bytes skipped.
- encodeNumericSmall
  
  private static int encodeNumericSmall(PositionedByteRange dst, BigDecimal val)
  Encode the small magnitude floating point number val using the key encoding. The caller guarantees that 1.0 > abs(val) > 0.0.
  
  A floating point value is encoded as an integer exponent E and a mantissa M. The original value is equal to (M * 100^E). E is set to the smallest value possible without making M greater than or equal to 1.0.
  
  For this routine, E will always be zero or negative, since the original value is less than one. The encoding written by this routine is the ones-complement of the varint of the negative of E followed by the mantissa:
  Encoding: ~-E M
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - The value to encode.
  
  Returns:
  
  the number of bytes written.
- encodeNumericLarge
  
  private static int encodeNumericLarge(PositionedByteRange dst, BigDecimal val)
  Encode the large magnitude floating point number val using the key encoding. The caller guarantees that val will be finite and abs(val) >= 1.0.
  A floating point value is encoded as an integer exponent E and a mantissa M. The original value is equal to (M * 100^E). E is set to the smallest value possible without making M greater than or equal to 1.0.
  
  Each centimal digit of the mantissa is stored in a byte. If the value of the centimal digit is X (hence X>=0 and X<=99) then the byte value will be 2*X+1 for every byte of the mantissa, except for the last byte which will be 2*X+0. The mantissa must be the minimum number of bytes necessary to represent the value; trailing X==0 digits are omitted. This means that the mantissa will never contain a byte with the value 0x00.
  
  If E > 10, then this routine writes of E as a varint followed by the mantissa as described above. Otherwise, if E <= 10, this routine only writes the mantissa and leaves the E value to be encoded as part of the opening byte of the field by the calling function.
  Encoding: M (if E<=10) E M (if E>10)
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - The value to encode.
  
  Returns:
  
  the number of bytes written.
- encodeToCentimal
  
  private static void encodeToCentimal(PositionedByteRange dst, BigDecimal val)
  
  Encode a value val in [0.01, 1.0) into Centimals. Util function for encodeNumericLarge(PositionedByteRange, BigDecimal) and encodeNumericSmall(PositionedByteRange, BigDecimal)
  
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - A BigDecimal after the normalization. The value must be in [0.01, 1.0).
- encodeNumeric
  
  public static int encodeNumeric(PositionedByteRange dst, long val, Order ord)
  
  Encode a numerical value using the variable-length encoding.
  
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - The value to encode.
  
  ord - The Order to respect while encoding val.
  
  Returns:
  
  the number of bytes written.
- encodeNumeric
  
  public static int encodeNumeric(PositionedByteRange dst, double val, Order ord)
  
  Encode a numerical value using the variable-length encoding.
  
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - The value to encode.
  
  ord - The Order to respect while encoding val.
  
  Returns:
  
  the number of bytes written.
- encodeNumeric
  
  public static int encodeNumeric(PositionedByteRange dst, BigDecimal val, Order ord)
  
  Encode a numerical value using the variable-length encoding. If the number of significant digits of the value exceeds the MAX_PRECISION, the exceeding part will be lost.
  
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  val - The value to encode.
  
  ord - The Order to respect while encoding val.
  
  Returns:
  
  the number of bytes written.
- decodeNumericValue
  
  private static BigDecimal decodeNumericValue(PositionedByteRange src)
  
  Decode a BigDecimal from src. Assumes src encodes a value in Numeric encoding and is within the valid range of BigDecimal values. BigDecimal does not support NaN or Infinte values.
  See Also:
  
  decodeNumericAsDouble(PositionedByteRange)
- decodeNumericAsDouble
  
  public static double decodeNumericAsDouble(PositionedByteRange src)
  
  Decode a primitive double value from the Numeric encoding. Numeric encoding is based on BigDecimal; in the event the encoded value is larger than can be represented in a double, this method performs an implicit narrowing conversion as described in BigDecimal.doubleValue().
  Throws:
  
  NullPointerException - when the encoded value is NULL.
  
  IllegalArgumentException - when the encoded value is not a Numeric.
  
  See Also:
  
  encodeNumeric(PositionedByteRange, double, Order)
  
  BigDecimal.doubleValue()
- decodeNumericAsLong
  
  public static long decodeNumericAsLong(PositionedByteRange src)
  
  Decode a primitive long value from the Numeric encoding. Numeric encoding is based on BigDecimal; in the event the encoded value is larger than can be represented in a long, this method performs an implicit narrowing conversion as described in BigDecimal.doubleValue().
  Throws:
  
  NullPointerException - when the encoded value is NULL.
  
  IllegalArgumentException - when the encoded value is not a Numeric.
  
  See Also:
  
  encodeNumeric(PositionedByteRange, long, Order)
  
  BigDecimal.longValue()
- decodeNumericAsBigDecimal
  
  public static BigDecimal decodeNumericAsBigDecimal(PositionedByteRange src)
  
  Decode a BigDecimal value from the variable-length encoding.
  Throws:
  
  IllegalArgumentException - when the encoded value is not a Numeric.
  
  See Also:
  
  encodeNumeric(PositionedByteRange, BigDecimal, Order)
- encodeString
  
  public static int encodeString(PositionedByteRange dst, String val, Order ord)
  
  Encode a String value. String encoding is 0x00-terminated and so it does not support codepoints in the value.
  
  Parameters:
  
  dst - The destination to which the encoded value is written.
  
  val - The value to encode.
  
  ord - The Order to respect while encoding val.
  
  Returns:
  
  the number of bytes written.
  
  Throws:
  
  IllegalArgumentException - when val contains a .
- decodeString
  
  public static String decodeString(PositionedByteRange src)
  
  Decode a String value.
- blobVarEncodedLength
  
  public static int blobVarEncodedLength(int len)
  
  Calculate the expected BlobVar encoded length based on unencoded length.
- blobVarDecodedLength
  
  @Private static int blobVarDecodedLength(int len)
  
  Calculate the expected BlobVar decoded length based on encoded length.
- encodeBlobVar
  
  public static int encodeBlobVar(PositionedByteRange dst, byte[] val, int voff, int vlen, Order ord)
  
  Encode a Blob value using a modified varint encoding scheme.
  This format encodes a byte[] value such that no limitations on the input value are imposed. The first byte encodes the encoding scheme that follows, BLOB_VAR. Each encoded byte thereafter consists of a header bit followed by 7 bits of payload. A header bit of '1' indicates continuation of the encoding. A header bit of '0' indicates this byte contains the last of the payload. An empty input value is encoded as the header byte immediately followed by a termination byte 0x00. This is not ambiguous with the encoded value of [], which results in [0x80, 0x00].
  
  Returns:
  
  the number of bytes written.
- encodeBlobVar
  
  public static int encodeBlobVar(PositionedByteRange dst, byte[] val, Order ord)
  
  Encode a blob value using a modified varint encoding scheme.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  encodeBlobVar(PositionedByteRange, byte[], int, int, Order)
- decodeBlobVar
  
  public static byte[] decodeBlobVar(PositionedByteRange src)
  
  Decode a blob value that was encoded using BlobVar encoding.
- encodeBlobCopy
  
  public static int encodeBlobCopy(PositionedByteRange dst, byte[] val, int voff, int vlen, Order ord)
  
  Encode a Blob value as a byte-for-byte copy. BlobCopy encoding in DESCENDING order is NULL terminated so as to preserve proper sorting of [] and so it does not support 0x00 in the value.
  
  Returns:
  
  the number of bytes written.
  
  Throws:
  
  IllegalArgumentException - when ord is DESCENDING and val contains a 0x00 byte.
- encodeBlobCopy
  
  public static int encodeBlobCopy(PositionedByteRange dst, byte[] val, Order ord)
  
  Encode a Blob value as a byte-for-byte copy. BlobCopy encoding in DESCENDING order is NULL terminated so as to preserve proper sorting of [] and so it does not support 0x00 in the value.
  Returns:
  
  the number of bytes written.
  
  Throws:
  
  IllegalArgumentException - when ord is DESCENDING and val contains a 0x00 byte.
  
  See Also:
  
  encodeBlobCopy(PositionedByteRange, byte[], int, int, Order)
- decodeBlobCopy
  
  public static byte[] decodeBlobCopy(PositionedByteRange src)
  
  Decode a Blob value, byte-for-byte copy.
  See Also:
  
  encodeBlobCopy(PositionedByteRange, byte[], int, int, Order)
- encodeNull
  
  public static int encodeNull(PositionedByteRange dst, Order ord)
  
  Encode a null value.
  
  Parameters:
  
  dst - The destination to which encoded digits are written.
  
  ord - The Order to respect while encoding val.
  
  Returns:
  
  the number of bytes written.
- encodeInt8
  
  public static int encodeInt8(PositionedByteRange dst, byte val, Order ord)
  
  Encode an int8 value using the fixed-length encoding.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  encodeInt64(PositionedByteRange, long, Order)
  
  decodeInt8(PositionedByteRange)
- decodeInt8
  
  public static byte decodeInt8(PositionedByteRange src)
  
  Decode an int8 value.
  See Also:
  
  encodeInt8(PositionedByteRange, byte, Order)
- encodeInt16
  
  public static int encodeInt16(PositionedByteRange dst, short val, Order ord)
  
  Encode an int16 value using the fixed-length encoding.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  encodeInt64(PositionedByteRange, long, Order)
  
  decodeInt16(PositionedByteRange)
- decodeInt16
  
  public static short decodeInt16(PositionedByteRange src)
  
  Decode an int16 value.
  See Also:
  
  encodeInt16(PositionedByteRange, short, Order)
- encodeInt32
  
  public static int encodeInt32(PositionedByteRange dst, int val, Order ord)
  
  Encode an int32 value using the fixed-length encoding.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  encodeInt64(PositionedByteRange, long, Order)
  
  decodeInt32(PositionedByteRange)
- decodeInt32
  
  public static int decodeInt32(PositionedByteRange src)
  
  Decode an int32 value.
  See Also:
  
  encodeInt32(PositionedByteRange, int, Order)
- encodeInt64
  
  public static int encodeInt64(PositionedByteRange dst, long val, Order ord)
  Encode an int64 value using the fixed-length encoding.
  This format ensures that all longs sort in their natural order, as they would sort when using signed long comparison.
  
  All Longs are serialized to an 8-byte, fixed-width sortable byte format. Serialization is performed by inverting the integer sign bit and writing the resulting bytes to the byte array in big endian order. The encoded value is prefixed by the FIXED_INT64 header byte. This encoding is designed to handle java language primitives and so Null values are NOT supported by this implementation.
  
  For example:
  
  Input: 0x0000000000000005 (5) Result: 0x288000000000000005 Input: 0xfffffffffffffffb (-4) Result: 0x280000000000000004 Input: 0x7fffffffffffffff (Long.MAX_VALUE) Result: 0x28ffffffffffffffff Input: 0x8000000000000000 (Long.MIN_VALUE) Result: 0x287fffffffffffffff
  
  This encoding format, and much of this documentation string, is based on Orderly's FixedIntWritableRowKey.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  decodeInt64(PositionedByteRange)
- decodeInt64
  
  public static long decodeInt64(PositionedByteRange src)
  
  Decode an int64 value.
  See Also:
  
  encodeInt64(PositionedByteRange, long, Order)
- encodeFloat32
  
  public static int encodeFloat32(PositionedByteRange dst, float val, Order ord)
  
  Encode a 32-bit floating point value using the fixed-length encoding. Encoding format is described at length in encodeFloat64(PositionedByteRange, double, Order).
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  decodeFloat32(PositionedByteRange)
  
  encodeFloat64(PositionedByteRange, double, Order)
- decodeFloat32
  
  public static float decodeFloat32(PositionedByteRange src)
  
  Decode a 32-bit floating point value using the fixed-length encoding.
  See Also:
  
  encodeFloat32(PositionedByteRange, float, Order)
- encodeFloat64
  
  public static int encodeFloat64(PositionedByteRange dst, double val, Order ord)
  Encode a 64-bit floating point value using the fixed-length encoding.
  This format ensures the following total ordering of floating point values: Double.NEGATIVE_INFINITY < -Double.MAX_VALUE < ... < -Double.MIN_VALUE < -0.0 < +0.0; < Double.MIN_VALUE < ... < Double.MAX_VALUE < Double.POSITIVE_INFINITY < Double.NaN
  
  Floating point numbers are encoded as specified in IEEE 754. A 64-bit double precision float consists of a sign bit, 11-bit unsigned exponent encoded in offset-1023 notation, and a 52-bit significand. The format is described further in the Double Precision Floating Point Wikipedia page
  
  The value of a normal float is -1 ^{sign bit} × 2^{exponent - 1023} × 1.significand
  
  The IEE754 floating point format already preserves sort ordering for positive floating point numbers when the raw bytes are compared in most significant byte order. This is discussed further at http://www.cygnus-software.com/papers/comparingfloats/comparingfloats. htm
  
  Thus, we need only ensure that negative numbers sort in the the exact opposite order as positive numbers (so that say, negative infinity is less than negative 1), and that all negative numbers compare less than any positive number. To accomplish this, we invert the sign bit of all floating point numbers, and we also invert the exponent and significand bits if the floating point number was negative.
  
  More specifically, we first store the floating point bits into a 64-bit long l using Double.doubleToLongBits(double). This method collapses all NaNs into a single, canonical NaN value but otherwise leaves the bits unchanged. We then compute
  
  l ˆ= (l >> (Long.SIZE - 1)) | Long.MIN_SIZE
  
  which inverts the sign bit and XOR's all other bits with the sign bit itself. Comparing the raw bytes of l in most significant byte order is equivalent to performing a double precision floating point comparison on the underlying bits (ignoring NaN comparisons, as NaNs don't compare equal to anything when performing floating point comparisons).
  
  The resulting long integer is then converted into a byte array by serializing the long one byte at a time in most significant byte order. The serialized integer is prefixed by a single header byte. All serialized values are 9 bytes in length.
  
  This encoding format, and much of this highly detailed documentation string, is based on Orderly's DoubleWritableRowKey.
  Returns:
  
  the number of bytes written.
  
  See Also:
  
  decodeFloat64(PositionedByteRange)
- decodeFloat64
  
  public static double decodeFloat64(PositionedByteRange src)
  
  Decode a 64-bit floating point value using the fixed-length encoding.
  See Also:
  
  encodeFloat64(PositionedByteRange, double, Order)
- isEncodedValue
  
  public static boolean isEncodedValue(PositionedByteRange src)
  
  Returns true when src appears to be positioned an encoded value, false otherwise.
- isNull
  
  public static boolean isNull(PositionedByteRange src)
  
  Return true when the next encoded value in src is null, false otherwise.
- isNumeric
  
  public static boolean isNumeric(PositionedByteRange src)
  
  Return true when the next encoded value in src uses Numeric encoding, false otherwise. NaN, +/-Inf are valid Numeric values.
- isNumericInfinite
  
  public static boolean isNumericInfinite(PositionedByteRange src)
  
  Return true when the next encoded value in src uses Numeric encoding and is Infinite, false otherwise.
- isNumericNaN
  
  public static boolean isNumericNaN(PositionedByteRange src)
  
  Return true when the next encoded value in src uses Numeric encoding and is NaN, false otherwise.
- isNumericZero
  
  public static boolean isNumericZero(PositionedByteRange src)
  
  Return true when the next encoded value in src uses Numeric encoding and is 0, false otherwise.
- isFixedInt8
  
  public static boolean isFixedInt8(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Int8 encoding, false otherwise.
- isFixedInt16
  
  public static boolean isFixedInt16(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Int16 encoding, false otherwise.
- isFixedInt32
  
  public static boolean isFixedInt32(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Int32 encoding, false otherwise.
- isFixedInt64
  
  public static boolean isFixedInt64(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Int64 encoding, false otherwise.
- isFixedFloat32
  
  public static boolean isFixedFloat32(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Float32 encoding, false otherwise.
- isFixedFloat64
  
  public static boolean isFixedFloat64(PositionedByteRange src)
  
  Return true when the next encoded value in src uses fixed-width Float64 encoding, false otherwise.
- isText
  
  public static boolean isText(PositionedByteRange src)
  
  Return true when the next encoded value in src uses Text encoding, false otherwise.
- isBlobVar
  
  public static boolean isBlobVar(PositionedByteRange src)
  
  Return true when the next encoded value in src uses BlobVar encoding, false otherwise.
- isBlobCopy
  
  public static boolean isBlobCopy(PositionedByteRange src)
  
  Return true when the next encoded value in src uses BlobCopy encoding, false otherwise.
- skip
  
  public static int skip(PositionedByteRange src)
  
  Skip buff's position forward over one encoded value.
  
  Returns:
  
  number of bytes skipped.
- length
  
  public static int length(PositionedByteRange buff)
  
  Return the number of encoded entries remaining in buff. The state of buff is not modified through use of this method.

Class OrderedBytes

Encoding Format summary

Null Encoding

Text Encoding

Binary Encoding

BlobVar

BlobCopy

Variable-length Numeric Encoding

Fixed-length Integer Encoding

Fixed-length Floating Point Encoding

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

NULL

NEG_INF

NEG_LARGE

NEG_MED_MIN

NEG_MED_MAX

NEG_SMALL

ZERO

POS_SMALL

POS_MED_MIN

POS_MED_MAX

POS_LARGE

POS_INF

NAN

FIXED_INT8

FIXED_INT16

FIXED_INT32

FIXED_INT64

FIXED_FLOAT32

FIXED_FLOAT64

TEXT

BLOB_VAR

BLOB_COPY

UTF8

TERM

MAX_PRECISION

DEFAULT_MATH_CONTEXT

Constructor Details

OrderedBytes

Method Details

unexpectedHeader

unsignedCmp

putUint32

putVaruint64

lengthVaruint64

skipVaruint64

getVaruint64

normalize

decodeSignificand

skipSignificand

encodeNumericSmall

encodeNumericLarge

encodeToCentimal

encodeNumeric

encodeNumeric

encodeNumeric

decodeNumericValue

decodeNumericAsDouble

decodeNumericAsLong

decodeNumericAsBigDecimal

encodeString

decodeString

blobVarEncodedLength

blobVarDecodedLength

encodeBlobVar

encodeBlobVar

decodeBlobVar

encodeBlobCopy

encodeBlobCopy

decodeBlobCopy

encodeNull

encodeInt8

decodeInt8

encodeInt16

decodeInt16

encodeInt32