Class HFileBlock

java.lang.Object
org.apache.hadoop.hbase.io.hfile.HFileBlock
All Implemented Interfaces:
HeapSize, Cacheable, HBaseReferenceCounted, org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
Direct Known Subclasses:
ExclusiveMemHFileBlock, SharedMemHFileBlock

@Private public class HFileBlock extends Object implements Cacheable
Cacheable Blocks of an HFile version 2 file. Version 2 was introduced in hbase-0.92.0.

Version 1 was the original file block. Version 2 was introduced when we changed the hbase file format to support multi-level block indexes and compound bloom filters (HBASE-3857). Support for Version 1 was removed in hbase-1.3.0.

HFileBlock: Version 2

In version 2, a block is structured as follows:
  • Header: See Writer#putHeader() for where header is written; header total size is HFILEBLOCK_HEADER_SIZE
    • 0. blockType: Magic record identifying the BlockType (8 bytes): e.g. DATABLK*
    • 1. onDiskSizeWithoutHeader: Compressed -- a.k.a 'on disk' -- block size, excluding header, but including tailing checksum bytes (4 bytes)
    • 2. uncompressedSizeWithoutHeader: Uncompressed block size, excluding header, and excluding checksum bytes (4 bytes)
    • 3. prevBlockOffset: The offset of the previous block of the same type (8 bytes). This is used to navigate to the previous block without having to go to the block index
    • 4: For minorVersions >=1, the ordinal describing checksum type (1 byte)
    • 5: For minorVersions >=1, the number of data bytes/checksum chunk (4 bytes)
    • 6: onDiskDataSizeWithHeader: For minorVersions >=1, the size of data 'on disk', including header, excluding checksums (4 bytes)
  • Raw/Compressed/Encrypted/Encoded data: The compression algorithm is the same for all the blocks in an HFile. If compression is NONE, this is just raw, serialized Cells.
  • Tail: For minorVersions >=1, a series of 4 byte checksums, one each for the number of bytes specified by bytesPerChecksum.

Caching

Caches cache whole blocks with trailing checksums if any. We then tag on some metadata, the content of BLOCK_METADATA_SPACE which will be flag on if we are doing 'hbase' checksums and then the offset into the file which is needed when we re-make a cache key when we return the block to the cache as 'done'. See Cacheable.serialize(ByteBuffer, boolean) and Cacheable.getDeserializer().

TODO: Should we cache the checksums? Down in Writer#getBlockForCaching(CacheConfig) where we make a block to cache-on-write, there is an attempt at turning off checksums. This is not the only place we get blocks to cache. We also will cache the raw return from an hdfs read. In this case, the checksums may be present. If the cache is backed by something that doesn't do ECC, say an SSD, we might want to preserve checksums. For now this is open question.

TODO: Over in BucketCache, we save a block allocation by doing a custom serialization. Be sure to change it if serialization changes in here. Could we add a method here that takes an IOEngine and that then serializes to it rather than expose our internals over in BucketCache? IOEngine is in the bucket subpackage. Pull it up? Then this class knows about bucketcache. Ugh.

  • Field Details

    • LOG

      private static final org.slf4j.Logger LOG
    • FIXED_OVERHEAD

      public static final long FIXED_OVERHEAD
    • blockType

      Type of block. Header field 0.
    • onDiskSizeWithoutHeader

      Size on disk excluding header, including checksum. Header field 1.
      See Also:
    • uncompressedSizeWithoutHeader

      Size of pure data. Does not include header or checksums. Header field 2.
      See Also:
    • prevBlockOffset

      private long prevBlockOffset
      The offset of the previous block on disk. Header field 3.
      See Also:
    • onDiskDataSizeWithHeader

      private final int onDiskDataSizeWithHeader
      Size on disk of header + data. Excludes checksum. Header field 6, OR calculated from onDiskSizeWithoutHeader when using HDFS checksum.
      See Also:
    • bufWithoutChecksum

      The in-memory representation of the hfile block. Can be on or offheap. Can be backed by a single ByteBuffer or by many. Make no assumptions.

      Be careful reading from this buf. Duplicate and work on the duplicate or if not, be sure to reset position and limit else trouble down the road.

      TODO: Make this read-only once made.

      We are using the ByteBuff type. ByteBuffer is not extensible yet we need to be able to have a ByteBuffer-like API across multiple ByteBuffers reading from a cache such as BucketCache. So, we have this ByteBuff type. Unfortunately, it is spread all about HFileBlock. Would be good if could be confined to cache-use only but hard-to-do.

      NOTE: this byteBuff including HFileBlock header and data, but excluding checksum.

    • fileContext

      private final HFileContext fileContext
      Meta data that holds meta information on the hfileblock.
    • offset

      private long offset
      The offset of this block in the file. Populated by the reader for convenience of access. This offset is not part of the block header.
    • nextBlockOnDiskSize

      private int nextBlockOnDiskSize
      The on-disk size of the next block, including the header and checksums if present. UNSET if unknown. Blocks try to carry the size of the next block to read in this data member. Usually we get block sizes from the hfile index but sometimes the index is not available: e.g. when we read the indexes themselves (indexes are stored in blocks, we do not have an index for the indexes). Saves seeks especially around file open when there is a flurry of reading in hfile metadata.
    • allocator

    • CHECKSUM_VERIFICATION_NUM_IO_THRESHOLD

      On a checksum failure, do these many succeeding read requests using hdfs checksums before auto-reenabling hbase checksum verification.
      See Also:
    • UNSET

      private static int UNSET
    • FILL_HEADER

      public static final boolean FILL_HEADER
      See Also:
    • DONT_FILL_HEADER

      public static final boolean DONT_FILL_HEADER
      See Also:
    • MULTI_BYTE_BUFFER_HEAP_SIZE

      public static final int MULTI_BYTE_BUFFER_HEAP_SIZE
    • BLOCK_METADATA_SPACE

      public static final int BLOCK_METADATA_SPACE
      Space for metadata on a block that gets stored along with the block when we cache it. There are a few bytes stuck on the end of the HFileBlock that we pull in from HDFS. 8 bytes are for the offset of this block (long) in the file. Offset is important because is is used when we remake the CacheKey when we return block to the cache when done. There is also a flag on whether checksumming is being done by hbase or not. See class comment for note on uncertain state of checksumming of blocks that come out of cache (should we or should we not?). Finally there are 4 bytes to hold the length of the next block which can save a seek on occasion if available. (This EXTRA info came in with original commit of the bucketcache, HBASE-7404. It was formerly known as EXTRA_SERIALIZATION_SPACE).
      See Also:
    • CHECKSUM_SIZE

      static final int CHECKSUM_SIZE
      Each checksum value is an integer that can be stored in 4 bytes.
      See Also:
    • DUMMY_HEADER_NO_CHECKSUM

      static final byte[] DUMMY_HEADER_NO_CHECKSUM
    • BLOCK_DESERIALIZER

      Used deserializing blocks from Cache. ++++++++++++++ + HFileBlock + ++++++++++++++ + Checksums + <= Optional ++++++++++++++ + Metadata! + <= See note on BLOCK_METADATA_SPACE above. ++++++++++++++
      See Also:
    • DESERIALIZER_IDENTIFIER

      private static final int DESERIALIZER_IDENTIFIER
    • totalChecksumBytes

      private final int totalChecksumBytes
  • Constructor Details

    • HFileBlock

      public HFileBlock(BlockType blockType, int onDiskSizeWithoutHeader, int uncompressedSizeWithoutHeader, long prevBlockOffset, ByteBuff buf, boolean fillHeader, long offset, int nextBlockOnDiskSize, int onDiskDataSizeWithHeader, HFileContext fileContext, ByteBuffAllocator allocator)
      Creates a new HFile block from the given fields. This constructor is used only while writing blocks and caching, and is sitting in a byte buffer and we want to stuff the block into cache. See HFileBlock.Writer.getBlockForCaching(CacheConfig).

      TODO: The caller presumes no checksumming

      TODO: HFile block writer can also off-heap ?

      required of this block instance since going into cache; checksum already verified on underlying block data pulled in from filesystem. Is that correct? What if cache is SSD?
      Parameters:
      blockType - the type of this block, see BlockType
      onDiskSizeWithoutHeader - see onDiskSizeWithoutHeader
      uncompressedSizeWithoutHeader - see uncompressedSizeWithoutHeader
      prevBlockOffset - see prevBlockOffset
      buf - block buffer with header (HConstants.HFILEBLOCK_HEADER_SIZE bytes)
      fillHeader - when true, write the first 4 header fields into passed buffer.
      offset - the file offset the block was read from
      onDiskDataSizeWithHeader - see onDiskDataSizeWithHeader
      fileContext - HFile meta data
  • Method Details

    • createFromBuff

      static HFileBlock createFromBuff(ByteBuff buf, boolean usesHBaseChecksum, long offset, int nextBlockOnDiskSize, HFileContext fileContext, ByteBuffAllocator allocator) throws IOException
      Creates a block from an existing buffer starting with a header. Rewinds and takes ownership of the buffer. By definition of rewind, ignores the buffer position, but if you slice the buffer beforehand, it will rewind to that point.
      Parameters:
      buf - Has header, content, and trailing checksums if present.
      Throws:
      IOException
    • getOnDiskSizeWithHeader

      private static int getOnDiskSizeWithHeader(ByteBuff headerBuf, boolean checksumSupport)
      Parse total on disk size including header and checksum.
      Parameters:
      headerBuf - Header ByteBuffer. Presumed exact size of header.
      checksumSupport - true if checksum verification is in use.
      Returns:
      Size of the block with header included.
    • getNextBlockOnDiskSize

      Returns:
      the on-disk size of the next block (including the header size and any checksums if present) read by peeking into the next block's header; use as a hint when doing a read of the next block when scanning or running over a file.
    • getBlockType

      Description copied from interface: Cacheable
      Returns the block type of this cached HFile block
      Specified by:
      getBlockType in interface Cacheable
    • refCnt

      public int refCnt()
      Description copied from interface: Cacheable
      Reference count of this Cacheable.
      Specified by:
      refCnt in interface Cacheable
      Specified by:
      refCnt in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
    • retain

      public HFileBlock retain()
      Description copied from interface: Cacheable
      Increase its reference count, and only when no reference we can free the object's memory.
      Specified by:
      retain in interface Cacheable
      Specified by:
      retain in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
    • release

      public boolean release()
      Call ByteBuff.release() to decrease the reference count, if no other reference, it will return back the ByteBuffer to ByteBuffAllocator
      Specified by:
      release in interface Cacheable
      Specified by:
      release in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
    • touch

      public HFileBlock touch()
      Calling this method in strategic locations where HFileBlocks are referenced may help diagnose potential buffer leaks. We pass the block itself as a default hint, but one can use touch(Object) to pass their own hint as well.
      Specified by:
      touch in interface HBaseReferenceCounted
      Specified by:
      touch in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
    • touch

      public HFileBlock touch(Object hint)
      Specified by:
      touch in interface HBaseReferenceCounted
      Specified by:
      touch in interface org.apache.hbase.thirdparty.io.netty.util.ReferenceCounted
    • getDataBlockEncodingId

      Returns get data block encoding id that was used to encode this block
    • getOnDiskSizeWithHeader

      Returns the on-disk size of header + data part + checksum.
    • getOnDiskSizeWithoutHeader

      Returns the on-disk size of the data part + checksum (header excluded).
    • getUncompressedSizeWithoutHeader

      Returns the uncompressed size of data part (header and checksum excluded).
    • getPrevBlockOffset

      Returns the offset of the previous block of the same type in the file, or -1 if unknown
    • overwriteHeader

      private void overwriteHeader()
      Rewinds buf and writes first 4 header fields. buf position is modified as side-effect.
    • getBufferWithoutHeader

      Returns a buffer that does not include the header and checksum.
      Returns:
      the buffer with header skipped and checksum omitted.
    • getBufferReadOnly

      Returns a read-only duplicate of the buffer this block stores internally ready to be read. Clients must not modify the buffer object though they may set position and limit on the returned buffer since we pass back a duplicate. This method has to be public because it is used in CompoundBloomFilter to avoid object creation on every Bloom filter lookup, but has to be used with caution. Buffer holds header, block content, and any follow-on checksums if present.
      Returns:
      the buffer of this block for read-only operations,the buffer includes header,but not checksum.
    • getByteBuffAllocator

    • sanityCheckAssertion

      private void sanityCheckAssertion(long valueFromBuf, long valueFromField, String fieldName) throws IOException
      Throws:
      IOException
    • sanityCheckAssertion

      private void sanityCheckAssertion(BlockType valueFromBuf, BlockType valueFromField) throws IOException
      Throws:
      IOException
    • sanityCheck

      void sanityCheck() throws IOException
      Checks if the block is internally consistent, i.e. the first HConstants.HFILEBLOCK_HEADER_SIZE bytes of the buffer contain a valid header consistent with the fields. Assumes a packed block structure. This function is primary for testing and debugging, and is not thread-safe, because it alters the internal buffer pointer. Used by tests only.
      Throws:
      IOException
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • unpack

      Retrieves the decompressed/decrypted view of this block. An encoded block remains in its encoded structure. Internal structures are shared between instances where applicable.
      Throws:
      IOException
    • allocateBufferForUnpacking

      Always allocates a new buffer of the correct size. Copies header bytes from the existing buffer. Does not change header fields. Reserve room to keep checksum bytes too.
    • isUnpacked

      public boolean isUnpacked()
      Return true when this block's buffer has been unpacked, false otherwise. Note this is a calculated heuristic, not tracked attribute of the block.
    • getOffset

      long getOffset()
      Cannot be UNSET. Must be a legitimate value. Used re-making the BlockCacheKey when block is returned to the cache.
      Returns:
      the offset of this block in the file it was read from
    • getByteStream

      Returns a byte stream reading the data(excluding header and checksum) of this block
    • heapSize

      public long heapSize()
      Description copied from interface: HeapSize
      Return the approximate 'exclusive deep size' of implementing object. Includes count of payload and hosting object sizings.
      Specified by:
      heapSize in interface HeapSize
    • isSharedMem

      public boolean isSharedMem()
      Will be override by SharedMemHFileBlock or ExclusiveMemHFileBlock. Return true by default.
    • sanityCheckUncompressed

      An additional sanity-check in case no compression or encryption is being used.
      Throws:
      IOException
    • getSerializedLength

      public int getSerializedLength()
      Description copied from interface: Cacheable
      Returns the length of the ByteBuffer required to serialized the object. If the object cannot be serialized, it should return 0.
      Specified by:
      getSerializedLength in interface Cacheable
      Returns:
      int length in bytes of the serialized form or 0 if the object cannot be cached.
    • serialize

      public void serialize(ByteBuffer destination, boolean includeNextBlockMetadata)
      Description copied from interface: Cacheable
      Serializes its data into destination.
      Specified by:
      serialize in interface Cacheable
      Parameters:
      destination - Where to serialize to
      includeNextBlockMetadata - Whether to include nextBlockMetadata in the Cache block.
    • getMetaData

      For use by bucketcache. This exposes internals.
    • addMetaData

      private ByteBuffer addMetaData(ByteBuffer destination, boolean includeNextBlockMetadata)
      Adds metadata at current position (position is moved forward). Does not flip or reset.
      Returns:
      The passed destination with metadata added.
    • getDeserializer

      Description copied from interface: Cacheable
      Returns CacheableDeserializer instance which reconstructs original object from ByteBuffer.
      Specified by:
      getDeserializer in interface Cacheable
      Returns:
      CacheableDeserialzer instance.
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object comparison)
      Overrides:
      equals in class Object
    • getDataBlockEncoding

    • getChecksumType

    • getBytesPerChecksum

    • getOnDiskDataSizeWithHeader

      Returns the size of data on disk + header. Excludes checksum.
    • totalChecksumBytes

      Return the number of bytes required to store all the checksums for this block. Each checksum value is a 4 byte integer.
      NOTE: ByteBuff returned by getBufferWithoutHeader() and getBufferReadOnly() or DataInputStream returned by getByteStream() does not include checksum.
    • computeTotalChecksumBytes

    • headerSize

      public int headerSize()
      Returns the size of this block header.
    • headerSize

      public static int headerSize(boolean usesHBaseChecksum)
      Maps a minor version to the size of the header.
    • getDummyHeaderForVersion

      Return the appropriate DUMMY_HEADER for the minor version
    • getDummyHeaderForVersion

      private static byte[] getDummyHeaderForVersion(boolean usesHBaseChecksum)
      Return the appropriate DUMMY_HEADER for the minor version
    • getHFileContext

      Returns:
      This HFileBlocks fileContext which will a derivative of the fileContext for the file from which this block's data was originally read.
    • toStringHeader

      Convert the contents of the block header into a human readable string. This is mostly helpful for debugging. This assumes that the block has minor version > 0.
      Throws:
      IOException
    • createBuilder

      private static HFileBlockBuilder createBuilder(HFileBlock blk, ByteBuff newBuff)
      Creates a new HFileBlockBuilder from the existing block and a new ByteBuff. The builder will be loaded with all of the original fields from blk, except now using the newBuff and setting isSharedMem based on the source of the passed in newBuff. An existing HFileBlock may have been an ExclusiveMemHFileBlock, but the new buffer might call for a SharedMemHFileBlock. Or vice versa.
      Parameters:
      blk - the block to clone from
      newBuff - the new buffer to use
    • shallowClone

      private static HFileBlock shallowClone(HFileBlock blk, ByteBuff newBuf)
    • deepCloneOnHeap