org.apache.hadoop.hbase.io.hfile.HFileBlockIndex.BlockIndexWriter

All Implemented Interfaces:: InlineBlockWriter

Enclosing class:: HFileBlockIndex

public static class HFileBlockIndex.BlockIndexWriter extends Object implements InlineBlockWriter

Writes the block index into the output stream. Generate the tree from bottom up. The leaf level is written to disk as a sequence of inline blocks, if it is larger than a certain number of bytes. If the leaf level is not large enough, we write all entries to the root level instead. After all leaf blocks have been written, we end up with an index referencing the resulting leaf index blocks. If that index is larger than the allowed root index size, the writer will break it up into reasonable-size intermediate-level index block chunks write those chunks out, and create another index referencing those chunks. This will be repeated until the remaining index is small enough to become the root index. However, in most practical cases we will only have leaf-level blocks and the root index, or just the root index.

Field Summary

Fields

Modifier and Type

Field

Description

private HFileBlock.Writer

blockWriter

private CacheConfig

cacheConf

CacheConfig, or null if cache-on-write is disabled

private BlockIndexChunk

curInlineChunk

Current leaf-level chunk.

private byte[]

firstKey

private HFileIndexBlockEncoder

indexBlockEncoder

Type of encoding used for index blocks in HFile

private int

maxChunkSize

The maximum size guideline of all multi-level index blocks.

private int

minIndexNumEntries

The maximum level of multi-level index blocks

private String

nameForCaching

Name to use for computing cache keys

private int

numLevels

The number of block index levels.

private BlockIndexChunk

rootChunk

While the index is being written, this represents the current block index referencing all leaf blocks, with one exception.

private boolean

singleLevelOnly

Whether we require this block index to always be single-level.

private long

totalBlockOnDiskSize

Total compressed size of all index blocks.

private long

totalBlockUncompressedSize

Total uncompressed size of all index blocks.

private long

totalNumEntries

The total number of leaf-level entries, i.e.
Constructor Summary

Constructors

Constructor

Description

BlockIndexWriter()

Creates a single-level block index writer

BlockIndexWriter(HFileBlock.Writer blockWriter, CacheConfig cacheConf, String nameForCaching, HFileIndexBlockEncoder indexBlockEncoder)

Creates a multi-level block index writer.
Method Summary

Modifier and Type

Method

Description

void

addEntry(byte[] firstKey, long blockOffset, int blockDataSize)

Add one index entry to the current leaf-level block.

void

blockWritten(long offset, int onDiskSize, int uncompressedSize)

Called after an inline block has been written so that we can add an entry referring to that block to the parent-level index.

void

ensureSingleLevel()

private void

expectNumLevels(int expectedNumLevels)

boolean

getCacheOnWrite()

Returns true if inline blocks produced by this writer should be cached

BlockType

getInlineBlockType()

The type of blocks this block writer produces.

int

getNumLevels()

Returns the number of levels in this block index.

final int

getNumRootEntries()

Returns how many block index entries there are in the root level

long

getTotalUncompressedSize()

The total uncompressed size of the root index block, intermediate-level index blocks, and leaf-level index blocks.

void

setMaxChunkSize(int maxChunkSize)

void

setMinIndexNumEntries(int minIndexNumEntries)

boolean

shouldWriteBlock(boolean closing)

Whether there is an inline block ready to be written.

long

writeIndexBlocks(org.apache.hadoop.fs.FSDataOutputStream out)

Writes the root level and intermediate levels of the block index into the output stream, generating the tree from bottom up.

void

writeInlineBlock(DataOutput out)

Write out the current inline index block.

private void

writeIntermediateBlock(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk parent, BlockIndexChunk curChunk)

private BlockIndexChunk

writeIntermediateLevel(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk currentLevel)

Split the current level of the block index into intermediate index blocks of permitted size and write those blocks to disk.

void

writeSingleLevelIndex(DataOutput out, String description)

Writes the block index data as a single level only.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- rootChunk
  
  private BlockIndexChunk rootChunk
  
  While the index is being written, this represents the current block index referencing all leaf blocks, with one exception. If the file is being closed and there are not enough blocks to complete even a single leaf block, no leaf blocks get written and this contains the entire block index. After all levels of the index were written by writeIndexBlocks(FSDataOutputStream), this contains the final root-level index.
- curInlineChunk
  
  private BlockIndexChunk curInlineChunk
  
  Current leaf-level chunk. New entries referencing data blocks get added to this chunk until it grows large enough to be written to disk.
- numLevels
  
  private int numLevels
  
  The number of block index levels. This is one if there is only root level (even empty), two if there a leaf level and root level, and is higher if there are intermediate levels. This is only final after writeIndexBlocks(FSDataOutputStream) has been called. The initial value accounts for the root level, and will be increased to two as soon as we find out there is a leaf-level in blockWritten(long, int, int).
- blockWriter
  
  private HFileBlock.Writer blockWriter
- firstKey
  
  private byte[] firstKey
- totalNumEntries
  
  private long totalNumEntries
  
  The total number of leaf-level entries, i.e. entries referenced by leaf-level blocks. For the data block index this is equal to the number of data blocks.
- totalBlockOnDiskSize
  
  private long totalBlockOnDiskSize
  
  Total compressed size of all index blocks.
- totalBlockUncompressedSize
  
  private long totalBlockUncompressedSize
  
  Total uncompressed size of all index blocks.
- maxChunkSize
  
  private int maxChunkSize
  
  The maximum size guideline of all multi-level index blocks.
- minIndexNumEntries
  
  private int minIndexNumEntries
  
  The maximum level of multi-level index blocks
- singleLevelOnly
  
  private boolean singleLevelOnly
  
  Whether we require this block index to always be single-level.
- cacheConf
  
  private CacheConfig cacheConf
  
  CacheConfig, or null if cache-on-write is disabled
- nameForCaching
  
  private String nameForCaching
  
  Name to use for computing cache keys
- indexBlockEncoder
  
  private HFileIndexBlockEncoder indexBlockEncoder
  
  Type of encoding used for index blocks in HFile
Constructor Details
- BlockIndexWriter
  
  public BlockIndexWriter()
  
  Creates a single-level block index writer
- BlockIndexWriter
  
  public BlockIndexWriter(HFileBlock.Writer blockWriter, CacheConfig cacheConf, String nameForCaching, HFileIndexBlockEncoder indexBlockEncoder)
  
  Creates a multi-level block index writer.
  
  Parameters:
  
  blockWriter - the block writer to use to write index blocks
  
  cacheConf - used to determine when and how a block should be cached-on-write.
Method Details
- setMaxChunkSize
  
  public void setMaxChunkSize(int maxChunkSize)
- setMinIndexNumEntries
  
  public void setMinIndexNumEntries(int minIndexNumEntries)
- writeIndexBlocks
  
  public long writeIndexBlocks(org.apache.hadoop.fs.FSDataOutputStream out) throws IOException
  
  Writes the root level and intermediate levels of the block index into the output stream, generating the tree from bottom up. Assumes that the leaf level has been inline-written to the disk if there is enough data for more than one leaf block. We iterate by breaking the current level of the block index, starting with the index of all leaf-level blocks, into chunks small enough to be written to disk, and generate its parent level, until we end up with a level small enough to become the root level. If the leaf level is not large enough, there is no inline block index anymore, so we only write that level of block index to disk as the root level.
  
  Parameters:
  
  out - FSDataOutputStream
  
  Returns:
  
  position at which we entered the root-level index.
  
  Throws:
  
  IOException
- writeSingleLevelIndex
  
  public void writeSingleLevelIndex(DataOutput out, String description) throws IOException
  
  Writes the block index data as a single level only. Does not do any block framing.
  
  Parameters:
  
  out - the buffered output stream to write the index to. Typically a stream writing into an HFile block.
  
  description - a short description of the index being written. Used in a log message.
  
  Throws:
  
  IOException
- writeIntermediateLevel
  
  private BlockIndexChunk writeIntermediateLevel(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk currentLevel) throws IOException
  
  Split the current level of the block index into intermediate index blocks of permitted size and write those blocks to disk. Return the next level of the block index referencing those intermediate-level blocks.
  
  Parameters:
  
  currentLevel - the current level of the block index, such as the a chunk referencing all leaf-level index blocks
  
  Returns:
  
  the parent level block index, which becomes the root index after a few (usually zero) iterations
  
  Throws:
  
  IOException
- writeIntermediateBlock
  
  private void writeIntermediateBlock(org.apache.hadoop.fs.FSDataOutputStream out, BlockIndexChunk parent, BlockIndexChunk curChunk) throws IOException
  
  Throws:
  
  IOException
- getNumRootEntries
  
  public final int getNumRootEntries()
  
  Returns how many block index entries there are in the root level
- getNumLevels
  
  public int getNumLevels()
  
  Returns the number of levels in this block index.
- expectNumLevels
  
  private void expectNumLevels(int expectedNumLevels)
- shouldWriteBlock
  
  public boolean shouldWriteBlock(boolean closing)
  
  Whether there is an inline block ready to be written. In general, we write an leaf-level index block as an inline block as soon as its size as serialized in the non-root format reaches a certain threshold.
  
  Specified by:
  
  shouldWriteBlock in interface InlineBlockWriter
- writeInlineBlock
  
  public void writeInlineBlock(DataOutput out) throws IOException
  
  Write out the current inline index block. Inline blocks are non-root blocks, so the non-root index format is used.
  
  Specified by:
  
  writeInlineBlock in interface InlineBlockWriter
  
  Throws:
  
  IOException
- blockWritten
  
  public void blockWritten(long offset, int onDiskSize, int uncompressedSize)
  
  Called after an inline block has been written so that we can add an entry referring to that block to the parent-level index.
  
  Specified by:
  
  blockWritten in interface InlineBlockWriter
  
  Parameters:
  
  offset - the offset of the block in the stream
  
  onDiskSize - the on-disk size of the block
  
  uncompressedSize - the uncompressed size of the block
- getInlineBlockType
  
  public BlockType getInlineBlockType()
  
  Description copied from interface: InlineBlockWriter
  
  The type of blocks this block writer produces.
  
  Specified by:
  
  getInlineBlockType in interface InlineBlockWriter
- addEntry
  
  public void addEntry(byte[] firstKey, long blockOffset, int blockDataSize)
  
  Add one index entry to the current leaf-level block. When the leaf-level block gets large enough, it will be flushed to disk as an inline block.
  
  Parameters:
  
  firstKey - the first key of the data block
  
  blockOffset - the offset of the data block
  
  blockDataSize - the on-disk size of the data block (HFile format version 2), or the uncompressed size of the data block ( HFile format version 1).
- ensureSingleLevel
  
  public void ensureSingleLevel() throws IOException
  
  Throws:
  
  IOException - if we happened to write a multi-level index.
- getCacheOnWrite
  
  public boolean getCacheOnWrite()
  
  Description copied from interface: InlineBlockWriter
  
  Returns true if inline blocks produced by this writer should be cached
  
  Specified by:
  
  getCacheOnWrite in interface InlineBlockWriter
  
  Returns:
  
  true if we are using cache-on-write. This is configured by the caller of the constructor by either passing a valid block cache or null.
- getTotalUncompressedSize
  
  public long getTotalUncompressedSize()
  
  The total uncompressed size of the root index block, intermediate-level index blocks, and leaf-level index blocks.
  
  Returns:
  
  the total uncompressed size of all index blocks

Class HFileBlockIndex.BlockIndexWriter

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

rootChunk

curInlineChunk

numLevels

blockWriter

firstKey

totalNumEntries

totalBlockOnDiskSize

totalBlockUncompressedSize

maxChunkSize

minIndexNumEntries

singleLevelOnly

cacheConf

nameForCaching

indexBlockEncoder

Constructor Details

BlockIndexWriter

BlockIndexWriter

Method Details

setMaxChunkSize

setMinIndexNumEntries

writeIndexBlocks

writeSingleLevelIndex

writeIntermediateLevel

writeIntermediateBlock

getNumRootEntries

getNumLevels

expectNumLevels

shouldWriteBlock

writeInlineBlock

blockWritten

getInlineBlockType

addEntry

ensureSingleLevel

getCacheOnWrite

getTotalUncompressedSize