org.apache.hadoop.hbase.io.hfile.HFile

@Private public final class HFile extends Object

File format for hbase. A file of sorted key/value pairs. Both keys and values are byte arrays.

The memory footprint of a HFile includes the following (below is taken from the TFile documentation but applies also to HFile):

Some constant overhead of reading or writing a compressed block.
- Each compressed block requires one compression/decompression codec for I/O.
- Temporary space to buffer the key.
- Temporary space to buffer the value.
HFile index, which is proportional to the total number of Data Blocks. The total amount of memory needed to hold the index can be estimated as (56+AvgKeySize)*NumBlocks.

Suggestions on performance optimization.

Minimum block size. We recommend a setting of minimum block size between 8KB to 1MB for general usage. Larger block size is preferred if files are primarily for sequential access. However, it would lead to inefficient random access (because there are more data to decompress). Smaller blocks are good for random access, but require more memory to hold the block index, and may be slower to create (because we must flush the compressor stream at the conclusion of each data block, which leads to an FS I/O flush). Further, due to the internal caching in Compression codec, the smallest possible block size would be around 20KB-30KB.
The current implementation does not offer true multi-threading for reading. The implementation uses FSDataInputStream seek()+read(), which is shown to be much faster than positioned-read call in single thread mode. However, it also means that if multiple threads attempt to access the same HFile (using multiple scanners) simultaneously, the actual I/O is carried out sequentially even if they access different DFS blocks (Reexamine! pread seems to be 10% faster than seek+read in my testing -- stack).
Compression codec. Use "none" if the data is not very compressable (by compressable, I mean a compression ratio at least 2:1). Generally, use "lzo" as the starting point for experimenting. "gz" overs slightly better compression ratio over "lzo" but requires 4x CPU to compress and 2x CPU to decompress, comparing to "lzo".

For more on the background behind HFile, see HBASE-61.

File is made of data blocks followed by meta data blocks (if any), a fileinfo block, data block index, meta data block index, and a fixed size trailer which records the offsets at which file changes content type.

 <data blocks><meta blocks><fileinfo><
 data index><meta index><trailer>

Each block has a bit of magic at its start. Block are comprised of key/values. In data blocks, they are both byte arrays. Metadata blocks are a String key and a byte array value. An empty file looks like this:

 <fileinfo><trailer>

. That is, there are not data nor meta blocks present.

TODO: Do scanners need to be able to take a start and end row? TODO: Should BlockIndex know the name of its file? Should it have a Path that points at its file say for the case where an index lives apart from an HFile instance?

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static interface

HFile.CachingBlockReader

An abstraction used by the block index.

static interface

HFile.Reader

An interface used by clients to open and iterate an HFile.

static interface

HFile.Writer

API required to write an HFile

static class

HFile.WriterFactory

This variety of ways to construct writers is used throughout the code, and we want to be able to swap writer implementations.
Field Summary

Fields

Modifier and Type

Field

Description

static final String

BLOOM_FILTER_DATA_KEY

Meta data block name for bloom filter bits.

(package private) static final LongAdder

CHECKSUM_FAILURES

static final LongAdder

DATABLOCK_READ_COUNT

static final int

DEFAULT_BYTES_PER_CHECKSUM

The number of bytes per checksum.

static final String

DEFAULT_COMPRESSION

Default compression name: none.

static final Compression.Algorithm

DEFAULT_COMPRESSION_ALGORITHM

Default compression: none.

static final String

FORMAT_VERSION_KEY

The configuration key for HFile version to use for new files

(package private) static final org.slf4j.Logger

LOG

static final int

MAX_FORMAT_VERSION

Maximum supported HFile format version

static final int

MAXIMUM_KEY_LENGTH

Maximum length of key in HFile.

static final int

MIN_FORMAT_VERSION

Minimum supported HFile format version

static final int

MIN_FORMAT_VERSION_WITH_TAGS

Minimum HFile format version with support for persisting cell tags

static final int

MIN_NUM_HFILE_PATH_LEVELS

We assume that HFile path ends with ROOT_DIR/TABLE_NAME/REGION_NAME/CF_NAME/HFILE, so it has at least this many levels of nesting.
Constructor Summary

Constructors

Modifier

Constructor

Description

private

HFile()

Shutdown constructor.
Method Summary

Modifier and Type

Method

Description

static void

checkFormatVersion(int version)

Checks the given HFile format version, and throws an exception if invalid.

static void

checkHFileVersion(org.apache.hadoop.conf.Configuration c)

static HFile.Reader

createReader(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)

Creates reader with cache configuration disabled

static HFile.Reader

createReader(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, CacheConfig cacheConf, boolean primaryReplicaReader, org.apache.hadoop.conf.Configuration conf)

static HFile.Reader

createReader(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf)

Method returns the reader given the specified arguments.

static final long

getAndResetChecksumFailuresCount()

Number of checksum verification failures.

static final long

getChecksumFailuresCount()

Number of checksum verification failures.

static int

getFormatVersion(org.apache.hadoop.conf.Configuration conf)

static List<org.apache.hadoop.fs.Path>

getStoreFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir)

Returns all HFiles belonging to the given region directory.

static String[]

getSupportedCompressionAlgorithms()

Get names of supported compression algorithms.

static final HFile.WriterFactory

getWriterFactory(org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf)

Returns the factory to be used to create HFile writers

static final HFile.WriterFactory

getWriterFactoryNoCache(org.apache.hadoop.conf.Configuration conf)

Returns the factory to be used to create HFile writers.

static boolean

isHFileFormat(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus)

Returns true if the specified file has a valid HFile Trailer.

static boolean

isHFileFormat(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)

Returns true if the specified file has a valid HFile Trailer.

(package private) static int

longToInt(long l)

static void

main(String[] args)

static final void

updateReadLatency(long latencyMillis, boolean pread, boolean tooSlow)

static final void

updateWriteLatency(long latencyMillis)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- LOG
  
  static final org.slf4j.Logger LOG
- MAXIMUM_KEY_LENGTH
  
  public static final int MAXIMUM_KEY_LENGTH
  
  Maximum length of key in HFile.
  See Also:
  
  Constant Field Values
- DEFAULT_COMPRESSION_ALGORITHM
  
  public static final Compression.Algorithm DEFAULT_COMPRESSION_ALGORITHM
  
  Default compression: none.
- MIN_FORMAT_VERSION
  
  public static final int MIN_FORMAT_VERSION
  
  Minimum supported HFile format version
  See Also:
  
  Constant Field Values
- MAX_FORMAT_VERSION
  
  public static final int MAX_FORMAT_VERSION
  
  Maximum supported HFile format version
  See Also:
  
  Constant Field Values
- MIN_FORMAT_VERSION_WITH_TAGS
  
  public static final int MIN_FORMAT_VERSION_WITH_TAGS
  
  Minimum HFile format version with support for persisting cell tags
  See Also:
  
  Constant Field Values
- DEFAULT_COMPRESSION
  
  public static final String DEFAULT_COMPRESSION
  
  Default compression name: none.
- BLOOM_FILTER_DATA_KEY
  
  public static final String BLOOM_FILTER_DATA_KEY
  
  Meta data block name for bloom filter bits.
  See Also:
  
  Constant Field Values
- MIN_NUM_HFILE_PATH_LEVELS
  
  public static final int MIN_NUM_HFILE_PATH_LEVELS
  
  We assume that HFile path ends with ROOT_DIR/TABLE_NAME/REGION_NAME/CF_NAME/HFILE, so it has at least this many levels of nesting. This is needed for identifying table and CF name from an HFile path.
  See Also:
  
  Constant Field Values
- DEFAULT_BYTES_PER_CHECKSUM
  
  public static final int DEFAULT_BYTES_PER_CHECKSUM
  
  The number of bytes per checksum.
  See Also:
  
  Constant Field Values
- CHECKSUM_FAILURES
  
  static final LongAdder CHECKSUM_FAILURES
- DATABLOCK_READ_COUNT
  
  public static final LongAdder DATABLOCK_READ_COUNT
- FORMAT_VERSION_KEY
  
  public static final String FORMAT_VERSION_KEY
  
  The configuration key for HFile version to use for new files
  See Also:
  
  Constant Field Values
Constructor Details
- HFile
  
  private HFile()
  
  Shutdown constructor.
Method Details
- getAndResetChecksumFailuresCount
  
  public static final long getAndResetChecksumFailuresCount()
  
  Number of checksum verification failures. It also clears the counter.
- getChecksumFailuresCount
  
  public static final long getChecksumFailuresCount()
  
  Number of checksum verification failures. It also clears the counter.
- updateReadLatency
  
  public static final void updateReadLatency(long latencyMillis, boolean pread, boolean tooSlow)
- updateWriteLatency
  
  public static final void updateWriteLatency(long latencyMillis)
- getFormatVersion
  
  public static int getFormatVersion(org.apache.hadoop.conf.Configuration conf)
- getWriterFactoryNoCache
  
  public static final HFile.WriterFactory getWriterFactoryNoCache(org.apache.hadoop.conf.Configuration conf)
  
  Returns the factory to be used to create HFile writers. Disables block cache access for all writers created through the returned factory.
- getWriterFactory
  
  public static final HFile.WriterFactory getWriterFactory(org.apache.hadoop.conf.Configuration conf, CacheConfig cacheConf)
  
  Returns the factory to be used to create HFile writers
- createReader
  
  public static HFile.Reader createReader(ReaderContext context, HFileInfo fileInfo, CacheConfig cacheConf, org.apache.hadoop.conf.Configuration conf) throws IOException
  
  Method returns the reader given the specified arguments. TODO This is a bad abstraction. See HBASE-6635.
  
  Parameters:
  
  context - Reader context info
  
  fileInfo - HFile info
  
  cacheConf - Cache configuation values, cannot be null.
  
  conf - Configuration
  
  Returns:
  
  an appropriate instance of HFileReader
  
  Throws:
  
  IOException - If file is invalid, will throw CorruptHFileException flavored IOException
- createReader
  
  public static HFile.Reader createReader(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf) throws IOException
  
  Creates reader with cache configuration disabled
  
  Parameters:
  
  fs - filesystem
  
  path - Path to file to read
  
  conf - Configuration
  
  Returns:
  
  an active Reader instance
  
  Throws:
  
  IOException - Will throw a CorruptHFileException (DoNotRetryIOException subtype) if hfile is corrupt/invalid.
- createReader
  
  public static HFile.Reader createReader(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, CacheConfig cacheConf, boolean primaryReplicaReader, org.apache.hadoop.conf.Configuration conf) throws IOException
  Parameters:
  
  fs - filesystem
  
  path - Path to file to read
  
  cacheConf - This must not be null.
  
  primaryReplicaReader - true if this is a reader for primary replica
  
  conf - Configuration
  
  Returns:
  
  an active Reader instance
  
  Throws:
  
  IOException - Will throw a CorruptHFileException (DoNotRetryIOException subtype) if hfile is corrupt/invalid.
  
  See Also:
  
  CacheConfig(Configuration)
- isHFileFormat
  
  public static boolean isHFileFormat(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path) throws IOException
  
  Returns true if the specified file has a valid HFile Trailer.
  
  Parameters:
  
  fs - filesystem
  
  path - Path to file to verify
  
  Returns:
  
  true if the file has a valid HFile Trailer, otherwise false
  
  Throws:
  
  IOException - if failed to read from the underlying stream
- isHFileFormat
  
  public static boolean isHFileFormat(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus fileStatus) throws IOException
  
  Returns true if the specified file has a valid HFile Trailer.
  
  Parameters:
  
  fs - filesystem
  
  fileStatus - the file to verify
  
  Returns:
  
  true if the file has a valid HFile Trailer, otherwise false
  
  Throws:
  
  IOException - if failed to read from the underlying stream
- getSupportedCompressionAlgorithms
  
  public static String[] getSupportedCompressionAlgorithms()
  
  Get names of supported compression algorithms. The names are acceptable by HFile.Writer.
  Returns:
  
  Array of strings, each represents a supported compression algorithm. Currently, the following compression algorithms are supported.
  
  "none" - No compression.
  "gz" - GZIP compression.
- longToInt
  
  static int longToInt(long l)
- getStoreFiles
  
  public static List<org.apache.hadoop.fs.Path> getStoreFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir) throws IOException
  
  Returns all HFiles belonging to the given region directory. Could return an empty list.
  
  Parameters:
  
  fs - The file system reference.
  
  regionDir - The region directory to scan.
  
  Returns:
  
  The list of files found.
  
  Throws:
  
  IOException - When scanning the files fails.
- checkFormatVersion
  
  public static void checkFormatVersion(int version) throws IllegalArgumentException
  
  Checks the given HFile format version, and throws an exception if invalid. Note that if the version number comes from an input file and has not been verified, the caller needs to re-throw an IOException to indicate that this is not a software error, but corrupted input.
  
  Parameters:
  
  version - an HFile version
  
  Throws:
  
  IllegalArgumentException - if the version is invalid
- checkHFileVersion
  
  public static void checkHFileVersion(org.apache.hadoop.conf.Configuration c)
- main
  
  public static void main(String[] args) throws Exception
  
  Throws:
  
  Exception

Class HFile

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

LOG

MAXIMUM_KEY_LENGTH

DEFAULT_COMPRESSION_ALGORITHM

MIN_FORMAT_VERSION

MAX_FORMAT_VERSION

MIN_FORMAT_VERSION_WITH_TAGS

DEFAULT_COMPRESSION

BLOOM_FILTER_DATA_KEY

MIN_NUM_HFILE_PATH_LEVELS

DEFAULT_BYTES_PER_CHECKSUM

CHECKSUM_FAILURES

DATABLOCK_READ_COUNT

FORMAT_VERSION_KEY

Constructor Details

HFile

Method Details

getAndResetChecksumFailuresCount

getChecksumFailuresCount

updateReadLatency

updateWriteLatency

getFormatVersion

getWriterFactoryNoCache

getWriterFactory

createReader

createReader

createReader

isHFileFormat

isHFileFormat

getSupportedCompressionAlgorithms

longToInt

getStoreFiles

checkFormatVersion

checkHFileVersion

main