Interface BufferedMutator

All Superinterfaces:
AutoCloseable, Closeable
All Known Implementing Classes:
BufferedMutatorOverAsyncBufferedMutator

@Public public interface BufferedMutator extends Closeable

Used to communicate with a single HBase table similar to Table but meant for batched, asynchronous puts. Obtain an instance from a Connection and call close() afterwards. Customizations can be applied to the BufferedMutator via the BufferedMutatorParams.

Exception handling with asynchronously via the BufferedMutator.ExceptionListener. The default implementation is to throw the exception upon receipt. This behavior can be overridden with a custom implementation, provided as a parameter with BufferedMutatorParams.listener(BufferedMutator.ExceptionListener).

Map/Reduce jobs are good use cases for using BufferedMutator. Map/reduce jobs benefit from batching, but have no natural flush point. BufferedMutator receives the puts from the M/R job and will batch puts based on some heuristic, such as the accumulated size of the puts, and submit batches of puts asynchronously so that the M/R logic can continue without interruption.

BufferedMutator can also be used on more exotic circumstances. Map/Reduce batch jobs will have a single BufferedMutator per thread. A single BufferedMutator can also be effectively used in high volume online systems to batch puts, with the caveat that extreme circumstances, such as JVM or machine failure, may cause some data loss.

NOTE: This class replaces the functionality that used to be available via HTable#setAutoFlush(boolean) set to false.

See also the BufferedMutatorExample in the hbase-examples module.

Since:
1.0.0
See Also:
  • Field Details

    • CLASSNAME_KEY

      Deprecated.
      Since 3.0.0, will be removed in 4.0.0. For internal test use only, do not use it any more.
      Key to use setting non-default BufferedMutator implementation in Configuration.

      See Also:
    • MIN_WRITE_BUFFER_PERIODIC_FLUSH_TIMERTICK_MS

      Having the timer tick run more often that once every 100ms is needless and will probably cause too many timer events firing having a negative impact on performance.
      See Also:
  • Method Details

    • getName

      Gets the fully qualified table name instance of the table that this BufferedMutator writes to.
    • getConfiguration

      org.apache.hadoop.conf.Configuration getConfiguration()
      Returns the Configuration object used by this instance.

      The reference returned is not a copy, so any change made to it will affect this instance.

    • mutate

      void mutate(Mutation mutation) throws IOException
      Sends a Mutation to the table. The mutations will be buffered and sent over the wire as part of a batch. Currently only supports Put and Delete mutations.
      Parameters:
      mutation - The data to send.
      Throws:
      IOException - if a remote or network exception occurs.
    • mutate

      void mutate(List<? extends Mutation> mutations) throws IOException
      Send some Mutations to the table. The mutations will be buffered and sent over the wire as part of a batch. There is no guarantee of sending entire content of mutations in a single batch; it will be broken up according to the write buffer capacity.
      Parameters:
      mutations - The data to send.
      Throws:
      IOException - if a remote or network exception occurs.
    • close

      void close() throws IOException
      Performs a flush() and releases any resources held.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException - if a remote or network exception occurs.
    • flush

      void flush() throws IOException
      Executes all the buffered, asynchronous Mutation operations and waits until they are done.
      Throws:
      IOException - if a remote or network exception occurs.
    • setWriteBufferPeriodicFlush

      default void setWriteBufferPeriodicFlush(long timeoutMs)
      Sets the maximum time before the buffer is automatically flushed checking once per second.
      Parameters:
      timeoutMs - The maximum number of milliseconds how long records may be buffered before they are flushed. Set to 0 to disable.
    • setWriteBufferPeriodicFlush

      default void setWriteBufferPeriodicFlush(long timeoutMs, long timerTickMs)
      Sets the maximum time before the buffer is automatically flushed.
      Parameters:
      timeoutMs - The maximum number of milliseconds how long records may be buffered before they are flushed. Set to 0 to disable.
      timerTickMs - The number of milliseconds between each check if the timeout has been exceeded. Must be 100ms (as defined in MIN_WRITE_BUFFER_PERIODIC_FLUSH_TIMERTICK_MS) or larger to avoid performance problems.
    • disableWriteBufferPeriodicFlush

      Disable periodic flushing of the write buffer.
    • getWriteBufferPeriodicFlushTimeoutMs

      Returns the current periodic flush timeout value in milliseconds.
      Returns:
      The maximum number of milliseconds how long records may be buffered before they are flushed. The value 0 means this is disabled.
    • getWriteBufferPeriodicFlushTimerTickMs

      Returns the current periodic flush timertick interval in milliseconds.
      Returns:
      The number of milliseconds between each check if the timeout has been exceeded. This value only has a real meaning if the timeout has been set to > 0
    • getWriteBufferSize

      default long getWriteBufferSize()
      Returns the maximum size in bytes of the write buffer for this HTable.

      The default value comes from the configuration parameter hbase.client.write.buffer.

      Returns:
      The size of the write buffer in bytes.
    • setRpcTimeout

      @Deprecated default void setRpcTimeout(int timeout)
      Deprecated.
      Since 3.0.0, will be removed in 4.0.0. Please set this through the BufferedMutatorParams.
      Set rpc timeout for this mutator instance
    • setOperationTimeout

      @Deprecated default void setOperationTimeout(int timeout)
      Deprecated.
      Since 3.0.0, will be removed in 4.0.0. Please set this through the BufferedMutatorParams.
      Set operation timeout for this mutator instance
    • getRequestAttributes

      default Map<String,byte[]> getRequestAttributes()
      Returns the rpc request attributes.