Interface WAL

All Superinterfaces:
AutoCloseable, Closeable, WALFileLengthProvider
All Known Implementing Classes:
AbstractFSWAL, AsyncFSWAL, DisabledWALProvider.DisabledWAL, FSHLog

@Private @Evolving public interface WAL extends Closeable, WALFileLengthProvider
A Write Ahead Log (WAL) provides service for reading, writing waledits. This interface provides APIs for WAL users (such as RegionServer) to use the WAL (do append, sync, etc). Note that some internals, such as log rolling and performance evaluation tools, will use WAL.equals to determine if they have already seen a given WAL.
  • Method Details

    • registerWALActionsListener

      Registers WALActionsListener
    • unregisterWALActionsListener

      Unregisters WALActionsListener
    • rollWriter

      Roll the log writer. That is, start writing log messages to a new file.

      The implementation is synchronized in order to make sure there's one rollWriter running at any given time.

      Returns:
      If lots of logs, flush the stores of returned regions so next time through we can clean logs. Returns null if nothing to flush. Names are actual region names as returned by RegionInfo.getEncodedName()
      Throws:
      FailedLogCloseException
      IOException
    • rollWriter

      Map<byte[],List<byte[]>> rollWriter(boolean force) throws IOException
      Roll the log writer. That is, start writing log messages to a new file.

      The implementation is synchronized in order to make sure there's one rollWriter running at any given time. If true, force creation of a new writer even if no entries have been written to the current writer

      Returns:
      If lots of logs, flush the stores of returned regions so next time through we can clean logs. Returns null if nothing to flush. Names are actual region names as returned by RegionInfo.getEncodedName()
      Throws:
      IOException
    • shutdown

      void shutdown() throws IOException
      Stop accepting new writes. If we have unsynced writes still in buffer, sync them. Extant edits are left in place in backing storage to be replayed later.
      Throws:
      IOException
    • close

      void close() throws IOException
      Caller no longer needs any edits from this WAL. Implementers are free to reclaim underlying resources after this call; i.e. filesystem based WALs can archive or delete files.
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • appendData

      long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws IOException
      Append a set of data edits to the WAL. 'Data' here means that the content in the edits will also have transitioned through the memstore.

      The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must have its region edit/sequence id assigned else it messes up our unification of mvcc and sequenceid. On return key will have the region edit/sequence id filled in.

      Parameters:
      info - the regioninfo associated with append
      key - Modified by this call; we add to it this edits region edit/sequence id.
      edits - Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit sequence id that is after all currently appended edits.
      Returns:
      Returns a 'transaction id' and key will have the region edit/sequence id in it.
      Throws:
      IOException
      See Also:
    • appendMarker

      long appendMarker(RegionInfo info, WALKeyImpl key, WALEdit edits) throws IOException
      Append an operational 'meta' event marker edit to the WAL. A marker meta edit could be a FlushDescriptor, a compaction marker, or a region event marker; e.g. region open or region close. The difference between a 'marker' append and a 'data' append as in appendData(RegionInfo, WALKeyImpl, WALEdit)is that a marker will not have transitioned through the memstore.

      The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must have its region edit/sequence id assigned else it messes up our unification of mvcc and sequenceid. On return key will have the region edit/sequence id filled in.

      Parameters:
      info - the regioninfo associated with append
      key - Modified by this call; we add to it this edits region edit/sequence id.
      edits - Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit sequence id that is after all currently appended edits.
      Returns:
      Returns a 'transaction id' and key will have the region edit/sequence id in it.
      Throws:
      IOException
      See Also:
    • updateStore

      void updateStore(byte[] encodedRegionName, byte[] familyName, Long sequenceid, boolean onlyIfGreater)
      updates the seuence number of a specific store. depending on the flag: replaces current seq number if the given seq id is bigger, or even if it is lower than existing one
    • sync

      void sync() throws IOException
      Sync what we have in the WAL.
      Throws:
      when - timeout, it would throw WALSyncTimeoutIOException.
      IOException
    • sync

      void sync(long txid) throws IOException
      Sync the WAL if the txId was not already sync'd.
      Parameters:
      txid - Transaction id to sync to.
      Throws:
      when - timeout, it would throw WALSyncTimeoutIOException.
      IOException
    • sync

      default void sync(boolean forceSync) throws IOException
      Parameters:
      forceSync - Flag to force sync rather than flushing to the buffer. Example - Hadoop hflush vs hsync.
      Throws:
      when - timeout, it would throw WALSyncTimeoutIOException.
      IOException
    • sync

      default void sync(long txid, boolean forceSync) throws IOException
      Parameters:
      txid - Transaction id to sync to.
      forceSync - Flag to force sync rather than flushing to the buffer. Example - Hadoop hflush vs hsync.
      Throws:
      when - timeout, it would throw WALSyncTimeoutIOException.
      IOException
    • startCacheFlush

      Long startCacheFlush(byte[] encodedRegionName, Set<byte[]> families)
      WAL keeps track of the sequence numbers that are as yet not flushed im memstores in order to be able to do accounting to figure which WALs can be let go. This method tells WAL that some region is about to flush. The flush can be the whole region or for a column family of the region only.

      Currently, it is expected that the update lock is held for the region; i.e. no concurrent appends while we set up cache flush.

      Parameters:
      families - Families to flush. May be a subset of all families in the region.
      Returns:
      Returns HConstants.NO_SEQNUM if we are flushing the whole region OR if we are flushing a subset of all families but there are no edits in those families not being flushed; in other words, this is effectively same as a flush of all of the region though we were passed a subset of regions. Otherwise, it returns the sequence id of the oldest/lowest outstanding edit.
      See Also:
    • startCacheFlush

      Long startCacheFlush(byte[] encodedRegionName, Map<byte[],Long> familyToSeq)
    • completeCacheFlush

      void completeCacheFlush(byte[] encodedRegionName, long maxFlushedSeqId)
      Complete the cache flush.
      Parameters:
      encodedRegionName - Encoded region name.
      maxFlushedSeqId - The maxFlushedSeqId for this flush. There is no edit in memory that is less that this sequence id.
      See Also:
    • abortCacheFlush

      void abortCacheFlush(byte[] encodedRegionName)
      Abort a cache flush. Call if the flush fails. Note that the only recovery for an aborted flush currently is a restart of the regionserver so the snapshot content dropped by the failure gets restored to the memstore.
      Parameters:
      encodedRegionName - Encoded region name.
    • getCoprocessorHost

      Returns Coprocessor host.
    • getEarliestMemStoreSeqNum

      @Deprecated long getEarliestMemStoreSeqNum(byte[] encodedRegionName)
      Deprecated.
      Since version 1.2.0. Removing because not used and exposes subtle internal workings. Use getEarliestMemStoreSeqNum(byte[], byte[])
      Gets the earliest unflushed sequence id in the memstore for the region.
      Parameters:
      encodedRegionName - The region to get the number for.
      Returns:
      The earliest/lowest/oldest sequence id if present, HConstants.NO_SEQNUM if absent.
    • getEarliestMemStoreSeqNum

      long getEarliestMemStoreSeqNum(byte[] encodedRegionName, byte[] familyName)
      Gets the earliest unflushed sequence id in the memstore for the store.
      Parameters:
      encodedRegionName - The region to get the number for.
      familyName - The family to get the number for.
      Returns:
      The earliest/lowest/oldest sequence id if present, HConstants.NO_SEQNUM if absent.
    • toString

      Human readable identifying information about the state of this WAL. Implementors are encouraged to include information appropriate for debugging. Consumers are advised not to rely on the details of the returned String; it does not have a defined structure.
      Overrides:
      toString in class Object
    • getTimestamp

      static long getTimestamp(String name)
      Split a WAL filename to get a start time. WALs usually have the time we start writing to them as part of their name, usually the suffix. Sometimes there will be an extra suffix as when it is a WAL for the meta table. For example, WALs might look like this 10.20.20.171%3A60020.1277499063250 where 1277499063250 is the timestamp. Could also be a meta WAL which adds a '.meta' suffix or a synchronous replication WAL which adds a '.syncrep' suffix. Check for these. File also may have no timestamp on it. For example the recovered.edits files are WALs but are named in ascending order. Here is an example: 0000000000000016310. Allow for this.
      Parameters:
      name - Name of the WAL file.
      Returns:
      Timestamp or -1.