Class HBCKServerCrashProcedure

java.lang.Object
org.apache.hadoop.hbase.procedure2.Procedure<TEnvironment>
org.apache.hadoop.hbase.procedure2.StateMachineProcedure<MasterProcedureEnv,org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProcedureProtos.ServerCrashState>
org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure
org.apache.hadoop.hbase.master.procedure.HBCKServerCrashProcedure
All Implemented Interfaces:
Comparable<Procedure<MasterProcedureEnv>>, ServerProcedureInterface

@Private public class HBCKServerCrashProcedure extends ServerCrashProcedure
Acts like the super class in all cases except when no Regions found in the current Master in-memory context. In this latter case, when the call to super#getRegionsOnCrashedServer returns nothing, this SCP will scan hbase:meta for references to the passed ServerName. If any found, we'll clean them up.

This version of SCP is for external invocation as part of fix-up (e.g. HBCK2's scheduleRecoveries); the super class is used during normal recovery operations. It is for the case where meta has references to 'Unknown Servers', servers that are in hbase:meta but not in live-server or dead-server lists; i.e. Master and hbase:meta content have deviated. It should never happen in normal running cluster but if we do drop accounting of servers, we need a means of fix-up. Eventually, as part of normal CatalogJanitor task, rather than just identify these 'Unknown Servers', it would make repair, queuing something like this HBCKSCP to do cleanup, reassigning them so Master and hbase:meta are aligned again.

NOTE that this SCP is costly to run; does a full scan of hbase:meta.

  • Field Details

    • LOG

      private static final org.slf4j.Logger LOG
  • Constructor Details

    • HBCKServerCrashProcedure

      public HBCKServerCrashProcedure(MasterProcedureEnv env, ServerName serverName, boolean shouldSplitWal, boolean carryingMeta)
      Parameters:
      serverName - Name of the crashed server.
      shouldSplitWal - True if we should split WALs as part of crashed server processing.
      carryingMeta - True if carrying hbase:meta table region.
    • HBCKServerCrashProcedure

      Used when deserializing from a procedure store; we'll construct one of these then call #deserializeStateData(InputStream). Do not use directly.
  • Method Details

    • getRegionsOnCrashedServer

      If no Regions found in Master context, then we will search hbase:meta for references to the passed server. Operator may have passed ServerName because they have found references to 'Unknown Servers'. They are using HBCKSCP to clear them out.
      Overrides:
      getRegionsOnCrashedServer in class ServerCrashProcedure
    • isMatchingRegionLocation

      protected boolean isMatchingRegionLocation(RegionStateNode rsn)
      The RegionStateNode will not have a location if a confirm of an OPEN fails. On fail, the RegionStateNode regionLocation is set to null. This is 'looser' than the test done in the superclass. The HBCKSCP has been scheduled by an operator via hbck2 probably at the behest of a report of an 'Unknown Server' in the 'HBCK Report'. Let the operators operation succeed even in case where the region location in the RegionStateNode is null.
      Overrides:
      isMatchingRegionLocation in class ServerCrashProcedure
      Returns:
      True if the region location in rsn matches that of this crashed server.