Class Procedure<TEnvironment>
java.lang.Object
org.apache.hadoop.hbase.procedure2.Procedure<TEnvironment>
- All Implemented Interfaces:
Comparable<Procedure<TEnvironment>>
- Direct Known Subclasses:
ClaimReplicationQueuesProcedure
,FailedProcedure
,FlushRegionProcedure
,LockProcedure
,OnePhaseProcedure
,ProcedureInMemoryChore
,RegionRemoteProcedureBase
,RegionTransitionProcedure
,SequentialProcedure
,ServerRemoteProcedure
,SnapshotRegionProcedure
,StateMachineProcedure
,TwoPhaseProcedure
@Private
public abstract class Procedure<TEnvironment>
extends Object
implements Comparable<Procedure<TEnvironment>>
Base Procedure class responsible for Procedure Metadata; e.g. state, submittedTime, lastUpdate,
stack-indexes, etc.
Procedures are run by a
ProcedureExecutor
instance. They are submitted and then the
ProcedureExecutor keeps calling execute(Object)
until the Procedure is done. Execute may
be called multiple times in the case of failure or a restart, so code must be idempotent. The
return from an execute call is either: null to indicate we are done; ourself if there is more to
do; or, a set of sub-procedures that need to be run to completion before the framework resumes
our execution.
The ProcedureExecutor keeps its notion of Procedure State in the Procedure itself; e.g. it stamps
the Procedure as INITIALIZING, RUNNABLE, SUCCESS, etc. Here are some of the States defined in the
ProcedureState enum from protos:
isFailed()
A procedure has executed at least once and has failed. The procedure may or may not have rolled back yet. Any procedure in FAILED state will be eventually moved to ROLLEDBACK state.isSuccess()
A procedure is completed successfully without exception.isFinished()
As a procedure in FAILED state will be tried forever for rollback, only condition when scheduler/ executor will drop procedure from further processing is when procedure state is ROLLEDBACK or isSuccess() returns true. This is a terminal state of the procedure.isWaiting()
- Procedure is in one of the two waiting states (ProcedureProtos.ProcedureState.WAITING
,ProcedureProtos.ProcedureState.WAITING_TIMEOUT
).
hasLock()
. The lock implementation is up to the implementor. If an entity needs to be
locked for the life of a procedure -- not just the calls to execute -- then implementations
should say so with the holdLock(Object)
method.
And since we need to restore the lock when restarting to keep the logic correct(HBASE-20846), the
implementation is a bit tricky so we add some comments hrre about it.
- Make
hasLock()
method final, and add alocked
field in Procedure to record whether we have the lock. We will set it totrue
indoAcquireLock(Object, ProcedureStore)
and tofalse
indoReleaseLock(Object, ProcedureStore)
. The sub classes do not need to manage it any more. - Also added a locked field in the proto message. When storing, the field will be set according
to the return value of
hasLock()
. And when loading, there is a new field in Procedure calledlockedWhenLoading
. We will set it totrue
if the locked field in proto message istrue
. - The reason why we can not set the
locked
field directly totrue
by callingdoAcquireLock(Object, ProcedureStore)
is that, during initialization, most procedures need to wait until master is initialized. So the solution here is that, we introduced a new method calledwaitInitialized(Object)
in Procedure, and move the wait master initialized related code fromacquireLock(Object)
to this method. And we added a restoreLock method to Procedure, iflockedWhenLoading
istrue
, we will call theacquireLock(Object)
to get the lock, but do not setlocked
to true. And later when we calldoAcquireLock(Object, ProcedureStore)
and pass thewaitInitialized(Object)
check, we will testlockedWhenLoading
, if it istrue
, when we just set thelocked
field to true and return, without actually calling theacquireLock(Object)
method since we have already called it once.
setTimeout(int)
}, and
setTimeoutFailure(Object)
. See TestProcedureEvents and the TestTimeoutEventProcedure
class for an example usage.
There are hooks for collecting metrics on submit of the procedure and on finish. See
updateMetricsOnSubmit(Object)
and updateMetricsOnFinish(Object, long, boolean)
.-
Nested Class Summary
-
Field Summary
Modifier and TypeFieldDescriptionprivate boolean
Used for override complete of the procedure without actually doing any logic in the procedure.private int
private RemoteProcedureException
private long
private boolean
private boolean
private static final org.slf4j.Logger
static final long
protected static final int
private NonceKey
private String
private long
private boolean
Indicate whether we need to persist the procedure to ProcedureStore after execution.private long
private byte[]
private long
private int[]
private org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureState
private long
private int
private boolean
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected abstract boolean
abort
(TEnvironment env) The abort() call is asynchronous and each procedure must decide how to deal with it, if they want to be abortable.protected Procedure.LockState
acquireLock
(TEnvironment env) The user should override this method if they need a lock on an Entity.protected void
addStackIndex
(int index) Called by the RootProcedureState on procedure execution.protected void
afterReplay
(TEnvironment env) Called when the procedure is ready to be added to the queue after the loading/replay operation.protected void
beforeReplay
(TEnvironment env) Called when the procedure is loaded for replay.protected void
bypass
(TEnvironment env) Set the bypass to true.private boolean
Called by the ProcedureExecutor to notify that one of the sub-procedures has completed.int
compareTo
(Procedure<TEnvironment> other) protected void
Called when the procedure is marked as completed (success or rollback).protected abstract void
deserializeStateData
(ProcedureStateSerializer serializer) Called on store load to allow the user to decode the previously serialized state.(package private) final Procedure.LockState
doAcquireLock
(TEnvironment env, ProcedureStore store) Internal method called by the ProcedureExecutor that starts the user-level code acquireLock().protected Procedure<TEnvironment>[]
doExecute
(TEnvironment env) Internal method called by the ProcedureExecutor that starts the user-level code execute().(package private) final void
doReleaseLock
(TEnvironment env, ProcedureStore store) Internal method called by the ProcedureExecutor that starts the user-level code releaseLock().protected void
doRollback
(TEnvironment env) Internal method called by the ProcedureExecutor that starts the user-level code rollback().long
Returns the time elapsed between the last update and the start time of the procedure.protected abstract Procedure<TEnvironment>[]
execute
(TEnvironment env) The main code of the procedure.protected int
long
getOwner()
long
protected ProcedureMetrics
Override this method to provide procedure specific counters for submitted count, failed count and time histogram.long
static long
getProcIdHashCode
(long procId) Get an hashcode for the specified Procedure IDbyte[]
Returns the serialized result if any, otherwise nullprotected static <T> Long
getRootProcedureId
(Map<Long, Procedure<T>> procedures, Procedure<T> proc) Helper to lookup the root Procedure ID given a specified procedure.long
protected int[]
org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureState
getState()
long
int
Returns the timeout in msecprotected long
Timeout of the next timeout.protected boolean
boolean
final boolean
hasLock()
This is used in conjunction withholdLock(Object)
.boolean
hasOwner()
boolean
boolean
static boolean
haveSameParent
(Procedure<?> a, Procedure<?> b) protected boolean
holdLock
(TEnvironment env) Used to keep the procedure lock even when the procedure is yielding or suspended.protected void
Called by the ProcedureExecutor on procedure-load to restore the latch stateboolean
isBypass()
boolean
isFailed()
Returns true if the procedure has failed.boolean
boolean
boolean
Can only be called when restarting, before the procedure actually being executed, as after we actually call thedoAcquireLock(Object, ProcedureStore)
method, we will resetlockedWhenLoading
to false.protected boolean
Return whether the procedure supports rollback.boolean
Returns true if the procedure is in a RUNNABLE state.boolean
Returns true if the procedure is finished successfully.boolean
Returns true if the procedure is waiting for a child to finish or for an external event.protected boolean
By default, the procedure framework/executor will try to run procedures start to finish.(package private) final void
Will only be called when loading procedures from procedure store, where we need to record whether the procedure has already held a lock.(package private) boolean
protected void
releaseLock
(TEnvironment env) The user should override this method, and release lock if necessary.protected boolean
(package private) void
(package private) final void
restoreLock
(TEnvironment env) protected abstract void
rollback
(TEnvironment env) The code to undo what was done by the execute() code.protected abstract void
serializeStateData
(ProcedureStateSerializer serializer) The user-level code of the procedure may have some state to persist (e.g.protected void
setAbortFailure
(String source, String msg) protected void
setChildrenLatch
(int numChildren) Called by the ProcedureExecutor on procedure-load to restore the latch stateprotected void
protected void
setFailure
(String source, Throwable cause) protected void
setFailure
(RemoteProcedureException exception) protected void
setLastUpdate
(long lastUpdate) Called on store load to initialize the Procedure internals after the creation/deserialization.protected void
setNonceKey
(NonceKey nonceKey) Called by the ProcedureExecutor to set the value to the newly created procedure.void
void
protected void
setParentProcId
(long parentProcId) Called by the ProcedureExecutor to assign the parent to the newly created procedure.protected void
setProcId
(long procId) Called by the ProcedureExecutor to assign the ID to the newly created procedure.protected void
setResult
(byte[] result) The procedure may leave a "result" on completion.protected void
setRootProcId
(long rootProcId) protected void
setStackIndexes
(List<Integer> stackIndexes) Called on store load to initialize the Procedure internals after the creation/deserialization.protected void
setState
(org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureState state) protected void
setSubmittedTime
(long submittedTime) Called on store load to initialize the Procedure internals after the creation/deserialization.protected void
setTimeout
(int timeout) protected boolean
Called by the ProcedureExecutor when the timeout set by setTimeout() is expired.protected boolean
By default, the executor will keep the procedure result around util the eviction TTL is expired.protected final void
protected final ProcedureSuspendedException
suspend
(int timeoutMillis, boolean jitter) toString()
protected String
protected void
toStringClassDetails
(StringBuilder builder) Extend the toString() information with the procedure details e.g.Extend the toString() information with more procedure detailsprotected StringBuilder
Build the StringBuilder for the simple form of procedure string.protected void
toStringState
(StringBuilder builder) Called fromtoString()
when interpolatingProcedure
State.(package private) boolean
Try to set this procedure into RUNNABLE state.protected void
updateMetricsOnFinish
(TEnvironment env, long runtime, boolean success) This function will be called just after procedure execution is finished.protected void
This function will be called just when procedure is submitted for execution.protected void
Called by ProcedureExecutor after each time a procedure step is executed.protected boolean
ThedoAcquireLock(Object, ProcedureStore)
will be split into two steps, first, it will call us to determine whether we need to wait for initialization, second, it will callacquireLock(Object)
to actually handle the lock for this procedure.boolean
-
Field Details
-
LOG
-
NO_PROC_ID
- See Also:
-
NO_TIMEOUT
- See Also:
-
nonceKey
-
owner
-
parentProcId
-
rootProcId
-
procId
-
submittedTime
-
state
-
exception
-
stackIndexes
-
childrenLatch
-
wasExecuted
-
timeout
-
lastUpdate
-
result
-
locked
-
lockedWhenLoading
-
bypass
Used for override complete of the procedure without actually doing any logic in the procedure. If bypass is set to true, when executing it will return null whendoExecute(Object)
is called to finish the procedure and release any locks it may currently hold. The bypass does cleanup around the Procedure as far as the Procedure framework is concerned. It does not clean any internal state that the Procedure's themselves may have set. That is for the Procedures to do themselves when bypass is called. They should override bypass and do their cleanup in the overridden bypass method (be sure to call the parent bypass to ensure proper processing).abort(Object)
method is overrideable Some procedures may have chosen to ignore the aborting. -
persist
Indicate whether we need to persist the procedure to ProcedureStore after execution. Default to true, and the implementation can allskipPersistence()
to let the framework skip the persistence of the procedure. This is useful when the procedure is in error and you want to retry later. The retry interval and the number of retries are usually not critical so skip the persistence can save some resources, and also speed up the restart processing. Notice that this value will be reset to true every time before execution. And when rolling back we do not test this value.
-
-
Constructor Details
-
Procedure
public Procedure()
-
-
Method Details
-
isBypass
-
bypass
Set the bypass to true. Only called inProcedureExecutor.bypassProcedure(long, long, boolean, boolean)
for now. DO NOT use this method alone, since we can't just bypass one single procedure. We need to bypass its ancestor too. If your Procedure has set state, it needs to undo it in here.- Parameters:
env
- Current environment. May be null because of context; e.g. pretty-printing procedure WALs where there is no 'environment' (and where Procedures that require an 'environment' won't be run.
-
needPersistence
boolean needPersistence() -
resetPersistence
void resetPersistence() -
skipPersistence
-
execute
protected abstract Procedure<TEnvironment>[] execute(TEnvironment env) throws ProcedureYieldException, ProcedureSuspendedException, InterruptedException The main code of the procedure. It must be idempotent since execute() may be called multiple times in case of machine failure in the middle of the execution.- Parameters:
env
- the environment passed to the ProcedureExecutor- Returns:
- a set of sub-procedures to run or ourselves if there is more work to do or null if the procedure is done.
- Throws:
ProcedureYieldException
- the procedure will be added back to the queue and retried later.InterruptedException
- the procedure will be added back to the queue and retried later.ProcedureSuspendedException
- Signal to the executor that Procedure has suspended itself and has set itself up waiting for an external event to wake it back up again.
-
rollback
The code to undo what was done by the execute() code. It is called when the procedure or one of the sub-procedures failed or an abort was requested. It should cleanup all the resources created by the execute() call. The implementation must be idempotent since rollback() may be called multiple time in case of machine failure in the middle of the execution.- Parameters:
env
- the environment passed to the ProcedureExecutor- Throws:
IOException
- temporary failure, the rollback will retry laterInterruptedException
- the procedure will be added back to the queue and retried later
-
abort
The abort() call is asynchronous and each procedure must decide how to deal with it, if they want to be abortable. The simplest implementation is to have an AtomicBoolean set in the abort() method and then the execute() will check if the abort flag is set or not. abort() may be called multiple times from the client, so the implementation must be idempotent.NOTE: abort() is not like Thread.interrupt(). It is just a notification that allows the procedure implementor abort.
-
serializeStateData
The user-level code of the procedure may have some state to persist (e.g. input arguments or current position in the processing state) to be able to resume on failure.- Parameters:
serializer
- stores the serializable state- Throws:
IOException
-
deserializeStateData
protected abstract void deserializeStateData(ProcedureStateSerializer serializer) throws IOException Called on store load to allow the user to decode the previously serialized state.- Parameters:
serializer
- contains the serialized state- Throws:
IOException
-
waitInitialized
ThedoAcquireLock(Object, ProcedureStore)
will be split into two steps, first, it will call us to determine whether we need to wait for initialization, second, it will callacquireLock(Object)
to actually handle the lock for this procedure. This is because that when master restarts, we need to restore the lock state for all the procedures to not break the semantic ifholdLock(Object)
is true. But theProcedureExecutor
will be started before the master finish initialization(as it is part of the initialization!), so we need to split the code into two steps, and when restore, we just restore the lock part and ignore the waitInitialized part. Otherwise there will be dead lock.- Returns:
- true means we need to wait until the environment has been initialized, otherwise true.
-
acquireLock
The user should override this method if they need a lock on an Entity. A lock can be anything, and it is up to the implementor. The Procedure Framework will call this method just before it invokesexecute(Object)
. It callsreleaseLock(Object)
after the call to execute. If you need to hold the lock for the life of the Procedure -- i.e. you do not want any other Procedure interfering while this Procedure is running, seeholdLock(Object)
. Example: in our Master we can execute request in parallel for different tables. We can create t1 and create t2 and these creates can be executed at the same time. Anything else on t1/t2 is queued waiting that specific table create to happen. There are 3 LockState:- LOCK_ACQUIRED should be returned when the proc has the lock and the proc is ready to execute.
- LOCK_YIELD_WAIT should be returned when the proc has not the lock and the framework should take care of readding the procedure back to the runnable set for retry
- LOCK_EVENT_WAIT should be returned when the proc has not the lock and someone will take care of readding the procedure back to the runnable set when the lock is available.
- Returns:
- the lock state as described above.
-
releaseLock
The user should override this method, and release lock if necessary. -
holdLock
Used to keep the procedure lock even when the procedure is yielding or suspended.- Returns:
- true if the procedure should hold on the lock until completionCleanup()
-
hasLock
This is used in conjunction withholdLock(Object)
. IfholdLock(Object)
returns true, the procedure executor will call acquireLock() once and thereafter not callreleaseLock(Object)
until the Procedure is done (Normally, it calls release/acquire around each invocation ofexecute(Object)
.- Returns:
- true if the procedure has the lock, false otherwise.
- See Also:
-
beforeReplay
Called when the procedure is loaded for replay. The procedure implementor may use this method to perform some quick operation before replay. e.g. failing the procedure if the state on replay may be unknown. -
afterReplay
Called when the procedure is ready to be added to the queue after the loading/replay operation. -
completionCleanup
Called when the procedure is marked as completed (success or rollback). The procedure implementor may use this method to cleanup in-memory states. This operation will not be retried on failure. If a procedure took a lock, it will have been released when this method runs. -
isYieldAfterExecutionStep
By default, the procedure framework/executor will try to run procedures start to finish. Return true to make the executor yield between each execution step to give other procedures a chance to run.- Parameters:
env
- the environment passed to the ProcedureExecutor- Returns:
- Return true if the executor should yield on completion of an execution step. Defaults to return false.
-
shouldWaitClientAck
By default, the executor will keep the procedure result around util the eviction TTL is expired. The client can cut down the waiting time by requesting that the result is removed from the executor. In case of system started procedure, we can force the executor to auto-ack.- Parameters:
env
- the environment passed to the ProcedureExecutor- Returns:
- true if the executor should wait the client ack for the result. Defaults to return true.
-
getProcedureMetrics
Override this method to provide procedure specific counters for submitted count, failed count and time histogram.- Parameters:
env
- The environment passed to the procedure executor- Returns:
- Container object for procedure related metric
-
updateMetricsOnSubmit
This function will be called just when procedure is submitted for execution. Override this method to update the metrics at the beginning of the procedure. The default implementation updates submitted counter ifgetProcedureMetrics(Object)
returns non-nullProcedureMetrics
. -
updateMetricsOnFinish
This function will be called just after procedure execution is finished. Override this method to update metrics at the end of the procedure. IfgetProcedureMetrics(Object)
returns non-nullProcedureMetrics
, the default implementation adds runtime of a procedure to a time histogram for successfully completed procedures. Increments failed counter for failed procedures. TODO: As any of the sub-procedures on failure rolls back all procedures in the stack, including successfully finished siblings, this function may get called twice in certain cases for certain procedures. Explore further if this can be called once.- Parameters:
env
- The environment passed to the procedure executorruntime
- Runtime of the procedure in millisecondssuccess
- true if procedure is completed successfully
-
toString
-
toStringSimpleSB
Build the StringBuilder for the simple form of procedure string.- Returns:
- the StringBuilder
-
toStringDetails
Extend the toString() information with more procedure details -
toStringClass
-
toStringState
Called fromtoString()
when interpolatingProcedure
State. Allows decorating generic Procedure State with Procedure particulars.- Parameters:
builder
- Append currentProcedureProtos.ProcedureState
-
toStringClassDetails
Extend the toString() information with the procedure details e.g. className and parameters- Parameters:
builder
- the string builder to use to append the proc specific information
-
getProcId
-
hasParent
-
getParentProcId
-
getRootProcId
-
getProcName
-
getNonceKey
-
getSubmittedTime
-
getOwner
-
hasOwner
-
setProcId
Called by the ProcedureExecutor to assign the ID to the newly created procedure. -
setParentProcId
Called by the ProcedureExecutor to assign the parent to the newly created procedure. -
setRootProcId
-
setNonceKey
Called by the ProcedureExecutor to set the value to the newly created procedure. -
setOwner
-
setOwner
-
setSubmittedTime
Called on store load to initialize the Procedure internals after the creation/deserialization. -
setTimeout
- Parameters:
timeout
- timeout interval in msec
-
hasTimeout
-
getTimeout
Returns the timeout in msec -
setLastUpdate
Called on store load to initialize the Procedure internals after the creation/deserialization. -
updateTimestamp
Called by ProcedureExecutor after each time a procedure step is executed. -
getLastUpdate
-
getTimeoutTimestamp
Timeout of the next timeout. Called by the ProcedureExecutor if the procedure has timeout set and the procedure is in the waiting queue.- Returns:
- the timestamp of the next timeout.
-
elapsedTime
Returns the time elapsed between the last update and the start time of the procedure. -
getResult
Returns the serialized result if any, otherwise null -
setResult
The procedure may leave a "result" on completion.- Parameters:
result
- the serialized result that will be passed to the client
-
lockedWhenLoading
Will only be called when loading procedures from procedure store, where we need to record whether the procedure has already held a lock. Later we will callrestoreLock(Object)
to actually acquire the lock. -
isLockedWhenLoading
Can only be called when restarting, before the procedure actually being executed, as after we actually call thedoAcquireLock(Object, ProcedureStore)
method, we will resetlockedWhenLoading
to false. Now it is only used in the ProcedureScheduler to determine whether we should put a Procedure in front of a queue. -
isRunnable
Returns true if the procedure is in a RUNNABLE state. -
isInitializing
-
isFailed
Returns true if the procedure has failed. It may or may not have rolled back. -
isSuccess
Returns true if the procedure is finished successfully. -
isFinished
- Returns:
- true if the procedure is finished. The Procedure may be completed successfully or rolledback.
-
isWaiting
Returns true if the procedure is waiting for a child to finish or for an external event. -
setState
protected void setState(org.apache.hadoop.hbase.shaded.protobuf.generated.ProcedureProtos.ProcedureState state) -
getState
-
setFailure
-
setFailure
-
setAbortFailure
-
setTimeoutFailure
Called by the ProcedureExecutor when the timeout set by setTimeout() is expired. Another usage for this method is to implement retrying. A procedure can set the state toWAITING_TIMEOUT
by callingsetState
method, and throw aProcedureSuspendedException
to halt the execution of the procedure, and do not forget a callsetTimeout(int)
method to set the timeout. And you should also override this method to wake up the procedure, and also return false to tell the ProcedureExecutor that the timeout event has been handled.- Returns:
- true to let the framework handle the timeout as abort, false in case the procedure handled the timeout itself.
-
hasException
-
getException
-
setChildrenLatch
Called by the ProcedureExecutor on procedure-load to restore the latch state -
incChildrenLatch
Called by the ProcedureExecutor on procedure-load to restore the latch state -
childrenCountDown
Called by the ProcedureExecutor to notify that one of the sub-procedures has completed. -
tryRunnable
boolean tryRunnable()Try to set this procedure into RUNNABLE state. Succeeds if all subprocedures/children are done.- Returns:
- True if we were able to move procedure to RUNNABLE state.
-
hasChildren
-
getChildrenLatch
-
addStackIndex
Called by the RootProcedureState on procedure execution. Each procedure store its stack-index positions. -
removeStackIndex
-
setStackIndexes
Called on store load to initialize the Procedure internals after the creation/deserialization. -
setExecuted
-
wasExecuted
-
getStackIndexes
-
isRollbackSupported
Return whether the procedure supports rollback. If the procedure does not support rollback, we can skip the rollback state management which could increase the performance. See HBASE-28210 and HBASE-28212. -
doExecute
protected Procedure<TEnvironment>[] doExecute(TEnvironment env) throws ProcedureYieldException, ProcedureSuspendedException, InterruptedException Internal method called by the ProcedureExecutor that starts the user-level code execute().- Throws:
ProcedureSuspendedException
- This is used when procedure wants to halt processing and skip out without changing states or releasing any locks held.ProcedureYieldException
InterruptedException
-
doRollback
Internal method called by the ProcedureExecutor that starts the user-level code rollback().- Throws:
IOException
InterruptedException
-
restoreLock
-
doAcquireLock
Internal method called by the ProcedureExecutor that starts the user-level code acquireLock(). -
doReleaseLock
Internal method called by the ProcedureExecutor that starts the user-level code releaseLock(). -
suspend
protected final ProcedureSuspendedException suspend(int timeoutMillis, boolean jitter) throws ProcedureSuspendedException - Throws:
ProcedureSuspendedException
-
compareTo
- Specified by:
compareTo
in interfaceComparable<TEnvironment>
-
getProcIdHashCode
Get an hashcode for the specified Procedure ID- Returns:
- the hashcode for the specified procId
-
getRootProcedureId
Helper to lookup the root Procedure ID given a specified procedure. -
haveSameParent
- Parameters:
a
- the first procedure to be compared.b
- the second procedure to be compared.- Returns:
- true if the two procedures have the same parent
-