Class TestFanOutOneBlockAsyncDFSOutputHang
java.lang.Object
org.apache.hadoop.hbase.io.asyncfs.AsyncFSTestBase
org.apache.hadoop.hbase.io.asyncfs.TestFanOutOneBlockAsyncDFSOutputHang
Testcase for HBASE-26679, here we introduce a separate test class and not put the testcase in
TestFanOutOneBlockAsyncDFSOutput
because we will send heartbeat to DN when there is no
out going packet, the timeout is controlled by
TestFanOutOneBlockAsyncDFSOutput.READ_TIMEOUT_MS
,which is 2 seconds, it will keep sending
package out and DN will respond immedately and then mess up the testing handler added by us. So
in this test class we use the default value for timeout which is 60 seconds and it is enough for
this test.-
Field Summary
Modifier and TypeFieldDescriptionprivate static Class<? extends org.apache.hbase.thirdparty.io.netty.channel.Channel>
static final HBaseClassTestRule
private static org.apache.hbase.thirdparty.io.netty.channel.EventLoopGroup
private static org.apache.hadoop.hdfs.DistributedFileSystem
private static final org.slf4j.Logger
private static org.apache.hadoop.hbase.io.asyncfs.monitor.StreamSlowMonitor
org.junit.rules.TestName
private static org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput
Fields inherited from class org.apache.hadoop.hbase.io.asyncfs.AsyncFSTestBase
CLUSTER, CLUSTER_TEST_DIR, UTIL
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprivate static org.apache.hadoop.hdfs.MiniDFSCluster.DataNodeProperties
findAndKillFirstDataNode
(org.apache.hadoop.hdfs.protocol.DatanodeInfo firstDatanodeInfo) static void
setUp()
static void
tearDown()
void
This test is for HBASE-26679.Methods inherited from class org.apache.hadoop.hbase.io.asyncfs.AsyncFSTestBase
setupClusterTestDir, shutdownMiniDFSCluster, startMiniDFSCluster
-
Field Details
-
CLASS_RULE
-
LOG
-
FS
-
EVENT_LOOP_GROUP
-
CHANNEL_CLASS
-
MONITOR
-
OUT
-
name
-
-
Constructor Details
-
TestFanOutOneBlockAsyncDFSOutputHang
public TestFanOutOneBlockAsyncDFSOutputHang()
-
-
Method Details
-
setUp
- Throws:
Exception
-
tearDown
- Throws:
Exception
-
testFlushHangWhenOneDataNodeFailedBeforeOtherDataNodeAck
This test is for HBASE-26679. Consider there are two dataNodes: dn1 and dn2,dn2 is a slow DN. The threads sequence before HBASE-26679 is: 1.We write some data to
FanOutOneBlockAsyncDFSOutput
and then flush it, there are oneFanOutOneBlockAsyncDFSOutput.Callback
inFanOutOneBlockAsyncDFSOutput.waitingAckQueue
. 2.The ack from dn1 arrives firstly and triggers Netty to invokeFanOutOneBlockAsyncDFSOutput.completed(org.apache.hbase.thirdparty.io.netty.channel.Channel)
with dn1's channel, then inFanOutOneBlockAsyncDFSOutput.completed(org.apache.hbase.thirdparty.io.netty.channel.Channel)
, dn1's channel is removed fromFanOutOneBlockAsyncDFSOutput.Callback#unfinishedReplicas
. 3.But dn2 responds slowly, before dn2 sending ack,dn1 is shut down or have a exception, soFanOutOneBlockAsyncDFSOutput.failed(org.apache.hbase.thirdparty.io.netty.channel.Channel, java.util.function.Supplier<java.lang.Throwable>)
is triggered by Netty with dn1's channel, and because theFanOutOneBlockAsyncDFSOutput.Callback#unfinishedReplicas
does not contain dn1's channel,theFanOutOneBlockAsyncDFSOutput.Callback
is skipped inFanOutOneBlockAsyncDFSOutput.failed(org.apache.hbase.thirdparty.io.netty.channel.Channel, java.util.function.Supplier<java.lang.Throwable>)
method,andFanOutOneBlockAsyncDFSOutput.state
is set toFanOutOneBlockAsyncDFSOutput.State#BROKEN
,and dn1,dn2 are all closed at the end ofFanOutOneBlockAsyncDFSOutput.failed(org.apache.hbase.thirdparty.io.netty.channel.Channel, java.util.function.Supplier<java.lang.Throwable>)
. 4.FanOutOneBlockAsyncDFSOutput.failed(org.apache.hbase.thirdparty.io.netty.channel.Channel, java.util.function.Supplier<java.lang.Throwable>)
is triggered again by dn2 because it is closed, but becauseFanOutOneBlockAsyncDFSOutput.state
is alreadyFanOutOneBlockAsyncDFSOutput.State#BROKEN
,the wholeFanOutOneBlockAsyncDFSOutput.failed(org.apache.hbase.thirdparty.io.netty.channel.Channel, java.util.function.Supplier<java.lang.Throwable>)
is skipped. So wait on the future returned byFanOutOneBlockAsyncDFSOutput.flush(boolean)
would be stuck for ever. After HBASE-26679, for above step 4,even if theFanOutOneBlockAsyncDFSOutput.state
is alreadyFanOutOneBlockAsyncDFSOutput.State#BROKEN
, we would still try to triggerFanOutOneBlockAsyncDFSOutput.Callback#future
.- Throws:
Exception
-
findAndKillFirstDataNode
private static org.apache.hadoop.hdfs.MiniDFSCluster.DataNodeProperties findAndKillFirstDataNode(org.apache.hadoop.hdfs.protocol.DatanodeInfo firstDatanodeInfo)
-