Class CompactingMemStore
java.lang.Object
org.apache.hadoop.hbase.regionserver.AbstractMemStore
org.apache.hadoop.hbase.regionserver.CompactingMemStore
- All Implemented Interfaces:
Closeable
,AutoCloseable
,MemStore
A memstore implementation which supports in-memory compaction. A compaction pipeline is added
between the active set and the snapshot data structures; it consists of a list of segments that
are subject to compaction. Like the snapshot, all pipeline segments are read-only; updates only
affect the active set. To ensure this property we take advantage of the existing blocking
mechanism -- the active set is pushed to the pipeline while holding the region's updatesLock in
exclusive mode. Periodically, a compaction is applied in the background to all pipeline segments
resulting in a single read-only component. The ``old'' segments are discarded when no scanner is
reading them.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Types of indexes (part of immutable segments) to be used after flattening, compaction, or merge are applied.private class
The in-memory-flusher thread performs the flush asynchronously. -
Field Summary
Modifier and TypeFieldDescriptionprotected final AtomicBoolean
static final String
static final String
protected MemStoreCompactor
private boolean
static final long
static final int
static final String
private static final int
static final String
private CompactingMemStore.IndexType
private final AtomicBoolean
private long
private boolean
private static final org.slf4j.Logger
private CompactionPipeline
private HStore
Fields inherited from class org.apache.hadoop.hbase.regionserver.AbstractMemStore
FIXED_OVERHEAD, regionServices, snapshot, snapshotId
-
Constructor Summary
ConstructorDescriptionCompactingMemStore
(org.apache.hadoop.conf.Configuration conf, CellComparator c, HStore store, RegionServicesForStores regionServices, MemoryCompactionPolicy compactionPolicy) -
Method Summary
Modifier and TypeMethodDescriptionprotected boolean
checkAndAddToActiveSize
(MutableSegment currActive, Cell cellToAdd, MemStoreSizing memstoreSizing) Check whether anything need to be done based on the current active set size.protected List<KeyValueScanner>
createList
(int capacity) protected MemStoreCompactor
createMemStoreCompactor
(MemoryCompactionPolicy compactionPolicy) void
debug()
void
flattenOneSegment
(long requesterVersion, MemStoreCompactionStrategy.Action action) (package private) void
protected void
flushInMemory
(MutableSegment currActive) private byte[]
Flush will first clear out the data in snapshot if any (It will take a second flush invocation to clear the current Cell set).(package private) long
private Segment
(package private) Cell
getNextRow
(Cell cell) private ThreadPoolExecutor
getPool()
private RegionServicesForStores
getScanners
(long readPt) This method is protected underHStore#lock
read lock.Returns an ordered list of segments from most recent to oldest in memstorelong
getStore()
boolean
protected long
heapSize()
private void
initInmemoryFlushSize
(org.apache.hadoop.conf.Configuration conf) (package private) void
(package private) boolean
boolean
isSloppy()
protected long
keySize()
Returns The total size of cells in this memstore.protected void
postUpdate
(MutableSegment currentActive) Issue any post update synchronization and testslong
This method is called before the flush is executed.protected boolean
preUpdate
(MutableSegment currentActive, Cell cell, MemStoreSizing memstoreSizing) Issue any synchronization and test needed before applying the update For compacting memstore this means checking the update can increase the size without overflowprotected void
pushActiveToPipeline
(MutableSegment currActive, boolean checkEmpty) NOTE: WhenflushInMemory(MutableSegment)
calls this method, due to concurrent writes and because we first add cell size to currActive.getDataSize and then actually add cell to currActive.cellSet, it is possible that currActive.getDataSize could not accommodate cellToAdd but currActive.cellSet is still empty if pending writes which not yet add cells to currActive.cellSet,so forflushInMemory(MutableSegment)
,checkEmpty parameter is false.private void
private void
private void
pushToSnapshot
(List<ImmutableSegment> segments) void
setCompositeSnapshot
(boolean useCompositeSnapshot) (package private) void
void
protected boolean
size()
protected boolean
snapshot()
Push the current active memstore segment into the pipeline and create a snapshot of the tail of current compaction pipeline Snapshot must be cleared by call toAbstractMemStore.clearSnapshot(long)
.void
This message intends to inform the MemStore that next coming updates are going to be part of the replaying edits from WALprivate void
The request to cancel the compaction asynchronous task (caused by in-memory flush) The compaction may still happen if the request was sent too late Non-blocking requestvoid
This message intends to inform the MemStore that the replaying edits from WAL are doneboolean
swapCompactedSegments
(VersionedSegmentsList versionedList, ImmutableSegment result, boolean merge) protected boolean
swapPipelineWithNull
(VersionedSegmentsList segments) private void
tryFlushInMemoryAndCompactingAsync
(MutableSegment currActive) Try to flush the currActive in memory and submit the backgroundCompactingMemStore.InMemoryCompactionRunnable
toRegionServicesForStores.getInMemoryCompactionPool()
.void
updateLowestUnflushedSequenceIdInWAL
(boolean onlyIfGreater) Updates the wal with the lowest sequence id (oldest entry) that is still in memoryMethods inherited from class org.apache.hadoop.hbase.regionserver.AbstractMemStore
add, add, addToScanners, addToScanners, clearSnapshot, close, doAdd, doClearSnapShot, dump, getActive, getComparator, getConfiguration, getLowest, getNextRow, getSnapshot, getSnapshotSize, resetActive, resetTimeOfOldestEdit, timeOfOldestEdit, toString, upsert
-
Field Details
-
COMPACTING_MEMSTORE_TYPE_KEY
- See Also:
-
COMPACTING_MEMSTORE_TYPE_DEFAULT
-
IN_MEMORY_FLUSH_THRESHOLD_FACTOR_KEY
- See Also:
-
IN_MEMORY_FLUSH_MULTIPLIER
- See Also:
-
IN_MEMORY_CONPACTION_POOL_SIZE_KEY
- See Also:
-
IN_MEMORY_CONPACTION_POOL_SIZE_DEFAULT
- See Also:
-
LOG
-
store
-
pipeline
-
compactor
-
inmemoryFlushSize
-
inMemoryCompactionInProgress
-
inWalReplay
-
allowCompaction
-
compositeSnapshot
-
indexType
-
DEEP_OVERHEAD
-
-
Constructor Details
-
CompactingMemStore
public CompactingMemStore(org.apache.hadoop.conf.Configuration conf, CellComparator c, HStore store, RegionServicesForStores regionServices, MemoryCompactionPolicy compactionPolicy) throws IOException - Throws:
IOException
-
-
Method Details
-
createMemStoreCompactor
protected MemStoreCompactor createMemStoreCompactor(MemoryCompactionPolicy compactionPolicy) throws IllegalArgumentIOException - Throws:
IllegalArgumentIOException
-
initInmemoryFlushSize
-
size
- Returns:
- Total memory occupied by this MemStore. This won't include any size occupied by the snapshot. We assume the snapshot will get cleared soon. This is not thread safe and the memstore may be changed while computing its size. It is the responsibility of the caller to make sure this doesn't happen.
-
preFlushSeqIDEstimation
This method is called before the flush is executed.- Returns:
- an estimation (lower bound) of the unflushed sequence id in memstore after the flush is
executed. if memstore will be cleared returns
HConstants.NO_SEQNUM
.
-
isSloppy
-
snapshot
Push the current active memstore segment into the pipeline and create a snapshot of the tail of current compaction pipeline Snapshot must be cleared by call toAbstractMemStore.clearSnapshot(long)
.AbstractMemStore.clearSnapshot(long)
.- Returns:
MemStoreSnapshot
-
getFlushableSize
Description copied from interface:MemStore
Flush will first clear out the data in snapshot if any (It will take a second flush invocation to clear the current Cell set). If snapshot is empty, current Cell set will be flushed.- Returns:
- On flush, how much memory we will clear.
-
setInMemoryCompactionCompleted
-
setInMemoryCompactionFlag
-
keySize
Description copied from class:AbstractMemStore
Returns The total size of cells in this memstore. We will not consider cells in the snapshot- Specified by:
keySize
in classAbstractMemStore
-
heapSize
- Specified by:
heapSize
in classAbstractMemStore
- Returns:
- The total heap size of cells in this memstore. We will not consider cells in the snapshot
-
updateLowestUnflushedSequenceIdInWAL
Description copied from class:AbstractMemStore
Updates the wal with the lowest sequence id (oldest entry) that is still in memory- Specified by:
updateLowestUnflushedSequenceIdInWAL
in classAbstractMemStore
- Parameters:
onlyIfGreater
- a flag that marks whether to update the sequence id no matter what or only if it is greater than the previous sequence id
-
startReplayingFromWAL
This message intends to inform the MemStore that next coming updates are going to be part of the replaying edits from WAL -
stopReplayingFromWAL
This message intends to inform the MemStore that the replaying edits from WAL are done -
preUpdate
Issue any synchronization and test needed before applying the update For compacting memstore this means checking the update can increase the size without overflow- Specified by:
preUpdate
in classAbstractMemStore
- Parameters:
currentActive
- the segment to be updatedcell
- the cell to be addedmemstoreSizing
- object to accumulate region size changes- Returns:
- true iff can proceed with applying the update
-
postUpdate
Description copied from class:AbstractMemStore
Issue any post update synchronization and tests- Specified by:
postUpdate
in classAbstractMemStore
- Parameters:
currentActive
- updated segment
-
sizeAddedPreOperation
- Specified by:
sizeAddedPreOperation
in classAbstractMemStore
-
getSegments
Description copied from class:AbstractMemStore
Returns an ordered list of segments from most recent to oldest in memstore- Specified by:
getSegments
in classAbstractMemStore
-
setCompositeSnapshot
-
swapCompactedSegments
public boolean swapCompactedSegments(VersionedSegmentsList versionedList, ImmutableSegment result, boolean merge) -
flattenOneSegment
- Parameters:
requesterVersion
- The caller must hold the VersionedList of the pipeline with version taken earlier. This version must be passed as a parameter here. The flattening happens only if versions match.
-
setIndexType
-
getIndexType
-
hasImmutableSegments
-
getImmutableSegments
-
getSmallestReadPoint
-
getStore
-
getFamilyName
-
getScanners
This method is protected underHStore#lock
read lock.- Returns:
- scanner over the memstore. This might include scanner over the snapshot when one is present.
- Throws:
IOException
-
createList
-
checkAndAddToActiveSize
protected boolean checkAndAddToActiveSize(MutableSegment currActive, Cell cellToAdd, MemStoreSizing memstoreSizing) Check whether anything need to be done based on the current active set size. The method is invoked upon every addition to the active set. For CompactingMemStore, flush the active set to the read-only memory if it's size is above threshold- Parameters:
currActive
- intended segment to updatecellToAdd
- cell to be added to the segmentmemstoreSizing
- object to accumulate changed size- Returns:
- true if the cell can be added to the currActive
-
tryFlushInMemoryAndCompactingAsync
Try to flush the currActive in memory and submit the backgroundCompactingMemStore.InMemoryCompactionRunnable
toRegionServicesForStores.getInMemoryCompactionPool()
. Just one thread can do the actual flushing in memory.- Parameters:
currActive
- current Active Segment to be flush in memory.
-
flushInMemory
void flushInMemory() -
flushInMemory
-
inMemoryCompaction
void inMemoryCompaction() -
getLastSegment
-
getFamilyNameInBytes
-
getPool
-
stopCompaction
The request to cancel the compaction asynchronous task (caused by in-memory flush) The compaction may still happen if the request was sent too late Non-blocking request -
pushActiveToPipeline
NOTE: WhenflushInMemory(MutableSegment)
calls this method, due to concurrent writes and because we first add cell size to currActive.getDataSize and then actually add cell to currActive.cellSet, it is possible that currActive.getDataSize could not accommodate cellToAdd but currActive.cellSet is still empty if pending writes which not yet add cells to currActive.cellSet,so forflushInMemory(MutableSegment)
,checkEmpty parameter is false. But ifsnapshot()
called this method,because there is no pending write,checkEmpty parameter could be true. -
pushTailToSnapshot
-
pushPipelineToSnapshot
-
swapPipelineWithNull
-
pushToSnapshot
-
getRegionServices
-
isMemStoreFlushingInMemory
boolean isMemStoreFlushingInMemory() -
getNextRow
- Parameters:
cell
- Find the row that comes after this one. If null, we return the first.- Returns:
- Next row or null if none found.
-
getInmemoryFlushSize
long getInmemoryFlushSize() -
debug
-