Class RatioBasedCompactionPolicy

Direct Known Subclasses:
ExploringCompactionPolicy

The default algorithm for selecting files for compaction. Combines the compaction configuration and the provisional file selection that it's given to produce the list of suitable candidates for compaction.
  • Field Details

    • LOG

      private static final org.slf4j.Logger LOG
  • Constructor Details

  • Method Details

    • shouldPerformMajorCompaction

      public boolean shouldPerformMajorCompaction(Collection<HStoreFile> filesToCompact) throws IOException
      Specified by:
      shouldPerformMajorCompaction in class SortedCompactionPolicy
      Parameters:
      filesToCompact - Files to compact. Can be null.
      Returns:
      True if we should run a major compaction.
      Throws:
      IOException
    • createCompactionRequest

      protected CompactionRequestImpl createCompactionRequest(ArrayList<HStoreFile> candidateSelection, boolean tryingMajor, boolean mayUseOffPeak, boolean mayBeStuck) throws IOException
      Specified by:
      createCompactionRequest in class SortedCompactionPolicy
      Throws:
      IOException
    • applyCompactionPolicy

      protected ArrayList<HStoreFile> applyCompactionPolicy(ArrayList<HStoreFile> candidates, boolean mayUseOffPeak, boolean mayBeStuck) throws IOException
      -- Default minor compaction selection algorithm: choose CompactSelection from candidates -- First exclude bulk-load files if indicated in configuration. Start at the oldest file and stop when you find the first file that meets compaction criteria: (1) a recently-flushed, small file (i.e. <= minCompactSize) OR (2) within the compactRatio of sum(newer_files) Given normal skew, any newer files will also meet this criteria

      Additional Note: If fileSizes.size() >> maxFilesToCompact, we will recurse on compact(). Consider the oldest files first to avoid a situation where we always compact [end-threshold,end). Then, the last file becomes an aggregate of the previous compactions. normal skew: older ----> newer (increasing seqID) _ | | _ | | | | _ --|-|- |-|- |-|---_-------_------- minCompactSize | | | | | | | | _ | | | | | | | | | | | | | | | | | | | | | | | | | |

      Parameters:
      candidates - pre-filtrate
      Returns:
      filtered subset
      Throws:
      IOException
    • needsCompaction

      public boolean needsCompaction(Collection<HStoreFile> storeFiles, List<HStoreFile> filesCompacting)
      A heuristic method to decide whether to schedule a compaction request
      Specified by:
      needsCompaction in class SortedCompactionPolicy
      Parameters:
      storeFiles - files in the store.
      filesCompacting - files being scheduled to compact.
      Returns:
      true to schedule a request.
    • setMinThreshold

      public void setMinThreshold(int minThreshold)
      Overwrite min threshold for compaction