Class ExploringCompactionPolicy

Direct Known Subclasses:
FIFOCompactionPolicy

Class to pick which files if any to compact together. This class will search all possibilities for different and if it gets stuck it will choose the smallest set of files to compact.
  • Field Details

    • LOG

      private static final org.slf4j.Logger LOG
  • Constructor Details

    • ExploringCompactionPolicy

      public ExploringCompactionPolicy(org.apache.hadoop.conf.Configuration conf, StoreConfigInformation storeConfigInfo)
      Constructor for ExploringCompactionPolicy.
      Parameters:
      conf - The configuration object
      storeConfigInfo - An object to provide info about the store.
  • Method Details

    • applyCompactionPolicy

      protected final ArrayList<HStoreFile> applyCompactionPolicy(ArrayList<HStoreFile> candidates, boolean mayUseOffPeak, boolean mightBeStuck) throws IOException
      Description copied from class: RatioBasedCompactionPolicy
      -- Default minor compaction selection algorithm: choose CompactSelection from candidates -- First exclude bulk-load files if indicated in configuration. Start at the oldest file and stop when you find the first file that meets compaction criteria: (1) a recently-flushed, small file (i.e. <= minCompactSize) OR (2) within the compactRatio of sum(newer_files) Given normal skew, any newer files will also meet this criteria

      Additional Note: If fileSizes.size() >> maxFilesToCompact, we will recurse on compact(). Consider the oldest files first to avoid a situation where we always compact [end-threshold,end). Then, the last file becomes an aggregate of the previous compactions. normal skew: older ----> newer (increasing seqID) _ | | _ | | | | _ --|-|- |-|- |-|---_-------_------- minCompactSize | | | | | | | | _ | | | | | | | | | | | | | | | | | | | | | | | | | |

      Overrides:
      applyCompactionPolicy in class RatioBasedCompactionPolicy
      Parameters:
      candidates - pre-filtrate
      Returns:
      filtered subset
      Throws:
      IOException
    • applyCompactionPolicy

      public List<HStoreFile> applyCompactionPolicy(List<HStoreFile> candidates, boolean mightBeStuck, boolean mayUseOffPeak, int minFiles, int maxFiles)
    • selectCompactFiles

      public List<HStoreFile> selectCompactFiles(List<HStoreFile> candidates, int maxFiles, boolean isOffpeak)
      Select at least one file in the candidates list to compact, through choosing files from the head to the index that the accumulation length larger the max compaction size. This method is a supplementary of the selectSimpleCompaction() method, aims to make sure at least one file can be selected to compact, for compactions like L0 files, which need to compact all files and as soon as possible.
    • isBetterSelection

      private boolean isBetterSelection(List<HStoreFile> bestSelection, long bestSize, List<HStoreFile> selection, long size, boolean mightBeStuck)
    • getTotalStoreSize

      private long getTotalStoreSize(List<HStoreFile> potentialMatchFiles)
      Find the total size of a list of store files.
      Parameters:
      potentialMatchFiles - StoreFile list.
      Returns:
      Sum of StoreFile.getReader().length();
    • filesInRatio

      private boolean filesInRatio(List<HStoreFile> files, double currentRatio)
      Check that all files satisfy the constraint
       FileSize(i) <= ( Sum(0,N,FileSize(_)) - FileSize(i)) * Ratio.
       
      Parameters:
      files - List of store files to consider as a compaction candidate.
      currentRatio - The ratio to use.
      Returns:
      a boolean if these files satisfy the ratio constraints.