Class MobUtils

java.lang.Object
org.apache.hadoop.hbase.mob.MobUtils

@Private public final class MobUtils extends Object
The mob utilities
  • Field Details

  • Constructor Details

    • MobUtils

      private MobUtils()
      Private constructor to keep this class from being instantiated.
  • Method Details

    • formatDate

      public static String formatDate(Date date)
      Formats a date to a string.
      Parameters:
      date - The date.
      Returns:
      The string format of the date, it's yyyymmdd.
    • parseDate

      public static Date parseDate(String dateString) throws ParseException
      Parses the string to a date.
      Parameters:
      dateString - The string format of a date, it's yyyymmdd.
      Returns:
      A date.
      Throws:
      ParseException
    • isMobReferenceCell

      public static boolean isMobReferenceCell(Cell cell)
      Whether the current cell is a mob reference cell.
      Parameters:
      cell - The current cell.
      Returns:
      True if the cell has a mob reference tag, false if it doesn't.
    • getTableNameTag

      private static Optional<Tag> getTableNameTag(Cell cell)
      Gets the table name tag.
      Parameters:
      cell - The current cell.
      Returns:
      The table name tag.
    • getTableNameString

      public static Optional<String> getTableNameString(Cell cell)
      Gets the table name from when this cell was written into a mob hfile as a string.
      Parameters:
      cell - to extract tag from
      Returns:
      table name as a string. empty if the tag is not found.
    • getTableName

      public static Optional<TableName> getTableName(Cell cell)
      Get the table name from when this cell was written into a mob hfile as a TableName.
      Parameters:
      cell - to extract tag from
      Returns:
      name of table as a TableName. empty if the tag is not found.
    • hasMobReferenceTag

      public static boolean hasMobReferenceTag(List<Tag> tags)
      Whether the tag list has a mob reference tag.
      Parameters:
      tags - The tag list.
      Returns:
      True if the list has a mob reference tag, false if it doesn't.
    • isRawMobScan

      public static boolean isRawMobScan(Scan scan)
      Indicates whether it's a raw scan. The information is set in the attribute "hbase.mob.scan.raw" of scan. For a mob cell, in a normal scan the scanners retrieves the mob cell from the mob file. In a raw scan, the scanner directly returns cell in HBase without retrieve the one in the mob file.
      Parameters:
      scan - The current scan.
      Returns:
      True if it's a raw scan.
    • isRefOnlyScan

      public static boolean isRefOnlyScan(Scan scan)
      Indicates whether it's a reference only scan. The information is set in the attribute "hbase.mob.scan.ref.only" of scan. If it's a ref only scan, only the cells with ref tag are returned.
      Parameters:
      scan - The current scan.
      Returns:
      True if it's a ref only scan.
    • isCacheMobBlocks

      public static boolean isCacheMobBlocks(Scan scan)
      Indicates whether the scan contains the information of caching blocks. The information is set in the attribute "hbase.mob.cache.blocks" of scan.
      Parameters:
      scan - The current scan.
      Returns:
      True when the Scan attribute specifies to cache the MOB blocks.
    • setCacheMobBlocks

      public static void setCacheMobBlocks(Scan scan, boolean cacheBlocks)
      Sets the attribute of caching blocks in the scan.
      Parameters:
      scan - The current scan.
      cacheBlocks - True, set the attribute of caching blocks into the scan, the scanner with this scan caches blocks. False, the scanner doesn't cache blocks for this scan.
    • cleanExpiredMobFiles

      public static void cleanExpiredMobFiles(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, TableName tableName, ColumnFamilyDescriptor columnDescriptor, CacheConfig cacheConfig, long current) throws IOException
      Cleans the expired mob files. Cleans the files whose creation date is older than (current - columnFamily.ttl), and the minVersions of that column family is 0.
      Parameters:
      fs - The current file system.
      conf - The current configuration.
      tableName - The current table name.
      columnDescriptor - The descriptor of the current column family.
      cacheConfig - The cacheConfig that disables the block cache.
      current - The current time.
      Throws:
      IOException
    • getMobHome

      public static org.apache.hadoop.fs.Path getMobHome(org.apache.hadoop.conf.Configuration conf)
      Gets the root dir of the mob files. It's {HBASE_DIR}/mobdir.
      Parameters:
      conf - The current configuration.
      Returns:
      the root dir of the mob file.
    • getMobHome

      public static org.apache.hadoop.fs.Path getMobHome(org.apache.hadoop.fs.Path rootDir)
      Gets the root dir of the mob files under the qualified HBase root dir. It's {rootDir}/mobdir.
      Parameters:
      rootDir - The qualified path of HBase root directory.
      Returns:
      The root dir of the mob file.
    • getQualifiedMobRootDir

      public static org.apache.hadoop.fs.Path getQualifiedMobRootDir(org.apache.hadoop.conf.Configuration conf) throws IOException
      Gets the qualified root dir of the mob files.
      Parameters:
      conf - The current configuration.
      Returns:
      The qualified root dir.
      Throws:
      IOException
    • getMobTableDir

      public static org.apache.hadoop.fs.Path getMobTableDir(org.apache.hadoop.fs.Path rootDir, TableName tableName)
      Gets the table dir of the mob files under the qualified HBase root dir. It's {rootDir}/mobdir/data/${namespace}/${tableName}
      Parameters:
      rootDir - The qualified path of HBase root directory.
      tableName - The name of table.
      Returns:
      The table dir of the mob file.
    • getMobRegionPath

      public static org.apache.hadoop.fs.Path getMobRegionPath(org.apache.hadoop.conf.Configuration conf, TableName tableName)
      Gets the region dir of the mob files. It's {HBASE_DIR}/mobdir/data/{namespace}/{tableName}/{regionEncodedName}.
      Parameters:
      conf - The current configuration.
      tableName - The current table name.
      Returns:
      The region dir of the mob files.
    • getMobRegionPath

      public static org.apache.hadoop.fs.Path getMobRegionPath(org.apache.hadoop.fs.Path rootDir, TableName tableName)
      Gets the region dir of the mob files under the specified root dir. It's {rootDir}/mobdir/data/{namespace}/{tableName}/{regionEncodedName}.
      Parameters:
      rootDir - The qualified path of HBase root directory.
      tableName - The current table name.
      Returns:
      The region dir of the mob files.
    • getMobFamilyPath

      public static org.apache.hadoop.fs.Path getMobFamilyPath(org.apache.hadoop.conf.Configuration conf, TableName tableName, String familyName)
      Gets the family dir of the mob files. It's {HBASE_DIR}/mobdir/{namespace}/{tableName}/{regionEncodedName}/{columnFamilyName}.
      Parameters:
      conf - The current configuration.
      tableName - The current table name.
      familyName - The current family name.
      Returns:
      The family dir of the mob files.
    • getMobFamilyPath

      public static org.apache.hadoop.fs.Path getMobFamilyPath(org.apache.hadoop.fs.Path regionPath, String familyName)
      Gets the family dir of the mob files. It's {HBASE_DIR}/mobdir/{namespace}/{tableName}/{regionEncodedName}/{columnFamilyName}.
      Parameters:
      regionPath - The path of mob region which is a dummy one.
      familyName - The current family name.
      Returns:
      The family dir of the mob files.
    • getMobRegionInfo

      public static RegionInfo getMobRegionInfo(TableName tableName)
      Gets the RegionInfo of the mob files. This is a dummy region. The mob files are not saved in a region in HBase. It's internally used only.
      Returns:
      A dummy mob region info.
    • isMobRegionInfo

      public static boolean isMobRegionInfo(RegionInfo regionInfo)
      Gets whether the current RegionInfo is a mob one.
      Parameters:
      regionInfo - The current RegionInfo.
      Returns:
      If true, the current RegionInfo is a mob one.
    • isMobRegionName

      public static boolean isMobRegionName(TableName tableName, byte[] regionName)
      Gets whether the current region name follows the pattern of a mob region name.
      Parameters:
      tableName - The current table name.
      regionName - The current region name.
      Returns:
      True if the current region name follows the pattern of a mob region name.
    • removeMobFiles

      public static boolean removeMobFiles(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, TableName tableName, org.apache.hadoop.fs.Path tableDir, byte[] family, Collection<HStoreFile> storeFiles)
      Archives the mob files.
      Parameters:
      conf - The current configuration.
      fs - The current file system.
      tableName - The table name.
      tableDir - The table directory.
      family - The name of the column family.
      storeFiles - The files to be deleted.
    • createMobRefCell

      public static Cell createMobRefCell(Cell cell, byte[] fileName, Tag tableNameTag)
      Creates a mob reference KeyValue. The value of the mob reference KeyValue is mobCellValueSize + mobFileName.
      Parameters:
      cell - The original Cell.
      fileName - The mob file name where the mob reference KeyValue is written.
      tableNameTag - The tag of the current table name. It's very important in cloning the snapshot.
      Returns:
      The mob reference KeyValue.
    • createMobRefCell

      public static Cell createMobRefCell(Cell cell, byte[] fileName, byte[] refCellTags)
    • createWriter

      public static StoreFileWriter createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, ColumnFamilyDescriptor family, String date, org.apache.hadoop.fs.Path basePath, long maxKeyCount, Compression.Algorithm compression, String startKey, CacheConfig cacheConfig, Encryption.Context cryptoContext, boolean isCompaction, String regionName) throws IOException
      Creates a writer for the mob file in temp directory.
      Parameters:
      conf - The current configuration.
      fs - The current file system.
      family - The descriptor of the current column family.
      date - The date string, its format is yyyymmmdd.
      basePath - The basic path for a temp directory.
      maxKeyCount - The key count.
      compression - The compression algorithm.
      startKey - The hex string of the start key.
      cacheConfig - The current cache config.
      cryptoContext - The encryption context.
      isCompaction - If the writer is used in compaction.
      Returns:
      The writer for the mob file.
      Throws:
      IOException
    • createWriter

      public static StoreFileWriter createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, ColumnFamilyDescriptor family, MobFileName mobFileName, org.apache.hadoop.fs.Path basePath, long maxKeyCount, Compression.Algorithm compression, CacheConfig cacheConfig, Encryption.Context cryptoContext, boolean isCompaction) throws IOException
      Creates a writer for the mob file in temp directory.
      Parameters:
      conf - The current configuration.
      fs - The current file system.
      family - The descriptor of the current column family.
      mobFileName - The mob file name.
      basePath - The basic path for a temp directory.
      maxKeyCount - The key count.
      compression - The compression algorithm.
      cacheConfig - The current cache config.
      cryptoContext - The encryption context.
      isCompaction - If the writer is used in compaction.
      Returns:
      The writer for the mob file.
      Throws:
      IOException
    • createWriter

      public static StoreFileWriter createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, ColumnFamilyDescriptor family, org.apache.hadoop.fs.Path path, long maxKeyCount, Compression.Algorithm compression, CacheConfig cacheConfig, Encryption.Context cryptoContext, ChecksumType checksumType, int bytesPerChecksum, int blocksize, BloomType bloomType, boolean isCompaction) throws IOException
      Creates a writer for the mob file in temp directory.
      Parameters:
      conf - The current configuration.
      fs - The current file system.
      family - The descriptor of the current column family.
      path - The path for a temp directory.
      maxKeyCount - The key count.
      compression - The compression algorithm.
      cacheConfig - The current cache config.
      cryptoContext - The encryption context.
      checksumType - The checksum type.
      bytesPerChecksum - The bytes per checksum.
      blocksize - The HFile block size.
      bloomType - The bloom filter type.
      isCompaction - If the writer is used in compaction.
      Returns:
      The writer for the mob file.
      Throws:
      IOException
    • createWriter

      public static StoreFileWriter createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, ColumnFamilyDescriptor family, org.apache.hadoop.fs.Path path, long maxKeyCount, Compression.Algorithm compression, CacheConfig cacheConfig, Encryption.Context cryptoContext, ChecksumType checksumType, int bytesPerChecksum, int blocksize, BloomType bloomType, boolean isCompaction, Consumer<org.apache.hadoop.fs.Path> writerCreationTracker) throws IOException
      Creates a writer for the mob file in temp directory.
      Parameters:
      conf - The current configuration.
      fs - The current file system.
      family - The descriptor of the current column family.
      path - The path for a temp directory.
      maxKeyCount - The key count.
      compression - The compression algorithm.
      cacheConfig - The current cache config.
      cryptoContext - The encryption context.
      checksumType - The checksum type.
      bytesPerChecksum - The bytes per checksum.
      blocksize - The HFile block size.
      bloomType - The bloom filter type.
      isCompaction - If the writer is used in compaction.
      writerCreationTracker - to track the current writer in the store
      Returns:
      The writer for the mob file.
      Throws:
      IOException
    • hasValidMobRefCellValue

      public static boolean hasValidMobRefCellValue(Cell cell)
      Indicates whether the current mob ref cell has a valid value. A mob ref cell has a mob reference tag. The value of a mob ref cell consists of two parts, real mob value length and mob file name. The real mob value length takes 4 bytes. The remaining part is the mob file name.
      Parameters:
      cell - The mob ref cell.
      Returns:
      True if the cell has a valid value.
    • getMobValueLength

      public static int getMobValueLength(Cell cell)
      Gets the mob value length from the mob ref cell. A mob ref cell has a mob reference tag. The value of a mob ref cell consists of two parts, real mob value length and mob file name. The real mob value length takes 4 bytes. The remaining part is the mob file name.
      Parameters:
      cell - The mob ref cell.
      Returns:
      The real mob value length.
    • getMobFileName

      public static String getMobFileName(Cell cell)
      Gets the mob file name from the mob ref cell. A mob ref cell has a mob reference tag. The value of a mob ref cell consists of two parts, real mob value length and mob file name. The real mob value length takes 4 bytes. The remaining part is the mob file name.
      Parameters:
      cell - The mob ref cell.
      Returns:
      The mob file name.
    • hasMobColumns

      public static boolean hasMobColumns(TableDescriptor htd)
      Checks whether this table has mob-enabled columns.
      Parameters:
      htd - The current table descriptor.
      Returns:
      Whether this table has mob-enabled columns.
    • getMobColumnFamilies

      Get list of Mob column families (if any exists)
      Parameters:
      htd - table descriptor
      Returns:
      list of Mob column families
    • isReadEmptyValueOnMobCellMiss

      public static boolean isReadEmptyValueOnMobCellMiss(Scan scan)
      Indicates whether return null value when the mob file is missing or corrupt. The information is set in the attribute "empty.value.on.mobcell.miss" of scan.
      Parameters:
      scan - The current scan.
      Returns:
      True if the readEmptyValueOnMobCellMiss is enabled.
    • isMobFileExpired

      public static boolean isMobFileExpired(ColumnFamilyDescriptor column, long current, String fileDate)
      Checks if the mob file is expired.
      Parameters:
      column - The descriptor of the current column family.
      current - The current time.
      fileDate - The date string parsed from the mob file name.
      Returns:
      True if the mob file is expired.
    • serializeMobFileRefs

      public static byte[] serializeMobFileRefs(org.apache.hbase.thirdparty.com.google.common.collect.SetMultimap<TableName,String> mobRefSet)
      Serialize a set of referenced mob hfiles
      Parameters:
      mobRefSet - to serialize, may be null
      Returns:
      byte array to i.e. put into store file metadata. will not be null
    • deserializeMobFileRefs

      public static org.apache.hbase.thirdparty.com.google.common.collect.ImmutableSetMultimap.Builder<TableName,String> deserializeMobFileRefs(byte[] bytes) throws IllegalStateException
      Deserialize the set of referenced mob hfiles from store file metadata.
      Parameters:
      bytes - compatibly serialized data. can not be null
      Returns:
      a setmultimap of original table to list of hfile names. will be empty if no values.
      Throws:
      IllegalStateException - if there are values but no table name