Class RestoreSnapshotHelper

java.lang.Object
org.apache.hadoop.hbase.snapshot.RestoreSnapshotHelper

@Private public class RestoreSnapshotHelper extends Object
Helper to Restore/Clone a Snapshot

The helper assumes that a table is already created, and by calling restore() the content present in the snapshot will be restored as the new content of the table.

Clone from Snapshot: If the target table is empty, the restore operation is just a "clone operation", where the only operations are:

  • for each region in the snapshot create a new region (note that the region will have a different name, since the encoding contains the table name)
  • for each file in the region create a new HFileLink to point to the original file.
  • restore the logs, if any

Restore from Snapshot:

  • for each region in the table verify which are available in the snapshot and which are not
    • if the region is not present in the snapshot, remove it.
    • if the region is present in the snapshot
      • for each file in the table region verify which are available in the snapshot
        • if the hfile is not present in the snapshot, remove it
        • if the hfile is present, keep it (nothing to do)
      • for each file in the snapshot region but not in the table
        • create a new HFileLink that point to the original file
  • for each region in the snapshot not present in the current table state
    • create a new region and for each file in the region create a new HFileLink (This is the same as the clone operation)
  • restore the logs, if any
  • Field Details

  • Constructor Details

  • Method Details

    • restoreHdfsRegions

      Restore the on-disk table to a specified snapshot state.
      Returns:
      the set of regions touched by the restore operation
      Throws:
      IOException
    • restoreHdfsRegions

      Throws:
      IOException
    • removeHdfsRegions

      private void removeHdfsRegions(ThreadPoolExecutor exec, List<RegionInfo> regions) throws IOException
      Remove specified regions from the file-system, using the archiver.
      Throws:
      IOException
    • restoreHdfsRegions

      private void restoreHdfsRegions(ThreadPoolExecutor exec, Map<String,org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest> regionManifests, List<RegionInfo> regions) throws IOException
      Restore specified regions by restoring content to the snapshot state.
      Throws:
      IOException
    • restoreHdfsMobRegions

      private void restoreHdfsMobRegions(ThreadPoolExecutor exec, Map<String,org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest> regionManifests, List<RegionInfo> regions) throws IOException
      Restore specified mob regions by restoring content to the snapshot state.
      Throws:
      IOException
    • getRegionHFileReferences

      private Map<String,List<org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest.StoreFile>> getRegionHFileReferences(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest manifest)
    • restoreRegion

      private void restoreRegion(RegionInfo regionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest regionManifest) throws IOException
      Restore region by removing files not in the snapshot and adding the missing ones from the snapshot.
      Throws:
      IOException
    • restoreMobRegion

      private void restoreMobRegion(RegionInfo regionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest regionManifest) throws IOException
      Restore mob region by removing files not in the snapshot and adding the missing ones from the snapshot.
      Throws:
      IOException
    • restoreRegion

      private void restoreRegion(RegionInfo regionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest regionManifest, org.apache.hadoop.fs.Path regionDir) throws IOException
      Restore region by removing files not in the snapshot and adding the missing ones from the snapshot.
      Throws:
      IOException
    • getTableRegionFamilyFiles

      private Set<String> getTableRegionFamilyFiles(org.apache.hadoop.fs.Path familyDir) throws IOException
      Returns The set of files in the specified family directory.
      Throws:
      IOException
    • cloneHdfsRegions

      private RegionInfo[] cloneHdfsRegions(ThreadPoolExecutor exec, Map<String,org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest> regionManifests, List<RegionInfo> regions) throws IOException
      Clone specified regions. For each region create a new region and create a HFileLink for each hfile.
      Throws:
      IOException
    • cloneHdfsMobRegion

      private void cloneHdfsMobRegion(Map<String,org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest> regionManifests, RegionInfo region) throws IOException
      Clone the mob region. For the region create a new region and create a HFileLink for each hfile.
      Throws:
      IOException
    • cloneRegion

      private void cloneRegion(RegionInfo newRegionInfo, org.apache.hadoop.fs.Path regionDir, RegionInfo snapshotRegionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest manifest) throws IOException
      Clone region directory content from the snapshot info. Each region is encoded with the table name, so the cloned region will have a different region name. Instead of copying the hfiles a HFileLink is created.
      Parameters:
      regionDir - Path cloned dir
      Throws:
      IOException
    • cloneRegion

      private void cloneRegion(HRegion region, RegionInfo snapshotRegionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest manifest) throws IOException
      Clone region directory content from the snapshot info. Each region is encoded with the table name, so the cloned region will have a different region name. Instead of copying the hfiles a HFileLink is created.
      Parameters:
      region - HRegion cloned
      Throws:
      IOException
    • restoreStoreFile

      private String restoreStoreFile(org.apache.hadoop.fs.Path familyDir, RegionInfo regionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest.StoreFile storeFile, boolean createBackRef) throws IOException
      Create a new HFileLink to reference the store file.

      The store file in the snapshot can be a simple hfile, an HFileLink or a reference.

      • hfile: abc -> table=region-abc
      • reference: abc.1234 -> table=region-abc.1234
      • hfilelink: table=region-hfile -> table=region-hfile
      Parameters:
      familyDir - destination directory for the store file
      regionInfo - destination region info for the table
      createBackRef - - Whether back reference should be created. Defaults to true.
      storeFile - store file name (can be a Reference, HFileLink or simple HFile)
      Throws:
      IOException
    • restoreReferenceFile

      private String restoreReferenceFile(org.apache.hadoop.fs.Path familyDir, RegionInfo regionInfo, org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotRegionManifest.StoreFile storeFile) throws IOException
      Create a new Reference as copy of the source one.

       The source table looks like:
          1234/abc      (original file)
          5678/abc.1234 (reference file)
      
       After the clone operation looks like:
         wxyz/table=1234-abc
         stuv/table=1234-abc.wxyz
      
       NOTE that the region name in the clone changes (md5 of regioninfo)
       and the reference should reflect that change.
       
      Parameters:
      familyDir - destination directory for the store file
      regionInfo - destination region info for the table
      storeFile - reference file name
      Throws:
      IOException
    • cloneRegionInfo

      public RegionInfo cloneRegionInfo(RegionInfo snapshotRegionInfo)
      Create a new RegionInfo from the snapshot region info. Keep the same startKey, endKey, regionId and split information but change the table name.
      Parameters:
      snapshotRegionInfo - Info for region to clone.
      Returns:
      the new HRegion instance
    • cloneRegionInfo

      public static RegionInfo cloneRegionInfo(TableName tableName, RegionInfo snapshotRegionInfo)
    • getTableRegions

      Returns the set of the regions contained in the table
      Throws:
      IOException
    • copySnapshotForScanner

      public static RestoreSnapshotHelper.RestoreMetaChanges copySnapshotForScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path restoreDir, String snapshotName) throws IOException
      Copy the snapshot files for a snapshot scanner, discards meta changes.
      Throws:
      IOException
    • restoreSnapshotAcl

      public static void restoreSnapshotAcl(org.apache.hadoop.hbase.shaded.protobuf.generated.SnapshotProtos.SnapshotDescription snapshot, TableName newTableName, org.apache.hadoop.conf.Configuration conf) throws IOException
      Throws:
      IOException