Class TableSnapshotScanner

java.lang.Object
org.apache.hadoop.hbase.client.AbstractClientScanner
org.apache.hadoop.hbase.client.TableSnapshotScanner
All Implemented Interfaces:
Closeable, AutoCloseable, Iterable<Result>, ResultScanner

@Private public class TableSnapshotScanner extends AbstractClientScanner
A Scanner which performs a scan over snapshot files. Using this class requires copying the snapshot to a temporary empty directory, which will copy the snapshot reference files into that directory. Actual data files are not copied.

This also allows one to run the scan from an online or offline hbase cluster. The snapshot files can be exported by using the org.apache.hadoop.hbase.snapshot.ExportSnapshot tool, to a pure-hdfs cluster, and this scanner can be used to run the scan directly over the snapshot files. The snapshot should not be deleted while there are open scanners reading from snapshot files.

An internal RegionScanner is used to execute the Scan obtained from the user for each region in the snapshot.

HBase owns all the data and snapshot files on the filesystem. Only the HBase user can read from snapshot files and data files. HBase also enforces security because all the requests are handled by the server layer, and the user cannot read from the data files directly. To read from snapshot files directly from the file system, the user who is running the MR job must have sufficient permissions to access snapshot and reference files. This means that to run mapreduce over snapshot files, the job has to be run as the HBase user or the user must have group or other priviledges in the filesystem (See HBASE-8369). Note that, given other users access to read from snapshot/data files will completely circumvent the access control enforced by HBase. See org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat.

  • Field Details

  • Constructor Details

    • TableSnapshotScanner

      public TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan) throws IOException
      Creates a TableSnapshotScanner.
      Parameters:
      conf - the configuration
      restoreDir - a temporary directory to copy the snapshot files into. Current user should have write permissions to this directory, and this should not be a subdirectory of rootDir. The scanner deletes the contents of the directory once the scanner is closed.
      snapshotName - the name of the snapshot to read from
      scan - a Scan representing scan parameters
      Throws:
      IOException - in case of error
    • TableSnapshotScanner

      public TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan) throws IOException
      Throws:
      IOException
    • TableSnapshotScanner

      public TableSnapshotScanner(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.Path restoreDir, String snapshotName, Scan scan, boolean snapshotAlreadyRestored) throws IOException
      Creates a TableSnapshotScanner.
      Parameters:
      conf - the configuration
      rootDir - root directory for HBase.
      restoreDir - a temporary directory to copy the snapshot files into. Current user should have write permissions to this directory, and this should not be a subdirectory of rootdir. The scanner deletes the contents of the directory once the scanner is closed.
      snapshotName - the name of the snapshot to read from
      scan - a Scan representing scan parameters
      snapshotAlreadyRestored - true to indicate that snapshot has been restored.
      Throws:
      IOException - in case of error
  • Method Details