Hey,
Before beginning the upgrade process, you must take a complete backup of HBase’s backing data. The following instructions cover backing up the data within the current HDFS instance. Alternatively, you can use the distcp command to copy the data to another HDFS cluster.
- Stop the HBase cluster
- Copy the HBase data directory to a backup location using the distcp command as the HDFS super user.
Using distcp to backup the HBase data directory
$kinit -k -t hdfs.keytab hdfs@EXAMPLE.COM
$hadoop distcp /hbase /hbase-pre-upgrade-backup
Distcp will launch a MapReduce job to handle copying the files in a distributed fashion. Check the output of the distcp command to ensure this job completed successfully.