Note: Documentation for Pivotal GemFire 7.0.x is now available at http://gemfire.docs.pivotal.io/7.0.2/index.html. Please refer to the Pivotal site for the latest and most up-to-date documentation on GemFire 7.0.x. The vFabric GemFire 7.0 documentation site will no longer be updated.

Back Up and Restore a Disk Store

You do backup and restore operations differently for online and offline distributed systems.


Online Backup

The gfsh backup operation creates a backup of disk stores for all members running in the distributed system when the backup command is invoked.
Note: Do not try to create backup files from a running system by using your operating system's file copy commands. You will get incomplete and unusable copies.

The backup works by passing commands to the running system members. Each member with persistent data creates a backup of its own configuration and disk stores. The backup does not block any activities in the distributed system, but it does use resources.

Preparing for Backup
  1. You might want to compact your disk store before running the backup. If auto-compaction is turned off, you may want to do a manual compaction to save on how much data will be copied over your network by the backup. For more information on configuring a manual compaction, see Manual Compaction
  2. Run the backup during a period of low activity in your system. The backup does not block system activities, but it uses file system resources on all hosts in your distributed system and can affect performance.
  3. Configure each member’s cache.xml with any files or directories you want backed up in addition to the disk store files. VMware recommends that you back up:
    • cache.xml
    • gemfire.properties
    • Application jar files
    • Other files that the application needs when starting (a file that sets the classpath, for example)
    Any directory that you specify is copied recursively, with any disk stores that are found excluded from this user-specified backup. Example:
    <backup>./myExtraBackupStuff</backup>
  4. Back up to a SAN (recommended) or to a directory that all members can access. Make sure the directory exists and has the proper permissions for your members to write to it and create subdirectories.

    The directory you specify for backup can be used multiple times. Each backup first creates a top level directory for the backup, under the directory you specify, identified to the minute.

    You can use one of two methods:
    • Use a single physical location, such as a network file server. Example:
      /export/fileServerDirectory/gemfireBackupLocation
    • Use a directory that is local to all host machines in the system. Example:
      ./gemfireBackupLocation
  5. Make sure there is a gemfire.properties file for the distributed system in the directory where you run the gemfire command. The gemfire.properties file is required by the backup command so that it can connect to the specified distributed system and instruct members to back up their disk stores. Make sure that locators or mcast-port are correctly set in the gemfire.properties file to connect to the distributed system that you want to back up.
  6. Make sure all members with persistent data are running in the system. Offline members cannot back up their disk stores. The tool gives a message telling you about any members that are offline:
    The backup may be incomplete. The following disk stores are not online:
        DiskStore at hostc.gemstone.com /home/dsmith/dir3
Performing a Full Online Backup
  1. If you have disabled auto-compaction, run manual compaction:
    gfsh>compact disk-store --name=Disk1
  2. Run the gfsh backup command, providing your backup directory location. Example:
    gfsh>backup disk-store --dir=/export/fileServerDirectory/gemfireBackupLocation
  3. The tool reports on the success of the operation. If the operation is successful, you see a message like this:
    The following disk stores were backed up:
    	DiskStore at hosta.gemstone.com /home/dsmith/dir1
    	DiskStore at hostb.gemstone.com /home/dsmith/dir2
    Backup successful.
    If the operation does not succeed at backing up all known members, you see a message like this:
    Connecting to distributed system: locators=warsaw.gemstone.com[26357]
    The following disk stores were backed up:
    	DiskStore at hosta.gemstone.com /home/dsmith/dir1
    	DiskStore at hostb.gemstone.com /home/dsmith/dir2
    The backup may be incomplete. The following disk stores are not online:
    	DiskStore at hostc.gemstone.com /home/dsmith/dir3

    A member that fails to complete its backup is noted in this ending status message and leaves the file INCOMPLETE_BACKUP in its highest level backup directory. Offline members leave nothing, so you only have this message from the backup operation itself.

  4. Validate the backup. To ensure that the backup can be recovered, validate the backed-up files. Run the validate offline-disk-store command on the backed-up files for each disk store.
    cd 2010-04-10-11-35/straw_14871_53406_34322/diskstores/ds1
    gemfire validate-disk-store ds1 dir0 dir1 [... dirN]
    Repeat for all disk stores of all members.

Performing an Incremental Backup


  1. Run the gfsh backup command, providing your backup directory location. Example:
    gfsh>backup disk-store --dir=/export/fileServerDirectory/gemfireBackupLocation
    --baselineDir=/export/fileServerDirectory/gemfireBackupLocation/2012-10-01-12-30
  2. The tool reports on the success of the operation. If the operation is successful, you see a message like this:
    The following disk stores were backed up:
    	DiskStore at hosta.gemstone.com /home/dsmith/dir1
    	DiskStore at hostb.gemstone.com /home/dsmith/dir2
    Backup successful.
    If the operation does not succeed at performing an incremental backup on all known members, you see a message like this:
    The following disk stores were backed up:
    	DiskStore at hosta.gemstone.com /home/dsmith/dir1
    	DiskStore at hostb.gemstone.com /home/dsmith/dir2
    The backup may be incomplete. The following disk stores are not online:
    	DiskStore at hostc.gemstone.com /home/dsmith/dir3

    A member that fails to complete its backup is noted in this ending status message and leaves the file INCOMPLETE_BACKUP. The next time you perform a backup operation a full backup will be performed.

  3. Repeat for all disk stores for all members.

What a Full Online Backup Saves

For each member with persistent data, a full backup includes the following:
  • Disk store files for all stores containing persistent region data.
  • Files and directories you have configured to be backed up in cache.xml <backup> elements. Example:
    <backup>./systemConfig/gf.jar</backup>
    <backup>/users/jpearson/gfSystemInfo/myCustomerConfig.doc</backup>
  • Deployed JAR files that you deployed using the gfsh deploy command.
  • Configuration files from the member startup.
    • gemfire.properties, including the properties with which the member was started.
    • cache.xml, if used.
    These configuration files are not automatically restored, to avoid interfering with more recent configurations. In particular, if these are extracted from a master jar file, copying the separate files into your working area can override the files in the jar. If you want to back up and restore these files, add them as custom <backup> elements.
  • A restore script, written for the member’s operating system, that copies the files back to their original locations. For example, in Windows, the file is restore.bat and in Linux, it is restore.sh.

What an Incremental Online Backup Saves

An incremental backup saves the difference between the last backup and the current data. An incremental backup copies only operations logs that are not already present in the baseline directories for each member. For incremental backups, the restore script contains explicit references to operation logs in one or more previously chained incremental backups. When the restore script is run from an incremental backup, it also restores the operation logs from previous incremental backups that are part of the backup chain.

If members are missing from the baseline directory because they were offline or did not exist at the time of the baseline backup, those members place full backups of all their files into the incremental backup directory.

Disk Store Backup Directory Structure and Contents

dsmith@dasmith-e6410:test_backup$ ls -R
./2012-10-18-13-44-53:
dasmith_e6410_server1_8623_v1_33892 dasmith_e6410_server2_8940_v2_45565

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892:
config diskstores README.txt restore.sh user

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892/config:
cache.xml

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892/diskstores:
DEFAULT

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892/diskstores/DEFAULT:
dir0

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892/diskstores/DEFAULT/dir0:
BACKUPDEFAULT_1.crf BACKUPDEFAULT_1.drf BACKUPDEFAULT.if

./2012-10-18-13-44-53/dasmith_e6410_server1_8623_v1_33892/user:

Offline Members: Manual Catch-Up to an Online Backup

If you must have a member offline during an online backup, you can manually back up its disk stores. Do one of the following:
  • Keep the member’s backup and restore separated, doing offline manual backup and offline manual restore, if needed.
  • Bring this member’s files into the online backup framework manually and create a restore script by hand, from a copy of another member’s script:
    1. Duplicate the directory structure of a backed up member for this member.
    2. Rename directories as needed to reflect this member’s particular backup, including disk store names.
    3. Clear out all files but the restore script.
    4. Copy in this member’s files.
    5. Modify the restore script to work for this member.

Restore an Online Backup

The restore script copies files back to their original locations. You can do this manually if you wish.
  1. Restore your disk stores when your members are offline and the system is down.
  2. Read the restore scripts to see where they will place the files and make sure the destination locations are ready. The restore scripts refuse to copy over files with the same names.
  3. Run the restore scripts. Run each script on the host where the backup originated.
The restore copies these files back to their original location:
  • Disk store files for all stores containing persistent region data.
  • Any files or directories you have configured to be backed up in the cache.xml <backup> elements.

Offline File Backup and Restore

With the system offline, you copy and restore your files using your file system commands.

To back up your offline system:
  1. Validate, and consider compacting your disk stores before backing them up.
  2. Copy all disk store files, and any other files you want to save, to your backup locations.
To restore a backup of an offline system:
  1. Make sure the system is either down or not using the directories you will use for the restored files.
  2. Reverse your backup file copy procedure, copying all the backed up files into the directories you want to use.
  3. Make sure your members are configured to use the directories where you put the files.
  4. Start the system members.