Storage and Distribution Options

GemFire provides several models for data storage and distribution including partitioned or replicated as well as distributed or non-distributed (local cache storage).

Peer-to-Peer Region Storage and Distribution

At its most general, data management means having current data available when and where your applications need it. In a properly configured VMware® vFabric™ GemFire® installation, you store your data in your local members and GemFire automatically distributes it to the other members that need it according to your cache configuration settings. You may be storing very large data objects that require special consideration, or you may have a high volume of data requiring careful configuration to safeguard your application's performance or memory use. You may need to be able to explicitly lock some data during particular operations. Most data management features are available as configuration options, which you can specify either through the cache.xml file or the API. Once configured, GemFire manages the data automatically. For example, this is how you manage data distribution, disk storage, data expiration activities, and data partitioning. A few features are managed at run-time through the API.

At the architectural level, data distribution runs between peers in a single system, between clients and servers, and between separate distributed system sites.
  • Peer-to-peer provides the core distribution and storage models, which are specified as attributes on the data regions.

  • For client/server, you choose which data regions to share between the client and server tiers. Then, within each region, you can fine-tune the data that the server automatically sends to the client by subscribing to subsets.

  • For multi-site, you choose which data regions to share between sites. Because of the high cost of distributing data between disparate geographical locations, not all changes are passed between sites.

Data storage in any type of installation is based on the peer-to-peer configuration for each individual distributed system. Data and event distribution is based on a combination of the peer-to-peer and system-to-system configurations.

The storage and distribution models are configured through cache and region attributes. The main choices are partitioned, replicated, or just distributed. All server regions must be partitioned or replicated. Each region’s data-policy and subscription-attributes, and its scope if it is not a partitioned region, interact for finer control of data distribution.

Storing Data in the Local Cache

To store data in your local cache, use a region refid with a RegionShortcut or ClientRegionShortcut that has local state. These automatically set the region data-policy to a non-empty policy. The regions without storage can send and receive event distributions without storing anything in your application heap. With the other settings, all entry operations received are stored locally.