Region types
define region behavior within a single distributed system. You have various
options for region data storage and distribution.
Within a vFabric
GemFire
distributed system, you can define distributed regions and non-distributed
regions, and you can define regions whose data is spread across the distributed
system, and regions whose data is entirely contained in a single member.
Your choice of region type is governed in part by the type of
application you are running. In particular, you need to use specific region
types for your servers and clients for effective communication between the two
tiers:
- Server regions are created
inside a
Cache by servers and are accessed by clients that
connect to the servers from outside the server’s distributed system. Server
regions must have region type partitioned or replicated. Server region
configuration uses the
RegionShortcut enum settings.
- Client regions are created
inside a
ClientCache by clients and are configured to
distribute data and events between the client and the server tier. Client
regions must have region type local. Client region configuration uses the
ClientRegionShortcut enum settings.
- Peer regions are created
inside a
Cache. Peer regions may be server regions, or they
may be regions that are not accessed by clients. Peer regions can have any
region type. Peer region configuration uses the
RegionShortcut enum settings.
These are the primary configuration choices for each data region.
| Region Type
|
Description
|
Best suited for...
|
| Partitioned
|
System-wide setting for the data set. Data is divided into
buckets across the members that define the region. For high availability,
configure redundant copies so each bucket is stored in multiple members with
one member holding the primary.
|
Server regions and peer regions
- Very large data
sets
- High
availability
- Write
performance
- Partitioned
event listeners and data loaders
|
| Replicated (distributed)
|
Holds all data from the distributed region. The data from
the distributed region is copied into the member replica region. Can be mixed
with non-replication, with some members holding replicas and some holding
non-replicas.
|
Server regions and peer regions
- Read heavy,
small datasets
- Asynchronous
distribution
- Query
performance
|
| Distributed non-replicated
|
Data is spread across the members that define the region.
Each member holds only the data it has expressed interest in. Can be mixed with
replication, with some members holding replicas and some holding non-replicas.
|
Peer regions, but not server regions and not client
regions
- Asynchronous
distribution
- Query
performance
|
| Non-distributed (local)
|
The region is visible only to the defining member.
|
Client regions and peer regions
- Data that is not
shared between applications
|
Partitioned Regions
Partitioning is a good choice for very large server regions.
Partitioned regions are ideal for data sets in the hundreds of gigabytes and
beyond.
Note: Partitioned regions generally require more JDBC connections than
other region types because each member that hosts data must have a connection.
Partitioned regions group your data into buckets, each of which is
stored on a subset of all of the system members. Data location in the buckets
does not affect the logical view - all members see the same logical data set.
Use partitioning for:
- Large data sets.
Store data sets that are too large to fit into a single member, and all members
will see the same logical data set. Partitioned regions divide the data into
units of storage called buckets that are split across the members hosting the
partitioned region data, so no member needs to host all of the region’s data.
GemFire
provides dynamic redundancy recovery and rebalancing of partitioned regions,
making them the choice for large-scale data containers. More members in the
system can accommodate more uniform balancing of the data across all host
members, allowing system throughput (both gets and puts) to scale as new
members are added.
- High availability.
Partitioned regions allow you configure the number of copies of your data that
GemFire
should make. If a member fails, your data will be available without
interruption from the remaining members. Partitioned regions can also be
persisted to disk for additional high availability.
- Scalability.
Partitioned regions can scale to large amounts of data because the data divided
between the members available to host the region. Increase your data capacity
dynamically by simply adding new members. Partitioned regions also allow you to
scale your processing capacity. Because your entries are spread out across the
members hosting the region, reads and writes to those entries are also spread
out across those members.
- Good write
performance. You can configure the number of copies of your data. The
amount of data transmitted per write does not increase with the number of
members. By contrast, with replicated regions, each write must be sent to every
member that has the region replicated, so the amount of data transmitted per
write increases with the number of members.
In partitioned regions, you can collocate keys within buckets and
across multiple partitioned regions. You can also control which members store
which data buckets.
Replicated Regions
Note: Replication is a good choice for small to medium size server
regions.
Replicated regions provide the highest performance in terms of
throughput and latency.
Use replicated regions for:
- Small amounts of data
required by all members of the distributed system. For example, currency
rate information and mortgage rates.
- Data sets that can be
contained entirely in a single member. Each replicated region holds the
complete data set for the region
- High performance data
access. Replication guarantees local access from the heap for application
threads, providing the lowest possible latency for data access.
- Asynchronous
distribution. All distributed regions, replicated and non-replicated,
provide the fastest distribution speeds.
Distributed, Non-Replicated Regions
Distributed regions provide the same performance as replicated
regions, but each member only stores data it has expressed interest in, either
by subscribing to events from other members or by defining the data entries in
its cache.
Use distributed, non-replicated regions for:
- Peer regions, but not
server regions or client regions. Server regions must be either replicated
or partitioned. Client regions must be local.
- Data sets where
individual members only need notification and updates for changes to a subset
of the data. In non-replicated regions, each member only receives update
events for the data entries it has defined in the local cache.
- Asynchronous
distribution. All distributed regions, replicated and non-replicated,
provide the fastest distribution speeds.
Local Regions
Note: When created using the
ClientRegionShortcut settings, client regions are
automatically defined as local, since all client distribution activities go to
and come from the server tier.
The local region has no peer-to-peer distribution activity.
Use local regions for:
- Client regions.
Distribution is only between the client and server tier.
- Private data sets for
the defining member. The local region is not visible to peer members.