Configure vFabric GemFire to Handle Network Partitioning

This section lists the configuration steps for network partition detection.

The system uses a combination of locators and system members, designated as lead members, to detect and resolve network partitioning problems.
  1. Use locators for member discovery. See Configuring Peer-to-Peer Discovery. In addition, use multiple locators.
  2. Enable partition detection in the locators and in all system members by setting this in their gemfire.properties:
    enable-network-partition-detection=true

    All system members should have the same setting for enable-network-partition-detection. If they don’t, the system throws a GemFireConfigException upon startup.

  3. Configure regions you want to protect from network partitioning with DISTRIBUTED_ACK or GLOBAL scope. Do not use DISTRIBUTED_NO_ACK scope. The region configurations provided in the region shortcut settings use DISTRIBUTED_ACK scope. This setting prevents operations from performed throughout the distributed system before a network partition is detected.
    Note: GemFire issues an alert if it detects distributed-no-ack regions when network partition detection is enabled:
    Region {0} is being created with scope {1} but enable-network-partition-detection is enabled in the distributed system. 
    This can lead to cache inconsistencies if there is a network failure.
  4. These other configuration parameters affect or interact with network partitioning detection. Check whether they are appropriate for your installation and modify as needed.
    • If you have network partition detection enabled, the threshold percentage value for allowed membership weight loss is automatically configured to 51. You cannot modify this value.
    • Failure detection is initiated if a member's gemfire.properties ack-wait-threshold (default is 15 seconds) and ack-severe-alert-threshold (15 seconds) elapses before receiving a response to a message. If you modify the ack-wait-threshold configuration value, you should modify ack-severe-alert-threshold to match the other configuration value.
    • If the system has clients connecting to it, the clients' cache.xml <cache> <pool> read-timeout should be set to at least three times the member-timeout setting in the server's gemfire.properties. The default <cache> <pool> read-timeout setting is 10000 milliseconds.
    • You can adjust the default weights of members by specifying the system property gemfire.member-weight upon startup. For example, if you have some VMs that host a needed service, you could assign them a higher weight upon startup.