Rebalancing Partitioned Region Data
Rebalancing Partitioned Region Data
In a distributed system with minimal contention to the concurrent threads reading or updating from the members, you can use rebalancing to dynamically increase or decrease your data and processing capacity.
- If the configured partition region redundancy is not satisfied, rebalancing does what it can to recover redundancy. See Configure High Availability for a Partitioned Region.
- Rebalancing moves the partitioned region data buckets between host members as needed to establish the most fair balance of data and behavior across the distributed system.
For efficiency, when starting multiple members, trigger the rebalance a single time, after you have added all members.
First, starting a gfsh prompt and connect to the GemFire
distributed system. Then type the following command:
gfsh>rebalanceOptionally, you can specify regions to include or exclude from rebalancing, specify a time-out for the rebalance operation or just simulate a rebalance operation. Type help rebalance or see rebalance for more information.
- API call:
ResourceManager manager = cache.getResourceManager(); RebalanceOperation op = manager.createRebalanceFactory().start(); //Wait until the rebalance is complete and then get the results RebalanceResults results = op.getResults(); //These are some of the details we can get about the run from the API System.out.println("Took " + results.getTotalTime() + " milliseconds\n"); System.out.println("Transfered " + results.getTotalBucketTransferBytes()+ "bytes\n");
ResourceManager manager = cache.getResourceManager(); RebalanceOperation op = manager.createRebalanceFactory().simulate(); RebalanceResults results = op.getResults(); System.out.println("Rebalance would transfer " + results.getTotalBucketTransferBytes() +" bytes "); System.out.println(" and create " + results.getTotalBucketCreatesCompleted() + " buckets.\n");
How Partitioned Region Rebalancing Works
The rebalancing operation runs asynchronously.
As a general rule, rebalancing is performed on one partitioned region at a time. For regions that have co-located data, the rebalancing works on the regions as a group, maintaining the data co-location between the regions.
You can continue to use your partitioned regions normally while rebalancing is in progress. Read operations, write operations, and function executions continue while data is moving. If a function is executing on a local data set, you may see a performance degradation if that data moves to another host during function execution. Future function invocations are routed to the correct member.
GemFire tries to ensure that each member has the same percentage of its available space used for each partitioned region. The percentage is configured in the partition-attributes local-max-memory setting.
- Does not allow the local-max-memory setting to be exceeded unless LRU eviction is enabled with overflow to disk.
- Places multiple copies of the same bucket on different host IP addresses whenever possible.
- Resets entry time to live and idle time statistics during bucket migration.
- Replaces offline members.
When to Rebalance a Partitioned Region
You typically want to trigger rebalancing when capacity is increased or reduced through member startup, shut down or failure.
- You use redundancy for high availability and have configured your region to not automatically recover redundancy after a loss. In this case, GemFire only restores redundancy when you invoke a rebalance. See Configure High Availability for a Partitioned Region.
- You have uneven hashing of data. Uneven hashing can occur if your keys do not have a hash code method, which ensures uniform distribution, or if you use a PartitionResolver to collocate your partitioned region data (see Co-locate Data from Different Partitioned Regions). In either case, some buckets may receive more data than others. Rebalancing can be used to even out the load between data stores by putting fewer buckets on members that are hosting large buckets.
How to Simulate Region Rebalancing