Skip to content

Commit

Permalink
Open Add NPR guard when reading instance config - Race when reading …
Browse files Browse the repository at this point in the history
…config while adding/removing instance (#2669)

When adding/removing instance from a cluster, it is possible that at a point of time, instance config is gone but the INSTANCE ZNode is still there. This would cause config map in helix controller cache to have instance mapped to null config.
The failure is not a blocking error causing further pipeline failure but we would like to avoid noise in log.
This change add null check before access instance config.
  • Loading branch information
xyuanlu authored Oct 18, 2023
1 parent 65e657d commit 38a4f47
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 8 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,10 @@ private static Set<String> getActiveNodes(Set<String> allNodes, Set<String> live

public static Set<String> filterOutEvacuatingInstances(Map<String, InstanceConfig> instanceConfigMap,
Set<String> nodes) {
return nodes.stream()
.filter(instance -> !instanceConfigMap.get(instance).getInstanceOperation().equals(
InstanceConstants.InstanceOperation.EVACUATE.name()))
return nodes.stream()
.filter(instance -> (instanceConfigMap.get(instance) != null && !instanceConfigMap.get(instance)
.getInstanceOperation()
.equals(InstanceConstants.InstanceOperation.EVACUATE.name())))
.collect(Collectors.toSet());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -410,15 +410,21 @@ public ZNRecord update(ZNRecord currentData) {

@Override
public boolean isEvacuateFinished(String clusterName, String instanceName) {
return !instanceHasCurrentSateOrMessage(clusterName, instanceName) && (getInstanceConfig(clusterName,
instanceName).getInstanceOperation().equals(InstanceConstants.InstanceOperation.EVACUATE.name()));
if (!instanceHasCurrentSateOrMessage(clusterName, instanceName)) {
InstanceConfig config = getInstanceConfig(clusterName, instanceName);
return config != null && config.getInstanceOperation().equals(InstanceConstants.InstanceOperation.EVACUATE.name());
}
return false;
}

@Override
public boolean isReadyForPreparingJoiningCluster(String clusterName, String instanceName) {
return !instanceHasCurrentSateOrMessage(clusterName, instanceName)
&& DelayedAutoRebalancer.INSTANCE_OPERATION_TO_EXCLUDE_FROM_ASSIGNMENT.contains(
getInstanceConfig(clusterName, instanceName).getInstanceOperation());
if (!instanceHasCurrentSateOrMessage(clusterName, instanceName)) {
InstanceConfig config = getInstanceConfig(clusterName, instanceName);
return config != null && DelayedAutoRebalancer.INSTANCE_OPERATION_TO_EXCLUDE_FROM_ASSIGNMENT.contains(
config.getInstanceOperation());
}
return false;
}

/**
Expand Down

0 comments on commit 38a4f47

Please sign in to comment.