Skip to content

HDDS-14671. Remove healthy_readonly state from SCM#9836

Open
sodonnel wants to merge 9 commits intoapache:HDDS-14496-zdufrom
sodonnel:HDDS-14671-healthy-readonly
Open

HDDS-14671. Remove healthy_readonly state from SCM#9836
sodonnel wants to merge 9 commits intoapache:HDDS-14496-zdufrom
sodonnel:HDDS-14671-healthy-readonly

Conversation

@sodonnel
Copy link
Contributor

What changes were proposed in this pull request?

Healthy_Readonly was added only for the upgrade flow, and it is no longer needed, so we should remove it.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14671

How was this patch tested?

Existing tests, some of which were modified to reflect the new behavior.

@sodonnel sodonnel changed the title HDDS-14671 HDDS-14671. Remove healthy_readonly state from SCM Feb 26, 2026
@github-actions github-actions bot added the zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496 label Feb 26, 2026
Copy link
Contributor

@Gargi-jais11 Gargi-jais11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for the patch.
Few more places it needs updation like comment changes.

Comment on lines +263 to +271
// First set the node to decommissioned, then run through all op states in
// order and ensure the healthy_to_healthy_readonly event gets fired
nsm.setNodeOperationalState(dn,
HddsProtos.NodeOperationalState.DECOMMISSIONED);
for (HddsProtos.NodeOperationalState s :
HddsProtos.NodeOperationalState.values()) {
eventPublisher.clearEvents();
nsm.setNodeOperationalState(dn, s);
assertEquals(SCMEvents.HEALTHY_READONLY_TO_HEALTHY_NODE, eventPublisher.getLastEvent());
assertEquals(SCMEvents.UNHEALTHY_TO_HEALTHY_NODE, eventPublisher.getLastEvent());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update the comment as well to:

order and ensure the unhealthy_to_healthy event gets fired

Comment on lines 314 to 315
// All datanodes on the SCM should have moved to HEALTHY-READONLY state.
TestHddsUpgradeUtils.testDataNodesStateOnSCM(
cluster.getStorageContainerManagersList(), NUM_DATA_NODES,
HEALTHY_READONLY, HEALTHY);
TestHddsUpgradeUtils.testDataNodesStateOnSCM(cluster.getStorageContainerManagersList(), NUM_DATA_NODES, HEALTHY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment above to:

All datanodes on the SCM should have moved to HEALTHY state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the comment here as well.

Comment on lines 202 to 207
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this comment as well.

@Gargi-jais11
Copy link
Contributor

I think the Dropdown and query still include "healthy_readonly" in Grafana dashboard of Overall-Metrics. This should be removed since the state is deprecated.

@Gargi-jais11
Copy link
Contributor

In TestSCMNodeMetrics at Line 177 and Line 249 since the metric is deprecated so have you kept it for compatibility checks? Although if the metric remains it is always 0.

@sodonnel
Copy link
Contributor Author

In TestSCMNodeMetrics at Line 177 and Line 249 since the metric is deprecated so have you kept it for compatibility checks? Although if the metric remains it is always 0.

I just didn't know that was there. I have removed the metrics now. It is better to take them away I think.

I have addressed the other comments too.

@Gargi-jais11
Copy link
Contributor

In TestSCMNodeMetrics at Line 177 and Line 249 since the metric is deprecated so have you kept it for compatibility checks? Although if the metric remains it is always 0.

I just didn't know that was there. I have removed the metrics now. It is better to take them away I think.

I have addressed the other comments too.

Thank you. It happens while removing code we don't know about code in few places

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

zdu Pull requests for Zero Downtime Upgrade (ZDU) https://issues.apache.org/jira/browse/HDDS-14496

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants