Skip to content

Conversation

@sureshanaparti
Copy link
Contributor

Description

This PR improves PowerFlex/ScaleIO client initialization , authentication and command execution.

  • Handles busy-wait for ScaleIO client authentication, and client creation potential deadlock
  • Adds default timeout 5 minutes for PowerFlex commands (dfv_cfg, etc.) execution (instead of default 60 min.) to fit into asynchronous command (Management Server -> Agent) timeout.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@sureshanaparti
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 27.71084% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.59%. Comparing base (425c4e3) to head (cce01f1).

Files with missing lines Patch % Lines
...ore/client/ScaleIOGatewayClientConnectionPool.java 15.78% 32 Missing ⚠️
...orage/datastore/manager/ScaleIOSDCManagerImpl.java 0.00% 16 Missing ⚠️
...age/datastore/client/ScaleIOGatewayClientImpl.java 44.44% 3 Missing and 2 partials ⚠️
...atastore/driver/ScaleIOPrimaryDataStoreDriver.java 20.00% 4 Missing ⚠️
.../apache/cloudstack/framework/config/ConfigKey.java 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.22   #12391      +/-   ##
============================================
- Coverage     17.59%   17.59%   -0.01%     
- Complexity    15598    15600       +2     
============================================
  Files          5910     5910              
  Lines        529692   529710      +18     
  Branches      64716    64713       -3     
============================================
+ Hits          93202    93205       +3     
- Misses       425998   426015      +17     
+ Partials      10492    10490       -2     
Flag Coverage Δ
uitests 3.57% <ø> (ø)
unittests 18.66% <27.71%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances PowerFlex/ScaleIO client initialization, authentication, and command execution by implementing lazy authentication, improving busy-wait handling, and adding default command timeouts. The changes address potential deadlock scenarios and ensure commands complete within asynchronous timeout limits.

Key Changes:

  • Deferred client authentication from constructor to first API call with proper session expiry detection
  • Added 5-minute default timeout for PowerFlex command execution to fit within async command timeouts
  • Improved connection pool synchronization to prevent concurrent client initialization

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
ScaleIOGatewayClientImpl.java Removed eager authentication from constructor, added volatile flag and busy-wait with Thread.yield(), enhanced session expiry checks
ScaleIOGatewayClientConnectionPool.java Improved synchronization with explicit lock object, refactored client creation with Optional, added debug logging
ScaleIOUtil.java Added DEFAULT_TIMEOUT_MS constant, updated all command executions to use timeout, added --file parameter to query commands, simplified removeMdms error handling
ScaleIOSDCManagerImpl.java Fixed canUnprepareSDC logic to return false when multiple pools share same SDC, improved isHostSdcConnected loop structure and logging
ScaleIOPrimaryDataStoreDriver.java Removed throws Exception declarations from method signatures
ScaleIOGatewayClientImplTest.java Moved authentication verifications after API calls to align with lazy authentication
ScaleIOStoragePoolTest.java Updated mocks to include timeout parameter in Script.runSimpleBashScript calls
ScaleIOStorageAdaptorTest.java Updated mocks to include timeout parameter and --file parameter in commands
ConfigKey.java Added valueInDomain convenience method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +442 to +444
logger.debug("Final attempt succeeded the SDC {} of the pool {} to connect", sdcId, dataStore);
} else {
logger.debug("Final attempt failed the SDC {} of the pool {} to connect", sdcId, dataStore);
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message contains grammatical issues. "Final attempt succeeded the SDC" should be "Final attempt succeeded for the SDC" or "Final attempt to connect the SDC succeeded". Similarly, "Final attempt failed the SDC" should be "Final attempt failed for the SDC" or "Final attempt to connect the SDC failed".

Suggested change
logger.debug("Final attempt succeeded the SDC {} of the pool {} to connect", sdcId, dataStore);
} else {
logger.debug("Final attempt failed the SDC {} of the pool {} to connect", sdcId, dataStore);
logger.debug("Final attempt to connect the SDC {} of the pool {} succeeded", sdcId, dataStore);
} else {
logger.debug("Final attempt to connect the SDC {} of the pool {} failed", sdcId, dataStore);

Copilot uses AI. Check for mistakes.
while (authenticating); // wait for authentication request (if any) to complete (and to pick the new session key)
while (authenticating) { // wait for authentication request (if any)
// to complete (and to pick the new session key)
Thread.yield();
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The busy-wait loop with Thread.yield() can cause excessive CPU usage when waiting for authentication to complete. Thread.yield() is a hint to the scheduler but provides no guaranteed waiting period, so the loop may spin very rapidly consuming CPU cycles. Consider using a small Thread.sleep() instead, or better yet, using proper synchronization primitives like CountDownLatch or wait/notify to eliminate busy-waiting entirely.

Suggested change
Thread.yield();
try {
Thread.sleep(10L);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
break;
}

Copilot uses AI. Check for mistakes.
Comment on lines +403 to +408
/**
* Use {@link ConfigKey#valueInScope(Scope, Long)} instead.
*/
public T valueInDomain(Long domainId) {
return valueInScope(Scope.Domain, domainId);
}
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation comment is misleading. It states "Use ConfigKey#valueInScope(Scope, Long) instead" which suggests this method is deprecated, but there is no @deprecated annotation. If this is a deprecated method meant to guide users to the preferred API, it should have the @deprecated annotation. If it's just a convenience method, the documentation should clarify that rather than suggesting users should use a different method.

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +130

wireMockRule.verify(getRequestedFor(urlEqualTo("/api/login"))
.withBasicAuth(new BasicCredentials(username, password)));
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification call at lines 129-130 is unreachable code. Since the test expects a CloudRuntimeException to be thrown at line 127, execution will not reach the verification statement. This verification should be removed as it serves no purpose and creates misleading test code.

Suggested change
wireMockRule.verify(getRequestedFor(urlEqualTo("/api/login"))
.withBasicAuth(new BasicCredentials(username, password)));

Copilot uses AI. Check for mistakes.
Comment on lines +145 to +147

wireMockRule.verify(getRequestedFor(urlEqualTo("/api/login"))
.withBasicAuth(new BasicCredentials(username, password)));
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification call at lines 146-147 is unreachable code. Since the test expects a ServerApiException to be thrown at line 144, execution will not reach the verification statement. This verification should be removed as it serves no purpose and creates misleading test code.

Suggested change
wireMockRule.verify(getRequestedFor(urlEqualTo("/api/login"))
.withBasicAuth(new BasicCredentials(username, password)));

Copilot uses AI. Check for mistakes.
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16303

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants