Skip to content

Comments

feat: allow completely disabling connect#80

Open
stevenolen wants to merge 7 commits intomainfrom
chore-disable-connect
Open

feat: allow completely disabling connect#80
stevenolen wants to merge 7 commits intomainfrom
chore-disable-connect

Conversation

@stevenolen
Copy link
Collaborator

No description provided.

@claude
Copy link

claude bot commented Feb 10, 2026

Claude finished @stevenolen's task —— View job


  • Gather context and read review guidelines
  • Read changed files and analyze diff
  • Submit PR review with inline feedback

Summary

Submitted a review covering:

  • Downstream side effects: network policies and Workbench DefaultRSConnectServer still reference Connect when it's disabled — worth confirming this is intentional
  • Test improvements: suggest using apierrors.IsNotFound for the error assertion and adding a nil-Enabled complementary test
  • Unrelated generated file: connectruntimeimagespec.go looks like a code-gen catch-up bundled into this PR
  • PR description: suggested adding context on the use case

@stevenolen stevenolen changed the title allow completely disabling connect feat: allow completely disabling connect Feb 10, 2026
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: allow completely disabling connect

The implementation follows the established pattern from the Flightdeck Enabled field — API type, deepcopy, client-go apply config, CRD, controller guard, and test. Looks clean overall.

A few items worth considering:


1. Downstream side effects when Connect is disabled (site_controller_connect.go:32-35)

When Enabled=false, reconcileConnect returns early, but other reconcilers still run:

  • Network policies: reconcileConnectNetworkPolicy and reconcileConnectSessionNetworkPolicy (site_controller_networkpolicies.go:46-54) will still create Connect-related network policies even though no Connect pods exist. Should these be conditionally skipped?
  • Workbench DefaultRSConnectServer: site_controller_workbench.go:164-168 hardcodes the Workbench spec with a Connect URL. When Connect is disabled, this references a non-existent service. Should this be conditionally omitted?

These may be acceptable (orphan network policies are harmless, dead URL just won't resolve), but worth confirming they don't cause unexpected behavior.


2. Test coverage (site_test.go:1045-1059)

The test correctly verifies the Connect CR is not created. Two suggestions:

  • Assert apierrors.IsNotFound(err) instead of just assert.Error(t, err) to distinguish from unexpected error types.
  • Consider adding a complementary test confirming that when Enabled is nil (the default), Connect is still reconciled normally — this guards against accidental inversion of the nil-check logic.

3. Unrelated generated file (connectruntimeimagespec.go)

The new ConnectRuntimeImageSpecApplyConfiguration file appears unrelated to the Enabled field change. If this was picked up by re-running code generation, that's fine — just noting it looks like a separate catch-up bundled into this PR.


4. PR description

A brief description would help reviewers understand the use case — e.g., why a user would want to disable Connect entirely (cost savings? Connect deployed separately? etc.).

Copy link

@Lytol Lytol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but this just skips the reconciliation which isn't going to be enough, especially for existing workloads. At the very least IMO, it should either remove the deployment or set the replicas to 0.

)

// Skip Connect reconciliation if explicitly disabled
if site.Spec.Connect.Enabled != nil && !*site.Spec.Connect.Enabled {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The de-reference of the pointer is pretty weird when Go does that automatically.

@stevenolen
Copy link
Collaborator Author

I might be missing something, but this just skips the reconciliation which isn't going to be enough, especially for existing workloads. At the very least IMO, it should either remove the deployment or set the replicas to 0.

I think initially i was planning to just allow them to be orphaned and clean up, since it's not really that common of a situation. added a bit of work to do the cleanup though.

i'm not sure I love the impl of the cleanup functions, but it appears reasonable. I also handled the network policy reconicilation feedback from claude

@stevenolen stevenolen requested a review from Lytol February 10, 2026 22:22
@ian-flores
Copy link
Collaborator

Updated thoughts after looking at the latest changes — nice progress on the cleanup logic and network policies.

What's looking good now

  • Connect CRD cleanup via cleanupConnect() + BasicDelete
  • Network policies gated behind Enabled check with cleanupConnectNetworkPolicies() for teardown
  • The check moved to the caller in site_controller.go so the enabled/disabled branching is in one place

Remaining concern: CleanupConnect finalizer destroys state

The cleanupConnect() function calls BasicDelete on the Connect CRD. When the Connect CRD is deleted, its finalizer (CleanupConnect in connect.go) runs a full teardown — including CleanupDatabasePasswordSecret and CleanupDatabase. If the intent is to support re-enabling Connect later (set Enabled=true and have it come back), deleting the CRD destroys the state needed for that.

Two options depending on what we want:

  1. If re-enablement isn't needed: current approach works, but worth documenting that disabling Connect is a destructive one-way operation.
  2. If re-enablement should work: instead of deleting the Connect CRD, propagate Enabled to the Connect CRD's own spec (ConnectSpec) and have the Connect controller handle the disabled state — remove runtime resources (Deployment, Service, Ingress) but preserve stateful ones (DB, secrets, PVCs). The CRD stays around as a "dormant" record of the configuration.

Still open

Volume provisioning — FSx/NFS volume creation for Connect in site_controller.go still runs regardless of Enabled. Not critical (unused volumes are just wasteful), but should be gated for consistency.

Workbench DefaultRSConnectServerconnectUrl is still built and passed to Workbench even when Connect is disabled. Workbench will have a config entry pointing at a non-existent service. Should be set to empty string when disabled.

FlightdeckIsEmptyStruct() check won't catch Enabled=false when other fields (like License, Image) are set. Connect would still appear on the dashboard even though it's not running.

@stevenolen
Copy link
Collaborator Author

addressed the feedback. I think claude went a little overboard on the documentation, but it feels "reasonable" given the situation.

the fsx volume creation blocking will leave an orphaned pv, which should happen anyway with the protections we've chosen for these volumes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants