Skip to content

Conversation

@katarimanojk
Copy link
Contributor

No description provided.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 12, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dasm for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from 2241460 to 0c9047f Compare December 12, 2025 14:28
@github-actions
Copy link

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this will be closed in 7 days.

Copy link
Contributor

@fultonj fultonj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please consider a few small changes?

Copy link
Contributor

@fmount fmount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than minor things that we can even address in a follow up, the patch seems good to go. @katarimanojk let me know if you have other iterations on it, otherwise we can land it.

@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from e90853a to 483825e Compare January 20, 2026 07:38
@katarimanojk
Copy link
Contributor Author

katarimanojk commented Jan 20, 2026

Other than minor things that we can even address in a follow up, the patch seems good to go. @katarimanojk let me know if you have other iterations on it, otherwise we can land it.

@fmount Thanks for the suggestions, updated the patch now

@katarimanojk katarimanojk requested a review from fmount January 20, 2026 07:39
fmount
fmount previously approved these changes Jan 20, 2026
Copy link
Contributor

@fmount fmount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@fmount
Copy link
Contributor

fmount commented Jan 21, 2026

@katarimanojk as per the recent failures I'm wondering if this patch should be extended to install not only cephadm but also the missing requirements [1].

[1] https://github.com/openstack-k8s-operators/ci-framework/blob/main/roles/cifmw_block_device/tasks/main.yml#L23

@katarimanojk
Copy link
Contributor Author

katarimanojk commented Jan 21, 2026

@katarimanojk as per the recent failures I'm wondering if this patch should be extended to install not only cephadm but also the missing requirements [1].

[1] https://github.com/openstack-k8s-operators/ci-framework/blob/main/roles/cifmw_block_device/tasks/main.yml#L23

@fmount i just updated the patch to install missing package, please review the latest change

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/e5db0069d16f45ca8a97bf33b950d6fe

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 08m 33s
✔️ podified-multinode-edpm-deployment-crc SUCCESS in 1h 30m 57s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 22m 58s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 1h 42m 47s
podified-multinode-hci-deployment-crc FAILURE in 53m 22s
✔️ cifmw-pod-zuul-files SUCCESS in 4m 34s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 9m 05s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 25s
✔️ cifmw-molecule-cifmw_cephadm SUCCESS in 4m 14s

hosts: "{{ cifmw_ceph_target | default('computes') }}"
gather_facts: false
become: true
when: cifmw_cephadm_install_on_all_nodes | default(false) | bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope.

ERROR! 'when' is not a valid attribute for a Play

The error appears to be in '/home/zuul/src/github.com/openstack-k8s-operators/ci-framework/hooks/playbooks/ceph.yml': line 101, column 3, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:


- name: Ceph prerequisites
  ^ here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, i missed it, thanks for pointing it. I will fix it in the follow up patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is fixed in the latest patch.

@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from f17e5c7 to 3b24394 Compare January 21, 2026 17:21
@michburk
Copy link
Contributor

Could you also include the [cifmw_cephadm] prefix in the commit title and include a signed-off-by line in the commit body?

Also when/if you have a log of this running and working, please share a link (if you need to share a downstream link, include a link in a comment on a jira and mark the comment as Restriced to Red Hat Employee)

Thanks!

@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from 3b24394 to 3f4c116 Compare January 22, 2026 06:57
@katarimanojk
Copy link
Contributor Author

katarimanojk commented Jan 22, 2026

Could you also include the [cifmw_cephadm] prefix in the commit title and include a signed-off-by line in the commit body?

Also when/if you have a log of this running and working, please share a link (if you need to share a downstream link, include a link in a comment on a jira and mark the comment as Restriced to Red Hat Employee)

Thanks!

Thanks for the suggestion, commit msg is updated accordingly and Jira has logs (only red hat employee)

@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from 3f4c116 to 8d0b439 Compare January 22, 2026 07:02
@katarimanojk katarimanojk requested a review from fmount January 22, 2026 07:02
pre_tasks:
- name: Early stop ceph prerequisites
when:
- not cifmw_cephadm_install_on_all_nodes | default(false) | bool
Copy link
Contributor

@fmount fmount Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a valid option? I think we extended this playbook to support more than just cephadm, so you might need to install jq in any case.
So either we set a global option to skip_prereq entirely, or we scope this variable to cephadm only.
How much we gain from this option? In theory we have two use cases in CI:

  1. HCI (it might be ok to install cephadm on the 3 compute nodes, because mons are deployed there)
  2. external Ceph: still 3 nodes, still mons co-located w/ OSDs

So the question is: what do you think about removing the condition entirely from both here (L109) and pre.yaml (L25)? This way we could rely on the fact that ansible is idempotent (and not break the existing use cases with this change), we could decide to skip this playbook entirely with --skip-tags if needed, and we can consolidate prerequisites in a single place w/ no distinction between cephadm or other things that we might need in the future (e.g. due to new ceph versions).
I'd like also to hear @fultonj's opinion on this ^

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmount Thanks for the feedback, I agree with your point about removing the condition
so that the prerequisite packages are installed unconditionally on all nodes.
Also I can add the ceph_prereq tag to the entire play, so if needed, the prerequisite installation can be skipped using --skip-tags ceph_prereq.

Some context on the use cases:

  • Greenfield scenarios: These scenarios will not be harmed by installing the packages again due to Ansible's idempotency
  • HCI adoption: , we use adoption_deploy_ceph.yml (i.e., openstack overcloud ceph deploy) instead of the cifmw ceph playbook, so this change won't impact that workflow

@fultonj we need your thoughts on this.

Copy link
Contributor

@fultonj fultonj Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the proposal from @fmount

Greenfield only installed cephadm on the boostrap node because that's the only node where it is needed for greenfield. That assumption got us this far but now for adoption the requirement has changed so let's just accept it and remove while we add instead of trying to add more complexity to support two scenarios we don't actually have to go out of our way to support. Wish I had thought of that sooner.

pre.yml is called in the bootstrap context; i.e. target hosts are only the single bootstrap node:

https://github.com/openstack-k8s-operators/ci-framework/blob/main/hooks/playbooks/ceph.yml#L410

it could be kept as is and thought of as pre-for-bootstrap and this line could be removed:

https://github.com/openstack-k8s-operators/ci-framework/blob/main/roles/cifmw_cephadm/tasks/pre.yml#L24-L25

You could then take care of prerequisites for all nodes including the cephadm installation separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fmount and @fultonj . I've updated the PR following your suggestion:

  • Removed the conditional from pre.yml - cephadm always installs on the bootstrap node
  • The "Ceph prerequisites" play installs jq and cephadm on all nodes without any condition.

… nodes

For adoption scenarios with external Ceph, cephadm and dependent packages
may be missing on nodes. Add separate play to install required packages on
all nodes.

Signed-off-by: mkatari <mkatari@redhat.com>
Jira: OSPRH-24863
@katarimanojk katarimanojk force-pushed the fix_missing_packages_in_cephnodes branch from 8d0b439 to 3dcaa11 Compare January 29, 2026 15:02
@katarimanojk katarimanojk requested a review from fmount January 29, 2026 15:07
@fmount
Copy link
Contributor

fmount commented Jan 29, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Jan 29, 2026
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/7404c62f4eae4372badea8c7438ce04f

✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 40m 17s
podified-multinode-edpm-deployment-crc FAILURE in 1h 05m 34s
✔️ cifmw-crc-podified-edpm-baremetal SUCCESS in 1h 33m 13s
✔️ cifmw-crc-podified-edpm-baremetal-minor-update SUCCESS in 2h 13m 56s
✔️ podified-multinode-hci-deployment-crc SUCCESS in 1h 56m 04s
✔️ noop SUCCESS in 0s
✔️ cifmw-pod-ansible-test SUCCESS in 8m 53s
✔️ cifmw-pod-pre-commit SUCCESS in 8m 13s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants