Skip to content

Conversation

@pohly
Copy link
Contributor

@pohly pohly commented Dec 9, 2025

#36028 broke https://testgrid.k8s.io/conformance-all#local-up-cluster,%20master%20(dev) and https://testgrid.k8s.io/sig-node-dynamic-resource-allocation#ci-dra-integration (both using local-up-cluster.sh).

Instead of merging an image bump blindly and hoping that it goes well, let's do at least some trial runs with jobs that will be affected by an image bump. The new pull-test-infra-local-e2e is such a job. It gets triggered by edits to the job file (like image bumps) and is optional (can be ignored if the normal job is unstable).

/assign @upodroid

kubernetes#36028 broke
https://testgrid.k8s.io/conformance-all#local-up-cluster,%20master%20(dev) and
https://testgrid.k8s.io/sig-node-dynamic-resource-allocation#ci-dra-integration (both
using local-up-cluster.sh).

Instead of merging an image bump blindly and hoping that it goes well, let's do
at least some trial runs with jobs that will be affected by an image
bump. The new pull-test-infra-local-e2e is such a job. It gets triggered by
edits to the job file (like image bumps) and is optional (can be ignored if the
normal job is unstable).
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pohly
Once this PR has been reviewed and has the lgtm label, please assign priyankasaggu11929 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 9, 2025
@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 9, 2025
memory: 6Gi
requests:
cpu: 4
memory: 6Gi
Copy link
Contributor Author

@pohly pohly Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expected to fail at the moment the same way as https://testgrid.k8s.io/conformance-all#local-up-cluster,%20master%20(dev) fails:

E1206 14:48:52.954172 46837 kuberuntime_manager.go:1558] "CreatePodSandbox for pod failed" err="rpc error: code = Unknown desc = failed to start sandbox "5704c40aa11f000b6c1027ed4ecb6c5ccd75154bfee41d50727b14a11c347fd9": failed to create containerd task: failed to create shim task: failed to mount rootfs component: mount source: "overlay", target: "/run/containerd/io.containerd.runtime.v2.task/k8s.io/5704c40aa11f000b6c1027ed4ecb6c5ccd75154bfee41d50727b14a11c347fd9/rootfs", fstype: overlay, flags: 0, data: "workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/127/work,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/127/fs,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs,index=off", err: invalid argument" pod="kube-system/coredns-5c44b89985-kvnxm"

My plan is to verify that it fails, then do a single change to try out the solution that @BenTheElder proposed (mounting empty dir on /var/lib/containerd).

If that works, we can move that volume mount to the presets to fix all jobs.

/cc @dims @bart0sh

@k8s-ci-robot k8s-ci-robot requested review from bart0sh and dims December 9, 2025 07:13
annotations:
testgrid-create-test-group: 'true'
testgrid-dashboards: sig-testing-misc
description: Brings up a cluster using kubetest with local-up-cluster
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's debatable whether this should be under testgrid-dashboards: presubmits-test-infra. I doubt that it would make much difference in practice. 🤷

@pohly
Copy link
Contributor Author

pohly commented Dec 9, 2025

/hold

Might be better done as a canary in https://testgrid.k8s.io/sig-testing-canaries.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Issues or PRs related to code in /config area/jobs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants