-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Which component are you using?:
/area cluster-autoscaler
What version of the component are you using?:
Component version: 9.53.0
What k8s version are you using (kubectl version)?:
kubectl version Output
$ kubectl version
Client Version: v1.34.2
Kustomize Version: v5.7.1
Server Version: v1.34.2
What environment is this in?:
Exoscale
What did you expect to happen?:
cluster-autoscaler scaling down unneeded nodes
What happened instead?:
cluster-autoscale fails to get nodes groups:
I1205 08:23:11.326874 1 static_autoscaler.go:270] Starting main loop
I1205 08:23:11.327193 1 log.go:36] exoscale-provider: refreshing node groups cache
I1205 08:23:11.327214 1 log.go:32] exoscale-provider: cluster-autoscaler is disabled: no node groups found
I1205 08:23:11.328046 1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED
I1205 08:23:12.279320 1 filter_out_schedulable.go:65] Filtering out schedulables
I1205 08:23:12.279352 1 filter_out_schedulable.go:122] 0 pods marked as unschedulable can be scheduled.
I1205 08:23:12.279378 1 filter_out_schedulable.go:85] No schedulable pods
I1205 08:23:12.279386 1 filter_out_daemon_sets.go:47] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1205 08:23:12.279436 1 static_autoscaler.go:520] No unschedulable pods
I1205 08:23:12.279457 1 static_autoscaler.go:559] Calculating unneeded nodes
I1205 08:23:12.279479 1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED
I1205 08:23:12.733911 1 static_autoscaler.go:602] Scale down status: lastScaleUpTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 lastScaleDownDeleteTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 lastScaleDownFailTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 scaleDownForbidden=false scaleDownInCooldown=false
I1205 08:23:12.733970 1 static_autoscaler.go:624] Starting scale down: no scale down candidates. skipping...
I1205 08:23:12.733997 1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED
How to reproduce it (as minimally and precisely as possible):
- Create a Kubernetes cluster with metrics server in Exoscale
- Create an IAM Role in exoscale with the following policy
{
"default-service-strategy": "deny",
"services": {
"compute": {
"type": "rules",
"rules": [
{
"expression": "operation == 'get-instance'",
"action": "allow"
},
{
"expression": "operation == 'get-instance-pool'",
"action": "allow"
},
{
"expression": "operation == 'get-operation'",
"action": "allow"
},
{
"expression": "operation == 'get-quota'",
"action": "allow"
},
{
"expression": "operation == 'list-sks-clusters'",
"action": "allow"
},
{
"expression": "operation == 'scale-sks-nodepool'",
"action": "allow"
},
{
"expression": "operation == 'evict-sks-nodepool-members'",
"action": "allow"
}
]
}
}
}
- Create a key for this IAM Role and put it in a secret with the zone of the cluster
apiVersion: v1
data:
api-key: REDACTED
api-secret: REDACTED
api-zone: 'ch-dk-2'
kind: Secret
metadata:
name: cluster-autoscaler-exoscale-cluster-autoscaler
type: Opaque
- Install cluster-autoscaler with the following values:
cluster-autoscaler:
cloudProvider: exoscale
autoDiscovery:
clusterName: gitlab-runner
extraArgs:
scale-down-unneeded-time: 15m
- Scale workload on it to trigger the scaling
Anything else we need to know?:
It started happening the 4th December 2025 at 13:00 UTC time, I think it might be related to an Exoscale API change:
https://openapi-v2.exoscale.com/compare/main..e63a3347-d83b-4c4d-b2f8-acda36f96348
I was on 9.52.1 and I tried updating to 9.53.0 but it didn't change anything