Skip to content

cluster-autoscaler: Exoscale fails to get node group #8893

@owngr

Description

@owngr

Which component are you using?:

/area cluster-autoscaler

What version of the component are you using?:

Component version: 9.53.0

What k8s version are you using (kubectl version)?:

kubectl version Output
$ kubectl version

Client Version: v1.34.2
Kustomize Version: v5.7.1
Server Version: v1.34.2

What environment is this in?:

Exoscale

What did you expect to happen?:

cluster-autoscaler scaling down unneeded nodes

What happened instead?:

cluster-autoscale fails to get nodes groups:

I1205 08:23:11.326874       1 static_autoscaler.go:270] Starting main loop
I1205 08:23:11.327193       1 log.go:36] exoscale-provider: refreshing node groups cache
I1205 08:23:11.327214       1 log.go:32] exoscale-provider: cluster-autoscaler is disabled: no node groups found
I1205 08:23:11.328046       1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED
I1205 08:23:12.279320       1 filter_out_schedulable.go:65] Filtering out schedulables
I1205 08:23:12.279352       1 filter_out_schedulable.go:122] 0 pods marked as unschedulable can be scheduled.
I1205 08:23:12.279378       1 filter_out_schedulable.go:85] No schedulable pods
I1205 08:23:12.279386       1 filter_out_daemon_sets.go:47] Filtered out 0 daemon set pods, 0 unschedulable pods left
I1205 08:23:12.279436       1 static_autoscaler.go:520] No unschedulable pods
I1205 08:23:12.279457       1 static_autoscaler.go:559] Calculating unneeded nodes
I1205 08:23:12.279479       1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED
I1205 08:23:12.733911       1 static_autoscaler.go:602] Scale down status: lastScaleUpTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 lastScaleDownDeleteTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 lastScaleDownFailTime=2025-12-05 07:18:01.453406785 +0000 UTC m=-3583.024926371 scaleDownForbidden=false scaleDownInCooldown=false
I1205 08:23:12.733970       1 static_autoscaler.go:624] Starting scale down: no scale down candidates. skipping...
I1205 08:23:12.733997       1 log.go:36] exoscale-provider: looking up node group for node ID REDACTED

How to reproduce it (as minimally and precisely as possible):

  1. Create a Kubernetes cluster with metrics server in Exoscale
  2. Create an IAM Role in exoscale with the following policy
{
  "default-service-strategy": "deny",
  "services": {
    "compute": {
      "type": "rules",
      "rules": [
        {
          "expression": "operation == 'get-instance'",
          "action": "allow"
        },
        {
          "expression": "operation == 'get-instance-pool'",
          "action": "allow"
        },
        {
          "expression": "operation == 'get-operation'",
          "action": "allow"
        },
        {
          "expression": "operation == 'get-quota'",
          "action": "allow"
        },
        {
          "expression": "operation == 'list-sks-clusters'",
          "action": "allow"
        },
        {
          "expression": "operation == 'scale-sks-nodepool'",
          "action": "allow"
        },
        {
          "expression": "operation == 'evict-sks-nodepool-members'",
          "action": "allow"
        }
      ]
    }
  }
}
  1. Create a key for this IAM Role and put it in a secret with the zone of the cluster
apiVersion: v1
data:
  api-key: REDACTED
  api-secret: REDACTED
  api-zone: 'ch-dk-2'
kind: Secret
metadata:
  name: cluster-autoscaler-exoscale-cluster-autoscaler
type: Opaque
  1. Install cluster-autoscaler with the following values:
cluster-autoscaler:
  cloudProvider: exoscale
  autoDiscovery:
    clusterName: gitlab-runner
  extraArgs:
    scale-down-unneeded-time: 15m
  1. Scale workload on it to trigger the scaling

Anything else we need to know?:

It started happening the 4th December 2025 at 13:00 UTC time, I think it might be related to an Exoscale API change:
https://openapi-v2.exoscale.com/compare/main..e63a3347-d83b-4c4d-b2f8-acda36f96348

I was on 9.52.1 and I tried updating to 9.53.0 but it didn't change anything

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions