Restore workspace from backup by Allda · Pull Request #1572 · devfile/devworkspace-operator

Allda · 2026-01-14T12:18:51Z

What does this PR do?

Add init container for workspace restoration

A new init container is added to the workspace deployment in case user choose to restore the workspace from backup.

By setting the workspace attribute "controller.devfile.io/restore-workspace" the controller sets a new init container instead of cloning data from git repository.

By default an automated path to restore image is used based on cluster settings. However user is capable overwrite that value using another attribute "controller.devfile.io/restore-source-image".

The restore container runs a wokspace-recovery.sh script that pull an image using oras an extract files to a /project directory.

What issues does this PR fix or reference?

#1525

Is it tested? How?

No automated tests are available in the first phase. I will add tests once I get the first approval that the concept is ok.

How to test:

Configure a cluster to enable backups
Create a new workspace and make changes in any of its files and save it.
Stop the workspace `kubectl patch devworkspace restore-workspace-2 --type=merge -p '{"spec": {"started": false}}'
Wait till it is stopped, and backup is executed for the workspace (verify the backup image exists in the registry)
Delete a workspace from cluster kubectl delete devworkspace restore-workspace-2
Add an attribute to the workspace CRD as shown below (controller.devfile.io/restore-workspace)
Create a workspace
Wait till it boots up and verify the changed file is present

kind: DevWorkspace
apiVersion: workspace.devfile.io/v1alpha2
metadata:
  labels:
    controller.devfile.io/creator: ""
  name: restore-workspace-2
spec:
  started: true
  routingClass: 'basic'
  template:
    attributes:
      controller.devfile.io/storage-type: common
      controller.devfile.io/restore-workspace: 'true'

PR Checklist

E2E tests pass (when PR is ready, comment /test v8-devworkspace-operator-e2e, v8-che-happy-path to trigger)
- v8-devworkspace-operator-e2e: DevWorkspace e2e test
- v8-che-happy-path: Happy path for verification integration with Che

What's missing:

integration with registry authentication ✅
integration with built-in OCP registry ✅

openshift-ci · 2026-01-14T12:18:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Allda
Once this PR has been reviewed and has the lgtm label, please assign dkwon17 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pkg/library/env/workspaceenv.go

pkg/library/restore/restore.go

controllers/workspace/devworkspace_controller.go

pkg/library/restore/restore.go

pkg/library/storage/storage.go

pkg/library/restore/restore.go

pkg/constants/attributes.go

A new init container is added to the workspace deployment in case user choose to restore the workspace from backup. By setting workspace attribute "controller.devfile.io/restore-workspace" the controller sets a new init container instead of cloning data from git repository. By default an automated path to restore image is used based on cluster settings. However user is capable overwrite that value using another attribute "controller.devfile.io/restore-source-image". The restore container runs a wokspace-recovery.sh script that pull an image using oras an extract files to a /project directory. Signed-off-by: Ales Raszka <araszka@redhat.com>

A new tests that verifies the workspace is created from a backup. It checks if a deployment is ready and if it contains a new restore init container with proper configuration. There are 2 tests - one focused on common pvc and other that have per-workspace storage. Signed-off-by: Ales Raszka <araszka@redhat.com>

The condition whether an workspace should be restored from workspace was in the restore module itself. This make a reading a code more difficult. Now the condition is checked in the controller itself and restore container is only added when enabled. This commit also fixes few minor changes based on the code review comments: - Licence header - Attribute validation - Add a test for disabled workspace recovery - Typos Signed-off-by: Ales Raszka <araszka@redhat.com>

A new config is added to control the restore container. Default values are set for the new init container. It can be changed by user in the config. The config uses same logic as the project clone container config. Signed-off-by: Ales Raszka <araszka@redhat.com>

controllers/workspace/devworkspace_controller_test.go

pkg/library/restore/restore.go

pkg/constants/attributes.go

controllers/workspace/devworkspace_controller_test.go

apis/controller/v1alpha1/devworkspaceoperatorconfig_types.go

rohanKanojia · 2026-01-19T12:00:09Z

@Allda : I'm facing a strange issue while testing this functionality on CRC cluster . I've tried on both amd64 and arm64 variants but face same issue. I used samples/plain-workspace.yaml for testing.

Everything goes fine till step 4, but when I create the restore backup manifest I can see devworkspace resource is created but there is no corresponding pod for it:

oc create -f restore-dw.yaml
devworkspace.workspace.devfile.io/plain-devworkspace created
oc get dw
NAME                 DEVWORKSPACE ID             PHASE     INFO
plain-devworkspace   workspace612b8ddca9ff45d5   Running   Workspace is running
oc get pods
No resources found in rokumar-dev namespace.

I had just modified the name from the restore manifest you shared:

kind: DevWorkspace
apiVersion: workspace.devfile.io/v1alpha2
metadata:
  labels:
    controller.devfile.io/creator: ""
  name: plain-devworkspace
spec:
  started: true
  routingClass: 'basic'
  template:
    attributes:
      controller.devfile.io/storage-type: common
      controller.devfile.io/restore-workspace: 'true'

Could you please check if I'm missing something?

Allda · 2026-01-21T12:36:34Z

@Allda : I'm facing a strange issue while testing this functionality on CRC cluster . I've tried on both amd64 and arm64 variants but face same issue. I used samples/plain-workspace.yaml for testing.

Everything goes fine till step 4, but when I create the restore backup manifest I can see devworkspace resource is created but there is no corresponding pod for it:
oc create -f restore-dw.yaml
devworkspace.workspace.devfile.io/plain-devworkspace created
oc get dw
NAME                 DEVWORKSPACE ID             PHASE     INFO
plain-devworkspace   workspace612b8ddca9ff45d5   Running   Workspace is running
oc get pods
No resources found in rokumar-dev namespace.
I had just modified the name from the restore manifest you shared:
kind: DevWorkspace
apiVersion: workspace.devfile.io/v1alpha2
metadata:
  labels:
    controller.devfile.io/creator: ""
  name: plain-devworkspace
spec:
  started: true
  routingClass: 'basic'
  template:
    attributes:
      controller.devfile.io/storage-type: common
      controller.devfile.io/restore-workspace: 'true'
Could you please check if I'm missing something?

I am not sure why it doesn't work on your system. I tried the workspace you mentioned and the backup and other pods were created successfully.

~ k get deployments -n araszka
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
workspacefba99cfc93514828   1/1     1            1           40s

~ k get pods -n araszka                                                               
NAME                                         READY   STATUS             RESTARTS      AGE
fuse-builder-pod                             0/1     ImagePullBackOff   2 (20d ago)   63d
workspacefba99cfc93514828-84b865df77-t5gwz   1/1     Running            0             51s

Are there any logs you can share or see if the workspace has any pods at the very start?

.github/workflows/project-backup-build.yml

Signed-off-by: Ales Raszka <araszka@redhat.com>

.github/workflows/project-backup-build.yml

dkwon17 · 2026-01-28T23:04:32Z

I think I found the cause of #1572 (comment). It seems unrelated to this PR. I will need to gather more evidence, and create a fix

project-backup/workspace-recovery.sh

akurinnoy · 2026-01-29T11:31:17Z

pkg/library/env/workspaceenv.go

+		Name:  devfileConstants.ProjectsRootEnvVar,
+		Value: constants.DefaultProjectsSourcesRoot,
+	})
+	if workspace.Config.Workspace.BackupCronJob.OrasConfig != nil {


To me it seems like a potential nil pointer issue, when workspace restore is enabled but backup configuration is not set in DWOC.

Suggested change

if workspace.Config.Workspace.BackupCronJob.OrasConfig != nil {

if workspace.Config.Workspace.BackupCronJob != nil &&

workspace.Config.Workspace.BackupCronJob.OrasConfig != nil {

akurinnoy · 2026-01-29T11:41:35Z

pkg/secrets/backup.go

+		err = c.Delete(ctx, existingNamespaceSecret)
+		if err != nil {
+			return nil, err
+		}
+	}
+	namespaceSecret = &corev1.Secret{
+		ObjectMeta: metav1.ObjectMeta{
+			Name:      constants.DevWorkspaceBackupAuthSecretName,
+			Namespace: workspace.Namespace,
+			Labels: map[string]string{
+				constants.DevWorkspaceIDLabel:          workspace.Status.DevWorkspaceId,
+				constants.DevWorkspaceWatchSecretLabel: "true",
+			},
+		},
+		Data: sourceSecret.Data,
+		Type: sourceSecret.Type,
+	}
+	if err := controllerutil.SetControllerReference(workspace, namespaceSecret, scheme); err != nil {
+		return nil, err
+	}
+	err = c.Create(ctx, namespaceSecret)


When multiple workspaces start simultaneously they race to copy the same secret causing failures.

err = c.Delete(ctx, existingNamespaceSecret) // race window opens // time gap err = c.Create(ctx, namespaceSecret) // race window closes

I think you can update instead of delete and create. Does this make sense?

I replaced the function with the SyncObjectWithCluster that performs object sync in a standardized way and only syncs it when secrets is changed.

akurinnoy · 2026-01-29T12:24:07Z

pkg/library/restore/restore.go

+			MountPath: constants.DefaultProjectsSourcesRoot,
+		},
+	}
+	registryAuthSecret, err := secrets.HandleRegistryAuthSecret(ctx, k8sClient, workspace.DevWorkspace, workspace.Config, "", scheme, log)


Could you please clarify why you passed an empty string for operatorNamespace?
I noticed that the backup implementation retrieves the operator namespace and passes it to the same function.
Sorry if this has already been discussed. If so, please point that out to me.

@akurinnoy I created a similar discussion here: #1572 (comment)

I addressed this based on the suggestion.

Signed-off-by: Ales Raszka <araszka@redhat.com>

In case the backup config is not present the value might be null and fails. The new condition handles it. Signed-off-by: Ales Raszka <araszka@redhat.com>

The delay between checking the empty dir and copying a backup content might cause an issue. This fix moves the check right before the content is being copied which minimize the delay. Signed-off-by: Ales Raszka <araszka@redhat.com>

The previous solution always deleted a secret if exist and re-create it. This can lead to potential issues. The new code uses SyncObjectWithCluster that is used across a whole codebase and minimize the risk of issues. Signed-off-by: Ales Raszka <araszka@redhat.com>

Allda · 2026-02-02T08:20:11Z

@dkwon17 I addressed all the code review comments. Is this PR good to be merged?

rohanKanojia · 2026-02-02T12:50:10Z

@Allda : Could you rebase your branch against main (to also bring in 0607887) ? I want to test restore for per-workspace scenarios.

rohanKanojia · 2026-02-02T13:06:17Z

pkg/constants/constants.go

+	//Role kinds
+	Role = "Role"
+	// ClusterRole kind
+	ClusterRole = "ClusterRole"


This seems to be a very generic name and can be mistaken for an API concept or exported constant from Kubernetes (which doesn’t exist).

I suggest renaming it to something like rbacRoleKind / rbacClusterRoleKind, or documenting that this is a literal Kind string used for comparison.

Suggested change

//Role kinds

Role = "Role"

// ClusterRole kind

ClusterRole = "ClusterRole"

// Role kind

rbacRoleKind = "Role"

// ClusterRole kind

rbacClusterRoleKind = "ClusterRole"

akurinnoy · 2026-02-02T10:44:37Z

pkg/secrets/backup.go

+func HandleRegistryAuthSecret(ctx context.Context, c client.Client, workspace *dw.DevWorkspace,
+	dwOperatorConfig *controllerv1alpha1.OperatorConfiguration, operatorConfigNamespace string, scheme *runtime.Scheme, log logr.Logger,
+) (*corev1.Secret, error) {
+	secretName := dwOperatorConfig.Workspace.BackupCronJob.Registry.AuthSecret


Potential nil pointer deference:

Suggested change

secretName := dwOperatorConfig.Workspace.BackupCronJob.Registry.AuthSecret

if dwOperatorConfig.Workspace == nil ||

dwOperartorConfig.Workspace.BackupCronJob == nil ||

dwOperatorConfig.Workspace.BackupCronJob.Registry == nil {

return nil, fmt.Errorf("backup/restore configuration not properly set in DevWorkspaceOperatorConfig")

}

secretName := dwOperatorConfig.Workspace.BackupCronJob.Registry.AuthSecret

akurinnoy · 2026-02-02T14:14:50Z

pkg/secrets/backup.go

+	// Construct the desired secret state
+	desiredSecret := &corev1.Secret{
+		ObjectMeta: metav1.ObjectMeta{
+			Name:      constants.DevWorkspaceBackupAuthSecretName,
+			Namespace: workspace.Namespace,
+			Labels: map[string]string{
+				constants.DevWorkspaceIDLabel:          workspace.Status.DevWorkspaceId,
+				constants.DevWorkspaceWatchSecretLabel: "true",
+			},
+		},
+		Data: sourceSecret.Data,
+		Type: sourceSecret.Type,
+	}


Here seems to be another race condition in secret copying. If multiple workspaces are restoring simultaneously in the same namespace, they will race to create/update the same secret name. Does this make sense?

Suggested change

// Construct the desired secret state

desiredSecret := &corev1.Secret{

ObjectMeta: metav1.ObjectMeta{

Name: constants.DevWorkspaceBackupAuthSecretName,

Namespace: workspace.Namespace,

Labels: map[string]string{

constants.DevWorkspaceIDLabel: workspace.Status.DevWorkspaceId,

constants.DevWorkspaceWatchSecretLabel: "true",

},

},

Data: sourceSecret.Data,

Type: sourceSecret.Type,

}

// Construct the desired secret state

desiredSecret := &corev1.Secret{

ObjectMeta: metav1.ObjectMeta{

Name: constants.DevWorkspaceBackupAuthSecretName + "=" + workspace.Status.DevWorkspaceId,

Namespace: workspace.Namespace,

Labels: map[string]string{

constants.DevWorkspaceIDLabel: workspace.Status.DevWorkspaceId,

constants.DevWorkspaceWatchSecretLabel: "true",

},

},

Data: sourceSecret.Data,

Type: sourceSecret.Type,

}

I don't think we need an auth secret for each workspace, especially since the secret has the same data, how about just removing the constants.DevWorkspaceIDLabel: workspace.Status.DevWorkspaceId label from the secret?

@Allda I suggest removing constants.DevWorkspaceIDLabel :

Suggested change

// Construct the desired secret state

desiredSecret := &corev1.Secret{

ObjectMeta: metav1.ObjectMeta{

Name: constants.DevWorkspaceBackupAuthSecretName,

Namespace: workspace.Namespace,

Labels: map[string]string{

constants.DevWorkspaceIDLabel: workspace.Status.DevWorkspaceId,

constants.DevWorkspaceWatchSecretLabel: "true",

},

},

Data: sourceSecret.Data,

Type: sourceSecret.Type,

}

// Construct the desired secret state

desiredSecret := &corev1.Secret{

ObjectMeta: metav1.ObjectMeta{

Name: constants.DevWorkspaceBackupAuthSecretName,

Namespace: workspace.Namespace,

Labels: map[string]string{

constants.DevWorkspaceWatchSecretLabel: "true",

},

},

Data: sourceSecret.Data,

Type: sourceSecret.Type,

}

rohanKanojia · 2026-02-02T14:27:03Z

Now that I'm testing with updated changes, I'm seeing issue that after creating restored workspace manifest . Restore pod correctly starts workspace-restore container and it restores devworkspace correctly too.

However, the DevWorkspace is not able to come out of Starting phase:

oc get dw
NAMESPACE             NAME                                  DEVWORKSPACE ID             PHASE      INFO
openshift-operators   test-devworkspace-should-get-backup   workspacedd4c47ceaf59422a   Starting   Waiting for workspace deployment

oc get pods
NAME                                               READY   STATUS    RESTARTS      AGE
devworkspace-controller-manager-5f5db7d5bd-xnwn6   1/1     Running   0             52m
devworkspace-webhook-server-5fbcc75dd-lv5b6        1/1     Running   0             52m
devworkspace-webhook-server-5fbcc75dd-vsmdb        1/1     Running   1 (52m ago)   52m
workspacedd4c47ceaf59422a-778f75dc67-7wxvf         1/1     Running   0             4m16s

oc logs pod/workspacedd4c47ceaf59422a-778f75dc67-7wxvf -cworkspace-restore                                              ─╯

Restoring devworkspace from image 'quay.io/rokumar/openshift-operators/test-devworkspace-should-get-backup:latest' to path '/projects'
Downloading 2d633520a986 devworkspace-backup.tar.gz
Downloaded  2d633520a986 devworkspace-backup.tar.gz
Pulled [registry] quay.io/rokumar/openshift-operators/test-devworkspace-should-get-backup:latest
Digest: sha256:5471f6cf07e78565f2ecc5624ff81fba1ce65953864434b67614cdc4f818581a
./
./web-nodejs-sample/
./web-nodejs-sample/.git/
./web-nodejs-sample/.git/branches/
./web-nodejs-sample/.git/description
./web-nodejs-sample/.git/hooks/
./web-nodejs-sample/.git/hooks/applypatch-msg.sample
./web-nodejs-sample/.git/hooks/commit-msg.sample
./web-nodejs-sample/.git/hooks/fsmonitor-watchman.sample
./web-nodejs-sample/.git/hooks/post-update.sample
./web-nodejs-sample/.git/hooks/pre-applypatch.sample
./web-nodejs-sample/.git/hooks/pre-commit.sample
./web-nodejs-sample/.git/hooks/pre-merge-commit.sample
./web-nodejs-sample/.git/hooks/pre-push.sample
./web-nodejs-sample/.git/hooks/pre-rebase.sample
./web-nodejs-sample/.git/hooks/pre-receive.sample
./web-nodejs-sample/.git/hooks/prepare-commit-msg.sample
./web-nodejs-sample/.git/hooks/push-to-checkout.sample
./web-nodejs-sample/.git/hooks/sendemail-validate.sample
./web-nodejs-sample/.git/hooks/update.sample
./web-nodejs-sample/.git/info/
./web-nodejs-sample/.git/info/exclude
./web-nodejs-sample/.git/config
./web-nodejs-sample/.git/objects/
./web-nodejs-sample/.git/objects/pack/
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.pack
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.rev
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.idx
./web-nodejs-sample/.git/objects/info/
./web-nodejs-sample/.git/HEAD
./web-nodejs-sample/.git/refs/
./web-nodejs-sample/.git/refs/heads/
./web-nodejs-sample/.git/refs/heads/main
./web-nodejs-sample/.git/refs/tags/
./web-nodejs-sample/.git/refs/remotes/
./web-nodejs-sample/.git/refs/remotes/origin/
./web-nodejs-sample/.git/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/packed-refs
./web-nodejs-sample/.git/logs/
./web-nodejs-sample/.git/logs/refs/
./web-nodejs-sample/.git/logs/refs/remotes/
./web-nodejs-sample/.git/logs/refs/remotes/origin/
./web-nodejs-sample/.git/logs/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/logs/refs/heads/
./web-nodejs-sample/.git/logs/refs/heads/main
./web-nodejs-sample/.git/logs/HEAD
./web-nodejs-sample/.git/index
./web-nodejs-sample/.git/FETCH_HEAD
./web-nodejs-sample/.gitattributes
./web-nodejs-sample/.github/
./web-nodejs-sample/.github/CODEOWNERS
./web-nodejs-sample/.gitignore
./web-nodejs-sample/.vscode/
./web-nodejs-sample/.vscode/launch.json
./web-nodejs-sample/LICENSE
./web-nodejs-sample/README.md
./web-nodejs-sample/app/
./web-nodejs-sample/app/app.js
./web-nodejs-sample/devfile.yaml
./web-nodejs-sample/package-lock.json
./web-nodejs-sample/package.json
./.code-workspace
Restore completed successfully.
╭─      ~/go/src/github.com/devfile/devworkspace-operator   pullRequest1572 ≢  ?3  36                                                                                                1.24.6     19:54:35  ─╮
╰─ oc exec -it pod/workspacedd4c47ceaf59422a-778f75dc67-7wxvf -- /bin/bash                                                 ─╯
Defaulted container "dev" out of: dev, workspace-restore (init), che-code-injector (init)
projects $ ls
web-nodejs-sample
projects $ cd web-nodejs-sample/
web-nodejs-sample (main) $ ls
LICENSE  README.md  app  devfile.yaml  package-lock.json  package.json
web-nodejs-sample (main) $ cat README.md 
# web-nodejs-sample

ExpressJS Sample Application

# Developer Workspace
[![Contribute](http://beta.codenvy.com/factory/resources/codenvy-contribute.svg)](http://beta.codenvy.com/f?id=r8et9w6vohmqvro8)

# Stack to use

FROM [codenvy/node](https://hub.docker.com/r/codenvy/node/)

# How to run

| #       | Description           | Command  |
| :------------- |:-------------| :-----|
| 1      | Run | `cd ${current.project.path}/app && node app.js` |## Modified via backup test
web-nodejs-sample (main) $ exit

rohanKanojia · 2026-02-02T15:52:03Z

controllers/workspace/devworkspace_controller.go

-	}
-	if workspace.Config.Workspace.ProjectCloneConfig.ImagePullPolicy != "" {
-		projectCloneOptions.PullPolicy = config.Workspace.ProjectCloneConfig.ImagePullPolicy
+	if restore.IsWorkspaceRestoreRequested(&workspace.Spec.Template) {


I see every reconcile checks IsWorkspaceRestoreRequested(), I think this would keep adding restore init container, and the workspace would never reach Running phase.

The restore attribute should be automatically removed after successful completion.

I think this would keep adding restore init container, and the workspace would never reach Running phase.

In this situation, the restore container is basically an alternative to the project-clone container, which (before this PR) is also being added for every reconciliation, so I don't think this is a problem.

Did you face specific issues during the testing?

While I was testing yesterday, I keep bumping into #1572 (comment)

DevWorkspace was able to come in Running state when I patched DevWorkspace to remove the restore attribute.

I'll check again if it is some problem with my setup.

dkwon17 · 2026-02-02T22:13:36Z

However, the DevWorkspace is not able to come out of Starting phase:

@rohanKanojia could you please share the DevWorkspace yaml that you used for the restore workspace?

rohanKanojia · 2026-02-03T10:09:04Z

@dkwon17 : I'm trying to reproduce it via a script:

restore-external-registry-test.sh

#!/usr/bin/env bash
set -euo pipefail
source ./utils.sh

NAMESPACE="openshift-operators"
RESTORE_ATTRIBUTE="controller.devfile.io/restore-workspace"
WORKSPACE_NAME_PREFIX="${1:-test-devworkspace}"
BACKUP_WORKSPACE_NAME="${WORKSPACE_NAME_PREFIX}-should-get-backup"

# 1️⃣ Delete the workspace if it exists
if kubectl get devworkspace "$BACKUP_WORKSPACE_NAME" -n "$NAMESPACE" >/dev/null 2>&1; then
  echo "Deleting existing workspace: $BACKUP_WORKSPACE_NAME"
  kubectl delete devworkspace "$BACKUP_WORKSPACE_NAME" -n "$NAMESPACE" --wait
fi

# 2️⃣ Apply the restore workspace manifest
cat <<EOF | kubectl apply -f -
apiVersion: workspace.devfile.io/v1alpha2
kind: DevWorkspace
metadata:
  name: $BACKUP_WORKSPACE_NAME
  namespace: $NAMESPACE
spec:
  started: true
  template:
    attributes:
      $RESTORE_ATTRIBUTE: 'true'
    projects:
      - name: web-nodejs-sample
        git:
          remotes:
            origin: "https://github.com/che-samples/web-nodejs-sample.git"
    components:
      - name: dev
        container:
          image: quay.io/devfile/universal-developer-image:latest
          memoryLimit: 512Mi
          memoryRequest: 256Mi
          cpuRequest: 1000m
    commands:
      - id: say-hello
        exec:
          component: dev
          commandLine: echo "Hello from \$(pwd)"
          workingDir: \${PROJECT_SOURCE}/app
  contributions:
    - name: che-code
      uri: https://eclipse-che.github.io/che-plugin-registry/main/v3/plugins/che-incubator/che-code/latest/devfile.yaml
      components:
        - name: che-code-runtime-description
          container:
            env:
              - name: CODE_HOST
                value: 0.0.0.0
EOF

echo "Workspace $BACKUP_WORKSPACE_NAME created with restore attribute"

# 3️⃣ Wait until the workspace is ready
echo "Waiting for workspace to start..."
kubectl wait devworkspace "$BACKUP_WORKSPACE_NAME" -n "$NAMESPACE" --for=condition=Ready --timeout=120s

# Wait up to 120s for pod to appear
echo "Waiting for workspace pod to be created..."
for i in {1..24}; do
  POD_NAME=$(kubectl get pods -n "$NAMESPACE" \
    -l "controller.devfile.io/devworkspace_name=$BACKUP_WORKSPACE_NAME" \
    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || echo "")
  if [[ -n "$POD_NAME" ]]; then
    break
  fi
  sleep 5
done

if [[ -z "$POD_NAME" ]]; then
  echo "❌ Workspace pod was not created in time"
  exit 1
fi

echo "✅ Workspace pod created: $POD_NAME"

# 4️⃣ Print controller logs to verify restore logic execution
echo ""
echo "=========================================="
echo "Controller logs (restore logic execution):"
echo "=========================================="
CONTROLLER_POD=$(kubectl get pods -n "$NAMESPACE" -l app.kubernetes.io/name=devworkspace-controller -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || echo "")
if [[ -n "$CONTROLLER_POD" ]]; then
  kubectl logs -n "$NAMESPACE" "$CONTROLLER_POD" --tail=100 | grep -i "restore\|$BACKUP_WORKSPACE_NAME" || echo "No restore-related logs found in recent entries"
else
  echo "⚠️  Controller pod not found, skipping controller logs"
fi

# 5️⃣ Print workspace pod logs
echo ""
echo "=========================================="
echo "Workspace pod logs:"
echo "=========================================="
kubectl logs -n "$NAMESPACE" "$POD_NAME" --all-containers=true --tail=50 2>/dev/null || echo "⚠️  Could not fetch pod logs (pod may still be initializing)"

# 6️⃣ Verify restored file exists (example: README.md from backup)
# Adjust this path based on what your backup contains
echo ""
echo "=========================================="
echo "Verifying restored file:"
echo "=========================================="

# First, list the projects directory to see what was restored
echo "Contents of /projects directory:"
kubectl exec -n "$NAMESPACE" "$POD_NAME" -- ls -la /projects/ || true

# Find the project clone directory (it may have a random suffix)
PROJECT_DIR=$(kubectl exec -n "$NAMESPACE" "$POD_NAME" -- sh -c 'find /projects -maxdepth 1 -type d -name "project-clone-*" | head -n 1' 2>/dev/null || echo "")

if [[ -z "$PROJECT_DIR" ]]; then
  echo "⚠️  No project-clone-* directory found, checking for direct web-nodejs-sample directory..."
  PROJECT_DIR="/projects"
fi

echo "Project directory: $PROJECT_DIR"

# Check for the restored file
RESTORED_FILE="$PROJECT_DIR/web-nodejs-sample/README.md"
echo "Checking if restored file exists: $RESTORED_FILE"

if kubectl exec -n "$NAMESPACE" "$POD_NAME" -- test -f "$RESTORED_FILE"; then
  echo "✅ Restored file exists!"
  echo ""
  echo "Content preview (first 5 lines):"
  kubectl exec -n "$NAMESPACE" "$POD_NAME" -- head -n 5 "$RESTORED_FILE"
  echo ""

  # Verify the specific modification from backup-external-registry-test.sh is present
  echo "Verifying backup modification is present..."
  if kubectl exec -n "$NAMESPACE" "$POD_NAME" -- grep -q "## Modified via backup test" "$RESTORED_FILE"; then
    echo "✅ Backup modification found in restored file!"
    echo ""
    echo "Last 3 lines of restored file:"
    kubectl exec -n "$NAMESPACE" "$POD_NAME" -- tail -n 3 "$RESTORED_FILE"
  else
    echo "❌ Backup modification NOT found in restored file"
    echo ""
    echo "Full file content:"
    kubectl exec -n "$NAMESPACE" "$POD_NAME" -- cat "$RESTORED_FILE"
    exit 1
  fi
else
  echo "❌ Restored file missing: $RESTORED_FILE"
  echo ""
  echo "Directory structure:"
  kubectl exec -n "$NAMESPACE" "$POD_NAME" -- find /projects -type f -name "README.md" || true
  exit 1
fi

echo "✅ Restore test passed!

dkwon17 · 2026-02-04T02:29:12Z

@rohanKanojia I can reproduce the issue, I am investigating

controllers/workspace/devworkspace_controller.go

Signed-off-by: Ales Raszka <araszka@redhat.com>

dkwon17 · 2026-02-05T22:13:53Z

Thank you @Allda , after these suggestions, I believe we are good to merge

rohanKanojia · 2026-02-06T11:56:03Z

While testing the latest changes, I’m facing an issue with the backup process in the OpenShift internal registry. This seems to be an authentication-related problem.

The same test backup ocp flow test script runs successfully on the main branch without any issues.

I'm sharing logs :

❌ Backup Job pod is in Error state. Printing logs...
+ set -e
+ exec /workspace-recovery.sh --backup

Backing up devworkspace 'test-devworkspace-should-get-backup' in namespace 'openshift-operators' to image 'default-route-openshift-image-registry.apps-crc.testing/openshift-operators/test-devworkspace-should-get-backup:latest'
./
./web-nodejs-sample/
./web-nodejs-sample/.git/
./web-nodejs-sample/.git/branches/
./web-nodejs-sample/.git/description
./web-nodejs-sample/.git/hooks/
./web-nodejs-sample/.git/hooks/applypatch-msg.sample
./web-nodejs-sample/.git/hooks/commit-msg.sample
./web-nodejs-sample/.git/hooks/fsmonitor-watchman.sample
./web-nodejs-sample/.git/hooks/post-update.sample
./web-nodejs-sample/.git/hooks/pre-applypatch.sample
./web-nodejs-sample/.git/hooks/pre-commit.sample
./web-nodejs-sample/.git/hooks/pre-merge-commit.sample
./web-nodejs-sample/.git/hooks/pre-push.sample
./web-nodejs-sample/.git/hooks/pre-rebase.sample
./web-nodejs-sample/.git/hooks/pre-receive.sample
./web-nodejs-sample/.git/hooks/prepare-commit-msg.sample
./web-nodejs-sample/.git/hooks/push-to-checkout.sample
./web-nodejs-sample/.git/hooks/sendemail-validate.sample
./web-nodejs-sample/.git/hooks/update.sample
./web-nodejs-sample/.git/info/
./web-nodejs-sample/.git/info/exclude
./web-nodejs-sample/.git/config
./web-nodejs-sample/.git/objects/
./web-nodejs-sample/.git/objects/pack/
./web-nodejs-sample/.git/objects/pack/pack-2daff847b5946130f9a6d97251c0293bec848f84.pack
./web-nodejs-sample/.git/objects/pack/pack-2daff847b5946130f9a6d97251c0293bec848f84.rev
./web-nodejs-sample/.git/objects/pack/pack-2daff847b5946130f9a6d97251c0293bec848f84.idx
./web-nodejs-sample/.git/objects/info/
./web-nodejs-sample/.git/HEAD
./web-nodejs-sample/.git/refs/
./web-nodejs-sample/.git/refs/heads/
./web-nodejs-sample/.git/refs/heads/main
./web-nodejs-sample/.git/refs/tags/
./web-nodejs-sample/.git/refs/remotes/
./web-nodejs-sample/.git/refs/remotes/origin/
./web-nodejs-sample/.git/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/packed-refs
./web-nodejs-sample/.git/logs/
./web-nodejs-sample/.git/logs/refs/
./web-nodejs-sample/.git/logs/refs/remotes/
./web-nodejs-sample/.git/logs/refs/remotes/origin/
./web-nodejs-sample/.git/logs/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/logs/refs/heads/
./web-nodejs-sample/.git/logs/refs/heads/main
./web-nodejs-sample/.git/logs/HEAD
./web-nodejs-sample/.git/index
./web-nodejs-sample/.git/FETCH_HEAD
./web-nodejs-sample/.gitattributes
./web-nodejs-sample/.github/
./web-nodejs-sample/.github/CODEOWNERS
./web-nodejs-sample/.gitignore
./web-nodejs-sample/.vscode/
./web-nodejs-sample/.vscode/launch.json
./web-nodejs-sample/LICENSE
./web-nodejs-sample/README.md
./web-nodejs-sample/app/
./web-nodejs-sample/app/app.js
./web-nodejs-sample/devfile.yaml
./web-nodejs-sample/package-lock.json
./web-nodejs-sample/package.json
./.code-workspace
Using mounted service account token for registry authentication
Login Succeeded
Uploading 5151c7b46308 devworkspace-backup.tar.gz
Error response from registry: unauthorized: authentication required: [map[Action:pull Class: Name:openshift-operators/test-devworkspace-should-get-backup Type:repository] map[Action:push Class: Name:openshift-operators/test-devworkspace-should-get-backup Type:repository]]

I'm testing it via this script:

#!/usr/bin/env bash
set -euo pipefail
source ./utils.sh

# -------------------------
# Defaults
# -------------------------
WORKSPACE_NAME_PREFIX="${1:-test-devworkspace}"

WORKSPACE_STOPPED="${WORKSPACE_NAME_PREFIX}-should-get-backup"
WORKSPACE_RUNNING="${WORKSPACE_NAME_PREFIX}-no-backup"

MANIFEST_URL="${2:-https://raw.githubusercontent.com/devfile/devworkspace-operator/refs/heads/main/samples/code-latest.yaml}"

DWO_CONFIG_NAME="devworkspace-operator-config"
DWO_NS="openshift-operators"

kubectl config set-context --current --namespace="$DWO_NS"

# -------------------------
# Get OpenShift internal registry route
# -------------------------
echo "🔍 Getting OpenShift internal registry route..."
REGISTRY_SERVICE="default-route-openshift-image-registry.apps-crc.testing"

log_success "Registry route: $REGISTRY_SERVICE"

echo "Will stop workspace to allow backup : $WORKSPACE_STOPPED"
echo "Will keep running workspace to avoid backup : $WORKSPACE_RUNNING"
echo

# -------------------------
# Create or Patch DevWorkspaceOperatorConfig
# -------------------------
echo "⚙️  Enabling backup CronJob with OpenShift internal registry..."

if kubectl get devworkspaceoperatorconfig "$DWO_CONFIG_NAME" -n "$DWO_NS" >/dev/null 2>&1; then
  # Config exists, patch it
  echo "DevWorkspaceOperatorConfig exists, patching..."
  kubectl patch devworkspaceoperatorconfig "$DWO_CONFIG_NAME" -n "$DWO_NS" --type merge -p "
config:
  workspace:
    backupCronJob:
      oras:
        extraArgs: '--insecure'
      enable: true
      schedule: '*/1 * * * *'
      registry:
        path: ${REGISTRY_SERVICE}
        authSecret: ""
"
else
  # Config doesn't exist, create it
  echo "DevWorkspaceOperatorConfig not found, creating..."
  cat <<EOF | kubectl apply -f -
apiVersion: controller.devfile.io/v1alpha1
kind: DevWorkspaceOperatorConfig
metadata:
  name: $DWO_CONFIG_NAME
  namespace: $DWO_NS
config:
  workspace:
    backupCronJob:
      oras:
        extraArgs: '--insecure'
      enable: true
      schedule: '*/1 * * * *'
      registry:
        path: ${REGISTRY_SERVICE}
        authSecret: ""
EOF
fi

log_success "DevWorkspaceOperatorConfig configured for backup"

# -------------------------
# Create both DevWorkspaces
# -------------------------
echo "🚀 Creating DevWorkspaces..."
deploy_devworkspace "$WORKSPACE_STOPPED" "$MANIFEST_URL"
deploy_devworkspace "$WORKSPACE_RUNNING" "$MANIFEST_URL"

log_success "Both workspaces are running"
sleep 5

# -------------------------
# Modify file in stopped workspace
# -------------------------
echo "📝 Modifying README.md in $WORKSPACE_STOPPED..."

POD_STOPPED=$(kubectl get pod -n "$DWO_NS" \
  -l controller.devfile.io/devworkspace_name="$WORKSPACE_STOPPED" \
  -o jsonpath='{.items[0].metadata.name}')

sleep 5
kubectl exec "$POD_STOPPED" -n "$DWO_NS" -- \
  bash -c 'echo "## Modified via backup test" >> /projects/web-nodejs-sample/README.md'

log_success "File modified"

# -------------------------
# Stop ONLY one workspace
# -------------------------
echo "🛑 Stopping workspace: $WORKSPACE_STOPPED"
kubectl patch dw "$WORKSPACE_STOPPED" -n "$DWO_NS" \
  --type merge -p '{"spec":{"started":false}}'

log_success "Workspace stopped"

# =====================================================
# Monitor for backup Jobs
# =====================================================
MONITOR_TIME=600
INTERVAL=5

echo
echo "👀 Monitoring for backup Jobs for ${MONITOR_TIME}s..."

FOUND_STOPPED_JOB=""
FOUND_RUNNING_JOB=""

end=$((SECONDS + MONITOR_TIME))
while [[ $SECONDS -lt $end ]]; do
  WORKSPACE_ID_STOPPED=$(kubectl get dw "$WORKSPACE_STOPPED" -n "$DWO_NS" -o jsonpath='{.status.devworkspaceId}')  
  WORKSPACE_ID_RUNNING=$(kubectl get dw "$WORKSPACE_RUNNING" -n "$DWO_NS" -o jsonpath='{.status.devworkspaceId}')  
  FOUND_STOPPED_JOB=$(kubectl get jobs -n "$DWO_NS" \
    -l "controller.devfile.io/backup-job=true,controller.devfile.io/devworkspace_id=$WORKSPACE_ID_STOPPED" \
    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)

  FOUND_RUNNING_JOB=$(kubectl get jobs -n "$DWO_NS" \
    -l "controller.devfile.io/backup-job=true,controller.devfile.io/devworkspace_id=$WORKSPACE_ID_RUNNING" \
    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)

  if [[ -n "$FOUND_RUNNING_JOB" ]]; then
    echo "❌ Backup Job created for RUNNING workspace: $FOUND_RUNNING_JOB"
    exit 1
  fi

  if [[ -n "$FOUND_STOPPED_JOB" ]]; then
    log_success "Backup Job detected for STOPPED workspace: $FOUND_STOPPED_JOB"
    break
  fi

  sleep "$INTERVAL"
done

if [[ -z "$FOUND_STOPPED_JOB" ]]; then
  echo "❌ Backup Job not created for stopped workspace"
  exit 1
fi

# Delete Running DevWorkspace to avoid Multi-Attach Error
kubectl delete dw "$WORKSPACE_RUNNING" -n "$DWO_NS" --ignore-not-found

# -------------------------
# Wait for stopped workspace backup Job completion
# -------------------------
echo "⏳ Waiting for backup Job to complete..."
TIMEOUT=300
ELAPSED=0
CHECK_INTERVAL=5

while [[ $ELAPSED -lt $TIMEOUT ]]; do
  # Get pod associated with the job
  JOB_POD_NAME=$(kubectl get pods -n "$DWO_NS" \
    -l "job-name=$FOUND_STOPPED_JOB" \
    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)

  if [[ -n "$JOB_POD_NAME" ]]; then
    # Check pod status
    POD_PHASE=$(kubectl get pod "$JOB_POD_NAME" -n "$DWO_NS" \
      -o jsonpath='{.status.phase}' 2>/dev/null || true)

    # Check if any container is in error state
    CONTAINER_STATE=$(kubectl get pod "$JOB_POD_NAME" -n "$DWO_NS" \
      -o jsonpath='{.status.containerStatuses[0].state}' 2>/dev/null || true)

    if [[ "$POD_PHASE" == "Failed" ]] || echo "$CONTAINER_STATE" | grep -q "waiting.*Error\|terminated.*Error"; then
      echo "❌ Backup Job pod is in Error state. Printing logs..."
      kubectl logs "$JOB_POD_NAME" -n "$DWO_NS" --all-containers=true
      exit 1
    fi
  fi

  # Check if job completed successfully
  JOB_STATUS=$(kubectl get job "$FOUND_STOPPED_JOB" -n "$DWO_NS" \
    -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}' 2>/dev/null || true)

  if [[ "$JOB_STATUS" == "True" ]]; then
    log_success "Backup Job completed for stopped workspace"
    break
  fi

  sleep "$CHECK_INTERVAL"
  ELAPSED=$((ELAPSED + CHECK_INTERVAL))
done

if [[ $ELAPSED -ge $TIMEOUT ]]; then
  echo "❌ Backup Job did not complete in time. Printing logs..."
  JOB_POD_NAME=$(kubectl get pods -n "$DWO_NS" \
    -l "job-name=$FOUND_STOPPED_JOB" \
    -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)
  if [[ -n "$JOB_POD_NAME" ]]; then
    kubectl logs "$JOB_POD_NAME" -n "$DWO_NS" --all-containers=true
  fi
  exit 1
fi

# -------------------------
# Verify backup artifact using ImageStream
# -------------------------
echo
echo "📦 Verifying backup artifact (ImageStream) for $WORKSPACE_STOPPED..."

if kubectl get imagestream "$WORKSPACE_STOPPED" -n "$DWO_NS" >/dev/null 2>&1; then
  log_success "ImageStream exists for stopped workspace"

  # Show ImageStream details
  echo "📋 ImageStream details:"
  kubectl get imagestream "$WORKSPACE_STOPPED" -n "$DWO_NS" -o jsonpath='{.status.dockerImageRepository}'
  echo
else
  echo "❌ ImageStream missing for stopped workspace"
  exit 1
fi

# -------------------------
# Verify NO ImageStream for running workspace
# -------------------------
echo
echo "📦 Verifying NO backup artifact for $WORKSPACE_RUNNING..."

if kubectl get imagestream "$WORKSPACE_RUNNING" -n "$DWO_NS" >/dev/null 2>&1; then
  echo "❌ ImageStream exists for running workspace"
  exit 1
else
  log_success "No ImageStream for running workspace"
fi

echo
echo "🎉 Backup validation successful"
log_success "Backup created ONLY for stopped workspace"
log_success "No backup for running workspace"

# -------------------------
# Cleanup logic
# -------------------------
cleanup() {
  echo "🗑️  Deleting DevWorkspaces..."
  kubectl delete dw "$WORKSPACE_STOPPED" "$WORKSPACE_RUNNING" -n "$DWO_NS" --ignore-not-found

  log_success "Cleanup complete"
}

trap cleanup EXIT

Signed-off-by: Ales Raszka <araszka@redhat.com>

Allda · 2026-02-06T13:42:39Z

Thank you @Allda , after these suggestions, I believe we are good to merge

* [Restore workspace from backup #1572 (comment)](https://github.com/devfile/devworkspace-operator/pull/1572#discussion_r2754243517)

* [Restore workspace from backup #1572 (comment)](https://github.com/devfile/devworkspace-operator/pull/1572#discussion_r2753707593)

* [Restore workspace from backup #1572 (comment)](https://github.com/devfile/devworkspace-operator/pull/1572#discussion_r2771377875)

I somehow missed those. It is fixed now.

dkwon17 · 2026-02-06T22:26:13Z

@rohanKanojia I noticed it's because it seems the role name got accidentally changed from builder to puller, could you try applying:

diff --git a/controllers/backupcronjob/rbac.go b/controllers/backupcronjob/rbac.go
index 577e20eb..05028343 100644
--- a/controllers/backupcronjob/rbac.go
+++ b/controllers/backupcronjob/rbac.go
@@ -99,7 +99,7 @@ func (r *BackupCronJobReconciler) ensureImagePushRoleBinding(ctx context.Context
                },
                RoleRef: rbacv1.RoleRef{
                        Kind:     constants.RbacClusterRoleKind,
-                       Name:     common.RegistryImagePullerRoleName(),
+                       Name:     common.RegistryImageBuilderRoleName(),
                        APIGroup: "rbac.authorization.k8s.io",
                },
        }
diff --git a/pkg/common/naming.go b/pkg/common/naming.go
index 5d9181c7..779497be 100644
--- a/pkg/common/naming.go
+++ b/pkg/common/naming.go
@@ -160,7 +160,11 @@ func WorkspaceSCCRolebindingName(sccName string) string {
 }
 
 func RegistryImagePullerRoleName() string {
-       return fmt.Sprintf("system:image-puller")
+       return "system:image-puller"
+}
+
+func RegistryImageBuilderRoleName() string {
+       return "system:image-builder"
 }

And see if that fixes it?

rohanKanojia · 2026-02-07T11:17:48Z

@dkwon17 : Thanks a lot for your investigation.

I can confirm that with the fix openshift backup seems to be working. All backup test scenarios are passing:

Backup with Per-User Storage (External Registry - Quay)
Restore with Per-User Storage (External Registry - Quay)
Backup with Per-Workspace Storage (External Registry - Quay)
Restore with Per-Workspace Storage (External Registry - Quay)
Backup with Per-User Storage (OpenShift Internal Registry)
Restore with Per-User Storage (OpenShift Internal Registry)
Backup with Per-Workspace Storage (OpenShift Internal Registry)
Restore with Per-Workspace Storage (OpenShift Internal Registry)

I think PR should be good to merge now after this change #1572 (comment) is committed.

Allda requested review from akurinnoy, dkwon17, ibuziuk and rohanKanojia as code owners January 14, 2026 12:18

tolusha reviewed Jan 15, 2026

View reviewed changes

pkg/library/env/workspaceenv.go Outdated Show resolved Hide resolved

tolusha reviewed Jan 15, 2026

View reviewed changes

pkg/library/restore/restore.go Outdated Show resolved Hide resolved

tolusha reviewed Jan 15, 2026

View reviewed changes

controllers/workspace/devworkspace_controller.go Outdated Show resolved Hide resolved

dkwon17 reviewed Jan 16, 2026

View reviewed changes

pkg/library/restore/restore.go Outdated Show resolved Hide resolved

dkwon17 reviewed Jan 16, 2026

View reviewed changes

pkg/library/storage/storage.go Outdated Show resolved Hide resolved

dkwon17 reviewed Jan 16, 2026

View reviewed changes

pkg/library/restore/restore.go Outdated Show resolved Hide resolved

dkwon17 reviewed Jan 16, 2026

View reviewed changes

pkg/constants/attributes.go Outdated Show resolved Hide resolved

openshift-merge-robot added the needs-rebase label Jan 16, 2026

Allda added 4 commits January 16, 2026 09:56

Allda force-pushed the 23570-restore-workspace branch from 1b95d94 to 5480c1c Compare January 16, 2026 11:04

openshift-merge-robot removed the needs-rebase label Jan 16, 2026

Allda commented Jan 16, 2026

View reviewed changes

controllers/workspace/devworkspace_controller_test.go Show resolved Hide resolved