Classification backbone with Vit results in argument 'input' (position1) must be Tensor, not tuple

Hi,

I am trying to use ViT as follows:

net = monai.networks.nets.ViT(spatial_dims=2, in_channels=1, img_size=(400, 400), proj_type='conv', patch_size=(64, 64),
                                  num_classes=6, classification=True, post_activation='0').to(device)

but I am running into the same issue as reported here: https://github.com/Project-MONAI/tutorials/issues/464

return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
TypeError: cross_entropy_loss(): argument 'input' (position 1) must be Tensor, not tuple

It has been concluded that the API will be enhanced by hidden_states_out, but I do not see it implemented - apparently due to design.

MONAI version: 1.3.0
Pytorch version: 2.1.1+cu121

Thanks for advice

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Classification backbone with Vit results in argument 'input' (position1) must be Tensor, not tuple #1614

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Classification backbone with Vit results in argument 'input' (position1) must be Tensor, not tuple #1614

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions