Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?
-
Updated
Feb 11, 2025 - Python
Adaptive Length Image Tokenization via Recurrent Allocation | How many tokens is an image worth ?
A minimal, hackable Vision-Language Model built on Karpathy’s nanochat — add image understanding and multimodal chat for under $200 in compute.
Add a description, image, and links to the vision-tokenization topic page so that developers can more easily learn about it.
To associate your repository with the vision-tokenization topic, visit your repo's landing page and select "manage topics."