You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to get ollama working with tools. I have them both running in docker-compose. it seems to be hooked up right, and if I disable tools/skills, I get answers from llama3 and llama3.2.
but tools/skills seem to confuse it:
Also, skills are fine with hugging-face, using Kimi K2. like it took a few steps, but this worked:
as a test, write a haiku about testing. also prove that tools and skills work by summarizing a web-search about dinosaurs.
Looking here for better models, I found that qwen3-vl:latest has tool-support, but it just seems to hang and eventually throws a timeout-error in logs (maybe due to low VRAM?) I also pulled qwen3-vl:4b and got the same thing (hang/timeout.)
Can anyone suggest a comparable model that I can run on ollama, or is this just a problem with ollama or my config?
I am not working with a great card:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.119.02 Driver Version: 580.119.02 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4060 Ti On | 00000000:01:00.0 On | N/A |
| 0% 47C P5 17W / 160W | 1782MiB / 8188MiB | 21% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
Asking a similar question of Kimi K2, gave me this answer, but I wonder if that is really up-to-date.
Summary
Your issues stem from two main problems:
Llama3 models have poor tool implementation - They generate tool calls but don't understand responses
Qwen3-VL is a vision model - These need 2-3x more VRAM than text-only models
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to get ollama working with tools. I have them both running in docker-compose. it seems to be hooked up right, and if I disable tools/skills, I get answers from llama3 and llama3.2.
but tools/skills seem to confuse it:
Also, skills are fine with hugging-face, using Kimi K2. like it took a few steps, but this worked:
Looking here for better models, I found that
qwen3-vl:latesthas tool-support, but it just seems to hang and eventually throws a timeout-error in logs (maybe due to low VRAM?) I also pulledqwen3-vl:4band got the same thing (hang/timeout.)Can anyone suggest a comparable model that I can run on ollama, or is this just a problem with ollama or my config?
I am not working with a great card:
Asking a similar question of Kimi K2, gave me this answer, but I wonder if that is really up-to-date.
Summary
Your issues stem from two main problems:
Immediate Solutions:
Best models to try (in order):
qwen2.5:1.5b- Excellent tool support, minimal VRAMqwen2.5:3b- Better performance, still VRAM-friendlyqwen2:1.5b- Proven stable tool supportmistral:7b- Good tools, needs more VRAMConfiguration fixes:
Test this immediately:
Qwen2.5 models have the most reliable tool support in Ollama and should work much better than Llama3 for your skills/tools use case.
I tried it, and it did seem to work:
Sure, here's a Haiku on the theme of Testing:
Test in code so clean,
No flaws emerge by design—
A clear outcome
Now showing Skills work:
create-plan: A concise plan is made.
{"name": "create-plan"}read_file: The path to a text file that includes a haiku test:
{"path": "/tmp/haiku_test.txt"}Please note, the above JSON objects are just examples. When using these in actual execution, replace the placeholders with your valid values.
Beta Was this translation helpful? Give feedback.
All reactions