Best model for programming?

absGeekNZ@lemmy.nz · 3 days ago

Best model for programming?

kata1yst@sh.itjust.works · 3 days ago

I’d recommend Qwen 2.5 Coder. Just try to ask very direct questions with smaller context.

QWQ is a bit stronger at more advanced coding tasks but I had a ton of trouble finding a version that would fit in my 24G 7900xtx.

SmokeyDope@lemmy.world · 3 days ago

in your range 32b models work well give qwen coder a try

Boomkop3@reddthat.com · 3 days ago

The one on top of your neck

Possibly linux@lemmy.zip · edit-2 3 days ago

Deepseek r1 14b

Gemma (assuming you are ok with the license)

mutual_ayed@sh.itjust.works · 3 days ago

Make sure you pull the 14b model or you quantize the 32b model. 16gb is plenty for one user using it locally.

wise_pancake@lemmy.ca · 2 days ago

What’s different about the Gemma license?

Possibly linux@lemmy.zip · 2 days ago

Not foss

Although foss is debatable in a foss context.

ghost@feddit.org · 3 days ago

As others have already mentioned, try qwen2.5-cider. With 16 GB, you should be able to confortably fit a quantised version of the 14b variant into VRAM. You can also try the 32b variant, but it will be much slower because not all layers can be off-loaded to the GPU.

absGeekNZ@lemmy.nz · 3 days ago

I’m running ollama 0.6.3 (pre-release) and rocm v6.10.5 on linux 6.11.0-21

Still getting

level=INFO source=gpu.go:377 msg=“no compatible GPUs were discovered”

Fisch@discuss.tchncs.de · 3 days ago

I have an RX 6700 XT and I needed to change an environment variable to make it work. Maybe something similar is needed for you GPU. I’d try googling something like “RX 9700 XT ROCM” or “RX 9700 XT ROCM no compatible GPUs were discovered” if you haven’t done that already.

SmokeyDope@lemmy.world · edit-2 3 days ago

When I had my AMD GPU going the best way to get models running was kobold.cpp and using vulcan. The flag is like --usevulcan or something. Its way easier than getting a rocm fork working from source.

badcodecat@lemux.minnix.dev · 3 days ago

are you looking for autocomplete or chat?

JamonBear@sh.itjust.works · 3 days ago

Is there a different recommendation for autocomplete?

badcodecat@lemux.minnix.dev · 3 days ago

in general, you would want something fast (probably something that fits in your GPU/VRAM) so you can get suggestions as fast as you can type. for chat, you’ll probably want the most intelligent/lorgest model you can run, it’s likely fine if it’s running on the CPU/RAM since the quality of an individual answer is more important than the speed in which many small answers can be generated. so, probably qwen for both, but, different sizes/quant for different use cases.

absGeekNZ@lemmy.nz · 3 days ago

chat to start with

icecreamtaco@lemmy.world · 3 days ago

ChatGPT works great too if you don’t want to use ram

raldone01@lemmy.world · 3 days ago

I am running local models only for privacy sensitive stuff. If you have ollama you can also setup openwebui and access both local and remote models through the same very nice interface! Also chatgpt API is much cheaper than subscribing.