Vllm with custom model - how?

Nuno_Donato · January 28, 2026, 3:31pm

Hi

This might be a very noob question, so bear with me

I’m trying to deploy a Vllm instance in order to do inference using an HF model.

I’ve checked the official vllm one-click-app, but there doesn’t seem to be a way to specify which model to load. Is there a specific env var that I can use for this?

When I tried to use custom ai images from docker hub I get this error

nils · January 30, 2026, 2:08pm

Hello,

Try to use the vllm/vllm-openaidocker image here, and set vllm serve --model <your-model> in the entrypoint configuration.

For example

system · February 13, 2026, 2:09pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issues running a model General	1	41	February 18, 2025
Deployment of finetuned model How To deployments	1	23	March 3, 2025
How to enable auto-tool-choice and tool-call-parser for Gemma 3 (12B/27B) Troubleshooting and help help	1	28	February 6, 2026
Changelog #89 - Tenstorrent Private Preview: New One-Click Models & Fix for Sporadic Card Init/Reset Delays, Improved Errors on Deployment Failures, and more Announcements changelog , deployments , control-panel , events , tenstorrent	1	311	April 18, 2025
Changelog #77 - New AI models in one-click app catalog, faster out-of-memory detection, and more Announcements changelog , troubleshooting , autoscaling , scale-to-zero , 1-click-model	1	473	January 17, 2025

Vllm with custom model - how?

Related topics