Vllm with custom model - how?

Hi

This might be a very noob question, so bear with me :slight_smile:

I’m trying to deploy a Vllm instance in order to do inference using an HF model.

I’ve checked the official vllm one-click-app, but there doesn’t seem to be a way to specify which model to load. Is there a specific env var that I can use for this?

When I tried to use custom ai images from docker hub I get this error

Hello,

Try to use the vllm/vllm-openaidocker image here, and set vllm serve --model <your-model> in the entrypoint configuration.

For example

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.