Hello, and welcome to our first changelog update of the year! Let’s dive into what’s new:
-
New AI models in one-click app catalog
We’ve added 7 new AI models in our one-click app catalog so you have a larger choice when deploying a dedicated endpoint for inference requests. Amongst the new models, we’ve included Phi-4, the latest model of the Phi family from Microsoft, and Qwen 2 VL 7B Instruct, the first Vision Language Model of our catalog.
-
Faster out-of-memory detection
Out-of-memory issues are now displayed earlier in the deployment events, enabling you to address them quickly by selecting a larger instance and successfully deploy your service. Previously, the issue was displayed later in the deployment process, delaying your ability to troubleshoot the issue.
-
Fixed autoscaling issue
Services with both autoscaling and scale-to-zero activated did not scale down unless inbound traffic was completely absent. This behavior led to unnecessary resource consumption during periods of low traffic. This is now fixed, and Services now scale down appropriately, optimizing resource usage.