Changelog #77 - New AI models in one-click app catalog, faster out-of-memory detection, and more

David · January 17, 2025, 7:18pm

Hello, and welcome to our first changelog update of the year! Let’s dive into what’s new:

New AI models in one-click app catalog

We’ve added 7 new AI models in our one-click app catalog so you have a larger choice when deploying a dedicated endpoint for inference requests. Amongst the new models, we’ve included Phi-4, the latest model of the Phi family from Microsoft, and Qwen 2 VL 7B Instruct, the first Vision Language Model of our catalog.

New AI models in one-click app catalog1078×620 243 KB
Faster out-of-memory detection

Out-of-memory issues are now displayed earlier in the deployment events, enabling you to address them quickly by selecting a larger instance and successfully deploy your service. Previously, the issue was displayed later in the deployment process, delaying your ability to troubleshoot the issue.
Fixed autoscaling issue

Services with both autoscaling and scale-to-zero activated did not scale down unless inbound traffic was completely absent. This behavior led to unnecessary resource consumption during periods of low traffic. This is now fixed, and Services now scale down appropriately, optimizing resource usage.

Topic		Replies	Views
Changelog #76 - New plans, deploy AI models in one click, and more Announcements changelog , pricing , metrics , secrets , 1-click-model	1	206	December 20, 2024
Changelog #80 - Qwen 2.5 VL 7B Instruct and Qwen 2.5 VL 72B Instruct One-Click Models, Improved Scale to Zero Cold Start, and more Announcements changelog , scale-to-zero , 1-click-model , llm	1	531	February 14, 2025
Changelog #89 - Tenstorrent Private Preview: New One-Click Models & Fix for Sporadic Card Init/Reset Delays, Improved Errors on Deployment Failures, and more Announcements changelog , deployments , control-panel , events , tenstorrent	1	290	April 18, 2025
Changelog #83 - QwQ 32B and R1 1776 Distill Llama 70B One-Click Models, 8x H100 GPUs, and more Announcements changelog , control-panel , h100-gpus , qwq-32b , r1-1776	1	467	March 7, 2025
Changelog #85 - A100 SXM available in North America, Gemma 3, Phi-4 Multimodal, and Mistral Small 3.1 One-Click Models, and more Announcements changelog , gpu , scale-to-zero , 1-click-model	1	298	March 21, 2025