Since Workers can’t scale to 0, then we should only have to pay for when the compute is actively utilized.
Or make it an option for Worker to scale to 0, which effectively does the same thing.
Since Workers can’t scale to 0, then we should only have to pay for when the compute is actively utilized.
Or make it an option for Worker to scale to 0, which effectively does the same thing.
Hey @Nate_Robinson,
Scale to zero is based on incoming HTTP requests. When for some given time there are no new requests, we scale a service down, and when a new request comes, we scale it up.
Workers don’t accept HTTP requests, so we don’t know when to scale them down. Even if we used CPU usage to scale down, which is not super reliable, what would we use to scale it back up?
Some users implemented an automation to pause/unpause a service. What is your use case?
I have a webservice that is an API and manages all incoming traffic. Some incoming requests require heavy ML libraries and also have to download files, so extra memory is required. Those jobs are passed to the worker via an external queue. These jobs are not frequent. Scaling via memory usage is what I am doing now.
If you have an example of how to pause/unpause the service, that would be great.