Andy
December 7, 2024, 6:49pm
1
I’m using Dockerfile to deploy my NestJS. At the very last step, I got this error
Failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
I’ve tried to build on local and it works. Not sure what caused the problems?
Dockerfile
# Use an official Node.js runtime as a parent image
FROM node:18
# Set the working directory in the container
WORKDIR /usr/src/app
# Copy package.json and package-lock.json
COPY package*.json ./
# Install dependencies
RUN npm install
# Copy the rest of the application code
COPY . .
# Build the application
RUN npm run build
# Show build files
RUN cd dist && ls -la
# Command to run the application
CMD ["npm", "run", "start:prod"]
David
December 9, 2024, 8:52am
2
Hi,
Your services are up and running. Did you fix the issue? Can you share how you solved it for the community if you were able to fix it, please?
Andy
December 10, 2024, 6:08am
3
Hey David, despite being up and running, the issues are still there, so I guess it doesn’t affect the services?
It seems related to the cache service. This is the logs at the end
#14 DONE 14.0s
#16 exporting cache to registry
#16 preparing build cache for export
error: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
sh: can't kill pid 9: No such process
Build failed ❌
Andy
December 10, 2024, 6:27am
4
This issue has been reported
opened 04:04PM - 29 Sep 23 UTC
closed 11:15AM - 23 Oct 23 UTC
### Contributing guidelines
- [X] I've read the [contributing guidelines](https… ://github.com/docker/buildx/blob/master/.github/CONTRIBUTING.md) and wholeheartedly agree
### I've found a bug and checked that ...
- [X] ... the documentation does not mention anything about my problem
- [x] ... there are no open or closed issues that are related to my problem
### Description
During docker (compose) builds, we occasionally see this error in our CI:
`failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF`
This can happen at various stages in docker builds, including:
- importing cache manifest from ...
- load build context
- RUN pip install --upgrade pip
We used our instance monitoring to investigate if there was any correlation with resource uses. We looked into network, memory, and cpu utilization and none of these spiked in correlation to these errors.
This error can kill multiple builds happening in parallel on our CI nodes, but it also happens to single builds as well.
### Expected behaviour
docker compose build progress
### Actual behaviour
docker compose builds fail
### Buildx version
github.com/docker/buildx v0.11.2 9872040
### Docker info
```text
+ docker system info
Client: Docker Engine - Community
Version: 24.0.6
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.11.2
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.21.0
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 16
Running: 16
Paused: 0
Stopped: 0
Images: 16
Server Version: 24.0.6
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 8165feabfdfe38c65b599c4993d227328c231fca
runc version: v1.1.8-0-g82f18fe
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
Kernel Version: 5.15.0-1044-aws
Operating System: Ubuntu 20.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 30.67GiB
Name: ip-10-10-15-71
ID: 8d7a5a77-4225-4887-a2c3-419a6c5ab76e
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: cmtlouis
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Default Address Pools:
Base: 172.17.0.0/12, Size: 20
Base: 192.168.0.0/16, Size: 24
```
### Builders list
```text
+ docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default * docker
default default running v0.11.6+616c3f613b54 linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/386
```
### Configuration
We are not able to consistently reproduce our issues, though we are building multiple images with multiple stages using docker-compose which may be relevant
we also run multiple jobs on the same instances in our CI, so multiple docker compose builds are happening in parallel at times. Furthermore it seems this error can happen to multiple docker compose builds at the same time which running on the same node in parallel.
### Build logs
_No response_
### Additional info
seems like it could be a similar error to:
https://github.com/microsoft/vscode-remote-release/issues/7958
I'm wondering if it is some other race condition that only happens occasionally.
It does not seem correlated to resource usage.
opened 11:03AM - 27 Apr 22 UTC
closed 02:44PM - 12 Jul 23 UTC
buildx-version:0.7.0
buildkit-version:0.9.0
driver: --driver kubenetes
![im… age](https://user-images.githubusercontent.com/28646240/165504234-2d5f7373-1f6c-455b-9913-8f9ef0579f00.png)
Error: failed to receive status: rpc error: code = Unavailable desc = closing transport due to: connection error: desc = "error reading from server: io: read/write on closed pipe"
I’m not sure what caused the problem. May you take a look?
Andy
December 14, 2024, 4:01am
5
Bump this issue. It’s still happening to my services.
David
December 15, 2024, 7:32pm
6
Hi,
Yes, so the cache export step fails.
This does not prevent the deployment from moving forward. However, it prevents you from using the cache when you redeploy.
We need to investigate how we can solve this.
I’ll get back to you on this.