Why Your Docker Container Keeps Restarting (and How to Fix It)
A Docker container that “keeps restarting” is almost never random. Docker is doing exactly what you (or an image author, or an orchestrator) told it to do: run a process, and if that process exits, apply a restart policy (or a higher-level controller) to bring it back.
This tutorial teaches you how to identify the true reason a container exits, how to read Docker’s signals, and how to fix the root cause—not just silence symptoms.
Table of Contents
- 1. Understand What “Restarting” Really Means
- 2. Quick Triage Checklist (Fastest Path to Root Cause)
- 3. Inspect Restart Policies and Container State
- 4. Read Logs Correctly (and Persist Them)
- 5. Interpret Exit Codes (and What They Usually Mean)
- 6. Common Root Causes and Fixes
- 6.1 The Main Process Exits Immediately
- 6.2 Your Entrypoint/Command Is Wrong
- 6.3 App Crashes on Startup (Missing Config/Env/Secrets)
- 6.4 Port Binding Conflicts
- 6.5 Healthcheck Fails (and Something Restarts It)
- 6.6 OOMKilled (Out of Memory)
- 6.7 Permission Problems (Volumes, Users, Filesystems)
- 6.8 Crash Loops Caused by Dependencies (DB Not Ready, DNS, Network)
- 6.9 “Exec format error” (Wrong Architecture)
- 6.10 Container Runs, But Exits When Shell Ends (Bad Debug Pattern)
- 7. Debugging Techniques That Actually Work
- 8. Fixing Restart Loops in Docker Compose
- 9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)
- 10. Prevention: Build Containers That Don’t Crash Loop
1. Understand What “Restarting” Really Means
A container is (conceptually) a wrapper around a process. When that process exits, the container stops. If Docker is configured to restart it, you’ll see it bounce between states like:
Up ... (health: starting)thenRestarting (1) ...Restarting (137) ...Exited (0) ...then immediatelyUp ...again
Docker’s restart behavior is controlled by:
- Restart policy on the container (
--restartor Composerestart:) - Orchestrator logic (Docker Swarm, Kubernetes, Nomad, systemd, etc.)
- External watchdog scripts (cron jobs, supervisors)
So “keeps restarting” usually means:
- The app exits (crash or normal exit)
- Something restarts it (policy/controller)
- The underlying problem is still present
- Repeat (crash loop)
Your job is to answer two questions:
- Why is the main process exiting?
- Who/what is restarting it?
2. Quick Triage Checklist (Fastest Path to Root Cause)
Run these in order; they provide the highest signal quickly.
2.1 See container status and restart count
docker ps -a --no-trunc
Look for:
STATUSlikeRestarting (1) 10 seconds ago- A high
RESTARTScount
2.2 Inspect the container state and exit code
docker inspect <container_name_or_id> --format \
'Status={{.State.Status}} ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} StartedAt={{.State.StartedAt}} FinishedAt={{.State.FinishedAt}}'
If OOMKilled=true, jump to OOMKilled.
2.3 Read logs from the last run
docker logs --tail=200 <container_name_or_id>
If it restarts quickly, include timestamps:
docker logs -t --tail=200 <container_name_or_id>
2.4 Check restart policy
docker inspect <container> --format 'RestartPolicy={{json .HostConfig.RestartPolicy}}'
If it’s always or unless-stopped, Docker will keep trying.
3. Inspect Restart Policies and Container State
3.1 Restart policies: what they do
Docker supports these policies:
no(default): do not restart automaticallyon-failure[:max-retries]: restart only if exit code is non-zeroalways: restart regardless of exit codeunless-stopped: likealways, but won’t restart if you manually stopped it
Show it:
docker inspect <container> --format '{{.HostConfig.RestartPolicy.Name}} {{.HostConfig.RestartPolicy.MaximumRetryCount}}'
3.2 Temporarily stop the restart loop (to debug)
If the container restarts too fast to inspect, you can disable restart:
docker update --restart=no <container>
docker stop <container>
Then start it manually:
docker start <container>
Or run a new container without restart policy to test:
docker run --rm --name test-no-restart <image:tag>
3.3 Get a full picture with docker inspect
docker inspect <container> > inspect.json
Key fields to search inside inspect.json:
.State.ExitCode.State.Error.State.OOMKilled.Config.Entrypoint.Config.Cmd.HostConfig.RestartPolicy.Mounts.HostConfig.Binds.NetworkSettings.Ports
4. Read Logs Correctly (and Persist Them)
4.1 Basic logs
docker logs <container>
4.2 Follow logs while it crash-loops
docker logs -f <container>
4.3 Get logs from a previous instance
Docker logs are attached to the container, not each restart separately, but timestamps help:
docker logs -t --since=10m <container>
4.4 If logs are empty
Empty logs can mean:
- The process exits before writing to stdout/stderr
- Logging is going to a file inside the container
- The entrypoint fails before the app starts
Try:
docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'
And run the image interactively:
docker run --rm -it --entrypoint sh <image:tag>
Then try to start the app manually from inside.
4.5 Logging drivers matter
Check the logging driver:
docker inspect <container> --format 'LogConfig={{json .HostConfig.LogConfig}}'
If you’re using journald, syslog, fluentd, etc., docker logs might not show what you expect.
5. Interpret Exit Codes (and What They Usually Mean)
Exit codes are one of the best clues.
Get the exit code:
docker inspect <container> --format 'ExitCode={{.State.ExitCode}}'
Common patterns:
0: process exited successfully (but maybe you expected it to keep running)1: generic error (app-level failure)2: misuse of shell builtins / CLI usage error126: command found but not executable (permissions)127: command not found137: killed (often OOM kill orSIGKILL)139: segmentation fault143: terminated (SIGTERM), often fromdocker stopor orchestrator
Also check if Docker says it was OOM killed:
docker inspect <container> --format 'OOMKilled={{.State.OOMKilled}}'
6. Common Root Causes and Fixes
6.1 The Main Process Exits Immediately
Symptom: container starts and exits quickly; exit code might be 0.
This often happens when the container runs a command that completes immediately, like echo, or a server that fails to daemonize correctly.
Example:
docker run --name mycontainer alpine echo "hello"
That will exit immediately, because echo finishes.
Fix: run a long-lived foreground process. For example:
docker run --name mycontainer alpine sh -c 'while true; do sleep 3600; done'
For real services (nginx, node, python), ensure they run in the foreground:
- Nginx:
nginx -g 'daemon off;' - Apache httpd:
httpd -DFOREGROUND - Many base images already do this correctly; custom scripts often break it.
Key idea: In Docker, you generally want one main process that stays in the foreground.
6.2 Your Entrypoint/Command Is Wrong
Symptom: exit code 127 (command not found) or 126 (not executable).
Check what Docker is trying to run:
docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'
If your Dockerfile uses:
ENTRYPOINT ["./start.sh"]
But start.sh isn’t copied, or isn’t executable, you’ll crash-loop.
Fixes:
- Ensure the file exists in the image:
docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la /path && file /path/start.sh'
- Ensure it’s executable:
COPY start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh
ENTRYPOINT ["/usr/local/bin/start.sh"]
- Watch out for Windows line endings (
CRLF) causing/bin/sh^M: bad interpreter:
Inside the container:
cat -A /usr/local/bin/start.sh | head
If you see ^M, convert to LF on the host:
sed -i 's/\r$//' start.sh
Then rebuild.
6.3 App Crashes on Startup (Missing Config/Env/Secrets)
Symptom: logs show configuration errors, missing environment variables, missing files.
Examples:
- “DATABASE_URL not set”
- “Cannot find /app/config.json”
- “Permission denied reading /run/secrets/…”
Debug:
docker logs --tail=200 <container>
docker inspect <container> --format 'Env={{json .Config.Env}}'
Fix: pass environment variables correctly
Run:
docker run -e DATABASE_URL='postgres://user:pass@db:5432/app' <image:tag>
Or with an env file:
docker run --env-file .env <image:tag>
Fix: mount required config files
docker run -v "$PWD/config.json:/app/config.json:ro" <image:tag>
Tip: If the app expects a file at a specific path, confirm the mount path matches exactly.
6.4 Port Binding Conflicts
Port conflicts typically prevent a container from starting at all (you’ll see an error like “bind: address already in use”), but in some setups you might see repeated attempts.
Check what’s using the port on the host:
sudo lsof -iTCP -sTCP:LISTEN -P | grep ':8080'
Or:
sudo ss -ltnp | grep ':8080'
Fix: change the host port mapping:
docker run -p 8081:8080 <image:tag>
Or stop the conflicting service.
6.5 Healthcheck Fails (and Something Restarts It)
Docker’s built-in HEALTHCHECK does not restart containers by itself. It only marks them as unhealthy. However:
- Docker Compose
depends_on: condition: service_healthycan cause cascading behavior - Orchestrators (Kubernetes) may restart pods if probes fail
- External tooling may watch health and restart
Check health status:
docker inspect <container> --format 'Health={{json .State.Health}}'
See the healthcheck command:
docker inspect <container> --format 'Healthcheck={{json .Config.Healthcheck}}'
Fix: Make healthchecks realistic:
- Ensure the service is actually listening before the check runs
- Increase
start_period,interval, andretries(in Dockerfile or Compose) - Validate the endpoint used by the healthcheck
To manually test the healthcheck command, run it inside the container:
docker exec -it <container> sh
# then run the healthcheck command (e.g., curl -f http://localhost:8080/health)
6.6 OOMKilled (Out of Memory)
Symptom: exit code often 137, and .State.OOMKilled=true.
Check:
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}}'
Check memory limits:
docker inspect <container> --format 'Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}}'
If Memory is non-zero, the container has a limit. If the app exceeds it, the kernel may kill it.
See live memory usage:
docker stats <container>
Fix options:
- Increase memory limit:
docker run --memory=1g --memory-swap=1g <image:tag>
- Reduce app memory usage:
- Lower JVM heap (
-Xmx) - Fix memory leaks
- Reduce concurrency / worker counts
- If you’re on Docker Desktop, increase the VM memory in Docker Desktop settings.
Important: OOM kills can happen even without explicit container limits if the host is under memory pressure.
6.7 Permission Problems (Volumes, Users, Filesystems)
Symptom: logs show EACCES, “permission denied”, inability to write to a directory, or failure to create PID files.
Common scenario: container runs as non-root, but a mounted volume is owned by root on the host.
Inspect the container user:
docker inspect <container> --format 'User={{.Config.User}}'
Check permissions inside:
docker exec -it <container> sh -c 'id && ls -ld /data && touch /data/testfile'
If it can’t write, fix by one of:
- Adjust host directory ownership to match container UID/GID:
sudo chown -R 1000:1000 ./data
- Run container with a specific user:
docker run --user 1000:1000 -v "$PWD/data:/data" <image:tag>
- Use Docker-managed volumes (often easier than bind mounts):
docker volume create appdata
docker run -v appdata:/data <image:tag>
6.8 Crash Loops Caused by Dependencies (DB Not Ready, DNS, Network)
Symptom: app exits because it can’t reach database, cache, or an API. Logs show connection refused, timeout, DNS failure.
Examples:
ECONNREFUSED db:5432getaddrinfo ENOTFOUND redisdial tcp: lookup ...: no such host
Debug network basics:
- Confirm the container is on the expected network:
docker network ls
docker inspect <container> --format '{{json .NetworkSettings.Networks}}'
- Test DNS and connectivity from inside:
docker exec -it <container> sh -c 'cat /etc/resolv.conf && nslookup db || true'
docker exec -it <container> sh -c 'nc -vz db 5432 || true'
(If nc isn’t installed, use busybox tools or install temporarily in a debug image.)
Fix: implement retry/backoff Your app should not crash instantly if a dependency is temporarily unavailable. Add:
- retry loops
- exponential backoff
- “wait-for-it” style scripts (carefully)
Example simple wait loop:
#!/bin/sh
set -eu
until nc -z db 5432; do
echo "Waiting for db..."
sleep 2
done
exec ./your-app
Fix Compose startup ordering
Compose depends_on does not guarantee readiness, only start order. Use healthchecks + waiting logic.
6.9 “Exec format error” (Wrong Architecture)
Symptom: container exits immediately with logs like:
exec /bin/sh: exec format error
This happens when you run an image built for a different CPU architecture (e.g., ARM image on x86_64).
Check host architecture:
uname -m
Check image architecture:
docker image inspect <image:tag> --format '{{.Architecture}}/{{.Os}}'
Fix: pull/build the correct platform
Pull with platform:
docker pull --platform=linux/amd64 <image:tag>
Or build multi-arch with Buildx:
docker buildx build --platform linux/amd64,linux/arm64 -t yourname/yourimage:latest --push .
6.10 Container Runs, But Exits When Shell Ends (Bad Debug Pattern)
Symptom: you started a container with an interactive shell as PID 1, then detached incorrectly, or your command ends.
Example:
docker run -it ubuntu bash
When you exit the shell, PID 1 exits, container stops.
Fix: run the actual service as PID 1, not a shell, or use -d with a long-running command.
7. Debugging Techniques That Actually Work
When a container restarts too quickly, you need a way to “pause the scene” and inspect.
7.1 Override entrypoint to get a shell
Run a new container from the same image:
docker run --rm -it --entrypoint sh <image:tag>
Or with bash if available:
docker run --rm -it --entrypoint bash <image:tag>
Then manually run what the container normally runs (from docker inspect output).
7.2 Start the container with a “sleep” command
This keeps it alive so you can exec in:
docker run -d --name debug --entrypoint sh <image:tag> -c 'sleep 36000'
docker exec -it debug sh
From there, test filesystem paths, env vars, and run the app command.
7.3 Inspect events in real time
Docker events can show restart cycles and reasons:
docker events --filter container=<container_name_or_id>
In another terminal, watch for die, start, restart events.
7.4 Check resource constraints and kernel messages
If you suspect OOM or kernel kills, check dmesg (on Linux host):
sudo dmesg -T | tail -n 200 | grep -i -E 'oom|killed process'
7.5 Confirm what PID 1 is doing
Inside a running container:
docker exec -it <container> sh -c 'ps aux'
If PID 1 is a shell script, ensure it uses exec to hand over PID 1 to the app. Without exec, signals may not be handled properly, causing weird shutdowns and restarts.
Bad:
#!/bin/sh
./app
Better:
#!/bin/sh
exec ./app
8. Fixing Restart Loops in Docker Compose
Compose often introduces restart loops because it makes restarts easy to enable and services depend on each other.
8.1 Check Compose service status
docker compose ps
8.2 View logs per service
docker compose logs --tail=200 <service>
docker compose logs -f <service>
8.3 Identify the restart policy
In Compose, you might have:
restart: alwaysrestart: on-failurerestart: unless-stopped
To temporarily disable restarts for debugging, remove restart: and re-run:
docker compose up -d --force-recreate
Or scale to zero (to stop the loop while you inspect other services):
docker compose stop <service>
8.4 Use healthchecks and readiness logic (Compose reality)
Compose depends_on does not ensure readiness. If your app needs Postgres ready, add:
- Postgres healthcheck
- App waits/retries on startup
Then you can gate startup with service_healthy (supported in modern Compose implementations), but still keep retries in the app for robustness.
9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)
Sometimes you’re looking at a “Docker container restarting,” but the restart is happening because Kubernetes is managing the container.
9.1 Check pod restart reason
kubectl get pods
kubectl describe pod <pod>
Look for:
Last State: TerminatedReason: OOMKilledExit Code: ...Back-off restarting failed container
9.2 View logs from the previous crash
kubectl logs <pod> -c <container> --previous
9.3 Probes can cause restarts
If livenessProbe fails, Kubernetes restarts the container. Confirm probe endpoints and timings.
10. Prevention: Build Containers That Don’t Crash Loop
10.1 Run one foreground process and handle signals
- Ensure PID 1 is your app (or a minimal init like
tini) - Use
execin entrypoint scripts - Handle SIGTERM for graceful shutdown
Example Dockerfile pattern:
RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini","--"]
CMD ["./app"]
10.2 Fail fast, but log clearly
If you must exit due to missing config, print a clear message to stderr and exit non-zero:
- “Missing DATABASE_URL”
- “Cannot read /app/config.json: permission denied”
This makes docker logs immediately useful.
10.3 Add sane retry behavior for dependencies
Databases and message brokers often start slower than apps. A small retry loop prevents unnecessary restarts.
10.4 Set appropriate resource limits
If you deploy with memory limits, configure your runtime accordingly (JVM heap, Node memory flags, worker counts).
10.5 Use healthchecks thoughtfully
Healthchecks should:
- represent real readiness/health
- allow enough startup time
- not overload the service
A Practical Walkthrough (Putting It All Together)
Assume your container web is restarting.
Step 1: Identify restart policy and exit code
docker ps -a
docker inspect web --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Restart={{.HostConfig.RestartPolicy.Name}}'
Step 2: Read logs
docker logs -t --tail=200 web
Step 3: If logs show “command not found” (127)
docker inspect web --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'
docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la && which yourbinary || true'
Fix Dockerfile paths or install missing binary.
Step 4: If exit code is 137 and OOMKilled=true
docker stats web
docker inspect web --format 'Memory={{.HostConfig.Memory}}'
Increase memory or reduce usage.
Step 5: If exit code is 0
Your app is exiting cleanly. That means you’re not running a long-lived server process. Fix the CMD/ENTRYPOINT to run the service in the foreground.
Reference Commands (Cheat Sheet)
# Status and restarts
docker ps -a
# Logs
docker logs --tail=200 <container>
docker logs -f <container>
# Inspect state, exit codes, OOM
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}}'
# Restart policy
docker inspect <container> --format '{{json .HostConfig.RestartPolicy}}'
# Disable restart policy (debug)
docker update --restart=no <container>
# Events
docker events --filter container=<container>
# Exec into a running container
docker exec -it <container> sh
# Run image with overridden entrypoint for debugging
docker run --rm -it --entrypoint sh <image:tag>
# Resource usage
docker stats <container>
Closing Mental Model
A restart loop is a feedback loop:
- Container starts
- Main process exits (for a reason)
- A policy/controller restarts it
- Repeat
If you consistently focus on (a) exit reason and (b) who restarts it, you’ll solve restart loops quickly—even in complex stacks.
If you want, paste the output of these three commands and I can help pinpoint the cause:
docker ps -a --no-trunc
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} Restart={{json .HostConfig.RestartPolicy}}'
docker logs --tail=200 <container>