Why Your Docker Container Keeps Restarting (and How to Fix It)

A Docker container that “keeps restarting” is almost never random. Docker is doing exactly what you (or an image author, or an orchestrator) told it to do: run a process, and if that process exits, apply a restart policy (or a higher-level controller) to bring it back.

This tutorial teaches you how to identify the true reason a container exits, how to read Docker’s signals, and how to fix the root cause—not just silence symptoms.

1. Understand What “Restarting” Really Means
2. Quick Triage Checklist (Fastest Path to Root Cause)
3. Inspect Restart Policies and Container State
4. Read Logs Correctly (and Persist Them)
5. Interpret Exit Codes (and What They Usually Mean)
6. Common Root Causes and Fixes
7. Debugging Techniques That Actually Work
8. Fixing Restart Loops in Docker Compose
9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)
10. Prevention: Build Containers That Don’t Crash Loop

1. Understand What “Restarting” Really Means

A container is (conceptually) a wrapper around a process. When that process exits, the container stops. If Docker is configured to restart it, you’ll see it bounce between states like:

Up ... (health: starting) then Restarting (1) ...
Restarting (137) ...
Exited (0) ... then immediately Up ... again

Docker’s restart behavior is controlled by:

Restart policy on the container (--restart or Compose restart:)
Orchestrator logic (Docker Swarm, Kubernetes, Nomad, systemd, etc.)
External watchdog scripts (cron jobs, supervisors)

So “keeps restarting” usually means:

The app exits (crash or normal exit)
Something restarts it (policy/controller)
The underlying problem is still present
Repeat (crash loop)

Your job is to answer two questions:

Why is the main process exiting?
Who/what is restarting it?

2. Quick Triage Checklist (Fastest Path to Root Cause)

Run these in order; they provide the highest signal quickly.

2.1 See container status and restart count

docker ps -a --no-trunc

Look for:

STATUS like Restarting (1) 10 seconds ago
A high RESTARTS count

2.2 Inspect the container state and exit code

docker inspect <container_name_or_id> --format \
'Status={{.State.Status}} ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} StartedAt={{.State.StartedAt}} FinishedAt={{.State.FinishedAt}}'

If OOMKilled=true, jump to OOMKilled.

2.3 Read logs from the last run

docker logs --tail=200 <container_name_or_id>

If it restarts quickly, include timestamps:

docker logs -t --tail=200 <container_name_or_id>

2.4 Check restart policy

docker inspect <container> --format 'RestartPolicy={{json .HostConfig.RestartPolicy}}'

If it’s always or unless-stopped, Docker will keep trying.

3. Inspect Restart Policies and Container State

3.1 Restart policies: what they do

Docker supports these policies:

no (default): do not restart automatically
on-failure[:max-retries]: restart only if exit code is non-zero
always: restart regardless of exit code
unless-stopped: like always, but won’t restart if you manually stopped it

Show it:

docker inspect <container> --format '{{.HostConfig.RestartPolicy.Name}} {{.HostConfig.RestartPolicy.MaximumRetryCount}}'

3.2 Temporarily stop the restart loop (to debug)

If the container restarts too fast to inspect, you can disable restart:

docker update --restart=no <container>
docker stop <container>

Then start it manually:

docker start <container>

Or run a new container without restart policy to test:

docker run --rm --name test-no-restart <image:tag>

3.3 Get a full picture with `docker inspect`

docker inspect <container> > inspect.json

Key fields to search inside inspect.json:

.State.ExitCode
.State.Error
.State.OOMKilled
.Config.Entrypoint
.Config.Cmd
.HostConfig.RestartPolicy
.Mounts
.HostConfig.Binds
.NetworkSettings.Ports

4. Read Logs Correctly (and Persist Them)

4.1 Basic logs

docker logs <container>

4.2 Follow logs while it crash-loops

docker logs -f <container>

4.3 Get logs from a previous instance

Docker logs are attached to the container, not each restart separately, but timestamps help:

docker logs -t --since=10m <container>

4.4 If logs are empty

Empty logs can mean:

The process exits before writing to stdout/stderr
Logging is going to a file inside the container
The entrypoint fails before the app starts

Try:

docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'

And run the image interactively:

docker run --rm -it --entrypoint sh <image:tag>

Then try to start the app manually from inside.

4.5 Logging drivers matter

Check the logging driver:

docker inspect <container> --format 'LogConfig={{json .HostConfig.LogConfig}}'

If you’re using journald, syslog, fluentd, etc., docker logs might not show what you expect.

5. Interpret Exit Codes (and What They Usually Mean)

Exit codes are one of the best clues.

Get the exit code:

docker inspect <container> --format 'ExitCode={{.State.ExitCode}}'

Common patterns:

0: process exited successfully (but maybe you expected it to keep running)
1: generic error (app-level failure)
2: misuse of shell builtins / CLI usage error
126: command found but not executable (permissions)
127: command not found
137: killed (often OOM kill or SIGKILL)
139: segmentation fault
143: terminated (SIGTERM), often from docker stop or orchestrator

Also check if Docker says it was OOM killed:

docker inspect <container> --format 'OOMKilled={{.State.OOMKilled}}'

6. Common Root Causes and Fixes

6.1 The Main Process Exits Immediately

Symptom: container starts and exits quickly; exit code might be 0.

This often happens when the container runs a command that completes immediately, like echo, or a server that fails to daemonize correctly.

Example:

docker run --name mycontainer alpine echo "hello"

That will exit immediately, because echo finishes.

Fix: run a long-lived foreground process. For example:

docker run --name mycontainer alpine sh -c 'while true; do sleep 3600; done'

For real services (nginx, node, python), ensure they run in the foreground:

Nginx: nginx -g 'daemon off;'
Apache httpd: httpd -DFOREGROUND
Many base images already do this correctly; custom scripts often break it.

Key idea: In Docker, you generally want one main process that stays in the foreground.

6.2 Your Entrypoint/Command Is Wrong

Symptom: exit code 127 (command not found) or 126 (not executable).

Check what Docker is trying to run:

docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'

If your Dockerfile uses:

ENTRYPOINT ["./start.sh"]

But start.sh isn’t copied, or isn’t executable, you’ll crash-loop.

Fixes:

Ensure the file exists in the image:

docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la /path && file /path/start.sh'

Ensure it’s executable:

COPY start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh
ENTRYPOINT ["/usr/local/bin/start.sh"]

Watch out for Windows line endings (CRLF) causing /bin/sh^M: bad interpreter:

Inside the container:

cat -A /usr/local/bin/start.sh | head

If you see ^M, convert to LF on the host:

sed -i 's/\r$//' start.sh

Then rebuild.

6.3 App Crashes on Startup (Missing Config/Env/Secrets)

Symptom: logs show configuration errors, missing environment variables, missing files.

Examples:

“DATABASE_URL not set”
“Cannot find /app/config.json”
“Permission denied reading /run/secrets/…”

Debug:

docker logs --tail=200 <container>
docker inspect <container> --format 'Env={{json .Config.Env}}'

Fix: pass environment variables correctly

Run:

docker run -e DATABASE_URL='postgres://user:pass@db:5432/app' <image:tag>

Or with an env file:

docker run --env-file .env <image:tag>

Fix: mount required config files

docker run -v "$PWD/config.json:/app/config.json:ro" <image:tag>

Tip: If the app expects a file at a specific path, confirm the mount path matches exactly.

6.4 Port Binding Conflicts

Port conflicts typically prevent a container from starting at all (you’ll see an error like “bind: address already in use”), but in some setups you might see repeated attempts.

Check what’s using the port on the host:

sudo lsof -iTCP -sTCP:LISTEN -P | grep ':8080'

Or:

sudo ss -ltnp | grep ':8080'

Fix: change the host port mapping:

docker run -p 8081:8080 <image:tag>

Or stop the conflicting service.

6.5 Healthcheck Fails (and Something Restarts It)

Docker’s built-in HEALTHCHECK does not restart containers by itself. It only marks them as unhealthy. However:

Docker Compose depends_on: condition: service_healthy can cause cascading behavior
Orchestrators (Kubernetes) may restart pods if probes fail
External tooling may watch health and restart

Check health status:

docker inspect <container> --format 'Health={{json .State.Health}}'

See the healthcheck command:

docker inspect <container> --format 'Healthcheck={{json .Config.Healthcheck}}'

Fix: Make healthchecks realistic:

Ensure the service is actually listening before the check runs
Increase start_period, interval, and retries (in Dockerfile or Compose)
Validate the endpoint used by the healthcheck

To manually test the healthcheck command, run it inside the container:

docker exec -it <container> sh
# then run the healthcheck command (e.g., curl -f http://localhost:8080/health)

6.6 OOMKilled (Out of Memory)

Symptom: exit code often 137, and .State.OOMKilled=true.

Check:

docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}}'

Check memory limits:

docker inspect <container> --format 'Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}}'

If Memory is non-zero, the container has a limit. If the app exceeds it, the kernel may kill it.

See live memory usage:

docker stats <container>

Fix options:

Increase memory limit:

docker run --memory=1g --memory-swap=1g <image:tag>

Reduce app memory usage:

Lower JVM heap (-Xmx)
Fix memory leaks
Reduce concurrency / worker counts

If you’re on Docker Desktop, increase the VM memory in Docker Desktop settings.

Important: OOM kills can happen even without explicit container limits if the host is under memory pressure.

6.7 Permission Problems (Volumes, Users, Filesystems)

Symptom: logs show EACCES, “permission denied”, inability to write to a directory, or failure to create PID files.

Common scenario: container runs as non-root, but a mounted volume is owned by root on the host.

Inspect the container user:

docker inspect <container> --format 'User={{.Config.User}}'

Check permissions inside:

docker exec -it <container> sh -c 'id && ls -ld /data && touch /data/testfile'

If it can’t write, fix by one of:

Adjust host directory ownership to match container UID/GID:

sudo chown -R 1000:1000 ./data

Run container with a specific user:

docker run --user 1000:1000 -v "$PWD/data:/data" <image:tag>

Use Docker-managed volumes (often easier than bind mounts):

docker volume create appdata
docker run -v appdata:/data <image:tag>

6.8 Crash Loops Caused by Dependencies (DB Not Ready, DNS, Network)

Symptom: app exits because it can’t reach database, cache, or an API. Logs show connection refused, timeout, DNS failure.

Examples:

ECONNREFUSED db:5432
getaddrinfo ENOTFOUND redis
dial tcp: lookup ...: no such host

Debug network basics:

Confirm the container is on the expected network:

docker network ls
docker inspect <container> --format '{{json .NetworkSettings.Networks}}'

Test DNS and connectivity from inside:

docker exec -it <container> sh -c 'cat /etc/resolv.conf && nslookup db || true'
docker exec -it <container> sh -c 'nc -vz db 5432 || true'

(If nc isn’t installed, use busybox tools or install temporarily in a debug image.)

Fix: implement retry/backoff Your app should not crash instantly if a dependency is temporarily unavailable. Add:

retry loops
exponential backoff
“wait-for-it” style scripts (carefully)

Example simple wait loop:

#!/bin/sh
set -eu

until nc -z db 5432; do
  echo "Waiting for db..."
  sleep 2
done

exec ./your-app

Fix Compose startup ordering Compose depends_on does not guarantee readiness, only start order. Use healthchecks + waiting logic.

6.9 “Exec format error” (Wrong Architecture)

Symptom: container exits immediately with logs like:

exec /bin/sh: exec format error

This happens when you run an image built for a different CPU architecture (e.g., ARM image on x86_64).

Check host architecture:

uname -m

Check image architecture:

docker image inspect <image:tag> --format '{{.Architecture}}/{{.Os}}'

Fix: pull/build the correct platform

Pull with platform:

docker pull --platform=linux/amd64 <image:tag>

Or build multi-arch with Buildx:

docker buildx build --platform linux/amd64,linux/arm64 -t yourname/yourimage:latest --push .

6.10 Container Runs, But Exits When Shell Ends (Bad Debug Pattern)

Symptom: you started a container with an interactive shell as PID 1, then detached incorrectly, or your command ends.

Example:

docker run -it ubuntu bash

When you exit the shell, PID 1 exits, container stops.

Fix: run the actual service as PID 1, not a shell, or use -d with a long-running command.

7. Debugging Techniques That Actually Work

When a container restarts too quickly, you need a way to “pause the scene” and inspect.

7.1 Override entrypoint to get a shell

Run a new container from the same image:

docker run --rm -it --entrypoint sh <image:tag>

Or with bash if available:

docker run --rm -it --entrypoint bash <image:tag>

Then manually run what the container normally runs (from docker inspect output).

7.2 Start the container with a “sleep” command

This keeps it alive so you can exec in:

docker run -d --name debug --entrypoint sh <image:tag> -c 'sleep 36000'
docker exec -it debug sh

From there, test filesystem paths, env vars, and run the app command.

7.3 Inspect events in real time

Docker events can show restart cycles and reasons:

docker events --filter container=<container_name_or_id>

In another terminal, watch for die, start, restart events.

7.4 Check resource constraints and kernel messages

If you suspect OOM or kernel kills, check dmesg (on Linux host):

sudo dmesg -T | tail -n 200 | grep -i -E 'oom|killed process'

7.5 Confirm what PID 1 is doing

Inside a running container:

docker exec -it <container> sh -c 'ps aux'

If PID 1 is a shell script, ensure it uses exec to hand over PID 1 to the app. Without exec, signals may not be handled properly, causing weird shutdowns and restarts.

Bad:

#!/bin/sh
./app

Better:

#!/bin/sh
exec ./app

8. Fixing Restart Loops in Docker Compose

Compose often introduces restart loops because it makes restarts easy to enable and services depend on each other.

8.1 Check Compose service status

docker compose ps

8.2 View logs per service

docker compose logs --tail=200 <service>
docker compose logs -f <service>

8.3 Identify the restart policy

In Compose, you might have:

restart: always
restart: on-failure
restart: unless-stopped

To temporarily disable restarts for debugging, remove restart: and re-run:

docker compose up -d --force-recreate

Or scale to zero (to stop the loop while you inspect other services):

docker compose stop <service>

8.4 Use healthchecks and readiness logic (Compose reality)

Compose depends_on does not ensure readiness. If your app needs Postgres ready, add:

Postgres healthcheck
App waits/retries on startup

Then you can gate startup with service_healthy (supported in modern Compose implementations), but still keep retries in the app for robustness.

9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)

Sometimes you’re looking at a “Docker container restarting,” but the restart is happening because Kubernetes is managing the container.

9.1 Check pod restart reason

kubectl get pods
kubectl describe pod <pod>

Look for:

Last State: Terminated
Reason: OOMKilled
Exit Code: ...
Back-off restarting failed container

9.2 View logs from the previous crash

kubectl logs <pod> -c <container> --previous

9.3 Probes can cause restarts

If livenessProbe fails, Kubernetes restarts the container. Confirm probe endpoints and timings.

10. Prevention: Build Containers That Don’t Crash Loop

10.1 Run one foreground process and handle signals

Ensure PID 1 is your app (or a minimal init like tini)
Use exec in entrypoint scripts
Handle SIGTERM for graceful shutdown

Example Dockerfile pattern:

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini","--"]
CMD ["./app"]

10.2 Fail fast, but log clearly

If you must exit due to missing config, print a clear message to stderr and exit non-zero:

“Missing DATABASE_URL”
“Cannot read /app/config.json: permission denied”

This makes docker logs immediately useful.

10.3 Add sane retry behavior for dependencies

Databases and message brokers often start slower than apps. A small retry loop prevents unnecessary restarts.

10.4 Set appropriate resource limits

If you deploy with memory limits, configure your runtime accordingly (JVM heap, Node memory flags, worker counts).

10.5 Use healthchecks thoughtfully

Healthchecks should:

represent real readiness/health
allow enough startup time
not overload the service

A Practical Walkthrough (Putting It All Together)

Assume your container web is restarting.

Step 1: Identify restart policy and exit code

docker ps -a
docker inspect web --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Restart={{.HostConfig.RestartPolicy.Name}}'

Step 2: Read logs

docker logs -t --tail=200 web

Step 3: If logs show “command not found” (127)

docker inspect web --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'
docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la && which yourbinary || true'

Fix Dockerfile paths or install missing binary.

Step 4: If exit code is 137 and OOMKilled=true

docker stats web
docker inspect web --format 'Memory={{.HostConfig.Memory}}'

Increase memory or reduce usage.

Step 5: If exit code is 0

Your app is exiting cleanly. That means you’re not running a long-lived server process. Fix the CMD/ENTRYPOINT to run the service in the foreground.

Reference Commands (Cheat Sheet)

# Status and restarts
docker ps -a

# Logs
docker logs --tail=200 <container>
docker logs -f <container>

# Inspect state, exit codes, OOM
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}}'

# Restart policy
docker inspect <container> --format '{{json .HostConfig.RestartPolicy}}'

# Disable restart policy (debug)
docker update --restart=no <container>

# Events
docker events --filter container=<container>

# Exec into a running container
docker exec -it <container> sh

# Run image with overridden entrypoint for debugging
docker run --rm -it --entrypoint sh <image:tag>

# Resource usage
docker stats <container>

Closing Mental Model

A restart loop is a feedback loop:

Container starts
Main process exits (for a reason)
A policy/controller restarts it
Repeat

If you consistently focus on (a) exit reason and (b) who restarts it, you’ll solve restart loops quickly—even in complex stacks.

If you want, paste the output of these three commands and I can help pinpoint the cause:

docker ps -a --no-trunc
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} Restart={{json .HostConfig.RestartPolicy}}'
docker logs --tail=200 <container>

Why Your Docker Container Keeps Restarting (and How to Fix It)

Why Your Docker Container Keeps Restarting (and How to Fix It)

Table of Contents

1. Understand What “Restarting” Really Means

2. Quick Triage Checklist (Fastest Path to Root Cause)

2.1 See container status and restart count

2.2 Inspect the container state and exit code

2.3 Read logs from the last run

2.4 Check restart policy

3. Inspect Restart Policies and Container State

3.1 Restart policies: what they do

3.2 Temporarily stop the restart loop (to debug)

3.3 Get a full picture with docker inspect

4. Read Logs Correctly (and Persist Them)

4.1 Basic logs

4.2 Follow logs while it crash-loops

4.3 Get logs from a previous instance

4.4 If logs are empty

4.5 Logging drivers matter

5. Interpret Exit Codes (and What They Usually Mean)

6. Common Root Causes and Fixes

6.1 The Main Process Exits Immediately

6.2 Your Entrypoint/Command Is Wrong

6.3 App Crashes on Startup (Missing Config/Env/Secrets)

6.4 Port Binding Conflicts

6.5 Healthcheck Fails (and Something Restarts It)

6.6 OOMKilled (Out of Memory)

6.7 Permission Problems (Volumes, Users, Filesystems)

6.8 Crash Loops Caused by Dependencies (DB Not Ready, DNS, Network)

6.9 “Exec format error” (Wrong Architecture)

6.10 Container Runs, But Exits When Shell Ends (Bad Debug Pattern)

7. Debugging Techniques That Actually Work

7.1 Override entrypoint to get a shell

7.2 Start the container with a “sleep” command

7.3 Inspect events in real time

7.4 Check resource constraints and kernel messages

7.5 Confirm what PID 1 is doing

8. Fixing Restart Loops in Docker Compose

8.1 Check Compose service status

8.2 View logs per service

8.3 Identify the restart policy

8.4 Use healthchecks and readiness logic (Compose reality)

9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)

9.1 Check pod restart reason

9.2 View logs from the previous crash

9.3 Probes can cause restarts

10. Prevention: Build Containers That Don’t Crash Loop

10.1 Run one foreground process and handle signals

10.2 Fail fast, but log clearly

10.3 Add sane retry behavior for dependencies

10.4 Set appropriate resource limits

10.5 Use healthchecks thoughtfully

A Practical Walkthrough (Putting It All Together)

Step 1: Identify restart policy and exit code

Step 2: Read logs

Step 3: If logs show “command not found” (127)

Step 4: If exit code is 137 and OOMKilled=true

Step 5: If exit code is 0

Reference Commands (Cheat Sheet)

Closing Mental Model

Related Tutorials

3.3 Get a full picture with `docker inspect`