← Back to Tutorials

Why Your Docker Container Keeps Restarting (and How to Fix It)

dockercontainersdevopstroubleshootingrestart-policykuberneteslogshealthchecksoomentrypoint

Why Your Docker Container Keeps Restarting (and How to Fix It)

A Docker container that “keeps restarting” is almost never random. Docker is doing exactly what you (or an image author, or an orchestrator) told it to do: run a process, and if that process exits, apply a restart policy (or a higher-level controller) to bring it back.

This tutorial teaches you how to identify the true reason a container exits, how to read Docker’s signals, and how to fix the root cause—not just silence symptoms.


Table of Contents


1. Understand What “Restarting” Really Means

A container is (conceptually) a wrapper around a process. When that process exits, the container stops. If Docker is configured to restart it, you’ll see it bounce between states like:

Docker’s restart behavior is controlled by:

  1. Restart policy on the container (--restart or Compose restart:)
  2. Orchestrator logic (Docker Swarm, Kubernetes, Nomad, systemd, etc.)
  3. External watchdog scripts (cron jobs, supervisors)

So “keeps restarting” usually means:

Your job is to answer two questions:

  1. Why is the main process exiting?
  2. Who/what is restarting it?

2. Quick Triage Checklist (Fastest Path to Root Cause)

Run these in order; they provide the highest signal quickly.

2.1 See container status and restart count

docker ps -a --no-trunc

Look for:

2.2 Inspect the container state and exit code

docker inspect <container_name_or_id> --format \
'Status={{.State.Status}} ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} StartedAt={{.State.StartedAt}} FinishedAt={{.State.FinishedAt}}'

If OOMKilled=true, jump to OOMKilled.

2.3 Read logs from the last run

docker logs --tail=200 <container_name_or_id>

If it restarts quickly, include timestamps:

docker logs -t --tail=200 <container_name_or_id>

2.4 Check restart policy

docker inspect <container> --format 'RestartPolicy={{json .HostConfig.RestartPolicy}}'

If it’s always or unless-stopped, Docker will keep trying.


3. Inspect Restart Policies and Container State

3.1 Restart policies: what they do

Docker supports these policies:

Show it:

docker inspect <container> --format '{{.HostConfig.RestartPolicy.Name}} {{.HostConfig.RestartPolicy.MaximumRetryCount}}'

3.2 Temporarily stop the restart loop (to debug)

If the container restarts too fast to inspect, you can disable restart:

docker update --restart=no <container>
docker stop <container>

Then start it manually:

docker start <container>

Or run a new container without restart policy to test:

docker run --rm --name test-no-restart <image:tag>

3.3 Get a full picture with docker inspect

docker inspect <container> > inspect.json

Key fields to search inside inspect.json:


4. Read Logs Correctly (and Persist Them)

4.1 Basic logs

docker logs <container>

4.2 Follow logs while it crash-loops

docker logs -f <container>

4.3 Get logs from a previous instance

Docker logs are attached to the container, not each restart separately, but timestamps help:

docker logs -t --since=10m <container>

4.4 If logs are empty

Empty logs can mean:

Try:

docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'

And run the image interactively:

docker run --rm -it --entrypoint sh <image:tag>

Then try to start the app manually from inside.

4.5 Logging drivers matter

Check the logging driver:

docker inspect <container> --format 'LogConfig={{json .HostConfig.LogConfig}}'

If you’re using journald, syslog, fluentd, etc., docker logs might not show what you expect.


5. Interpret Exit Codes (and What They Usually Mean)

Exit codes are one of the best clues.

Get the exit code:

docker inspect <container> --format 'ExitCode={{.State.ExitCode}}'

Common patterns:

Also check if Docker says it was OOM killed:

docker inspect <container> --format 'OOMKilled={{.State.OOMKilled}}'

6. Common Root Causes and Fixes

6.1 The Main Process Exits Immediately

Symptom: container starts and exits quickly; exit code might be 0.

This often happens when the container runs a command that completes immediately, like echo, or a server that fails to daemonize correctly.

Example:

docker run --name mycontainer alpine echo "hello"

That will exit immediately, because echo finishes.

Fix: run a long-lived foreground process. For example:

docker run --name mycontainer alpine sh -c 'while true; do sleep 3600; done'

For real services (nginx, node, python), ensure they run in the foreground:

Key idea: In Docker, you generally want one main process that stays in the foreground.


6.2 Your Entrypoint/Command Is Wrong

Symptom: exit code 127 (command not found) or 126 (not executable).

Check what Docker is trying to run:

docker inspect <container> --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'

If your Dockerfile uses:

ENTRYPOINT ["./start.sh"]

But start.sh isn’t copied, or isn’t executable, you’ll crash-loop.

Fixes:

  1. Ensure the file exists in the image:
docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la /path && file /path/start.sh'
  1. Ensure it’s executable:
COPY start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh
ENTRYPOINT ["/usr/local/bin/start.sh"]
  1. Watch out for Windows line endings (CRLF) causing /bin/sh^M: bad interpreter:

Inside the container:

cat -A /usr/local/bin/start.sh | head

If you see ^M, convert to LF on the host:

sed -i 's/\r$//' start.sh

Then rebuild.


6.3 App Crashes on Startup (Missing Config/Env/Secrets)

Symptom: logs show configuration errors, missing environment variables, missing files.

Examples:

Debug:

docker logs --tail=200 <container>
docker inspect <container> --format 'Env={{json .Config.Env}}'

Fix: pass environment variables correctly

Run:

docker run -e DATABASE_URL='postgres://user:pass@db:5432/app' <image:tag>

Or with an env file:

docker run --env-file .env <image:tag>

Fix: mount required config files

docker run -v "$PWD/config.json:/app/config.json:ro" <image:tag>

Tip: If the app expects a file at a specific path, confirm the mount path matches exactly.


6.4 Port Binding Conflicts

Port conflicts typically prevent a container from starting at all (you’ll see an error like “bind: address already in use”), but in some setups you might see repeated attempts.

Check what’s using the port on the host:

sudo lsof -iTCP -sTCP:LISTEN -P | grep ':8080'

Or:

sudo ss -ltnp | grep ':8080'

Fix: change the host port mapping:

docker run -p 8081:8080 <image:tag>

Or stop the conflicting service.


6.5 Healthcheck Fails (and Something Restarts It)

Docker’s built-in HEALTHCHECK does not restart containers by itself. It only marks them as unhealthy. However:

Check health status:

docker inspect <container> --format 'Health={{json .State.Health}}'

See the healthcheck command:

docker inspect <container> --format 'Healthcheck={{json .Config.Healthcheck}}'

Fix: Make healthchecks realistic:

To manually test the healthcheck command, run it inside the container:

docker exec -it <container> sh
# then run the healthcheck command (e.g., curl -f http://localhost:8080/health)

6.6 OOMKilled (Out of Memory)

Symptom: exit code often 137, and .State.OOMKilled=true.

Check:

docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}}'

Check memory limits:

docker inspect <container> --format 'Memory={{.HostConfig.Memory}} MemorySwap={{.HostConfig.MemorySwap}}'

If Memory is non-zero, the container has a limit. If the app exceeds it, the kernel may kill it.

See live memory usage:

docker stats <container>

Fix options:

  1. Increase memory limit:
docker run --memory=1g --memory-swap=1g <image:tag>
  1. Reduce app memory usage:
  1. If you’re on Docker Desktop, increase the VM memory in Docker Desktop settings.

Important: OOM kills can happen even without explicit container limits if the host is under memory pressure.


6.7 Permission Problems (Volumes, Users, Filesystems)

Symptom: logs show EACCES, “permission denied”, inability to write to a directory, or failure to create PID files.

Common scenario: container runs as non-root, but a mounted volume is owned by root on the host.

Inspect the container user:

docker inspect <container> --format 'User={{.Config.User}}'

Check permissions inside:

docker exec -it <container> sh -c 'id && ls -ld /data && touch /data/testfile'

If it can’t write, fix by one of:

  1. Adjust host directory ownership to match container UID/GID:
sudo chown -R 1000:1000 ./data
  1. Run container with a specific user:
docker run --user 1000:1000 -v "$PWD/data:/data" <image:tag>
  1. Use Docker-managed volumes (often easier than bind mounts):
docker volume create appdata
docker run -v appdata:/data <image:tag>

6.8 Crash Loops Caused by Dependencies (DB Not Ready, DNS, Network)

Symptom: app exits because it can’t reach database, cache, or an API. Logs show connection refused, timeout, DNS failure.

Examples:

Debug network basics:

  1. Confirm the container is on the expected network:
docker network ls
docker inspect <container> --format '{{json .NetworkSettings.Networks}}'
  1. Test DNS and connectivity from inside:
docker exec -it <container> sh -c 'cat /etc/resolv.conf && nslookup db || true'
docker exec -it <container> sh -c 'nc -vz db 5432 || true'

(If nc isn’t installed, use busybox tools or install temporarily in a debug image.)

Fix: implement retry/backoff Your app should not crash instantly if a dependency is temporarily unavailable. Add:

Example simple wait loop:

#!/bin/sh
set -eu

until nc -z db 5432; do
  echo "Waiting for db..."
  sleep 2
done

exec ./your-app

Fix Compose startup ordering Compose depends_on does not guarantee readiness, only start order. Use healthchecks + waiting logic.


6.9 “Exec format error” (Wrong Architecture)

Symptom: container exits immediately with logs like:

This happens when you run an image built for a different CPU architecture (e.g., ARM image on x86_64).

Check host architecture:

uname -m

Check image architecture:

docker image inspect <image:tag> --format '{{.Architecture}}/{{.Os}}'

Fix: pull/build the correct platform

Pull with platform:

docker pull --platform=linux/amd64 <image:tag>

Or build multi-arch with Buildx:

docker buildx build --platform linux/amd64,linux/arm64 -t yourname/yourimage:latest --push .

6.10 Container Runs, But Exits When Shell Ends (Bad Debug Pattern)

Symptom: you started a container with an interactive shell as PID 1, then detached incorrectly, or your command ends.

Example:

docker run -it ubuntu bash

When you exit the shell, PID 1 exits, container stops.

Fix: run the actual service as PID 1, not a shell, or use -d with a long-running command.


7. Debugging Techniques That Actually Work

When a container restarts too quickly, you need a way to “pause the scene” and inspect.

7.1 Override entrypoint to get a shell

Run a new container from the same image:

docker run --rm -it --entrypoint sh <image:tag>

Or with bash if available:

docker run --rm -it --entrypoint bash <image:tag>

Then manually run what the container normally runs (from docker inspect output).

7.2 Start the container with a “sleep” command

This keeps it alive so you can exec in:

docker run -d --name debug --entrypoint sh <image:tag> -c 'sleep 36000'
docker exec -it debug sh

From there, test filesystem paths, env vars, and run the app command.

7.3 Inspect events in real time

Docker events can show restart cycles and reasons:

docker events --filter container=<container_name_or_id>

In another terminal, watch for die, start, restart events.

7.4 Check resource constraints and kernel messages

If you suspect OOM or kernel kills, check dmesg (on Linux host):

sudo dmesg -T | tail -n 200 | grep -i -E 'oom|killed process'

7.5 Confirm what PID 1 is doing

Inside a running container:

docker exec -it <container> sh -c 'ps aux'

If PID 1 is a shell script, ensure it uses exec to hand over PID 1 to the app. Without exec, signals may not be handled properly, causing weird shutdowns and restarts.

Bad:

#!/bin/sh
./app

Better:

#!/bin/sh
exec ./app

8. Fixing Restart Loops in Docker Compose

Compose often introduces restart loops because it makes restarts easy to enable and services depend on each other.

8.1 Check Compose service status

docker compose ps

8.2 View logs per service

docker compose logs --tail=200 <service>
docker compose logs -f <service>

8.3 Identify the restart policy

In Compose, you might have:

To temporarily disable restarts for debugging, remove restart: and re-run:

docker compose up -d --force-recreate

Or scale to zero (to stop the loop while you inspect other services):

docker compose stop <service>

8.4 Use healthchecks and readiness logic (Compose reality)

Compose depends_on does not ensure readiness. If your app needs Postgres ready, add:

Then you can gate startup with service_healthy (supported in modern Compose implementations), but still keep retries in the app for robustness.


9. Fixing Restart Loops in Kubernetes (If Docker Isn’t the Real Culprit)

Sometimes you’re looking at a “Docker container restarting,” but the restart is happening because Kubernetes is managing the container.

9.1 Check pod restart reason

kubectl get pods
kubectl describe pod <pod>

Look for:

9.2 View logs from the previous crash

kubectl logs <pod> -c <container> --previous

9.3 Probes can cause restarts

If livenessProbe fails, Kubernetes restarts the container. Confirm probe endpoints and timings.


10. Prevention: Build Containers That Don’t Crash Loop

10.1 Run one foreground process and handle signals

Example Dockerfile pattern:

RUN apk add --no-cache tini
ENTRYPOINT ["/sbin/tini","--"]
CMD ["./app"]

10.2 Fail fast, but log clearly

If you must exit due to missing config, print a clear message to stderr and exit non-zero:

This makes docker logs immediately useful.

10.3 Add sane retry behavior for dependencies

Databases and message brokers often start slower than apps. A small retry loop prevents unnecessary restarts.

10.4 Set appropriate resource limits

If you deploy with memory limits, configure your runtime accordingly (JVM heap, Node memory flags, worker counts).

10.5 Use healthchecks thoughtfully

Healthchecks should:


A Practical Walkthrough (Putting It All Together)

Assume your container web is restarting.

Step 1: Identify restart policy and exit code

docker ps -a
docker inspect web --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Restart={{.HostConfig.RestartPolicy.Name}}'

Step 2: Read logs

docker logs -t --tail=200 web

Step 3: If logs show “command not found” (127)

docker inspect web --format 'Entrypoint={{json .Config.Entrypoint}} Cmd={{json .Config.Cmd}}'
docker run --rm -it --entrypoint sh <image:tag> -c 'ls -la && which yourbinary || true'

Fix Dockerfile paths or install missing binary.

Step 4: If exit code is 137 and OOMKilled=true

docker stats web
docker inspect web --format 'Memory={{.HostConfig.Memory}}'

Increase memory or reduce usage.

Step 5: If exit code is 0

Your app is exiting cleanly. That means you’re not running a long-lived server process. Fix the CMD/ENTRYPOINT to run the service in the foreground.


Reference Commands (Cheat Sheet)

# Status and restarts
docker ps -a

# Logs
docker logs --tail=200 <container>
docker logs -f <container>

# Inspect state, exit codes, OOM
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}}'

# Restart policy
docker inspect <container> --format '{{json .HostConfig.RestartPolicy}}'

# Disable restart policy (debug)
docker update --restart=no <container>

# Events
docker events --filter container=<container>

# Exec into a running container
docker exec -it <container> sh

# Run image with overridden entrypoint for debugging
docker run --rm -it --entrypoint sh <image:tag>

# Resource usage
docker stats <container>

Closing Mental Model

A restart loop is a feedback loop:

  1. Container starts
  2. Main process exits (for a reason)
  3. A policy/controller restarts it
  4. Repeat

If you consistently focus on (a) exit reason and (b) who restarts it, you’ll solve restart loops quickly—even in complex stacks.

If you want, paste the output of these three commands and I can help pinpoint the cause:

docker ps -a --no-trunc
docker inspect <container> --format 'ExitCode={{.State.ExitCode}} OOMKilled={{.State.OOMKilled}} Error={{.State.Error}} Restart={{json .HostConfig.RestartPolicy}}'
docker logs --tail=200 <container>