Blog
Engineering
8
minutes

Best Practices and Tips for Writing a Dockerfile

Docker is a high-level virtualization platform that lets you package software as isolated units called containers. Containers are created from images that include everything needed to run the packaged workload, such as executable binaries and dependency libraries. Images are defined in Dockerfiles. These resemble sequential scripts that are executed to assemble an image. Dockerfiles can include several kinds of instructions, such as RUN, to execute a command in the container’s file system, and COPY to add files from your host. In this article, you’ll learn about the key characteristics of Dockerfiles and some best practices to be aware of when you’re writing your own Dockerfiles. Adhering to these guidelines helps you reap the benefits of the containerization movement while minimizing risks in terms of security and performance issues.
Romaric Philogène
CEO & Co-founder
Summary
Twitter icon
linkedin icon

What Does a Dockerfile Do?

Dockerfiles are sets of instructions used to create a Docker image. The Dockerfile format is formalized by Docker and founded on a set of keywords that let you manipulate a container’s file system.

Each keyword is capitalized and followed by specific arguments:

FROM ubuntu:latest
COPY /source /destination
RUN touch /example-demo

Three instructions are used here: FROM, which defines the base image from which yours will inherit; COPY, which lets you add files from your host to paths inside the container; and RUN, which is a pass-through for executing commands inside the container’s file system.

Best Practices for Writing a Dockerfile

Writing your own Dockerfiles? Crafting a production-ready Dockerfile is a bit more involved than the minimal example shown above, but it shouldn’t be a daunting exercise. Here’s what you need to know to get the most from your image builds.

Scan for Security Vulnerabilities

You can perform active scans of your Docker images to identify key vulnerabilities. Several tools are available, including Docker Scan, which is built into the Docker CLI.

Scans assess the operating system packages installed in the image and match them against known lists of Common Vulnerabilities and Exposures (CVEs). Suggested remediation steps are provided for each detected problem.

You should utilize a container scanning engine as a best-practice step to avoid issues slipping into production unseen. Add a scan to the image build stage of your CI pipeline to protect yourself from developers unwittingly adding risky packages to your image.

Remove Unneeded Dependencies

Docker images should be as minimal as possible. Streamlining your builds to include just the bare essentials makes for smaller image sizes, faster builds, and a reduced attack surface. You’ll also use less network bandwidth when transferring images to registries and hosting providers.

Don’t install any packages or libraries that your application doesn’t actually use. As you’ll rarely interact with running containers yourself, there’s no point in adding common CLI utilities on the off chance you might want them later. Maintaining this practice will help your Dockerfile stay focused on containerizing your application instead of an entire OS environment.

Secure Injected Credentials

This is one of the most common and dangerous Dockerfile issues. It can be tempting to copy config files as part of your build. This is an anti-pattern that should be avoided.

Anything copied as part of your Dockerfile is baked into your image and accessible to anyone who can access it. If you add a file containing a database password, you’ve just exposed those secrets to all the users with access to your image registry.

Configuration keys and secret sensitive data must be supplied to individual containers, not the images they’re built from. Use environment variables, config files mounted into a volume, or Docker’s dedicated secrets mechanism to inject data when containers start. This avoids accidental information exposure and ensures your images are reusable across environments.

Run as a Non-Root User

Docker containers usually default to running their processes as the root user. Using root to run your processes means that a successful exploit of a web server inside your container could let an attacker take control of the container or even your host machine—the root user in the container is the same as the root on your host.

The USER instruction directs Docker to run a container’s command as a specified non-root user. It accepts the name or ID of a user and group. All subsequent RUN, CMD, and ENTRYPOINT instructions in your Dockerfile will run as that account.

USER demo-user:example-group

For even greater isolation, you can run the Docker daemon as a non-root user. This uses namespace remapping to completely avoid the use of the root on the host. Even if a process breaks out of your container, it won’t be able to fully compromise your machine. The rootless mode requires a special configuration and does not work with all Docker features.

Track Dockerfile Versions

Dockerfiles are text files that can change over time, so you should commit them to version control. Some organizations keep Dockerfiles in the same repository as their code, facilitating one git pull followed by a docker build to build the image, whereas others choose to use a separate repository. In the latter case, you’d need to pull the code repository separately before building your image.

Using version control lets you track, revert, and review the changes you make to your Dockerfile. You’ll have a useful starting point to debug issues if you build a new image and then find that it fails when deployed.

Create Stateless, Reproducible Containers

Docker containers should be ephemeral and completely stateless. Images need to define reproducible builds: the exact same image should be built each time you run docker build with a single Dockerfile. This is aided by specifying exact versions of the packages you install, as the latest, release, or even v1 could change significantly in the future.

Another trait of reproducible builds is the absence of side effects. Builds are meant to create containers, not to update data in your database, push artifacts into your CI system, or send you emails about progress. Those are all jobs to be performed in a separate stage in your pipeline.

Heeding these practices ensures that the image you built through docker build in a week will be the same as the one you built through docker build today. This makes it much easier to back up your data. Docker registries can quickly become large, but properly built images don’t necessarily need to be archived. If you do lose your registry, you’ll be able to quickly rebuild replicas of your images from the Dockerfiles in your source repositories.

Handle Multiline Arguments

It’s inevitable that some Dockerfile instructions will become long and unwieldy. This is especially true of RUN instructions that execute commands within your container. It’s best practice to combine as many commands as possible into a single RUN instruction as this helps facilitate layer caching. The drawback is having some very long lines in your Dockerfile.

To mitigate this issue, you should combine multiple lines with backslashes. This makes your Dockerfile easier to read so newcomers can quickly scan through the instructions. Docker suggests that you sort lines alphabetically; while this won’t be possible in all command sequences, it works well for lists of packages and other downloaded files.

RUN apt-get update && \
apt-get install -y apache2 &&\
service apache2 restart

Know the Difference Between CMD and ENTRYPOINT

CMD and ENTRYPOINT are two instructions used to define the process that’s run in your container. ENTRYPOINT sets the process that’s executed when the container first starts. This defaults to a shell, /bin/sh. CMD provides the default arguments for the ENTRYPOINT process.

The following pair of instructions results in date +%A running when the container is created:

ENTRYPOINT ["date"]CMD ["+%A"]

Setting a custom ENTRYPOINT lets image users quickly access your CLI binaries without having to specify their full path. When you use docker run, Docker overrides the CMD but reuses the ENTRYPOINT:

docker run my-image:latest +%B

The above command would start the container and run date +%B.

Use COPY Instead of ADD

Dockerfiles support two similar but subtly different instructions. ADD and COPY both let you add existing files to your image; whereas COPY only works with local files on your host, ADD also accepts remote URLs and automatically extracts tar archives. A simple ADD archive.tar has very different results from COPY archive.tar because the version using ADD will copy the contents of the archive, not the archive file itself.

Because ADD possesses this extra magic, it’s advisable to use COPY as your go-to when copying content from your file system. ADD creates unnecessary ambiguity, especially when referencing archive files, which means it should be reserved for use when it’s really needed. This helps you communicate your intentions to other contributors who might work with your Dockerfile.

Build Images without a Dockerfile Using STDIN

You can build images without a Dockerfile. While not strictly a Dockerfile best practice, this technique is useful when you’re dynamically creating one-off utility images as part of a CI pipeline.

Use docker build - to have Docker accept a Dockerfile on stdin:

echo -e 'FROM ubuntu:latest\nRUN echo "Built from stdin"' | docker build -

Conclusion

Dockerfiles are a set of instructions that Docker uses to build images. They provide a standardized, reproducible, and auditable mechanism for creating container images. While the syntax is simple, it’s important to keep common gotchas and best practices in mind so you don’t hit unexpected issues in production.

Keep your Dockerfiles as small as possible, actively scan them for security vulnerabilities, and ensure you’re not baking in secrets or config keys. These simple steps will ensure your Dockerfiles stay reusable, resilient, and secure against unintentional information disclosure.

When it’s time to launch your images into production, Qovery is a platform that can deploy your Dockerfile straight to AWS. Install the Qovery CLI, initialize a new project, and then push your code to your Git repository. Qovery will automatically launch your Dockerfile in the cloud, exposing the ports you’ve specified in your Dockerfile. This removes the burden of maintaining a Docker Engine installation on your own servers.

Share on :
Twitter icon
linkedin icon
Ready to rethink the way you do DevOps?
Qovery is a DevOps automation platform that enables organizations to deliver faster and focus on creating great products.
Book a demo

Suggested articles

DevOps
 minutes
TPUs vs. GPUs: The DevOps Guide to AI Hardware Selection

Stop guessing on AI hardware. This DevOps guide details when to use TPUs vs. GPUs for optimal performance, cost, and framework compatibility in MLOps.

Mélanie Dallé
Senior Marketing Manager
Cloud
Business
10
 minutes
The DevOps Guide to Docker Monitoring: Tools, Best Practices, and Unified Observability

Stop tool sprawl. Compare top Docker monitoring tools (Prometheus, Datadog, Qovery) and learn how unified observability simplifies K8s debugging and speeds up feature delivery.

Romaric Philogène
CEO & Co-founder
Cloud
Heroku
Internal Developer Platform
Platform Engineering
9
 minutes
The Top 8 Tools to Build a Zero-Toil PaaS on Your Cloud

Stop managing K8s complexity. Discover the top 8 platform tools (Qovery, Rancher, Dokku) that let you build a customizable, zero-maintenance PaaS on your cloud.

Morgan Perry
Co-founder
Kubernetes
 minutes
How to Deploy a Docker Container on Kubernetes: Step-by-Step Guide

Simplify Kubernetes Deployment. Learn the difficult 6-step manual process for deploying Docker containers to Kubernetes, the friction of YAML and kubectl, and how platform tools like Qovery automate the entire workflow.

Mélanie Dallé
Senior Marketing Manager
Observability
DevOps
 minutes
Observability in DevOps: What is it, Observe vs. Monitoring, Benefits

Observability in DevOps: Diagnose system failures faster. Learn how true observability differs from traditional monitoring. End context-switching, reduce MTTR, and resolve unforeseen issues quickly.

Mélanie Dallé
Senior Marketing Manager
DevOps
Cloud
8
 minutes
6 Best Practices to Automate DevSecOps in Days, Not Months

Integrate security seamlessly into your CI/CD pipeline. Learn the 6 best DevSecOps practices—from Policy as Code to continuous monitoring—and see how Qovery automates compliance and protection without slowing development.

Morgan Perry
Co-founder
Heroku
15
 minutes
Top 10 Heroku Alternatives: When Simplicity Hits the Scaling Wall

Escape rising Heroku costs & outages. Compare top alternatives that deliver PaaS simplicity on your own cloud and scale without limits.

Mélanie Dallé
Senior Marketing Manager
Product
Infrastructure Management
Deployment
 minutes
Stop tool sprawl - Welcome to Terraform/OpenTofu support

Provisioning cloud resources shouldn’t require a second stack of tools. With Qovery’s new Terraform and OpenTofu support, you can now define and deploy your infrastructure right alongside your applications. Declaratively, securely, and in one place. No external runners. No glue code. No tool sprawl.

Alessandro Carrano
Head of Product

It’s time to rethink
the way you do DevOps

Say goodbye to DevOps overhead. Qovery makes infrastructure effortless, giving you full control without the trouble.