Docker skills are high in demand mainly because, thanks to the Docker
we can automate the deployment of applications inside so-called containers
, creating tailored environments that can be easily replicated anywhere the Docker
technology is supported. In this tutorial we will see how to create a Docker image
from scratch, using a Dockerfile
. We will learn the most important instructions we can use to customize our image, how to build the image, and how to run containers based on it.
In this tutorial you will learn:
- How to create a docker image using a Dockerfile
- Some of the most frequently used Dockerfile instructions
- How to achieve data persistence in containers
Software Requirements and Conventions Used
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Os-independent |
Software | Docker |
Other |
|
Conventions | # – requires given linux commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux commands to be executed as a regular non-privileged user |
Images and containers
Before we start, it may be useful to define clearly what we mean when we talk about images
and containers
in the context of Docker
. Images can be considered as building blocks of the Docker world. They represents the “blueprints” used to create containers. Indeed, when a container is created it represents a concrete instance of the images it is based on.
Many containers can be created from the same image. In the rest of this article we will learn how to provide the instructions needed to create an image tailored to our needs inside a Dockerfile
, how to actually build the image, and how to run a container based on it.
Build our own image using a Dockerfile
To build our own image we will use a Dockerfile.
A Dockerfile contains all the instructions needed to create and setup an image. Once our Dockerfile is ready we will use the docker build
command to actually build the image.
The first thing we should do is to create a new directory to host our project. For the sake of this tutorial we will build an image containing the Apache
web server, so we will name the root directory of the project “dockerized-apache”:
$ mkdir dockerized-apache
This directory is what we call the build context
. During the build process, all the files and directories contained in it, including the Dockerfile
we will create, are sent to the Docker daemon so they can be easily accessed, unless they are listed into the .dockerignore
file.
Let’s create our Dockerfile
. The file must be called Dockerfile
and will contain, as we said above, all the instructions needed to create an image with the desired features. We fire up our favorite text editor and start by writing the following instructions:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
The first instruction we must provide is FROM
: with it we can specify an existing image we will use as base (this is called a base image
), to create our own. In this case our base image will be ubuntu
. Apart from the image name, we also used a tag, in order to specify the version of the image we want to use, in this case 18.10
. If no tag is specified the latest
tag is used by default: this will cause the latest available version of the base image to be used. If the image is not already present on our system it will be downloaded from dockerhub.
After the FROM
instruction, we used LABEL
. This instruction is optional, can be repeated multiple times, and is used to add metadata to our image. In this case we used it to specify the image maintainer.
The RUN instruction
At this point, if we run docker build
, we will just produce an image identical to the base one, except for the metadata we added. This would be of no use for us. We said we want to “dockerize” the Apache
web server, so the next thing to do in our Dockerfile
, is to provide an instruction to install the web server as part of the image. The instruction that let us accomplish this task is RUN
:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
The RUN
instruction is used to execute commands on top of the image. One very important thing to remember is that for every RUN
instruction we use, a new layer is created and added to the stack. On this regard Docker is very smart: already built layers will be “cached”: this means that if we build an image based on our Dockerfile
, and then we decide, for example, to add another RUN
instruction (and thus a new layer) at the end of it, the build will not start from scratch, but will run only the new instructions.
For this to happen, of course, the instructions already built on the Dockerfile
must not be modified. Is even possible to avoid this behavior completely when building an image, just using the --no-cache
option of the docker build
command.
In our case we used the RUN
instruction to execute the apt-get update && apt-get -y install apache2
commands. Notice how we passed the -y
option to the apt-get install
command: this option makes so that an affirmative answer is given automatically to all the confirmations required by the command. This is necessary because we are installing the package non-interactively.
Exposing port 80
As we know, the Apache web server listens on port 80
for standard connections. We must instruct Docker to make that port accessible on the container. To accomplish the task we use the EXPOSE
function and provide the port number. For security reasons the specified port is opened only when the container is launched. Let’s add this instruction to our Dockerfile
:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
Building the image
At this point we can already try to build our image. From inside the root directory of our project, “dockerized-apache”, we run the following command:
$ sudo docker build -t linuxconfig/dockerized-apache .
Let’s examine the command. First of all, we prefixed the command with sudo, in order to run it with administrative privileges. It is possible to avoid this, by adding an user to the docker
group, but this represents a security risk. The -t
option we provided, short for --tag
, let us apply a repository name and optionally a tag to our image if the build succeeds.
Finally, the .
instructs docker to look for the Dockerfile
in the current directory. As soon as we launch the command, the build process will start. The progress and build messages will be displayed on screen:
Sending build context to Docker daemon 2.048
kB
Step 1/4 : FROM ubuntu:18.10
Trying to pull repository docker.io/library/ubuntu ...
[...]
In few minutes our image should be created successfully. To verify it, we can run the docker images
command, which returns a list of all the images existing in our local Docker repository:
$ sudo docker images
REPOSITORY TAG IMAGE ID
CREATED SIZE
linuxconfig/dockerized-apache latest 7ab7b6873614 2
minutes ago 191 MB
As expected the image appears in the list. As we can notice, since we didn’t provide a tag (only a repository name, linuxconfig/dockerized-apache
) the latest
tag has been automatically applied to our image. An ID
has been also assigned to the it, 7ab7b6873614
: we can use it to reference the image in future commands.
Launching a container based on the image
Now that our image is ready, we can create and launch a container
based on it. To accomplish the task we use the docker run
command:
$ sudo docker run --name=linuxconfig-apache -d -p 8080:80 linuxconfig/dockerized-apache apachectl -D FOREGROUND
Let’s examine the command above. The first option we provided was --name
: with it, we specify a name for the container, in this case “linuxconfig-apache”. If we omitted this option a random generated name would have been assigned to our container.
The -d
option (short for --detach
) causes the container to run in background.
The -p
option, short for --publish
, is needed in order to publish a container port (or a range of ports) to the host system. The syntax of the option is the following:
-p localhost_port:container_port
In this case we published the port 80
we previously exposed in the container, to the host port 8080
. For the sake of completeness we must say that it’s also possible to use the -P
option (short for --publish-all
) instead, causing all the ports exposed in the container to be mapped to random
ports on the host.
The last two things we specified in the command above, are: the image
the container should be based on, and the command
to run when the container is started, which is optional. The image is of course linuxconfig/dockerized-apache
, the one we built before.
The command we specified is apachectl -D FOREGROUND
. With this command the Apache
web server is launched in foreground
mode: this is mandatory for it to work in the container. The docker run
command runs the specified command on a new
container:
$ sudo docker run --name=linuxconfig-apache -d
-p 8080:80 linuxconfig/dockerized-apache apachectl -D FOREGROUND
a51fc9a6dd66b02117f00235a341003a9bf0ffd53f90a040bc1122cbbc453423
What is the number printed on the screen? It is the ID
of the container! Once we have the container up and running, we should be able to access the page served by the default Apache
VirtualHost at the localhost:8080
address (port 8080
on the host is mapped on port 80
on the container):
Our setup is working correctly. If we run the docker ps
command, which lists all the active containers in the system, we can retrieve information about our container: id (short version, easier to reference form the command line for a human), the image it was run from, the command used, its creation time and current status, ports mapping and name.
$ sudo docker ps
CONTAINER ID IMAGE COMMAND
CREATED STATUS PORTS NAMES
a51fc9a6dd66 linuxconfig/dockerized-apache "apachectl -D FORE..." 28
seconds ago Up 28 seconds 0.0.0.0:8080->80/tcp
linuxconfig-apache
To stop it the container all we need to do is to reference it by its id or name, and run the docker stop
command. For example:
$ sudo docker stop linuxconfig-apache
To start it again:
$ sudo docker start linuxconfig-apache
Execute command directly via the Dockerfile
Since here we built a basic image, and at runtime, using the docker run
command, we specified the command to be launched when the container is started. Sometimes we want to specify the latter directly inside the Dockerfile. We can do it in two ways: using CMD
or ENTRYPOINT
.
Both instructions can be used for the same purpose but they behave differently when a command is also specified from the command line. Let’s see how.
The CMD instruction
The CMD
instruction can basically be used in two forms. The first is the exec
form:
CMD ["/usr/sbin/apachectl", "-D", "FOREGROUND"]
The other one is the shell
form:
CMD /usr/sbin/apachectl -D FOREGROUND
The exec
from is usually preferred. It is worth notice that when using the exec form a shell is not invoked, therefore variable expansions will not happen. If variable expansion is needed we can use the shell
form or we can invoke a shell directly in the exec
mode, as:
CMD ["sh", "-c", "echo", "$HOME"]
The CMD
instruction can be specified only once in the Dockerfile
. If multiple CMD
options are provided, only the last will take effect. The purpose of the instruction is to provide a default
command to be launched when the container starts:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-D", "FOREGROUND"]
The command specified with CMD
inside the Dockerfile
, works as a default, and will be overridden if another command is specified from the command line when executing docker run
.
The ENTRYPOINT instruction
The ENTRYPOINT
instruction can also be used to configure a command to be used when the container is started, and like CMD
, both the exec
and shell
form can be used with it. The big difference between the two is that a command passed from the command line will not override the one specified with ENTRYPOINT
: instead it will be appended to it.
By using this instruction we can specify a basic command and modify it with the options we provide when running the docker-run
command, making our container behave like an executable. Let’s see an example with our Dockerfile
:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
ENTRYPOINT ["/usr/sbin/apachectl"]
In this case we substituted the CMD
instruction with ENTRYPOINT
and also removed the -D FOREGROUND
option from the exec format. Suppose we now rebuild the image, and recreate the container using the following command:
$ sudo docker run --name=linuxconfig-apache -d -p 8080:80 linuxconfig/dockerized-apache -D FOREGROUND
When the container starts, the -D FOREGROUND
arguments is appended to the command provided in the Dockerfile
with the ENTRYPOINT
instruction, but only if using the exec
form. This can be verified by running the docker ps
command (here we added some options to the command, to better display and format its output, selecting only the information we need):
$ sudo docker ps --no-trunc --format
"{{.Names}}\t{{.Command }}"
linuxconfig-apache "/usr/sbin/apachectl -D FOREGROUND"
Just like CMD
, the ENTRYPOINT
instruction can be provided only one time. If it appears multiple time in the Dockerfile, only the last occurrence will be considered. It is possibile to override the default ENTRYPOINT
of the image from the command line, by using the --entrypoint
option of the docker run
command.
Combining CMD and ENTRYPOINT
Now that we know the peculiarity of the CMD
and ENTRYPOINT
instructions we can also combine them. What can we obtain by doing so? We can use ENTRYPOINT
to specify a valid base command, and the CMD
instruction to specify default parameters for it.
The command will run with those default parameters by default, unless we override them from the command line when running docker run
. Sticking to our Dockerfile
, we could write:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
ENTRYPOINT ["/usr/sbin/apachectl"]
CMD ["-D", "FOREGROUND"]
If we rebuild the image from this Dockerfile
, remove the previous container we created, and re-launch the docker run
command without specifying any additional argument, the /usr/bin/apachectl -D FOREGROUND
command will be executed. If we instead provide some arguments, they will override those specified in the Dockerfile
with the CMD
instruction. For example, if we run:
$ sudo docker run --name=linuxconfig-apache -d -p 8080:80 linuxconfig/dockerized-apache -X
The command that will be executed when starting the container will be /usr/bin/apachectl -X
. Let’s verify it:
$ sudo docker ps --no-trunc --format
"{{.Names}}\t{{.Command }}"
linuxconfig-apache "/usr/sbin/apachectl -X"
The command launched, was as expected: the -X
option, by the way, makes so that the httpd daemon is launched in debug mode
.
Copying files into the container
Our “dockerized” Apache server works. As we saw, if we navigate to localhost:8080
, we visualize the default apache welcome page. Now, say we have a website ready to be shipped with the container, how can we “load” it so that Apache will serve it instead?
Well, for the sake of this tutorial we will just replace the default index.html file. To accomplish the task we can use the COPY
instruction. Suppose we have an alternative index.html file inside the root of our project (our build context) with this content:
<html>
<body>
<h2>Hello!</h2>
<h3>This file has been copied into the container with the COPY instruction!</h3>
</body>
</html>
We want to load it and copy it to the /var/www/html
directory inside the container, therefore inside our Dockerfile
we add the COPY
instruction:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
ENTRYPOINT ["/usr/sbin/apachectl"]
CMD ["-D", "FOREGROUND"]
COPY index.html /var/www/html/index.html
We rebuild the image and the container. If now navigate to localhost:8080
, we will see the new message:
# new message
The COPY
instruction can be used to copy both files and directories. When the destination path doesn’t exist it is created inside the container. All new files and directories are created with a UID
and GID
of 0
.
Another possibile solution to copy files inside the container is to use the ADD
instruction, which is more powerful than COPY
. With this instruction we can copy files, directories but also URLs
. Additionally, if we copy a local tar archive
with a recognized compressed format, it will be automatically uncompressed and copied as a directory inside the container.
The ideal strategy would be to use COPY
unless the additional features provided by ADD
are really needed.
Creating a VOLUME
In the previous example, to demonstrate how the COPY
instruction works, we replaced the default index.html file of the default Apache VirtualHost inside the container.
If we stop and start the container, we will still find the modification we made, but if the container for some reason is removed, all the data contained on its writable layer will be lost with it. How to solve this problem? One approach is to use the VOLUME
instruction:
FROM ubuntu:18.10
LABEL maintainer="egidio.docile@linuxconfig.org"
RUN apt-get update && apt-get -y install apache2
EXPOSE 80
ENTRYPOINT ["/usr/sbin/apachectl"]
CMD ["-D", "FOREGROUND"]
COPY index.html /var/www/html/index.html
VOLUME /var/www/html
The VOLUME
instruction takes one or more directories (in this case /var/www/html
) and causes them to be used as mountpoints for external, randomly-named volumes generated when the container is created.
This way, the data we put into the directories used as mountpoints will be persisted inside the mounted volumes and will still exist even if the container is destroyed. If a directory set to be used as a mountpoint already contains data at initialization time, that data is copied inside the volume that is mounted on it.
Let’s rebuild the image and the container. We can now verify that the volume has been created and its in use by inspecting the container:
$ sudo docker inspect linuxconfig-apache
[...]
"Mounts": [
{
"Type": "volume",
"Name": "8f24f75459c24c491b2a5e53265842068d7c44bf1b0ef54f98b85ad08e673e61",
"Source": "/var/lib/docker/volumes/8f24f75459c24c491b2a5e53265842068d7c44bf1b0ef54f98b85ad08e673e61/_data",
"Destination": "/var/www/html",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
],
[...]
As already said, the volume will survive even after the container is destroyed so our data will not be lost.
The VOLUME
instruction inside the Dockefile
, as we can see from the output of the docker inspect command above, makes so that a randomly named volume is created. To define a named volume
, or to mount an already existing volume inside a container, we must specify it at runtime, when running the docker run
command, using the -v
option (short for --volume
). Let’s see an example:
$ sudo docker run --name=linuxconfig-apache -d -p 8080:80 -v myvolume:/var/www/html linuxconfig/dockerized-apache
In the command above, we used the -v
option specifying the volume name
(very important: notice that it is not a path, but a simple name) and the mountpoint
inside the container using the following syntax:
<volume_name>:<mountpoint>
When we perform such command the volume named “myvolume” will be mounted at the specific path inside the container (the volume will be created if it doesn’t already exist). As we said before, if the volume is empty, the data already existing on the mountpoint inside the container will be copied inside of it. Using the docker volume ls
command, we can confirm a volume with the name we specified has been created:
$ sudo docker volume ls
DRIVER VOLUME NAME
local myvolume
To remove a volume we use the docker volume rm
command, and provide the name of the volume to remove. Docker, however, will not let us remove a volume used by an active container:
$ sudo docker volume rm myvolume
Error response from daemon: Unable to remove volume, volume still in use: remove
myvolume: volume is in use -
[95381b7b6003f6165dfe2e1912d2f827f7167ac26e22cf26c1bcab704a2d7e02]
Another approach for data persistance, especially useful during development, is to bind-mount
a host directory inside the container. This approach has the advantage of letting us work on our code locally with our favorite tools and see the effect of the changes immediately reflected inside the container, but has a big disadvantage: the container becomes dependent on the host directory structure.
For this reason, since portability is one of the main targets of Docker, is not possible to define a bind-mount
inside a Dockerfile, but only at runtime. To accomplish this task, we use the -v
option of docker run
command again, but this time we provide the path
of a directory inside the host filesystem instead of a volume name:
$ sudo docker run --name=linuxconfig-apache -d -p 8080:80 -v /path/on/host:/var/www/html linuxconfig/dockerized-apache
When launching the command above, the host directory /path/on/host will be mounted on /var/www/html inside the container. If the directory on host doesn’t exist it is created automatically. In this case data in the mountpoint directory inside the container (/var/www/html in our example) is not copied to the host directory that is mounted on it, as it happens for volumes instead.
Conclusion
In this tutorial we learned the basics concepts needed to create and build a docker image using a Dockerfile
and how to run a container based on it. We built a very simple image which let us run a “dockerized” version of the Apache web server. In the process, we saw how to use the FROM
instruction, which is mandatory to specify a base image to work on, the LABEL
instruction to add metadata to our image, the EXPOSE
instruction to declare the ports to be exposed in the container. We also learned how to map said port(s) to the host system port(s).
We learned how to use the
RUN
instruction to run commands on the image, and we learned how to specify a command to be executed when the container is started both from command line and inside the Dockerfile
. We saw how to accomplish this by using the CMD
and ENTRYPOINT
instructions, and what are the differences between the two. Finally, we saw how to COPY
data inside the container, and how to achieve data persistence using volumes. In our examples, we discussed only a small subset of the instructions that can be used in a Dockerfile
.
For a complete and detailed list, please consult the official Docker documentation. In the meantime, if you want to know how to build an entire LAMP
stack using Docker and the docker-compose tool, you can take a look at our article on How to create a docker-based LAMP stack using docker-compose on Ubuntu 18.04 Bionic Beaver Linux.