YAML is a data serialization language. The name itself is a recursive acronym which stands for YAML Ain’t Markup Language. It is specifically designed to be human-friendly, easy to read and write, to represent settings and data structures and to work well with modern programming languages. It is used, for example, as the language for docker-compose files and to specify tasks in Ansible playbooks. In this tutorial we learn the YAML basics concepts and we see how the various data types are represented in the YAML syntax.
In this tutorial you will learn:
- The YAML basic concepts
- Data types used in YAML files
- How to organize multi-line content

Software requirements and conventions used
Category | Requirements, Conventions or Software Version Used |
---|---|
System | Distribution independent |
Software | No specific software needed |
Other | None |
Conventions | # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command$ – requires given linux-commands to be executed as a regular non-privileged user |
YAML Basic concepts
Before starting to examine how data is represented in the YAML syntax, we better see some of the very basic and fundamental concepts behind its usage. Let’s go!
Only spaces allowed. The very first thing to know is that in the YAML syntax, spaces and only spaces can be used for indentation, and indentation is semantic, just like happens in the Python programming language, since it is used to define structures and data-trees.
Document delimiters. The ---
and ...
symbols mark, respectively, the start and the end of a document. They are optional, so a YAML file can be perfectly valid if they are not used, however, they become necessary in some specific cases. The three hyphens must be used when a document is preceded by directives. Directives are basically composed by a %
(percentage) sign followed by a name and space delimited parameters (there are currently only two directives defined: %YAML
and %TAG
). The ---
symbol marks the end of the directives and the start of the document. Since a single file can contain multiple documents, to separate them, we need to use the three dots symbol (...
), which can be followed only by directives and/or the ---
delimiter.
Everything is part of a dictionary. Everything inside YAML files is basically part of a dictionary, since data is represented in the key-value pairs format. YAML is case-sensitive and keys must be unique.
Finally, YAML files must end with the .yaml
or yml
suffixes.
Data types
Once we saw the basics, let’s see how data types are represented in the YAML syntax. We have three primitives:
- Scalars
- List
- Mappings (key-value pairs)
Let’s see how they are represented.
Scalars
Scalars are data which can be identified as a single value, for example: a string, an integer or a boolean. Using scalars in the YAML syntax is pretty simple. Here is an example of the usage of a string from a docker-compose.yml
file in which the image to use for a container is specified:
image: httpd:latest
As we can notice, to define a string we don’t need to use quotes (we can, but it’s not mandatory). Numbers, so both integers and floating point values are also easily represented:
items: 39
price: 25.5
Boolean can be represented in multiple ways: yes/no
, true/false
, y/n
,
on/off
:
overwrite: no
Lists
In the YAML syntax, a list or collection of values, can be represented in two ways: the first one is by preceding its elements, each one on its line, with an hypen and a space; the other is by enclosing its elements in square brackets separated by a comma. Here is an example of the first syntax:
list:
- first
- second
- third
The “inline” way, instead, is the following:
list: [ first, second, third ]
Mappings
Mappings (or hashes, dictionaries) are unordered sequences of key/value pairs. As we said before, everything inside YAML is a member of a dictionary. Here is an example:
character:
name: aragorn
race: man
In the example above, the name
and race
keys are members of the same dictionary, respectively mapped to the “aragorn” and “man” values. The dictionary itself is the value associated to the character
key.
Mappings, just like lists, can be also represented with an inline syntax, using curly braces. In that case keys and their respective values are separated by :
(colon) and a space, which is mandatory. The mapping of the previous example can be also represented in the following way:
character: { name: aragorn, race: man }
Keys in a dictionary must be unique. Data types can obviously mixed to represent complex structures. For example we can have a list of mappings:
characters:
- { name: aragorn, race: man }
- { name: legolas, race: elf }
- { name: frodo, race: hobbit }
or:
characters:
- name: aragorn
race: man
- name: legolas
race: elf
- name: frodo
race: hobbit
Or we can use a list as a value in a dictionary:
character: { name: aragorn, race: man, weapons: [sword, knife] }
Multi-line content
Inside YAML documents it is possible to define a multi-line content by using the |
character (literal block scalar). Here is an example from an Ansible playbook task. In it, we use the content instruction of the “copy” module to define the multi-line content of a file. When we use the |
character the newlines in the content are preserved:
- name: Example
hosts: localhost
tasks:
- name: Write content
copy:
dest: /foo.conf
content: |
line1
line2
line3
It is also possible to use the >
character (Folder block scalar) to organize content on multiple lines. The difference between the two is that, while in the previous example newlines are preserved, with >
newlines are converted to spaces, so the actual content once written, will appear on the same line. This is particularly useful when we want to make a really long line more readable:
- name: Example
hosts: localhost
tasks:
- name: Example
copy:
dest: /foo.conf
content: >
this content
will be
on the same line
Conclusions
In this tutorial we talked about the YAML serialization language and we learned the fundamental concepts behind its usage. YAML files are used to represent settings or data. They are used, among the other things, to define Ansible playbook tasks and to set how containers should be built and launched in docker-compose files. We saw the defining traits of the YAML syntax, and how data types such as scalars, lists and dictionaries are represented. Finally, we saw how to organize multi-line contents.