Introduction to YAML with Examples

YAML is a data serialization language. The name itself is a recursive acronym which stands for YAML Ain’t Markup Language. It is specifically designed to be human-friendly, easy to read and write, to represent settings and data structures and to work well with modern programming languages. It is used, for example, as the language for docker-compose files and to specify tasks in Ansible playbooks. In this tutorial we learn the YAML basics concepts and we see how the various data types are represented in the YAML syntax.

In this tutorial you will learn:

  • The YAML basic concepts
  • Data types used in YAML files
  • How to organize multi-line content
Introduction to YAML with Examples
Introduction to YAML with Examples

Software requirements and conventions used

Software Requirements and Linux Command Line Conventions
Category Requirements, Conventions or Software Version Used
System Distribution independent
Software No specific software needed
Other None
Conventions # – requires given linux-commands to be executed with root privileges either directly as a root user or by use of sudo command
$ – requires given linux-commands to be executed as a regular non-privileged user

YAML Basic concepts

Before starting to examine how data is represented in the YAML syntax, we better see some of the very basic and fundamental concepts behind its usage. Let’s go!

Only spaces allowed. The very first thing to know is that in the YAML syntax, spaces and only spaces can be used for indentation, and indentation is semantic, just like happens in the Python programming language, since it is used to define structures and data-trees.

Document delimiters. The --- and ... symbols mark, respectively, the start and the end of a document. They are optional, so a YAML file can be perfectly valid if they are not used, however, they become necessary in some specific cases. The three hyphens must be used when a document is preceded by directives. Directives are basically composed by a % (percentage) sign followed by a name and space delimited parameters (there are currently only two directives defined: %YAML and %TAG). The --- symbol marks the end of the directives and the start of the document. Since a single file can contain multiple documents, to separate them, we need to use the three dots symbol (...), which can be followed only by directives and/or the --- delimiter.




Everything is part of a dictionary. Everything inside YAML files is basically part of a dictionary, since data is represented in the key-value pairs format. YAML is case-sensitive and keys must be unique.

Finally, YAML files must end with the .yaml or yml suffixes.

Data types

Once we saw the basics, let’s see how data types are represented in the YAML syntax. We have three primitives:

  • Scalars
  • List
  • Mappings (key-value pairs)

Let’s see how they are represented.

Scalars

Scalars are data which can be identified as a single value, for example: a string, an integer or a boolean. Using scalars in the YAML syntax is pretty simple. Here is an example of the usage of a string from a docker-compose.yml file in which the image to use for a container is specified:

image: httpd:latest

As we can notice, to define a string we don’t need to use quotes (we can, but it’s not mandatory). Numbers, so both integers and floating point values are also easily represented:

items: 39
price: 25.5

Boolean can be represented in multiple ways: yes/no, true/false, y/n,
on/off:

overwrite: no

Lists

In the YAML syntax, a list or collection of values, can be represented in two ways: the first one is by preceding its elements, each one on its line, with an hypen and a space; the other is by enclosing its elements in square brackets separated by a comma. Here is an example of the first syntax:

list:
  - first
  - second
  - third

The “inline” way, instead, is the following:

list: [ first, second, third ]


Mappings

Mappings (or hashes, dictionaries) are unordered sequences of key/value pairs. As we said before, everything inside YAML is a member of a dictionary. Here is an example:

character:
  name: aragorn
  race: man

In the example above, the name and race keys are members of the same dictionary, respectively mapped to the “aragorn” and “man” values. The dictionary itself is the value associated to the character key.

Mappings, just like lists, can be also represented with an inline syntax, using curly braces. In that case keys and their respective values are separated by : (colon) and a space, which is mandatory. The mapping of the previous example can be also represented in the following way:

character: { name: aragorn, race: man }

Keys in a dictionary must be unique. Data types can obviously mixed to represent complex structures. For example we can have a list of mappings:

characters:
  - { name: aragorn, race: man }
  - { name: legolas, race: elf }
  - { name: frodo, race: hobbit }

or:

characters:
  - name: aragorn
    race: man

  - name: legolas
    race: elf

  - name: frodo
    race: hobbit

Or we can use a list as a value in a dictionary:

character: { name: aragorn, race: man, weapons: [sword, knife] }


Multi-line content

Inside YAML documents it is possible to define a multi-line content by using the | character (literal block scalar). Here is an example from an Ansible playbook task. In it, we use the content instruction of the “copy” module to define the multi-line content of a file. When we use the | character the newlines in the content are preserved:

- name: Example
  hosts: localhost
  tasks:
    - name: Write content
      copy:
        dest: /foo.conf
        content: |
          line1
          line2
          line3

It is also possible to use the > character (Folder block scalar) to organize content on multiple lines. The difference between the two is that, while in the previous example newlines are preserved, with > newlines are converted to spaces, so the actual content once written, will appear on the same line. This is particularly useful when we want to make a really long line more readable:

- name: Example
  hosts: localhost
  tasks:
    - name: Example
      copy:
        dest: /foo.conf
        content: >
          this content
          will be
          on the same line

Conclusions

In this tutorial we talked about the YAML serialization language and we learned the fundamental concepts behind its usage. YAML files are used to represent settings or data. They are used, among the other things, to define Ansible playbook tasks and to set how containers should be built and launched in docker-compose files. We saw the defining traits of the YAML syntax, and how data types such as scalars, lists and dictionaries are represented. Finally, we saw how to organize multi-line contents.



Comments and Discussions
Linux Forum