YAML Files


This is part of a series on data storage with Python.

In this post, we'll be covering .yaml files. YAML stands for "YAML Aint Markup Language", and it's a nice way for humans to read and edit several properties of some thing (e.g., an experiment, a piece of hardware, a server, etc...) A typical YAML document looks like this:

---
author: Jonathan Wheeler
email: jonathan.m.wheeler@gmail.com
favorite_foods: 
  - sandwiches
  - pizza
  - anything my wife makes
fake_birthday: 1992-01-23
pets:
  lizard: 
     name: Henrietta
     address: My wife's classroom at Mountain View Academy
  squirrel:
     name: Sammy the Squirrel
     address: tree behind my house
biography: >
  It was a dark and stormy day. Everything
  seemed lost, until out of nowhere, 
  a man appeared with a unicycle...

The syntax is very easy to learn, and is very human-readable. Yaml documents are a great way to quickly write out settings or configurations by hand in a way that can later be read by a computer.

One of the things that I love about YAML files that they support working off a template. This is a great way to describe a large variety of configurations without having to repeat definitions. You can change all of the settings of a given type by just changing one line of code.

some_template: &foo
  current: 10 mA
  voltage: 100 mV
  author: Jonathan

second_template: &bar
  current: 20 mA
  voltage: 100 mV

first_experiment: 
  << : *foo # Loads in all of the variables in some_template

second_experiment:
  << : *foo # Loads in all of the variables in some_template
  author: Jonathan's friend # Overwrites the author

In python, you can write data into a .yaml file with the following code:

import yaml

data = {
    'author': 'Jonathan Wheeler',
    'date': '2021-03-22',
    'array': [1,2,3]
}

with open('somefile.yaml', 'w') as f:
    f.write(yaml.dump(data))

And read it back with

with open('somefile.yaml') as f:
    raw_contents = f.read()
    data = yaml.load(raw_contents, Loader=yaml.FullLoader)

YAML excels with storing configurations and settings. It is not a very good format for storing large amounts of data, or when computers need to make more edits than humans do.