Tag Archives: chef

Getting Involved in the Operations Working Group

For the last few years I’ve been trying to get more people involved in the Operations Working Group – the team within the OpenStreetMap Foundation that runs all of our services. Each time I think “why aren’t more people involved”, I try to figure out some plausible barriers to entry, and then work on fixing them.

One of the reasons is that we’re very quiet about what we do – there’s a pride in keeping OpenStreetMap humming along and not causing too much of a fuss. But the lack of publicity hurts us when we’re trying to get more people involved. Hence this blog post, among other reasons.

I’ve been working recently (as in, for the last few years) on making as much of our OWG activities public as possible, rather than hidden away on our mailing list or in our meetings. So we now have a public issue tracker showing our tasks, and we also publish a monthly summary of our activities.

To make OpenStreetMap work, we run a surprisingly large number of servers. For many years we maintained a list of our hardware and what each server was being used for on the OpenStreetMap wiki, which helped new people find out how everything works. Maintaining this information was a lot of work and the wiki was often outdated. For my own projects I use Jekyll to create static websites (which are easier to find hosting for than websites that need databases), and information on Jekyll sites can be generated with the help of data files. Since OWG uses Chef to configure all of our servers, and Chef knows both the hardware configuration and also what the machines are used for, the idea came that we could automate these server information pages entirely. That website is now live on hardware.openstreetmap.org so we have a public, accurate and timely list of all of our hardware and the services running on it.

Now my attention has moved to our Chef configuration. Although the configuration has been public for years, currently the barrier to entry is substantial. One straightforward (but surprisingly time-consuming) improvement was to simply write a README for each cookbook – 77 in all. I finished that project last week.

Unless you have administrator rights to OSMF hardware (and even I don’t have that!) you need to write the chef configuration ‘blind’ – that is, you can propose a change but you can’t realistically test that it works before you make a pull request. That makes proposing changes close to impossible, so it’s not surprising that few non-administrators have ever contributed changes. I have experience with a few tools that can help, the most important being test-kitchen. This allows a developer to locally check that their changes work, and have the desired effect, before making a pull request, and also it allows the administrators to check that the PR works before deploying the updated cookbook. Both Matt and I have been working on this recently, and today I proposed a basic test-kitchen configuration.

This will only be the start of a long process, since eventually most of those 77 cookbooks will need test-kitchen configurations. Even in my initial attempts to test the serverinfo cookbook (that generates hardware.openstreetmap.org) I found a bunch of problems, some of which I haven’t yet figured out how to work around. There will be many more of these niggles found, but the goal is to allow future developers to improve each cookbook using only their own laptops.

All of these are small steps on the long path to getting more people involved in the Operations Working Group. If you’re interested in helping, get stuck in to both our task list and our chef repo, and let me know when you get stuck.

Using Vagrant to test Chef cookbooks

I’ve previously discussed using Chef as a great way to manage servers, the key part of the process being writing “cookbooks” to describe what software you want installed, how it should be configured, and what services should be running. But a question that I’ve been asked by a few people is how do I test my cookbooks as I are writing them?

Of course, a simple way to test them is run them on my laptop – which would be great, except that I would end up with all kinds of things installed that I don’t need, and there’s things that I don’t want to repeatedly uninstall just to check my cookbooks install it properly. The second approach is to keep run and re-run them on a server as I go along, but that involves uploading a half-written cookbook to my chef server, running chef-client on the server, seeing if it worked, rinsing and repeating. And I’d have to be brave or foolhardy to do this when the server is in production!

Step forward Vagrant. Vagrant bills itself as a way to:

“Create and configure lightweight, reproducible, and portable development environments.”

but ignore the slogan, that’s not what I use it for. Instead, I treat Vagrant as:

“A command-line interface to test chef cookbooks using virtual machines”

After a few weeks of using chef, I’d started testing cookbooks using VirtualBox to avoid trashing my laptop. But clicking around in the GUI, installing VMs by hand, and running Chef was getting a bit tedious, never mind soaking up disk-space with lots of virtual machine images that I was loathe to delete. With Vagrant, however, things become much more straightforward.

Vagrant creates virtual machines using a simple config file, and lets you specify a local path to your cookbooks, and which recipes you want to run. An example config file looks like:

Vagrant.configure("2") do |config|
  config.vm.box = "precise64"
  config.vm.provision :chef_solo do |chef|
    chef.cookbooks_path = "/home/andy/src/toolkit-chef/cookbooks"
    chef.add_recipe("toolkit")
  end
  config.vm.network :forwarded_port, guest: 80, host: 11180
end

You then run `vagrant up` and it will create the virtual machine from the “precise64” base box, set up networking, shared folders and any other customisations, and run your cookbooks. If, inevitably, your in-development cookbook has a mistake, you can fix it and run `vagrant provision` to re-run Chef. No need to upload cookbooks anywhere or copy them around, and it keeps your development safely separated from your production machines. Other useful commands are `vagrant ssh` to log into the virtual machine (if you need to poke around to figure out if the recipes are doing what you want), `vagrant halt` to shut down the VM when you’re done for the day, and finally `vagrant destroy` to remove the virtual machine entirely. I do this fairly regularly – I’ve got half a dozen Vagrant instances configured for different projects and so often need to free up the disk space – but given I keep the config files then recreating the virtual machine a few months later is no more complex than `vagrant up` and a few minutes wait.

Going back to the original purpose of Vagrant, it’s based around redistributing “boxes” i.e. virtual machines configured in a particular way. I’ve never needed to redistibute a box, but once or twice found myself needing a particular base box that’s not available on vagrantbox.es – for example, testing cookbooks on old versions of Ubuntu. Given my dislike of creating virtual machines manually, I found the Veewee project useful. It takes a config file and installs the OS for you (effectively pressing the keyboard on your behalf during the install) and creates a reusable Vagrant base box. The final piece of the jigsaw is then writing the Veewee config files – step forward Bento, which is a pre-written collection of them. Using all these, you can start with a standard Ubuntu .iso file, convert that into a base box with Veewee, and use that base box in as many Vagrant projects as you like.

Finally, I’ve also used Vagrant purely as a command line for VirtualBox – if I’m messing around with a weekend project and don’t want to mess up my laptop installing random depenedencies, I instead create a minimal Vagrantfile using vagrant init, vagrant up, vagrant ssh, and mess around in the virtual machine – it’s much quicker than installing a fresh VM by hand, and useful even if you aren’t using Chef.

Do you have your own way of testing Chef cookbooks? Or any other tricks or useful projects? If so, let me know!

Getting Started With Chef

A little over a year ago I was plugging through setting up another OpenCycleMap server. I knew what needed installing, and I’d done it many times before, but I suspected that there was a better way than having a terminal open in one screen and my trusty installation notes in the other.

Previously I’d taken a copy of my notes, and tried reworking them into something resembling an automated installation script. I got it to the point where I could work through my notes line-by-line, pasting most of them into the terminal and checking the output, with the occasional note requiring actual typing (typically when I was editing configuration files). But to transform the notes into a robust hands-off script would have been a huge amount of work – probably involving far too many calls to sed and grep – and making everything work when it’s re-run or when I change the script a bit would be hard. I suspected that I would be re-inventing a wheel – but I didn’t know which wheel!

The first thing was to figure out some jargon – what’s the name of this particular wheel? Turns out that it’s known as “configuration management“. The main principle is to write code to describe the server setup, rather than running commands. That twigged with me straight away – every time I was adding more software to the OpenCycleMap servers I had this sinking feeling that I’d need to type the same stuff in over and over on different servers – I’d prefer to write some code once, and run that code over and over instead. The code also needs to be idempotent – i.e. it doesn’t matter how many times you run the code, the end result is the same. That’s about the sum of what configuration management entails.

There’s a few open-source options for configuration management, but one in particular caught my eye. Opscode’s Chef is ruby-based, which works for me since I do a fair amount of ruby development and it’s a language that I enjoy working with. And chef is also what the OpenStreetMap sysadmins use to configure their servers, so having people around who use the same system would simply be a bonus.

What started off as a few days effort turned into a massive multi-week project as I learned chef for the first time, and plugged through creating cookbooks for all the components of my server. It was a massive task and took much longer than I’d initially expected, but 18 months on it was clearly worth it – I’d have never been able to run enough servers for all the styles I have now, nor been able to keep up with the upgrades to the software and hardware without it. It’s awesome.

So here’s some tips, for those who have their own servers and are in a similar position to what I was.

  • How many servers before it’s worth it? Configuration management really kicks in to its own when you have dozens of servers, but how few are too few to be worth the hassle? It’s a tough one. Nowadays I’d say if you have only one server it’s still worth it – just – since one server really means three, right? The one you’re running, the VM on your laptop that you’re messing around with for the next big software upgrade, and the next one you haven’t installed yet. If you’re running a server with anything remotely important on it, then having some chef-scripts to get a replacement up and running if the first goes up in smoke is a really good time-critical aid when you need it most.
  • How do you get started with chef? Well, it’s tough, the learning curve is like a cliff. Chef setups have three main parts – the server(s) you’re setting up (the “node”), the machine you’re pressing keys on (the “workstation”) and the confusingly-named “chef server” which is where “nodes” grab their scripts (“cookbooks”) from. It makes sense to cut down the learning, so I’d recommend using the free 5-node trial of their Hosted Chef offering. That way you only need to concentrate on the nodes and workstation setup at first – and when you run out of nodes, there’s always the open-source chef-server if the platform is too expensive.
  • Which recipes should I use? There are loads available on github, and there’s links all over the chef website. In general, I recommend avoiding them, at least at first. Like I mentioned, the learning curve is cliff-like and while you can do super-complex whizz-bang stuff with chef, the public recipes are almost all vastly overcomplicated, and more importantly, hard to learn from. Start out writing your own – mine were little more than a list of packages to install at first. Then I started adding in some templates, a few scripts resources here and there, and built up from there as I learned new features. Make sure your chef repository is in git, and that you’re committing your cookbook changes as you go along
  • Where’s the documentation? I’d recommend following the tutorial to get things all set up, while trying not to worry too much about the details. Then start writing recipes. For that, the resources page on the wiki tells you everything you need to know – start with the package resource, then the template resource, then on to the rest. There’s a whole bunch of stuff that you won’t need for a long time – attributes, tags, searches – so don’t try learning everything in one go.

I’ll be writing more about developing and testing cookbooks in the future – it’s a whole subject in itself!