Cracking the Monolith with Librarian-Puppet

First post of the new year! I’ve been experimenting with librarian-puppet as a tool to break down monolithic (puppet) infrastructure codebases into a form that’s more compatible with the provisioning of distributed systems.

Throw down your Masters

I’ve touched on the benefits of masterless vs mastered puppet topologies in a previous post and it’s still a principle I believe in. As we scale up systems to the point where replacability and fast provisions become important having a single orchestration point (or should I say point of failure?) becomes more difficult to maintain effectively - moving towards masterless puppet is an option worth exploring.

As a consequence, I’m experimenting with using librarian-puppet as a method of breaking down the need for a Puppet master into their constituent parts - do we need to know how other projects are configured in Production for different teams? We run the risk of accidentally changing eachothers’ code in unexpected ways if the code is truly being shared. Isolated per-project configuration, then, seems to be the way to go.

Dynamic vs Static

To me there are two kinds of provisioning that happen via any kind of configuration management system - dynamic configuration and static configuration.

A file that is exactly the same in both the configuration codebase and in the production environment is static. Examples might be /etc/motd with the server name or a relational database config file - things that shouldn’t change once the machine is provisioned.
Files that are templated on provision and have the potential to change on any configuration run are dynamic. Examples might be firewall rules to let only application servers connect to the database, or ssh keys for new developers.

In breaking open a monolothic puppet codebase, the most important tactic is to pull out the static configuration first. This is where librarian-puppet starts to shine - it enables an incremental migration from one paradigm to the other (the reverse is also true).

Divide-and-Conquer - the Static case

Take the example monolithic puppet codebase below, where web_app is a deployable project in a separate codebase that depends on both riak and python in production.

$ tree ./puppet/
└── modules
    ├── common
    ├── mysql
    ├── python
    ├── riak
    └── web_app
    ...

If the web_app module consists of just static files then we’re almost done - librarian-puppet supports git repositories and sub-paths to modules, making it easy to pull out single modules into a Puppetfile which can be pulled down with librarian-puppet in the same manner as pip for python.

mod 'probablyfine-web_app',
  :git  => 'git://git.example.com/puppet-config.git',
  :path => 'modules/web_app'

If no other teams are using this code (and they ideally shouldn’t be if web_app is a single-team responsibility) then eventually it will be possible to pull this module out in its entireity. Where to depends - some teams may prefer to locate their puppet code with the actual application code, some may way their own git repos for a team’s modules.

But what of the dependencies on riak and python? At the time of writing a quick search on the Puppet Forge reveals 11 Riak modules and 28 python modules. The next step then would be re-using open-source modules rather than writing your own, and contributing fixes/feature requests if needed, reducing the amount of puppet code that you personally need to supervise.

Divide-and-Conquer - the Dynamic case

The bad news is that the static files are the easy case. Effectively provisioning dynamic files requires more thought and there isn’t actually one hard and fast answer - a good solution depends a lot on the requirements of a team. Here are a few potential solutions, and there are likely many more

Use a DNS-based solution for situtions that need hosts to know where a particular other host is. A fixed CNAME for lookups, that can be changed to point to a different location, can remove the need for templating.
An external lookup service like Zookeeper or etcd. Delegate the lookups to the machines themselves and regularly update their state. This doesn’t avoid the change of state entirely but moves the configuration closer to being static.
Do you need templating at all? Nagios (as an example) supports a MySQL backend via a plugin rather than files that are templated out with current inventory.

Conclusion

I’m really enjoying Librarian-Puppet as a way to move towards what I feel is a more intuitive and correct way to break down infrastructure code, as it enables incremental migration by pulling individual modules out of larger puppet codebases and encourages the use of existing open-source modules on the Puppet Forge.